Unit 1 Divide-And-Conquer: Structure Page Nos
Unit 1 Divide-And-Conquer: Structure Page Nos
UNIT 1 DIVIDE-AND-CONQUER
Structure Page Nos.
1.0 Introduction 5
1.1 Objectives 5
1.2 General Issues in Divide-and-Conquer 6
1.3 Integer Multiplication 8
1.4 Binary Search 12
1.5 Sorting 13
1.5.1 Merge Sort
1.5.2 Quick Sort
1.6 Randomization Quicksort 17
1.7 Finding the Median 19
1.8 Matrix Multiplication 22
1.9 Exponentiation 23
1.10 Summary 25
1.11 Solutions/Answers 25
1.12 Further Readings 28
1.0 INTRODUCTION
We have already mentioned that solving (a general) problem, with or without
computers, is quite a complex and difficult task. We also mentioned a large number
of problems, which we may encounter, even in a formal discipline like Mathematics,
may not have any algorithmic/computer solutions. Out of the problems, which
theoretically can be solved algorithmically, designing a solution for such a problem is,
in general, quite difficult. In view of this difficulty, a number of standard techniques,
which are found to be helpful in solving problems, have become popular in computer
science. Out of these techniques Divide-and-Conquer is probably the most well-
known one.
The general plan for Divide-and-Conquer technique has the following three major
steps:
Step 1: An instance of the problem to be solved, is divided into a number of smaller
instances of the (same) problem, generally of equal sizes. Any sub-instance
may be further divided into its sub-instances. A stage reaches when either a
direct solution of a sub-instance at some stage is available or it is not further
sub-divisible. In the latter case, when no further sub-division is possible, we
attempt a direct solution for the sub-instance.
Step 3: Combine the solutions so obtained of the smaller instances to get the
solution of the original instance of the problem.
1.1 OBJECTIVES
After going through this Unit, you should be able to:
explain the essential idea behind the Divide-and-Conquer strategy for solving
problems with the help of a computer, and
use Divide-and-Conquer strategy for solving problems. 5
Design Techniques-I
1.2 GENERAL ISSUES IN DIVIDE-AND-CONQUER
Recalling from the introduction, Divide-and-Conquer is a technique of designing
algorithms that (informally) proceeds as follows:
Given an instance of the problem to be solved, split this into more than one
sub-instances (of the given problem). If possible, divide each of the sub-instances into
smaller instances, till a sub-instnace has a direct solution available or no further
subdivision is possible. Then independently solve each of the sub-instances and then
combine solutions of the sub-instances so as to yield a solution for the original
instance.
Example 1.2.1:
We have an algorithm, alpha say, which is known to solve all instances of size n, of a
given problem, in at most c n2 steps (where c is some constant). We then discover an
algorithm, beta say, which solves the same problem by:
Dividing an instance into 3 sub-instances of size n/2.
Solve these 3 sub-instances.
Combines the three sub-solutions taking d n steps in combining.
Suppose our original algorithm alpha is used to carry out the Step 2, viz., „solve these
sub-instances‟. Let
So if dn < (cn2)/4 (i.e., d/4c < n) then beta is faster than alpha.
In particular, for all large enough n‟s, (viz., for n > 4d/c = Constant), beta is faster
than alpha.
The algorithm beta improves upon the algorithm alpha by just a constant factor. But
if the problem size n is large enough such that for some i > 1, we have
n > 4d/c and also
n/2 > 4d/c and even
n/2i > 4d/c
which suggests that using beta instead of alpha for the Step 2 repeatedly until the
sub-sub-sub…sub-instances are of size n0 < = (4d/c), will yield a still faster algorithm.
If n < = n0 then
6
Solve problem using Algorithm alpha; Divide-And-Conquer
else
Split the problem instance into 3 sub-instances of size n/2;
Use gamma to solve each sub-instance;
Combine the 3 sub-solutions;
end if ;
end gamma;
Let T (gamma) (n) denote the running time of this algorithm. Then
cn2 if n < = n0
T (gamma) (n) =
3T (gamma) (n/2) + dn, otherwise
we shall show how relations of this form can be estimated. Later in the course, with
these methods it can be shown that
T(gamma) (n) = O (n(log3) ) =O(n1.59)
This is a significant improvement upon algorithms alpha and beta, in view of the fact
that as n becomes larger the differences in the values of n1.59 and n2 becomes larger
and larger.
The improvement that results from applying algorithm gamma is due to the fact that it
maximizes the savings achieved through beta. The (relatively) inefficient method
alpha is applied only to “small ” problem sizes.
In (ii), it is more usual to consider the ratio of initial problem size to sub-instance size.
In our example, the ration was 2. The threshold in (i) is sometimes called the
(recursive) base value. In summary, the generic form of a divide-and-conquer
algorithm is:
Procedure D-and-C (n : input size);
begin
read (n0) ; read the threshold value.
if n < = n0 then
solve problem without further sub-division;
else
Split into sub-instances each of size n/k;
for each of the r sub-instances do
D-and-C (n/k);
Combine the resulting sub-solutions to produce the solution to the original
problem;
end if;
end D-and-C;
7
Design Techniques-I Such algorithms are naturally and easily realised as recursive procedures in (suitable)
high-level programming languages.
x x n x n ...x x and
y yn yn ... y y
z z n z n z n ...z z
Note: The algorithm given below works for any number base, e.g., binary, decimal,
hexadecimal, etc. We use decimal simply for convenience.
The classical algorithm for multiplication requires O(n2) steps to multiply two n-digit
numbers.
A step is regarded as a single operation involving two single digit numbers, e.g.,
5+6, 3* 4, etc.
x x n x n ... x x and
y yn yn ... y y
n
x = ( x i ) * i ; and
i
n
y ( y i ) * i .
i
with representation
z z n z n z n ...z z
is given by
8
n Divide-And-Conquer
i n i n i
z= (z i ) * x i * * y i * .
i i i
For example:
581 = 5 * 102 + 8 * 101 + 1 *100
602 = 6 * 102 + 0 * 101 + 2 * 100
581*602 = 349762 = 3 * 105 + 4 * 104 + 9 103 + 7 102 + 6 101
+ 2 100
Let us denote
From this we also know that the result of multiplying x and y (i.e., z) is
z = x*y = (a * 10[n/2] + b) * (c * 10[n/2] + d)
= (a * c) * 102[n/2] + (a * d + b * c) * 10[n/2] + (b * d)
where
n, if n is even
2[n/2] =
n if n is odd
For a given n-digit number, whenever we divides the sequence of digits into two
subsequences, one of which has [n/] digits, the other subsequence has n ─ [n/2] digits, which is
n
(n/2) digits if n is even and if n is odd. However, because of the convenience we
may call both as (n/2) ─ digit sequences/numbers. 9
Design Techniques-I 3. Given the four returned products, the calculation of the result of multiplying
x and y involves only additions (can be done in O(n) steps) and multiplications
by a power of 10 (also can be done in O(n) steps, since it only requires placing
the appropriate number of 0s at the end of the number). (Combine stage).
This saving is accomplished at the expense of a slightly more number of steps taken in
the „combine stage‟ (Step 3) (although, this will still uses O(n) operations).
We continue with the earlier notations in which z is the product of two numbers x and
y having respectively the decimal representations
x x n x n ...x x
y yn yn ...y y
Further a,b,c,d are the numbers whose decimal representations are given by
d = y[n/2] ─ 1 y[n/2] ─ 2 … y1 y0
Performance Analysis
One of the reasons why we study analysis of algorithms, is that if there are more than
one algorithms that solve a given problem then, through analysis, we can find the
running times of various available algorithms. And then we may choose the one be
which takes least/lesser running time.
The ratio of initial problem size to sub-problem size. (let us call the ration
as )
The number of steps required to divide the initial instance into substances and
to combine sub-solutions, expressed as a function of the input size, n.
Let TP (n) denote the number of steps taken by P on instances of size n. Then
T (P ( n0 )) = Constant (Recursive-base)
T (P ( n )) = T (P (n ) + gamma ( n )
In the case when and are both constant (as mentioned earlier, in all the examples
we have given, there is a general method that can be used to solve such recurrence
relations in order to obtain an asymptotic bound for the running time Tp (n). These
methods were discussed in Block 1.
In general:
T (n ) = T ( n/) + O ( n gamma)
In general:
T (n ) = T ( n/) + O ( n gamma) 11
Design Techniques-I (where gamma is constant) has the solution
Ex. 1) Using Karatsuba‟s Method, find the value of the product 1026732 732912
int Binary Seach (int * A, int low, int high, int value)
{ int mid:
while (low < high)
{ mid = (low + high) / 2;
if (value = = A [mid])
return mid;
else if (value < A [mid])
high = mid – 1;
else low = mid + 1;
}
return ─ 1;
}
Explanation of the Binary Search Algorithm
It takes as parameter the array A, in which the value is to be searched. It also takes
the lower and upper bounds of the array as parameters viz., low and high respectively.
At each step of the interation of the while loop, the algorithm reduces the number of
elements of the array to be searched by half. If the value is found then its index is
returned. However, if the value is not found by the algorithm, then the loop terminates
if the value of the low exceeds the value of high, there will be no more items to be
searched and hence the function returns a negative value to indicate that item is not
found.
Analysis
As mentioned earlier, each step of the algorithm divides the block of items being
searched in half. The presence or absence of an item in an array of n elements, can be
established in at most lg n steps.
12
Thus the running time of a binary search is proportional to lg n and we say this is a Divide-And-Conquer
O(lg n) algorithm.
Ex. 2) Explain how Binary Search method finds or fails to find in the given sorted
array:
8 12 75 26 35 48 57 78 86
93 97 108 135 168 201
the following values
(i) 15
(ii) 93
(iii) 43
1.5 SORTING
We have already discussed the two sorting algorithms viz., Merge Sort and Quick
Sort. The purpose of repeating the algorithm is mainly to discuss, not the design but,
the analysis part.
Divide Step: If given array A has zero or 1 element then return the array A, as it is
trivially sorted. Otherwise, chop the given array A in almost the middle to give two
subarrays A1 and A2, each of which containing about half of the elements in A.
The recursion stops when the subarray has just only 1 element, so that it is trivially
sorted. Below is the Merge Sort function in C++.
13
Design Techniques-I void merge_sort (int A[], int p, int r)
{
Next, we define merge function which is called by the program merge-sort At this
stage, we have an Array A and indices p,q,r such that p < q < r. Subarray A[p .. q] is
sorted and subarray A [q + 1 . . r] is sorted and by the restrictions on p, q, r, neither
subarray is empty. We want that the two subarrays are merged into a single sorted
subarray in A[p .. r]. We will implement it so that it takes O(n) time, where
n = r – p + 1 = the number of elements being merged.
Let us consider two piles of cards. Each pile is sorted and placed face-up on a table
with the smallest card on top of each pile. We will merge these into a single sorted
pile, face-down on the table. A basic step will be to choose the smaller of the two top
cards, remove it from its pile, thereby exposing a new top card and then placing the
chosen card face-down onto the output pile. We will repeatedly perform these basic
steps until one input becomes empty. Once one input pile empties, just take the
remaining input pile and place it face-down onto the output pile. Each basic step
should take constant time, since we check just the two top cards and there are n basic
steps, since each basic step removes one card from the input piles, and we started with
n cards in the input piles. Therefore, this procedure should take O(n) time. We don‟t
actually need to check whether a pile is empty before each basic step. Instead we will
put on the bottom of each input pile a special sentinel card. It contains a special value
that we use to simplify the code. We know in advance that there are exactly r – p + 1
non sentinel cards. We will stop once we have performed r – p + 1 basic step. Below
is the function merge which runs in O(n) time.
int n 1 = q – p + 1
int n 2 = r – q:
int* L = new int[n1 + 1];
int * R = new int [ n2 + 1];
for (int i = 1; i < = n1; i ++)
L [i] = A [ p + i – 1];
i = j = 1;
for (int k = p;k <=r; k++)
„
„
if (L[i] < = R [j])
„
„
A[k] = L[i];
14
i + = 1; Divide-And-Conquer
„
„
else
„
„
A[k] = R[j];
j + = 1;
„
„ }
Solving the merge-sort recurrence: By the master theorem, this recurrence has the
solution T(n) = O(n lg n). Compared to insertion sort (O(n 2) worst-case time), merge
sort is faster. Trading a factor of n for a factor of lg n is a good deal. On small inputs,
insertion sort may be faster. But for large enough inputs, merge sort will always be
faster, because its running time grows more slowly than insertion sort‟s.
Partition a [ 1…n] into subarrays A = A [ 1..q] and A = A[q + 1…n] such that all
elements in A are larger than all elements in A.
Recursively sort A and A.
15
Design Techniques-I Pseudo code for QUICKSORT:
QUICKSORT (A, p, r)
If p < r THEN
q = PARTITION (A, p, r)
QUICKSORT (A, p, q ─ 1)
QUICKSORT (A, q + 1, r)
end if
Then, in order to sort an array A of n elements, we call QUICKSORT with the three
parameters A, 1 and n QUICKSORT (A, 1, n).
If q = n/2 and is (n) time, we again get the recurrence. If T(n) denotes the time taken
by QUICKSORT in sorting an array of n elements.
T(n) = 2T(n/2) + (n). Then after solving the recurrence we get the running time as
T(n) = (n log n)
The problem is that it is hard to develop partition algorithm which always divides A in
two halves.
PARTITION (A, p, r)
x = A [ r]
i=p─1
FOR j = p TO r ─ 1 DO
IF A [J] r THEN
i=i+1
Exchange A [ i] and A[j]
end if
end Do
Exchange A[I + 1] and A [r]
RETURN i + 1
QUICKSORT correctness:
Easy to show inductively, if PARTITION works correctly
Example:
2 8 7 1 3 5 6 4 i = 0, j = 1
2 8 7 1 3 5 6 4 i = 1, j = 2
2 8 7 1 3 5 6 4 i = 1, j = 3
2 8 7 1 3 5 6 4 i = 1, j = 4
2 1 7 8 3 5 6 4 i = 2, j = 5
2 1 3 8 7 5 6 4 i = 3, j = 6
2 1 3 8 7 5 6 4 i = 3, j = 7
2 1 3 8 7 5 6 4 i = 3, j = 8
2 1 3 4 7 5 6 8 q=4
If we run QUICKSORT on a set of inputs that are already sorted, the average
running time will be close to the worst-case.
Similarly, if we run QUICKSORT on a set of inputs that give good splits, the
average running time will be close to the best-case.
If we run QUICKSORT on a set of inputs which are picked uniformly at
random from the space of all possible input permutations, then the average case
will also be close to the best-case. Why? Intuitively, if any input ordering is
equally likely, then we expect at least as many good splits as bad splits,
therefore on the average a bad split will be followed by a good split, and it gets
“absorbed” in the good split.
So, under the assumption that all input permutations are equally likely, the average
time of QUICKSORT is (n lg n) (intuitively). Is this assumption realistic?
Not really. In many cases the input is almost sorted: think of rebuilding indexes
in a database etc.
The question is: how can we make QUICKSORT have a good average time
irrespective of the input distribution?
Using randomization.
Running time of a randomized algorithm depends not only on input but also on the
random choices made by the algorithm.
Randomized algorithms have best-case and worst-case running times, but the inputs
for which these are achieved are not known, they can be any of the inputs.
We are normally interested in analyzing the expected running time of a randomized
algorithm, that is the expected (average) running time for all inputs of size n
This section may be omitted after one reading. 17
Design Techniques-I Alternatively we can modify PARTITION slightly and exchange last element in
A with random element in A before partitioning.
RANDQUICKSORT (A, p, r)
IF p < r THEN
q = RANDPARTITION (A,p,r)
RANDQUICKSORT (A, p, q ─ 1, r)
END IF
One call of PARTITION takes O (1) time plus time proportional to the number of
iterations FOR-loop.
─ In each iteration of FOR-loop we compare an element with the pivot
element.
Each pair of elements zi and zj are compared at most once (when either of them is the
pivot)
X = in
n
j i
X ij where
If zi comparedto zi
Xij =
If zi not comparedto zi
18
Divide-And-Conquer
= n n
i j i Pr zi comparedto z j
To compute Pr [zi compared to zj] it is useful to consider when two elements are not
compared.
Assume first pivot it 7 first partition separates the numbers into sets
{1, 2, 3, 4, 5, 6} and {8, 9, 10}.
In partitioning. 7 is compared to all numbers. No number from the first set will ever
be compared to a number from the second set.
In general once a pivot r, zi < r < zj, is chosen we know that zi and zj cannot later be
compared.
On the other hand if zi is chosen as pivot before any other element in Zij then it is
compared to each element in Zij. Similar for zj.
In example 7 and 9 are compared because 7 is first item from Z7,9 to be chosen as pivot
and 2 and 9 are not compared because the first pivot in Z2,9 is 7.
Prior to an element in Zij being chosen as pivot the set Zij is together in the same
partition any element in Zij is equally likely to be first element chosen as pivot
the probability that zi or zj is chosen first in Zij is
j i
Pr [zi compared to zj ] =
j i
We now have:
EX = in nj i Pr zi comparedto z j
= in nj i
j i
= in n i
k k
= in n i
k k
= in O (log n )
= O(n log n)
Next time we will see how to make quicksort run in worst-case O(n log n ) time.
We will give here two algorithms for the solution of above problem. One is practical
randomized algorithm with O(n) expected running time. Another algorithm which is
more of theoretical interest only, with O(n) worst case running time.
Randomized Selection
The key idea is to use the algorithm partition () from quicksort but we only need to
examine one subarray and this saving also shows up in the running time O(n). We
will use the Randomized Partition (A, p,r) which randomly partitions the Array A
around an element A[q] such that all elements from A[p] to A[q─1] are less than A[q]
and all elements A[q+1] to A[r] are greater than A[q]. This is shown in a diagram
below.
We can now give the pseudo code for Randomized Select (A, p, r, i). This procedure
selects the ith order statistic in the Array A [p ..r].
Worst case: The partition will always occur in 0:n-1 fashion. Therefore, the time
required by the Randomized Select can be described by a recurrence given below:
T(n) = T(n─1) + O(n)
= O(n2) (arithmetic series)
Average case: Let us now analyse the average case running time of Randomized
Select.
For upper bound, assume ith element always occurs in the larger side of partition:
n
T(n) Tmaxk, n k n
n k
20
n Divide-And-Conquer
T(k ) (n )
n k n /
n
T(n) T(k ) (n ) The recurrence we started with
n k n /
n
ck (n ) Substitute T(n) cn for T(k)
n k n /
c n n /
= k k (n ) Split” the recurrence
n k k
c n n
= n n (n ) Expand arithmetic series
n
cn
= c(n─1) - (n ) Multiply it out
cn
T(n) c(n─1) - (n ) The recurrence so far
cn c
= cn – c ─ (n ) Subtract c/2
cn c
= cn – ( (n ) ) Rearrange the arithmetic
cn (if c is big enough) What we set out to prove
There are at least ½ of the 5-element medians which are x which equal to
n/5/2 = n/10 and also there are at least 3 n/10 elements which are x. Now,
for large n, 3 n/10 n/4. So at least n/4 elements are x and similarly n/4 elements
are x. Thus after partitioning around x, step 5 will call Select () on at most 3n/4
elements. The recurrence is therefore.
21
Design Techniques-I T(n) T( n/5) + T(3n/4) + (n)
T(n/5) + T(3n/4) + (n) n/5 n/5
cn/5 + 3cn/ (n)) Substitute T(n) = cn
= 19cn/20 + (n) Combine fractions
= cn – (cn/20 - (n)) Express in desired form
cn if c is big enough What we set out to prove
The idea behind the Strassen‟s algorithm is to multiply 2 2 matrices with only 7
scalar multiplications (instead of 8). Consider the matrices
r s a b e g
t u c d f h
The seven submatrix products used are
P1 = a . (g – h)
P2 = (a + b) . h
P3 = ( c+ d ) . e
P4 = d . (f – e)
P5 = (a + d) . ( e + h)
P6 = (b – d) . (f + h)
P7 = ( a ─ c) . ( e + g)
Using these submatrix products the matrix products are obtained by
r = P 5 + P4 – P2 + P6
s = P1 + P2
t = P3 + P4
u = P 5 + P1 – P3 – P7
This method works as it can be easily seen that s = (ag – ah) + (ah + bh) = ag + bh. In
this method there are 7 multiplications and 18 additions. For (n n) matrices, it can
be worth replacing one multiplication by 18 additions, since multiplication costs are
much more than addition costs.
1.9 EXPONENTIATION
Exponentiating by Squaring is an algorithm used for the fast computation of large
powers of number x. It is also known as the square-and-multiply algorithm or
binary exponentiation. It implicitly uses the binary expansion of the exponent. It is
of quite general use, for example, in modular-arithmetic.
Squaring Algorithm
The following recursive algorithm computes xn, for a positive integer n:
x, if n = 1
Power (x, n) = Power (x2, n/2), if n is even
x . (Power (x2, (n – 1)/2)), if n > 2 is odd
Further Applications
The same idea allows fast computation of large exponents modulo a number.
Especially in cryptography, it is useful to compute powers in a ring of integers modulo
q. It can also be used to compute integer powers in a group, using the rule
Power (x, ─ n) = (Power (x, n))─1.
The method works in every semigroup and is often used to compute powers of
matrices.
Examples 1.9.1:
13789722341 (mod 2345)
would take a very long time and lots of storage space if the native method is used:
compute 1378972234 then take the remainder when divided by 2345. Even using a more
effective method will take a long time: square 13789, take the remainder when
divided by 2345, multiply the result by 13789, and so on. This will take 722340
modular multiplications. The square-and-multiply algorithm is based on the
observation that 13789722341 = 13789 (137892)361170. So if we computed 137892, then
the full computation would only take 361170 modular multiplications. This is a gain
23
Design Techniques-I of a factor of two. But since the new problem is of the same type, we can apply the
same observation again, once more approximately having the size.
The repeated application this algorithm is equivalent to decomposing the exponent (by
a base conversion to binary) into a sequence of squares and products: for example,
x7= x4 x2x1
= (x2)2 * x2*x
= (x2 * x)2 * x algorithm needs only 4 multiplications instead of
7–1=6
where 7 = (111)2 = 22 + 21 + 20
Addition Chain
For example: 1, 2, 3, 6, 12, 24, 30, 31 is an addition chain for 31, of length 7, since
2=1+1
3=2+1
6=3+3
12 = 6 + 6
24 = 12 + 12
30 = 24 + 6
31 = 30 + 1
Addition chains can be used for exponentiation: thus, for example, we only need
7 multiplications to calculate 531:
52 = 51 51
52 = 52 51
56 = 53 53
512 = 56 56
524 = 512 512
530 = 524 56
531 = 530 51
Addition chain exponentiation
In mathematics, addition chain exponentiation is a fast method of exponentiation. It
works by creating a minimal-length addition chain that generates the desired
exponent. Each exponentiation in the chain can be evaluated by multiplying two of
the earlier exponentiation results.
24
This algorithm works better than binary exponentiation for high exponents. However, Divide-And-Conquer
it trades off space for speed, so it may not be good on over-worked systems.
1.10 SUMMARY
The unit discusses various issues in respect of the technique viz., Divide and Conquer
for designing and analysing algorithms for solving problems. First, the general plan of
the Divide and conquer technique is explained and then an outline of a formal Divide-
and-conquer procedure is defined. The issue of whether at some stage to solve a
problem directly or whether to further subdivide it, is discussed in terms of the relative
efficiencies in the two alternative cases.
1.11 SOLUTIONS/ANSWERS
Ex.1) 1026732 732912
Though, the above may be simplified in another simpler way, yet we want to
explain Karatsuba‟s method, therefore, next, we compute the products.
U = 1026 732
V = 732 912
P = 1758 1644
Let us consider only the product 1026 732 and other involved products may
be computed similarly and substituted in (A).
Let us write
U = 1026 732 = (10 102 + 26) (07 102 + 32)
= (10 7) 104 + 26 32 + [(10 + 7) (26 + 32)
─ 10 7 ─ 26 32)] 102
25
Design Techniques-I = 17 104 + 26 32 + (17 58 ─ 70 ─ 26 32) 102
At this stage, we do not apply Karatsuba‟s algorithm and compute the products of
2-digit numbers by conventional method.
Ex. 2) The number of elements in the given list is 15. Let us store these in an array
say A[1..15]. Thus, initially low = 1 and high = 15 and, hence,
mid = (1+15)/2 = 8.
In the first iteration, the search algorithm compares the value to be searched
with A[8] = 78
low = 1, high 4 ─ 1 = 3
Therefore, mid =
Therefore
(new) mid =
As A[3] = 15 (the value to be searched)
Hence, the algorithm terminates and returns the index value 3
as output.
and (new) mid = , where A[12] = 108
and (new) mid = = 10 with A[10] = 93
low = 5 high = 6 ─ 1 = 5
hence mid = 5, and A[5] = 35
As 43 > A[5], hence value A[5].
But, at this stage, low is not less than high and hence the
algorithm returns ─ 1, indicating failure to find the given
value in the array.
a b
c d
and
e g
f h
Then
P1 = a . (g ─ h)
= 5 (6 ─ 9) = ─ 15
P2 = (a + b) . h = (5 + 6) . 9 = 99,
P3 = (c + d) . e = (─ 4 + 3) . (─ 7) = 7
P4 = d . (f ─ e) = 3. (5 ─ (─ 7)) = 36;
P5 = (a + d) (e + h) = (5 + 3) (─ 7 + 9) = 16
P6 = ( b ─ d) (f + h) = (6 ─ 3) . (5 + 9) = 42
P7 = (a ─ c) (e + g) = (5 ─ (─ 4)) (─ 7 + 6) = ─ 9
r s
t u
27
Design Techniques-I where
r = P 5 + P4 ─ P2 + P6
= 16 + 36 ─ 99 + 42 = ─ 5
s = P1 + P2 = ─ 15 + 99 = 84
t = P3 + P4 = 7 + 36 = 43
u = P5 + P1 ─ P3 ─ P7
= 16 + (─ 15) ─ 7 ─ (─9)
= 16 ─ 15 ─ 7 + 9
=3
28
Graph Algorithms
2.0 INTRODUCTION
A number of problems and games like chess, tic-tac-toe etc. can be formulated and
solved with the help of graphical notations. The wide variety of problems that can be
solved by using graphs range from searching particular information to finding a good
or bad move in a game. In this Unit, we discuss a number of problem-solving
techniques based on graphic notations, including the ones involving searches of
graphs and application of these techniques in solving game and sorting problems.
2.1 OBJECTIVES
After going through this Unit, you should be able to:
explain and apply various graph search techniques, viz Depth-First Search
(DFS), Breadth-First-Search (BFS), Best-First Search, and Minimax Principle;
discuss relative merits and demerits of these search techniques, and
apply graph-based problem-solving techniques to solve sorting problems and to
games.
2.2 EXAMPLES
To begin with, we discuss the applicability of graphs to a popular game known as
NIM.
Nim is a game for 2 players, in which the players take turns alternately. Initially the
players are given a position consisting of several piles, each pile having a finite
number of tokens. On each turn, a player chooses one of the piles and then removes at
least one token from that pile. The player who picks up the last token wins.
Marienbad is a variant of a nim game and it is played with matches. The rules of this
game are similar to the nim and are given below:
(1) It is a two-player game
(2) It starts with n matches ( n must be greater or equal to 2 i.e., n >= 2)
(3) The winner of the game is one who takes the last match, whosoever is left with
no sticks, loses the game.
(4) On the very first turn, up to n –1 matches can be taken by the player having the
very first move.
(5) On the subsequent turns, one must remove at least one match and at most twice
the number of matches picked up by the opponent in the last move.
Before going into detailed discussion through an example, let us explain what may be
the possible states which may indicate different stages in the game. At any stage, the
following two numbers are significant:
(i) The total number of match sticks available, after picking up by the players so
far.
(ii) The number of match sticks that the player having the move can pick up.
After discussing some of possible states, we elaborate the game described above
through the following example.
Example 2.2.1:
Let the initial number of matches be 6, and let player A take the chance to move first.
What should be A‟s strategy to win, for his first move. Generally, A will consider all
possible moves and choose the best one as follow:
if A takes 5 matches, that leaves just one for B, then B will take it and win the
game;
if A takes 4 matches, that leaves 2 for B, then B will take it and win;
if A takes 3 matches, that leaves 3 for B, then B will take it and win;
30
Graph Algorithms
if A takes 2 match, that leaves 4 for B, then B will take it and win;
if A takes 1 match, that leaves 5 for B. In the next step, B can take 1 or 2(recall
that B can take at most the twice of the number what A just took) and B will go
to either of the states (4,2) or (3,3) both of which are winning moves for A, B‟s
move will lead to because A can take all the available stick and hence there will
be not more sticks for B to pick up. Taking a look at this reasoning process, it is
for sure that the best move for A is just taking one match stick.
The above process can be expressed by a directed graph, where each node
corresponds to a position (state) and each edge corresponds a move between two
positions, with each node expressed by a pair of numbers < i,j >, 0 <= j <= i, and
i: the number of the matches left;
j : the upper limit of number of matches which can be removed in the next move, that
is, any number of matches between 1 and j can be taken in the next move.
In the directed graph shown below, rectangular nodes denote losing nodes and
oval nodes denote winning nodes:
5,2
6,5
3,3
6,5
4,2
4,4
5
3,2
2,2
1,1
Figure 1
a terminal node < 0, 0 >, from which there is no legal move. It is a losing
position.
a nonterminal node is a winning node (denoted by a circle), if at least one of its
successors is a losing node, because the player currently having the move is can
leave his opponent in losing position.
a nonterminal node is a losing node (denoted by a square) if all of its successors
are wining nodes. Again, because the player currently having the move cannot
avoid leaving his opponent in one of these winning positions.
How to determine the wining nodes and losing nodes in a directed graph?
Intuitively, we can starting at the losing node < 0, 0 >, and work back according to
the definition of the winning node and losing node. A node is a losing node, for the
current player, if the move takes to a state such that the opponent can make at least
one move which forces the current player to lose. On the other hand, a node is a
winning node, if after making the move, current player will leave the opponent in a
state, from which opponent can not win. For instance, in any of nodes < 1, 1 >,
31
Design Techniques-I < 2, 2 >, < 3, 3 > and < 4, 4 >, a player can make a move and leave his opponent to
be in the position < 0, 0 >, thus these 4 nodes are wining nodes. From position
< 3, 2 >, two moves are possible but both these moves take the opponent to a winning
position so it is a losing node. The initial position < 6, 5 > has one move which takes
the opponent to a losing position so it is a winning node. Keeping the process of
going in the backward direction, we can mark the types of nodes in a graph. A
recursive C program for the purpose, can be implemented as follows:
Ex.1) Draw a directed graph for a game of Marienbad when the number of match
sticks, initially, is 5.
Preconditioning
Consider a scenario in which problem might have many similar situations or
instances, which are required to be solved. In such a situation, it it might be useful to
spend some time and energy in calculating the auxiliary solutions (i.e., attaching
some extra information to the problem space) that can be used afterwards to fasten
the process of finding the solution of each of these situations. This is known as
preconditioning. Although some time has to be spent in calculating / finding the
auxiliary solutions yet it has been seen that in the final tradeoff, the benefit achieved
in terms of speeding up of the process of finding the solution of the problem will be
much more than the additional cost incurred in finding auxiliary/additional
information.
In other words, let x be the time taken to solve the problem without preconditioning, y
be the time taken to solve the problem with the help of some auxiliary results (i.e.,
after preconditioning) and let t be the time taken in preconditioning the problem space
i.e., time taken in calculating the additional/auxiliary information. Then to solve n
typical instances, provided that y < x , preconditioning will be beneficial only
when ,
nx > t + ny
i.e., nx – ny > t
or n > t / (x – y)
32
Graph Algorithms
Preconditioning is also useful when only a few instances of a problem need to be
solved. Suppose we need a solution to a particular instance of a problem, and we need
it in quick time. One way is to solve all the relevant instances in advance and store
their solutions so that they can be provided quickly whenever needed. But this is a
very inefficient and impractical approach,─ i.e., to find solutions of all instances when
solution of only one is needed. On the other hand, a popular alternative could be to
calculate and attach some additional information to the problem space which will be
useful to speedup the process of finding the solution of any given instance that is
encountered.
For an example, let us consider the problem of finding the ancestor of any given node
in a rooted tree (which may be a binary or a general tree).
In any rooted tree, node u will be an ancestor of node v, if node u lies on the path
from root to v. Also we must note that every node is an ancestor of itself and root is an
ancestor of all nodes in the tree including itself. Let us suppose, we are given a pair of
nodes (u,v) and we are to find whether u is an ancestor or v or not. If the tree contains
n nodes, then any given instance can take Ω(n) time in the worst case. But, if we
attach some relevant information to each of the nodes of the tree, then after spending
Ω(n) time in preconditioning, we can find the ancestor of any given node in constant
time.
Now to precondition the tree, we first traverse the tree in preorder and calculate the
precedence of each node in this order, similarly, we traverse the tree in postorder and
calculate the precedence of each node. For a node u, let precedepre[u] be its
precedence in preorder and let precedepost[u] be its precedence in postorder.
Let u and v be the two given nodes. Then according to the rules of preorder and
postorder traversal, we can see that :
In preorder traversal, as the root is visited first before the left subtree and the right
subtree, so,
If precedepre[u] <= precedepre[v], then
u is an ancestor of v or u is to the left of v in the tree.
In postorder traversal, as the root is visited last, because, first we visit leftsubtree,
then right subtree and in the last we visit root so,
If precedepost[u] >= precedepost[v], then
u is an ancestor of v or u is to the right of v in the tree.
So for u to be an ancestor of v, both the following conditions have to be satisfied:
precedepre[u] <= precede[v] and precedepost[u] >= precedepost[v].
Thus, we can see that after spending some time in calculating preorder and postorder
precedence of each node in the tree, the ancestor of any node can be found in constant
time.
B C D
E F G H
33
Design Techniques-I
2.4 DEPTH-FIRST SEARCH
The depth-first search is a search strategy in which the examination of a given vertex
u, is delayed when a new vertex say v is reached and examination of v is delayed
when new vertex say w is reached and so on. When a leaf is reached (i.e., a node
which does not have a successor node), the examination of the leaf is carried out. And
then the immediate ancestor of the leaf is examined. The process of examination is
carried out in reverse order of reaching the nodes.
In depth first search for any given vertex u, we find or explore or discover the first
adjacent vertex v (in its adjacency list), not already discovered. Then, in stead of
exploring other nodes adjacent to u, the search starts from vertex v which finds its
first adjacent vertex not already known or discovered. The whole process is repeat for
each newly discovered node. When a vertex adjacent to v is explored down to the
leaf, we back track to explore the remaining adjacent vertices of v. So we search
farther or deeper in the graph whenever possible. This process continues until we
discover all the vertices reachable from the given source vertex. If still any
undiscovered vertices remain then a next source is selected and the same search
process is repeated. This whole process goes on until all the vertices of the graph are
discovered.
The vertices have three adjacent different statuses during the process of traversal or
searching, the status being: unknown, discovered and visited. Initially all the vertices
have their status termed as „unknown‟, after being explored the status of the vertex is
changed to „discovered‟ and after all vertices adjacent to a given vertex are discovered
its status is termed as „visited‟. This technique ensures that in the depth first forest, at
a time each vertex belong to only one depth-first tree so these trees are disjoint.
Because we leave partially visited vertices and move ahead, to backtrack later, stack
will be required as the underlying data structure to hold vertices. In the recursive
version of the algorithm given below, the stack will be implemented implicitly,
however, if we write a non-recursive version of the algorithm, the stack operation
have to be specified explicitly.
In the algorithm, we assume that the graph is represented using adjacency list
representation. To store the parent or predecessor of a vertex in the depth-first search,
we use an array parent[]. Status of a „vertex‟ i.e., unknown, discovered, or visited is
stored in the array status. The variable time is taken as a global variable. V is the
vertex set of the graph G.
In depth-first search algorithm, we also timestamp each vertex. So the vertex u has
two times associated with it, the discovering time d[u] and the termination time t[u].
The discovery time corresponds to the status change of a vertex from unknown to
discovered, and termination time corresponds to status change from discovered to
visited. For the initial input graph when all vertices are unknown, time is initialized to
0. When we start from the source vertex, time is taken as 1 and with each new
discovery or termination of a vertex, the time is incremented by 1. Although DFS
algorithm can be written without time stamping the vertices, time stamping of vertices
helps us in a better understanding of this algorithm. However, one drawback of time
stamping is that the storage requirement increases.
Also in the algorithm we can see that for any given node u, its discovering time will
be less than its termination time i.e., d[u] < t[u].
34
Graph Algorithms
The algorithm is:
Program
DFS(G)
//This fragment of algorithm performs initializing
//and starts the depth first search process
4 time = 0 }
5 for each vertex u ε V
6 {if status[u] == unknown
7 VISIT(u)
VISIT(U)
1 status[u] = discovered;
2 time = time + 1;
3 d[u] =time;}
4 for each Vertex v ε V adjacent to u
5 {if status[v] == unknown
6 parent[v] = u;
7 VISIT(V);
8 time = time + 1;
9 t[u] = time;
10 status[u] = visited;}
In the procedure DFS, the first for-loop initializes the status of each vertex to
unknown and parent or predecessor vertex to NULL. Then it creates a global variable
time and initializes it to 0. In the second for-loop belonging to this procedure, for each
node in the graph if that node is still unknown, the VISIT(u) procedure is called. Now
we can see that every time the VISIT (u) procedure will be called, the vertex u it will
become the root of a new tree in the forest of depth first search.
Whenever the procedure VISIT(u) will be called with parameter u, the vertex u will be
unknown. So in the procedure VISIT(u), first the status of vertex u is changed to
„discovered‟, time is incremented by 1 and it is stored as discovery time of vertex u in
d[u].
When the VISIT procedure will be called for the first time, d[u] will be 1. In the
for-loop for each given vertex u, every unknown vertex adjacent to u is visited
recursively and the parent[] array is updated. When the for-loop concludes, i.e., when
every vertex adjacent to u is discovered, the time is increment by 1 and is stored as the
termination time of u i.e. t[u] and the status of vertex u is changed to „visited‟.
In procedure DFS(), each for loop takes time O(V), where V is the number of
vertices in V. The procedure VISIT is called once for every vertex of the graph. In the
procedure visit for each of the for-loop is executed equal to the number of edges
emerging from that node and yet not traversed. Considering the adjacency list of all
nodes to total number of edges traversed are O(E), where E, is the number of
edges in E. So the running time of DFS is, therefore, O (V+E).
35
Design Techniques-I Example 2.4.1:
For the graph given in Figure 2.4.1.1. Use DFS to visit various vertices. The vertex
D is taken as the starting vertex and, if there are more than one vertices adjacent to a
vertex, then the adjacent vertices are visited in lexicographic order.
In the following,
(i) the label i/ indicates that the corresponding vertex is the ith discovered vertex.
(ii) the label i/j indicates that the corresponding vertex is the ith discovered vertex
and jth in the combined sequence of discovered and visited.
Figure 2.4.1.2: D has two neighbors by convention A is visited first i.e., the status of A changes to
discovered, d[A] = 2
Figure 2.4.1.3: A has two unknown neighbors B and C, so status of B changes to ‘discovered’, i.e.,
d[B] = 3
36
Graph Algorithms
Figure 2.4.1.5: All of E’s neighbors are discovered so status of vertex E is changed to ‘visited’ and
t[E] = 5
Figure 2.4.1.7: Similarly vertices G, E and H are discovered respectively with d[G] = 7, d[C] = 8
and d[H] = 9
37
Design Techniques-I
Figure 2.4.1.8: Now as all the neighbors of H are already discovered we backtrack, to C and stores
its termination time as t[H] = 10
Figure 2.4.1.9: We find the termination time of remaining nodes in reverse order, backtracking
along the original path ending with D.
The resultant parent pointer tree has its root at D, since this is the first node visited.
Each new node visited becomes the child of the most recently visited node. Also we
can see that while D is the first node to be „discovered‟, it is the last node terminated.
This is due to recursion because each of D‟s neighbors must be discovered and
terminated before D can be terminated. Also, all the edges of the graph, which are not
used in the traversal, are between a node and its ancestor. This property of depth-first
search differentiates it from breadth-first search tree.
Also we can see that the maximum termination time for any vertex is 16, which is
twice the number of vertices in the graph because time is incremented only when a
vertex is discovered or terminated and each vertex is discovered once and terminated
once.
Note: We should remember that in depth-first search the third case of overlapping
intervals is not possible i.e., situation given below is not, possible because of
recursion.
(2) Another important property of depth-first search (sometimes called white path
property) is that v is a descendant of u if and only if at the time of discovery of
u, there is at least one path from u to v contains only unknown vertices (i.e.,
white vertices or vertices not yet found or discovered).
(3) Depth-First Search can be used to find connected components in a given graph:
One useful aspect of depth first search algorithm is that it traverses connected
component one at a time and then it can be used to identify the connected
components in a given graph.
(4) Depth-first search can also be used to find cycles in an undirected graph:
we know that an undirected graph has a cycle if and only if at some particular
point during the traversal, when u is already discovered, one of the neighbors v
of u is also already discovered and is not parent or predecessor of u.
We can prove this property by the argument that if we discover v and find that
u is already discovered but u is not parent of v then u must be an ancestor of v
and since we traveled u to v via a different route, there is a cycle in the graph.
Ex.3) Trace how DFS traverses (i.e., discover and visits) the graph given below
when starting node/vertex is B.
B C
E F G H
To perform depth first search in directed graphs, the algorithm given above can be
used with minor modifications. The main difference exists in the interpretation of an
“adjacent vertex”. In a directed graph vertex v is adjacent to vertex u if there is a
directed edge from u to v. If a directed edge exists from u to v but not from v to u,
then v is adjacent to u but u is not adjacent to v.
Because of this change, the algorithm behaves differently. Some of the previously
given properties may no longer be necessarily applicable in this new situation.
39
Design Techniques-I Edge Classification
Another interesting property of depth first search is that search can be used to classify
different type of edges of the directed graph G(V,E). This edge classification gives us
some more information about the graph.
Note: In an undirected graph, every edge is either a tree edge or back edge, i.e.,
forward edges or cross edges are not possible.
Example 2.4.2:
In the following directed graph, we consider the adjacent nodes in the increasing
alphabetic order and let starting vertex be.
Figure 2.4.2.2: a has unknown two neighbors a and d, by convention b is visited first, i.e the status
of b changes to discovered, d[a] =2
40
Graph Algorithms
Figure 2.4.2.3: b has two unknown neighbors c and d, by convention c is discovered first i.e.,
d[c] = 3
Figure 2.4.2.4: c has only a single neighbor a which is already discovered so c is terminated i.e.,
t[c] = 5
Figure 2.4.2.5: The algorithm backtracks recursively to b, the next unknown neighbor is d, whose
status is change to discovered i.e., d[d] = 5
41
Design Techniques-I
Figure 2.4.2.7: The algorithm backtracks recursively to b, which has no unknown neighbors, so
b(terminated) is visited i.e., t[b] = 7
Figure 2.4.2. 8: The algorithm backtracks to a which has no unknown neighbors so a is visited i.e.,
t[a] = 8.
Figure 2.4.2. 9: The connected component is visited so the algorithm moves to next component
starting from e (because we are moving in increasing alphabetic order) so e is
‘discovered’ i.e. , d[e] = 9
Figure 2.4.2. 10: e has two unknown neighbors f and g, by convention we discover f i.e.,
d[f] = 10
42
Graph Algorithms
Figure 2.4.2. 12: The algorithm backtracks to e, which has g as the next ‘unknown’ neighbor, g is
‘discovered’ i.e., d[g] = 12
Figure 2.4.2. 13: The only neighbor of g is e, which is already discovered, so g(terminates) is
‘visited’ i.e., t[g] = 12
Figure 2.4.2. 14: The algorithm backtracks to e, which has no unknown neighbors left so
e (terminates) is visit i.e., t[e] = 14
43
Design Techniques-I Some more properties of Depth first search (in directed graph)
(1) Given a directed graph, depth first search can be used to determine whether it
contains cycle.
(2) Cross-edges go from a vertex of higher discovery time to a vertex of lower
discovery time. Also a forward edge goes from a vertex of lower discovery time
to a vertex of higher discovery time.
(3) Tree edges, forward edges and cross edges all go from a vertex of higher
termination time to a vertex of lower finish time whereas back edges go from a
vertex of lower termination time to a vertex of higher termination time.
(4) A graph is acyclic if and only if f any depth first search forest of graph G yields
no back edges. This fact can be realized from property 3 explained above, that if
there are no back edges then all edges will go from a vertex of higher
termination time to a vertex of lower termination time. So there will be no
cycles. So the property which checks cycles in a directed graph can be verified
by ensuring there are no back edges.
For recording the status of each vertex, whether it is still unknown, whether it has
been discovered (or found) and whether all of its adjacent vertices have also been
discovered. The vertices are termed as unknown, discovered and visited respectively.
So if (u,v) ε E and u is visited then v will be either discovered or visited i.e., either v
has just been discovered or vertices adjacent to v have also been found or visited.
As breadth first search forms a breadth first tree, so if in the edge (u,v) vertex v is
discovered in adjacency list of an already discovered vertex u then we say that u is the
parent or predecessor vertex of V. Each vertex is discovered once only.
The data structure we use in this algorithm is a queue to hold vertices. In this
algorithm we assume that the graph is represented using adjacency list representation.
front[u] is used to represent the element at the front of the queue. Empty() procedure
returns true if queue is empty otherwise it returns false. Queue is represented as Q.
Procedure enqueue() and dequeue() are used to insert and delete an element from the
queue respectively. The data structure Status[]is used to store the status of each vertex
as unknown or discovered or visited.
The algorithm works as follows. Lines 1-2 initialize each vertex to „unknown‟.
Because we have to start searching from vertex s, line 3 gives the status „discovered‟
to vertex s. Line 4 inserts the initial vertex s in the queue. The while loop contains
statements from line 5 to end of the algorithm. The while loop runs as long as there
remains „discovered‟ vertices in the queue. And we can see that queue will only
contain „discovered‟ vertices. Line 6 takes an element u at the front of the queue and
in lines 7 to 10 12 the adjacency list of vertex u is traversed and each unknown vertex
v in the adjacency list of u, its status is marked as discovered, its parent is marked as u
and then it is inserted in the queue. In the line 13, vertex u is removed from the queue.
In line 14-15, when there are no more elements in adjacency list of u, vertex u is
removed from the queue its status is changed to „visited‟ and is also printed as visited.
The algorithm given above can also be improved by storing the distance of each
vertex u from the source vertex s using an array distance[] and also by permanently
recording the predecessor or parent of each discovered vertex in the array parent[]. In
fact, the distance of each reachable vertex from the source vertex as calculated by the
BFS is the shortest distance in terms of the number of edges traversed. So next we
present the modified algorithm for breadth first search.
In the above algorithm the newly inserted line 3 initializes the parent of each vertex to
NULL, line 4 initializes the distance of each vertex from the source vertex to infinity,
line 6 initializes the distance of source vertex s to 0, line 7 initializes the parent of
source vertex s NULL, line 14 records the parent of v as u, line 15 calculates the
shortest distance of v from the source vertex s, as distance of u plus 1.
45
Design Techniques-I Example 2.5.3:
In the figure given below, we can see the graph given initially, in which only source s
is discovered.
We take unknown (i.e., undiscovered) adjacent vertex of s and insert them in queue,
first a and then b. The values of the data structures are modified as given below:
Next, after completing the visit of a we get the figure and the data structures as given
below:
46
Graph Algorithms
47
Design Techniques-I
Figure 2: We take unknown (i.e., undiscovered) adjacent vertices of s and insert them
in the queue.
Figure 3: Now the gray vertices in the adjacency list of u are b, c and d, and we can
visit any of them depending upon which vertex is inserted in the queue first. As in this
example, we have inserted b first which is now at the front of the queue, so next we
will visit b.
Figure 5: Vertices e and f are discovered as adjacent vertices of c, so they are inserted
in the queue and then c is removed from the queue and is visited.
48
Graph Algorithms
The best first search belongs to a branch of search algorithms known as heuristic
search algorithms. The basic idea of heuristic search is that, rather than trying all
possible search paths at each step, we try to find which paths seem to be getting us
nearer to our goal state. Of course, we can‟t be sure that we are really near to our goal
state. It could be that we are really near to our goal state. It could be that we have to
take some really complicated and circuitous sequence of steps to get there. But we
might be able to make a good guess. Heuristics are used to help us make that guess.
To use any heuristic search we need an evaluation function that scores a node in the
search tree according to how close to the goal or target node it seems to be. It will just
be an estimate but it should still be useful. But the estimate should always be on the
lower side to find the optimal or the lowest cost path. For example, to find the
optimal path/route between Delhi and Jaipur, an estimate could be straight arial
distance between the two cities.
There are a whole batch of heuristic search algorithms e.g., Hill Climbing, best first
search, A* ,AO* etc. But here we will be focussing on best first search.
Best First Search combines the benefits of both depth first and breadth first search
by moving along a single path at a time but change paths whenever some other path
looks more promising than the current path.
At each step in the depth first search, we first generate the successors of the current
node and then apply a heuristic function to find the most promising child/successor.
We then expand/visit (i.e., find its successors) the chosen successor i.e., find its
unknown successors. If one of the successors is a goal node we stop. If not then all
these nodes are added to the list of nodes generated or discovered so fart. During this
process of generating successors a bit of depth search is performed but ultimately if
the solution i.e., goal node is not found then at some point the newly
found/discovered/generated node will have a less promising heuristic value than one
of the top level nodes which were ignored previously. If this is the case then we
backtrack to the previously ignored but currently the most promising node and we
expand /visit that node. But when we back track, we do not forget the older branch
from where we have come. Its last node remains in the list of nodes which have been
discovered but not yet expanded/ visited . The search can always return to it if at some
stage during the search process it again becomes the most promising node to move
ahead.
Choosing the most appropriate heuristic function for a particular search problem is not
easy and it also incurs some cost. One of the simplest heuristic functions is an
estimate of the cost of getting to a solution from a given node this cost could be in
terms of the number of expected edges or hops to be traversed to reach the goal node.
We should always remember that in best first search although one path might be
selected at a time but others are not thrown so that they can be revisited in future if the
selected path becomes less promising.
49
Design Techniques-I Although the example we have given below shows the best first search of a tree, it is
sometimes important to search a graph instead of a tree so we have to take care that
the duplicate paths are not pursued. To perform this job, an algorithm will work by
searching a directed graph in which a node represents a point in the problem space.
Each node, in addition to describing the problem space and the heuristic value
associated with it, will also contain a link or pointer to its best parent and points to its
successor node. Once the goal node is found, the parent link will allow us to trace the
path from source node to the goal node. The list of successors will allow it to pass the
improvement down to its successors if any of them are already existing.
OPEN list is the list of nodes which have been found but yet not expanded
i.e., the nodes which have been discovered /generated but whose
children/successors are yet not discovered. Open list can be implemented in the
form of a queue in which the nodes will be arranged in the order of decreasing
prioity from the front i.e., the node with the most promising heuristic value (i.e.,
the highest priority node) will be at the first place in the list.
CLOSED list contains expanded/visited nodes i.e., the anodes whose
successors are also genereated. We require to keep the nodes in memory I we
want to search a graph rather than a tree, since whenever a new node is generated
we need to check if it has been generated before.
The algorithm can be written as:
Best First Search
1. Place the start node on the OPEN list.
2. Create a list called CLOSED i.e., initially empty.
3. If the OPEN list is empty search ends unsuccessfully.
4. Remove the first node on OPEN list and put this node on CLOSED list.
5. If this is a goal node, search ends successfully.
6. Generate successors of this node:
For each successor :
(a) If it has bot been discovered / generated before i.e., it is not on OPEN,
evaluate this node by applying the heuristic function, add it to the OPEN
and record its parent.
(b) If it has been discovered / generated before, change the parent if the new
path is better than the previous one. In that case update the cost of getting to
this node and to any successors that this node may already have.
7. Reorder the list OPEN, according to the heuristic merit.
8. Go to step 3.
Example
In this example, each node has a heuristic value showing the estimated cost of getting
to a solution from this node. The example shows part of the search process using best
first search.
A
B 8 C 7
A B 8 C 7
D 4 E 9
50
Graph Algorithms
A A
B 8 C 7 B 8 C 7
D 4 E 9 B 3 B 3 B 5 D 4 E 9
A 9 A 5 A 9 A 1
2 1
Figure 4 Figure 5
Figure 3: As the estimated goal distance of C is less so expand C to find its successors
d and e.
Figure 4: Now D has lesser estimated goal distance i.e., 4 , so expand D to generate
F and G with distance 9 and 11 respectively.
Figure 5: Now among all the nodes which have been discovered but yet not expanded
B has the smallest estimated goal distance i.e., 8, so now backtrack and expand B and
so on.
Best first searches will always find good paths to a goal node if there is any. But it
requires a good heuristic function for better estimation of the distance to a goal node.
Minimax is a method in decision theory for minimizing the expected maximum loss.
It is applied in two players games such as tic-tac-toe, or chess where the two players
take alternate moves. It has also been extended to more complex games which require
general decision making in the presence of increased uncertainty. All these games
have a common property that they are logic games. This means that these games can
be described by a set of rules and premises. So it is possible to know at a given point
of time, what are the next available moves. We can also call them full information
games as each player has complete knowledge about possible moves of the adversary.
In the subsequent discussion of games, the two players are named as MAX and MIN.
We are using an assumption that MAX moves first and after that the two players will
move alternatively. The extent of search before each move will depend o the play
depth – the amount of lookahead measured in terms of pairs of alternating moves for
MAX and MIN.
For searching we can use either breadth first, depth first or heuristic methods except
that the termination conditions must now be specified. Several artificial termination
conditions can be specified based on factors such as time limit, storage space and the
depth of the deepest node in the search tree.
In a two player game, the first step is to define a static evaluation function efun(),
which attaches a value to each position or state of the game. This value indicates how
good it would be for a player to reach that position. So after the search terminates, we
must extract from the search tree an estimate of the best first move by applying a
static evaluation function efun() to the leaf nodes of the search tree. The evaluation
function measures the worth of the leaf node position. For example, in chess a simple
static evaluation function might attach one point for each pawn, four points for each
rook and eight points for queen and so on. But this static evaluation is too easy to be
of any real use. Sometimes we might have to sacrifice queen to prevent the opponent
from a winning move and to gain advantage in future so the key lies in the amount of
lookahead. The more number of moves we are able to lookahead before evaluating a
move, the better will be the choice.
In analyzing game trees, we follow a convention that the value of the evaluation
function will increase as the position becomes favourable to player MAX , so the
positive values will indicate position that favours MAX whereas for the positions
favourable to player MIN are represented by the static evaluation function
having negative values and values near zero correspond to game positions not
favourable to either MAX or MIN. In a terminal position, the static evaluation
function returns either positive infinity or negative infinity where as positive infinity
represents a win for player MAX and negative infinity represents a win for the player
MIN and a value zero represents a draw.
In the algorithm, we give ahead, the search tree is generated starting with the current
game position upto the end game position or lookahead limit is reached. Increasing
the lookahead limit increases search time but results in better choice. The final game
position is evaluated from the MAX‟s point of view. The nodes that belong to the
player MAX receive the maximum value of its children. The nodes for the player MIN
will select the minimum value of its children.
In the algorithm, lookahead limit represents the lookahead factor in terms of number
of steps, u and v represent game states or nodes, maxmove() and minmove() are
functions to describe the steps taken by player MAX or player MIN to choose a
move, efun() is the static evaluation function which attaches a positive or negative
integer value to a node ( i.e., a game state), value is a simple variable.
Now to move number of steps equal to the lookahead limit from a given game state u,
MAX should move to the game state v given by the following code :
maxval = -
for each game state w that is a successor of u
val = minmove(w,lookaheadlimit)
if (val >= maxval)
maxval = val
v=w // move to the state v
The minmove() function is as follows :
minmove(w, lookaheadlimit)
{
if(lookaheadlimit = = 0 or w has no successor)
52
Graph Algorithms
return efun(w)
else
minval = +
for each successor x of w
val = maxmove(x,lookaheadlimit – 1)
if (minval > val)
minval = val
return(minval)
}
maxmove(w, lookaheadlimit)
{
if (lookaheadlimit = = 0 or w has no successor)
return efun(w)
else
maxval = -
for each successor x of w
val = minmove(x,lookaheadlimit – 1)
if (maxval < val)
maxval = val
return(maxval)
}
We can see that in the minimax technique, player MIN tries to minimize te advantage
he allows to player MAX, and on the other hand player MAX tries to maximize the
advantage he obtains after each move.
Let us suppose the graph given below shows part of the game. The values of leaf
nodes are given using efun() procedure for a particular game then the value of nodes
above can be calculated using minimax principle.Suppose the lookahead limit is 4 and
it is MAX‟s turn.
9
2
2 1
3 7 2 5 1
-7 3 -2 7 2 -4 -7 5 1 -2
5 -7 9 3 -2 8 7 2 5 -4 5 -7 8 3 1 5 2-2
53
Design Techniques-I spending time to search for children of the B node and so we can safely ignore all the
remaining children of B.
This shows that the search on same paths can sometimes be aborted ( i.e., it is not
required to explore all paths) because we find out that the search subtree will not take
us to any viable answer.
MA
X
MI
N A 5 3 B
5 7 3
This optimization is known as alpha beta pruning/procedure and the values, below
which search need not be carried out are known as alpha beta cutoffs.
The alpha values of MAX nodes (including the start value) can never decrease.
The beta value of MIN nodes can never increase.
So we can see that remarkable reductions in the amount of search needed to evaluate a
good move are possible by using alpha beta pruning / procedure.
54
Graph Algorithms
A directed graph that does not have any cycles is known as directed acyclic graph.
Hence, dependencies or precedences among events can be represented by using
directed acyclic graphs.
There are many problems in which we can easily tell which event follows or precedes
a given event, but we can‟t easily work out in which order all the events are held. For
example, it is easy to specify/look up prerequisite relationships between modules in a
course, but it may be hard to find an order to take all the modules so that all
prerequisite material is covered before the modules that depend on it. Same is the case
with a compiler evaluating sub-expressions of an expression like the following:
(a + b)(c ─ d) ─ (a ─ b)(c + d)
Both of these problems are essentially equivalent. The data of both problems can be
represented by directed acyclic graph (See figure below). In the first each node is a
module; in the second example each node is an operator or an operand. Directed edges
occur when one node depends on the other, because of prerequisite relationships
among courses or the parenthesis order of the expression. The problem in both is to
find an acceptable ordering of the nodes satisfying the dependencies. This is referred
to as a topological ordering. More formally it is defined below.
* *
+ ─ + ─
a b c d
55
Design Techniques-I after u in the ordering. Therefore, a cyclic graph cannot have a topological order. A
topological sort of a graph can be viewed as an ordering of its vertices along a
horizontal line so that all directed edges go in one direction.
The term topological sort comes from the study of partial orders and is sometimes
called a topological order or linear order.
The algorithm given below assumes that the direceted acyclic graph is represented
using adjacency lists. Each node of the adjacency list contains a variable indeg which
stores the indegree of the given vertex. Adj is an array of |V| lists, one for each vertex
in V.
Topological-Sort(G)
1 for each vertex u ε G
2 do indeg[u] = in-degree of vertex u
3 if indeg[u] = 0
4 then enqueue(Q,u)
5 while Q != 0
6 do u = dequeue(Q)
7 print u
8 for each v ε Adj[u]
9 do indeg[v] = indeg[v] – 1
10 if indeg[v] = 0
11 then enqueue(Q,v)
The for loop of lines 1-3 calculates the indegree of each node and if the indegree of
any node is found to be 0, then it is immediately enqueeud. The while loop of lines 5-
11 works as follows. We dequeue a vertex from the queue. Its indegree will be
zero(Why?). It then output the vertex, and decrement the in degree of each vertex
adjacent to u. If in the process, the in degree of any vertex adjacent to u becomes 0,
then it is also enqueued.
We can also use Depth First Search Traversal for topologically sorting a directed
acyclic graph. DFS algorithm can be slightly changed or used as it is to find the
topological ordering. We simply run DFS on the input directed acyclic graph and
insert the vertices of a node in a linked list or simply print the vertices in decreasing
order of the termination time.
To see why this approach work, suppose that DFS is run on a given dag G = (V,E) to
determine the finishing times for its vertices. Let u, v ε V, if there is an edge in G from
u to v, then termination time of v will be less than termination time of u i.e.,
t[v] < t[u]. Since, we output the vertices in decreasing order of termination time, the
vertex with least number of dependencies will be outputted first.
ALGORITHM
1. Run the DFS algorithm on graph G. In doing so compute the termination time
of each vertex.
2. Whenever a vertex is terminated (i.e. visited), insert it in the front o f a list.
56
Graph Algorithms
3. Output the list.
RUNNING TIME
Let n is the number of vertices (or nodes, or activities) and m is the number of edges
(constraints). As each vertex is discovered only once, and for each vertex we loop
over all its outgoing edges once. Therefore, total running time is O(n+m).
2.8 SUMMARY
This unit discusses some searching and sorting techniques for sorting those problems
each of which can be efficiently represented in the form of a graph. In a graphical
representation of a problem, generally, a node represents a state of the problem, and
an arrow/arc represents a move between a pair of states.
2.9 SOLUTIONS/ANSWERS
Ex. 1)
3,2 4,2
2,2
3,2
3,3
0,0
4,3
1,1
2,1
Ex.2)
A
B C
E F G H
D I K L
57
Design Techniques-I
1/
A
2/ 7/
B C
6/ 11/
3/ 8/
E F G H
D I K L
4/ 5/
9/ 10/
1/11
A
2/5 7/10
B C
6/4 11/9
3/3 8/8
E F G H
D I
K L
4/1 5/2
9/6 10/7
Ex.3)
1/
B C
E F G H
58
Graph Algorithms
2/
1/
B C
E F G H
2/
1/
3/
B C
E F G H
59