Introduction to Algorithm Design Lecture Notes 3
ROAD MAP
Greedy Technique
Knapsack Problem Minimum Spanning Tree Problem
Prims Algorithm Kruskals Algorithm
Single Source Shortest Paths
Dijkstras Algorithm
Job Sequencing With Deadlines
Greedy Technique
Greedy technique is for solving optimization problems
such as engineering problems
Greedy approach suggests constructing a solution through a sequence of steps called decision steps
Each expanding a partially constructed solution Until a complete solution to the problem is reached
Similar to dynamic programming
but, not all possible solutions are explored
3
Greedy Technique
On each decision step the choice should be
Feasible has to satisfy the problems constraints Locally optimal has to be the best local choice among all feasible choices available on that step Irrevocable once made, it can not be changed
4
Greedy Technique
Greedy Algorithm ( a [ 1 .. N ] ) { solution = for i = 1 to n x = select (a) is feasible ( solution, x ) solution = solution U {x} return solution }
5
Greedy Technique
In each step, greedy technique suggests a greedy grasp of the best alternative available
Locally optimal decision Hope to yield a globally optimal solution
Greedy technique does not give the optimal solution for all problems There are problems for which a sequence of locally optimal choices does not yield an optimal solution
EX: TSP, Graph coloring Produces approximate solution
6
Knapsack Problem
Given :
wi : weight of object i m : capacity of knapsack pi : profit of all of i is taken
Find:
xi : fraction of i taken
Feasibility:
n
x w
i i =1
m
n
Optimality: maximize
x p
i i =1
Knapsack Problem
Algorithm Knapsack (m,n) { U = m for i = 1 to n x(i) = 0 for i = 1 to n if (w[i] > U) break x[i] = 1.0 U = U w[i] if (i n) x[i] = U/w[i]
Example : M = 20 p = (25, 24, 15)
n=3 w = (18, 15, 10)
Proof of Optimality G is greedy solution G = x1, x2, , xn let xj 1 j : least index xi = 1 1ij 0 xj 1 xi = 0 j i n O is optimal solution O = y1, y2, , yn let xk yk yk < xk k is the least index
yj yj ys
xj yj + xj yj ys
profit profit
O O'
O O O G
M
O
10
ROAD MAP
Greedy Technique
Knapsack Problem Minimum Spanning Tree Problem
Prims Algorithm Kruskals Algorithm
Single Source Shortest Paths
Dijkstras Algorithm
Job Sequencing With Deadlines
11
Minimum Spanning Tree (MST)
Problem Instance:
A weighted, connected, undirected graph G (V, E)
Definition:
A spanning tree of a connected graph is its connected acyclic subgraph A minimum spanning tree of a weighted connected graph is its spanning tree of the smallest weight weight of a tree is defined as the sum of the weights on all its edges
Feasible Solution:
A spanning tree, a subgraph G of G
G ' = (V , E ' )
E' E
12
Minimum Spanning Tree
Objective function :
Sum of all edge costs in G
C (G ' ) =
C (e)
eG '
Optimum Solution :
Minimum cost spanning tree
13
Minimum Spanning Tree
T1 is the minimum spanning tree
14
Prims Algorithm
Prims algorithm constructs a MST through a sequence of expanding subtrees Greedy choice :
Choose minimum cost edge add it to the subgraph
Approach :
1. Each vertex j keeps near[j] T where cost(j,near[j]) is minimum 2. near[j] = 0 if j T = if there is no egde between j and T 3. Using a heap to select minimum of all edges
15
Greedy Technique
Greedy Algorithm ( a [ 1 .. N ] ) { solution = for i = 1 to n x = select (a) is feasible ( solution, x ) solution = solution U {x} return solution }
16
Prims Algorithm
17
Prims Algorithm Example
18
Prims Algorithm Example
19
Prims Algorithm Example
20
Correctness proof of Prims Algorithm
Prims algorithm always yield a MST We can prove it by induction
T0 is a part of any MST
consists of a single vertex
Assume that Ti-1 is a part of MST We need to prove that Ti, generated by Ti-1 by Prims algorithm is a part of a MST We prove it by contradiction
Assume that no MST of the graph can contain Ti
21
Let ei=(u,v) be minimum weight edge from a vertex in Ti-1 to a vertex not in Ti-1 used by Prims algorithm to expand Ti-1 to Ti By our assumption, if we add ei to T, a cycle must be formed In addition to ei, cycle must contain another edge (v, u) If we delete the edge ek (v, u), we obtain another spanning tree wei wek So weight of new spanning tree is les or equal to T New spanning tree is a minimum spanning tree which contradicts the 22 assumption that no minimum spanning tree contains Ti
Prims Algorithm
Analysis :
How efficient is Prims algorithm ?
It depends on the data structure chosen If a graph is represented by its weight matrix and the priority queue is implemented as an unordered array, algorithms running time will be in (|V2|) If a graph is represented by adjacency linked lists and priority queue is implemented as a min-heap, running time of the algorith will be in O(|E|log|V|)
23
Kruskals Algorithm
It is another algorithm to construct MST It always yields an optimal solution The algorithm constructs a MST as an expanding sequence of subgraphs which are always acyclic but are not necessarily connected on intermediate stages of the algorithm
24
Kruskals Algorithm
Greedy choice :
Choose minimum cost edge
Connecting two disconnected subgraphs
Approach :
1. Sort edges in increasing order of costs 2. Start with empty tree 3. Consider edge one by one in order if T U {e} does not contain a cycle T = T U {e}
25
Greedy Technique
Greedy Algorithm ( a [ 1 .. N ] ) { solution = for i = 1 to n x = select (a) is feasible ( solution, x ) solution = solution U {x} return solution }
26
Kruskals Algorithm
27
Kruskals Algorithm Example
28
Kruskals Algorithm Example
29
Kruskals Algorithm Example
30
Proof of Optimality
Algorithm T not optimal e1, e2, , en-1 c(e1) < c(e2) < < c(en-1)
T is optimal
e T
e T ' T U {e} form a cycle The cycle contains an edge e * T
T U {e} {ek} forms another spanning tree cost = cost(T) + c(e) - c(e*) Greedy choice c(e) > c(e*) cost < cost (T) T Analysis O(|E| log|E|)
31
Disjoint Subsets and Union-Find Algorithms
Some applications (such as Kruskals algorithm) requires a dynamic partition of some n-element set S into a collection of disjoint subsets S1, S2, , Sk After being initialized as a collection n oneelement subsets, each containing a different element of S, the collection is subjected to a sequence of intermixed union and find operations
32
Disjoint Subsets and Union-Find Algorithms
We are dealing with an abstract data type of a collection of disjoint subsets of a finite set with operations :
makeset(x) : creates one-element set {x} it is assumed that this operation can be applied to each of the element of set S only once find(x) : returns a subset containing x union (x,y) : constructs the union of disjoint subsets Sx, Sy containing x and y, respectively Adds it to the collection to replace Sx and Sy
which are deleted from it
33
Disjoint Subsets and Union-Find Algorithms
Example :
S = { 1, 2, 3, 4, 5, 6 } make(i) creates the set(i) applying this operation six times initializes the structure to the collection of six singleton sets : {1}, {2}, {3}, {4}, {5}, {6}
performing union (1,4) and union (5,2) yields {1, 4}, {5, 2}, {3}, {6} if followed by union (4,5) and then by union (3,6) {1, 4, 5, 2}, {3, 6}
34
Disjoint Subsets and Union-Find Algorithms
There are two principal alternatives for implementing this data structure
1. quick find
optimizes the time efficiency of the find operation
2. quick union
optimizes the union operation
35
Disjoint Subsets and Union-Find Algorithms
1. quick find
optimizes the time efficiency of the find operation uses an array indexed by the elements of the underlying set S each subset is implemented as a linked list whose header contains the pointers to the first and last elements of the list
36
Linked list representation of subsets {1, 4, 5 ,2} and {3, 6} obtained by quick-find 37 after performing union (1,4), union (5,2), union (4,5) and union (3,6)
Disjoint Subsets and Union-Find Algorithms
Time efficency of makeset(x) is (1), hence initialization of n singleton subsets is (n) Time efficency of find(x) is (1) Executing union(x,y) takes longer,
(n2) for a sequence of n union operations A simple way to improve the overall efficiency is to append the shorter of the two lists to the longer one This modification is called union-by-size But it does not improve the worst case efficiency
38
Disjoint Subsets and Union-Find Algorithms
2. quick union
Represents each subsets by a rooted tree The nodes of the tree contain the subsets elements (one per node) Trees edges are directed from children to their parents
39
Disjoint Subsets and Union-Find Algorithms
2. quick union
Time efficency of makeset(x) is (1), hence initialization of n singleton subsets is (n) Time efficency of find(x) is (n)
A find is performed by following the pointer chain from the node containing x to the trees root A tree representing a subset can degenerate into a linked list with n nodes
Executing union (x,y) takes (1)
It is implemented by attaching the root of the ys tree to the root of the xs tree
40
Disjoint Subsets and Union-Find Algorithms
a ) forest representation of subsets {1, 4, 5, 2} & {3, 6} used by quick union b) result of union (5,6)
41
Disjoint Subsets and Union-Find Algorithms
In fact, a better efficiency can be obtained by combining either variety of quick union with path compression This modification makes every node encountered during the execution of a find operation point to the trees root
The efficiency of a sequence of at most n-1 unions and m finds to only slightly worse than linear
42
ROAD MAP
Greedy Technique
Knapsack Problem Job Sequencing With Deadlines Minimum Spanning Tree Problem
Prims Algorithm Kruskals Algorithm
Single Source Shortest Paths Dijkstras Algorithm Job Sequencing With Deadlines
43
Single Source Shortest Paths
Definition:
For a given vertex called source in a weighted connected graph, find shortest paths to all its other vertices
44
Dijkstras Algorithm
Idea :
Incrementally add nodes to an empty tree Each time add a node that has the smallest path length
Approach :
1. S = { } 2. Initialize dist [v] for all v
3. Insert v with min dist[v] in T 4. Update dist[w] for all w S
45
Dijkstras Algorithm
Idea of Dijkstras algorithm
46
Greedy Technique
Greedy Algorithm ( a [ 1 .. N ] ) { solution = for i = 1 to n x = select (a) is feasible ( solution, x ) solution = solution U {x} return solution }
47
48
Dijkstras Algorithm Example
49
Dijkstras Algorithm Example
50
Dijkstras Algorithm
Analysis :
Time efficiency depends on the data structure used for priority queue and for representing an input graph itself For graphs represented by their weight matrix and priority queue implemented as an unordered array, efficiency is in (|V|2) For graphs represented by their adjacency list and priority queue implemented as a min-heap efficiency is in O(|E|log|V|) A better upper bound for both Prim and Dijkstras algorithm can be achieved, if Fibonacci heap is used
51
ROAD MAP
Greedy Technique
Knapsack Problem Minimum Spanning Tree Problem
Prims Algorithm Kruskals Algorithm
Single Source Shortest Paths
Dijkstras Algorithm
Job Sequencing With Deadlines
52
Job Sequencing With Deadlines
Given :
n jobs 1, 2, , n deadline di > 0 profit pi > 0 each job taken 1 unit time 1 machiene available
Find
J {1,2,..., N }
Feasibility:
The jobs in J can be completed before their deadlines
Optimality:
maximize
P
i i j
53
Job Sequencing With Deadlines
Example : n=4 di = 2, 1, 2, 1 pi = 100, 10, 15, 27
J = {1,2} J = {1,3} J = {1,4}
p p p
= 110 = 115 = 127 optimal
54
J = {1,2,3} is not feasible
Job Sequencing With Deadlines
Greedy strategy?
55
Job Sequencing With Deadlines
Greedy Choice : Take the job gives largest profit Process jobs in nonincreasing order of pis
56
Greedy Technique
Greedy Algorithm ( a [ 1 .. N ] ) { solution = for i = 1 to n x = select (a) is feasible ( solution, x ) solution = solution U {x} return solution }
57
Job Sequencing With Deadlines
How to check feasibility ? Need to check all permutations
k jobs than k! permutations
To check feasibility
If the jobs are scheduled as follows i1, i2, , ik check whether dij, j
k jobs requires at least k! time What about using a greedy strategy to check feasibility?
58
Job Sequencing With Deadlines
Order the jobs in nondecreasing order of deadlines
j = i1 , i2 ,...ik
di 1 di 2 K di k
We only need to check this permutation
The subset is feasible if and only if this permutation is feasible
59
Greedy Technique
Greedy Algorithm ( a [ 1 .. N ] ) { solution = for i = 1 to n x = select (a) is feasible ( solution, x ) solution = solution U {x} return solution }
60
Job Sequencing With Deadlines
Analysis: Use presorting
Sorting and selection takes O(n log n) time in total
Checking feasibility
Each check takes linear time in the worst case O(n2) in total
Total runtime is O(n2)
Can we improve this?
61
Encoding Text
Suppose we have to encode a text that comprises characters from some n-character alphabet by assigning to each of the texts characters some sequence of bits called codeword We can use a fixed-encoding that assigns to each character
Good if each character has same frequency What if some characters are more frequent than others
62
Encoding Text
EX: The number of bits in the encoding of 100 characters long text
freq fixed word
a 45 000
b 13 ... 101
c 12 100
d 16 111
e 9
f 5 101 = 300
variable word 0
1101 1100 = 224
63
Prefix Codes
A codeword is not prefix of another codeword
Otherwise decoding is not easy and may not be possible
Encoding
Change each character with its codeword
Decoding
Start with the first bit Find the codeword A unique codeword can be found prefix code Continue with the bits following the codeword
Codewords can be represented in a tree
64
Prefix Codes
EX: Trees for the following codewords
a fixed word 000 variable word 0
b ... 101
c 100
f 101
111 1101 1100
65
Huffman Codes
Given: The characters and their frequencies Find: The coding tree Cost : Minimize the cost
Cost = f (c) d (c)
cC
f(c) : frequency of c d(c) : depth of c
66
Huffman Codes
What is the greedy strategy?
67
Huffman Codes
Approach :
1. Q = forest of one-node trees // initialize n one-node trees; // label the nodes with the characters // label the trees with the frequencies of the chars 2. for i=1 to n-1 3. x = select the least freq tree in Q & delete 4. y = select the least freq tree in Q & delete 5. z = new tree 6. z left = x and z right = y 7. f(z) = f(x) + f(y) 8. Insert z into Q
68
Greedy Technique
Greedy Algorithm ( a [ 1 .. N ] ) { solution = for i = 1 to n x = select (a) is feasible ( solution, x ) solution = solution U {x} return solution }
69
Huffman Codes Example
Consider five characters {A,B,C,D,-} with following occurrence probabilities
The Huffman tree construction for this input is as follows
70
71
Huffman Codes
Optimality Proof :
Tree should be full Two least frequent chars x and y must be two deepest nodes Induction argument
After merge operation new character set
Characters in roots with new frequencies
72
Huffman Codes
Analysis :
Use priority queues for forest
O (| c | log | c |)
Buildheap + ( 2 | c | - 2 ) DeleteMin + ( | c | -1 ) Insert
73