UNIT V
NP-COMPLETE AND APPROXIMATION ALGORITHM
Tractable and intractable problems: Polynomial time algorithms – Venn diagram representation -
NP-algorithms - NP-hardness and NP-completeness – Bin Packing problem - Problem reduction:
TSP – 3-CNF problem. Approximation Algorithms: TSP - Randomized Algorithms: concept and
application - primality testing - randomized quick sort - Finding kth smallest number
Tractable and Intractable Problems
Tractable problems refer to computational problems that can be solved efficiently using algorithms
that can scale with the input size of the problem. In other words, the time required to solve a
tractable problem increases at most polynomially with the input size.
On the other hand, intractable problems are computational problems for which no known algorithm
can solve them efficiently in the worst-case scenario. This means that the time required to solve an
intractable problem grows exponentially or even faster with the input size.
One example of a tractable problem is computing the sum of a list of n numbers. The time required
to solve this problem scales linearly with the input size, as each number can be added to a running
total in constant time. Another example is computing the shortest path between two nodes in a
graph, which can be solved efficiently using algorithms like Dijkstra's algorithm or the A* algorithm.
In contrast, some well-known intractable problems include the traveling salesman problem, the
knapsack problem, and the Boolean satisfiability problem. These problems are NP-hard, meaning
that any problem in NP (the set of problems that can be solved in polynomial time using a non-
deterministic Turing machine) can be reduced to them in polynomial time. While it is possible to find
approximate solutions to these problems, there is no known algorithm that can solve them exactly in
polynomial time.
In summary, tractable problems are those that can be solved efficiently with algorithms that scale
well with the input size, while intractable problems are those that cannot be solved efficiently in the
worst-case scenario.
Examples of Tractable problems
1. Sorting: Given a list of n items, the task is to sort them in ascending or descending order.
Algorithms like QuickSort and MergeSort can solve this problem in O(n log n) time
complexity.
2. Matrix multiplication: Given two matrices A and B, the task is to find their product C = AB.
The best-known algorithm for matrix multiplication runs in O(n^2.37) time complexity,
which is considered tractable for practical applications.
3. Shortest path in a graph: Given a graph G and two nodes s and t, the task is to find the
shortest path between s and t. Algorithms like Dijkstra's algorithm and the A* algorithm can
solve this problem in O(m + n log n) time complexity, where m is the number of edges and n
is the number of nodes in the graph.
4. Linear programming: Given a system of linear constraints and a linear objective function, the
task is to find the values of the variables that optimize the objective function subject to the
constraints. Algorithms like the simplex method can solve this problem in polynomial time.
5. Graph coloring: Given an undirected graph G, the task is to assign a color to each node such
that no two adjacent nodes have the same color, using as few colors as possible. The greedy
algorithm can solve this problem in O(n^2) time complexity, where n is the number of nodes
in the graph.
These problems are considered tractable because algorithms exist that can solve them in polynomial
time complexity, which means that the time required to solve them grows no faster than a
polynomial function of the input size.
Examples of intractable problems
1. Traveling salesman problem (TSP): Given a set of cities and the distances between them, the
task is to find the shortest possible route that visits each city exactly once and returns to the
starting city. The best-known algorithms for solving the TSP have an exponential worst-case
time complexity, which makes it intractable for large instances of the problem.
2. Knapsack problem: Given a set of items with weights and values, and a knapsack that can
carry a maximum weight, the task is to find the most valuable subset of items that can be
carried by the knapsack. The knapsack problem is also NP-hard and is intractable for large
instances of the problem.
3. Boolean satisfiability problem (SAT): Given a boolean formula in conjunctive normal form
(CNF), the task is to determine if there exists an assignment of truth values to the variables
that makes the formula true. The SAT problem is one of the most well-known NP-complete
problems, which means that any NP problem can be reduced to SAT in polynomial time.
4. Subset sum problem: Given a set of integers and a target sum, the task is to find a subset of
the integers that sums up to the target sum. Like the knapsack problem, the subset sum
problem is also intractable for large instances of the problem.
5. Graph isomorphism problem: Given two graphs G1 and G2, the task is to determine if there
exists a bijection between the nodes of the two graphs such that the edge structure is
preserved. The graph isomorphism problem is suspected to be intractable, but it has not
been proven to be NP-hard or polynomial-time solvable.
These problems are considered intractable because no known algorithm can solve them in
polynomial time complexity in the worst-case scenario, which means that the time required to solve
them grows exponentially or faster with the input size.
Polynomial Algorithm
Polynomial time algorithms are algorithms that can solve a problem with an upper bound on the
time complexity that is polynomial in the size of the input. In other words, the time required to solve
a problem using a polynomial time algorithm grows no faster than a polynomial function of the size
of the input.
Polynomial time algorithms are considered efficient and are typically the preferred approach for
solving problems, as they can handle input sizes that are much larger than what is possible for
exponential time algorithms.
Here are a few examples of polynomial time algorithms:
1. Linear search: Given a list of n items, the task is to find a specific item in the list. The time
complexity of linear search is O(n), which is a polynomial function of the input size.
2. Bubble sort: Given a list of n items, the task is to sort them in ascending or descending order.
The time complexity of bubble sort is O(n^2), which is also a polynomial function of the
input size.
3. Shortest path in a graph: Given a graph G and two nodes s and t, the task is to find the
shortest path between s and t. Algorithms like Dijkstra's algorithm and the A* algorithm can
solve this problem in O(m + n log n) time complexity, which is a polynomial function of the
input size.
4. Maximum flow in a network: Given a network with a source node and a sink node, and
capacities on the edges, the task is to find the maximum flow from the source to the sink.
The Ford-Fulkerson algorithm can solve this problem in O(mf), where m is the number of
edges in the network and f is the maximum flow, which is also a polynomial function of the
input size.
5. Linear programming: Given a system of linear constraints and a linear objective function, the
task is to find the values of the variables that optimize the objective function subject to the
constraints. Algorithms like the simplex method can solve this problem in polynomial time.
P (Polynomial) problems
P problems refer to problems where an algorithm would take a polynomial amount of time
to solve, or where Big-O is a polynomial (i.e. O(1), O(n), O(n²), etc). These are problems that
would be considered ‘easy’ to solve, and thus do not generally have immense run times.
NP (Non-deterministic Polynomial) Problems
NP problems were a little harder for me to understand, but I think this is what they are. In
terms of solving a NP problem, the run-time would not be polynomial. It would be
something like O(n!) or something much larger.
NP-Hard Problems
A problem is classified as NP-Hard when an algorithm for solving it can be translated to
solve any NP problem. Then we can say, this problem is at least as hard as any NP problem,
but it could be much harder or more complex.
NP-Complete Problems
NP-Complete problems are problems that live in both the NP and NP-Hard classes. This
means that NP-Complete problems can be verified in polynomial time and that any NP
problem can be reduced to this problem in polynomial time.
Bin Packing problem
Bin Packing problem involves assigning n items of different weights and bins each of capacity
c to a bin such that number of total used bins is minimized. It may be assumed that all items
have weights smaller than bin capacity.
The following 4 algorithms depend on the order of their inputs. They pack the item given
first and then move on to the next input or next item
1) Next Fit algorithm
The simplest approximate approach to the bin packing problem is the Next-Fit (NF)
algorithm which is explained later in this article. The first item is assigned to bin 1. Items
2,... ,n are then considered by increasing indices : each item is assigned to the current bin, if
it fits; otherwise, it is assigned to a new bin, which becomes the current one.
Visual Representation
Let us consider the same example as used above and bins of size 1
Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
The minimum number of bins required would be Ceil ((Total Weight) / (Bin Capacity))=
Celi(3.7/1) = 4 bins.
The Next fit solution (NF(I))for this instance I would be-
Considering 0.5 sized item first, we can place it in the first bin
Moving on to the 0.7 sized item, we cannot place it in the first bin. Hence we place it in a
new bin.
Moving on to the 0.5 sized item, we cannot place it in the current bin. Hence we place it in a
new bin.
Moving on to the 0.2 sized item, we can place it in the current (third bin)
Similarly, placing all the other items following the Next-Fit algorithm we get-
Thus we need 6 bins as opposed to the 4 bins of the optimal solution. Thus we can see that
this algorithm is not very efficient.
Analyzing the approximation ratio of Next-Fit algorithm
The time complexity of the algorithm is clearly O(n). It is easy to prove that, for any instance
I of BPP,the solution value NF(I) provided by the algorithm satisfies the bound
NF(I)<2z(I)
where z(I) denotes the optimal solution value. Furthermore, there exist instances for which
the ratio NF(I)/z(I) is arbitrarily close to 2, i.e. the worst-case approximation ratio of NF is
r(NF) = 2.
Psuedocode
NEXT FIT (size[], n, c)
size[] is the array containg the sizes of the items, n is the number of items and c is the
capacity of the bin
{
Initialize result (Count of bins) and remaining capacity in current bin.
res = 0
bin_rem = c
Place items one by one
for (int i = 0; i < n; i++) {
// If this item can't fit in current bin
if (size[i] > bin_rem) {
Use a new bin
res++
bin_rem = c - size[i]
}
else
bin_rem -= size[i];
}
return res;
}
2) First Fit algorithm
A better algorithm, First-Fit (FF), considers the items according to increasing
indices and assigns each item to the lowest indexed initialized bin into which it
fits; only when the current item cannot fit into any initialized bin, is a new bin
introduced
Visual Representation
Let us consider the same example as used above and bins of size 1
Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
The minimum number of bins required would be Ceil ((Total Weight) / (Bin Capacity))=
Celi(3.7/1) = 4 bins.
The First fit solution (FF(I))for this instance I would be-
Considering 0.5 sized item first, we can place it in the first bin
Moving on to the 0.7 sized item, we cannot place it in the first bin. Hence we place it in a
new bin.
Moving on to the 0.5 sized item, we can place it in the first bin.
Moving on to the 0.2 sized item, we can place it in the first bin, we check with the second bin
and we can place it there.
Moving on to the 0.4 sized item, we cannot place it in any existing bin. Hence we place it in a
new bin.
Similarly, placing all the other items following the First-Fit algorithm we get-
Thus we need 5 bins as opposed to the 4 bins of the optimal solution but is much more
efficient than Next-Fit algorithm.
Analyzing the approximation ratio of Next-Fit algorithm
If FF(I) is the First-fit implementation for I instance and z(I) is the most optimal solution, then:
It can be seen that the First Fit never uses more than 1.7 * z(I) bins. So First-Fit is better than
Next Fit in terms of upper bound on number of bins.
Psuedocode
FIRSTFIT(size[], n, c)
{
size[] is the array containg the sizes of the items, n is the number of items and c is the
capacity of the bin
/Initialize result (Count of bins)
res = 0;
Create an array to store remaining space in bins there can be at most n bins
bin_rem[n];
Plae items one by one
for (int i = 0; i < n; i++) {
Find the first bin that can accommodate weight[i]
int j;
for (j = 0; j < res; j++) {
if (bin_rem[j] >= size[i]) {
bin_rem[j] = bin_rem[j] - size[i];
break;
}
}
If no bin could accommodate size[i]
if (j == res) {
bin_rem[res] = c - size[i];
res++;
}
}
return res;
}
3) Best Fit Algorithm
The next algorithm, Best-Fit (BF), is obtained from FF by assigning the current
item to the feasible bin (if any) having the smallest residual capacity (breaking
ties in favor of the lowest indexed bin).
Simply put,the idea is to places the next item in the tightest spot. That is, put it in the bin so
that the smallest empty space is left.
Visual Representation
Let us consider the same example as used above and bins of size 1
Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
The minimum number of bins required would be Ceil ((Total Weight) / (Bin Capacity))=
Celi(3.7/1) = 4 bins.
The First fit solution (FF(I))for this instance I would be-
Considering 0.5 sized item first, we can place it in the first bin
Moving on to the 0.7 sized item, we cannot place it in the first bin. Hence we place it in a
new bin.
Moving on to the 0.5 sized item, we can place it in the first bin tightly.
Moving on to the 0.2 sized item, we cannot place it in the first bin but we can place it in
second bin tightly.
Moving on to the 0.4 sized item, we cannot place it in any existing bin. Hence we place it in a
new bin.
Similarly, placing all the other items following the First-Fit algorithm we get-
Thus we need 5 bins as opposed to the 4 bins of the optimal solution but is much more
efficient than Next-Fit algorithm.
Analyzing the approximation ratio of Best-Fit algorithm
It can be noted that Best-Fit (BF), is obtained from FF by assigning the current item to the
feasible bin (if any) having the smallest residual capacity (breaking ties in favour of the
lowest indexed bin). BF satisfies the same worst-case bounds as FF
Analysis Of upper-bound of Best-Fit algorithm
If z(I) is the optimal number of bins, then Best Fit never uses more than 2 * z(I)-2 bins. So
Best Fit is same as Next Fit in terms of upper bound on number of bins.
Psuedocode
BESTFIT(size[],n, c)
{
size[] is the array containg the sizes of the items, n is the number of items and c is the
capacity of the bin
Initialize result (Count of bins)
res = 0;
Create an array to store remaining space in bins there can be at most n bins
bin_rem[n];
Place items one by one
for (int i = 0; i < n; i++) {
Find the best bin that can accommodate weight[i]
int j;
Initialize minimum space left and index of best bin
int min = c + 1, bi = 0;
for (j = 0; j < res; j++) {
if (bin_rem[j] >= size[i] && bin_rem[j] - size[i] < min) {
bi = j;
min = bin_rem[j] - size[i];
}
}
If no bin could accommodate weight[i],create a new bin
if (min == c + 1) {
bin_rem[res] = c - size[i];
res++;
}
else
Assign the item to best bin
bin_rem[bi] -= size[i];
}
return res;
}
In the offline version, we have all items at our disposal since the start of the execution. The
natural solution is to sort the array from largest to smallest, and then apply the algorithms
discussed henceforth.
NOTE: In the online programs we have given the inputs upfront for simplicity but it can also
work interactively
Let us look at the various offline algorithms
1) First Fit Decreasing
We first sort the array of items in decreasing size by weight and apply first-fit algorithm as
discussed above
Algorithm
Read the inputs of items
Sort the array of items in decreasing order by their sizes
Apply First-Fit algorithm
Visual Representation
Let us consider the same example as used above and bins of size 1
Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
Sorting them we get {0.7, 0.6, 0.5, 0.5, 0.5, 0.4, 0.2, 0.2, 0.1}
The First fit Decreasing solution would be-
We will start with 0.7 and place it in the first bin
We then select 0.6 sized item. We cannot place it in bin 1. So, we place it in bin 2
We then select 0.5 sized item. We cannot place it in any existing. So, we place it in bin 3
We then select 0.5 sized item. We can place it in bin 3
Doing the same for all items, we get.
Thus only 4 bins are required which is the same as the optimal solution.
2) Best Fit Decreasing
We first sort the array of items in decreasing size by weight and apply Best-fit algorithm as
discussed above
Algorithm
Read the inputs of items
Sort the array of items in decreasing order by their sizes
Apply Next-Fit algorithm
Visual Representation
Let us consider the same example as used above and bins of size 1
Assuming the sizes of the items be {0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6}.
Sorting them we get {0.7, 0.6, 0.5, 0.5, 0.5, 0.4, 0.2, 0.2, 0.1}
The Best fit Decreasing solution would be-
We will start with 0.7 and place it in the first bin
We then select 0.6 sized item. We cannot place it in bin 1. So, we place it in bin 2
We then select 0.5 sized item. We cannot place it in any existing. So, we place it in bin 3
We then select 0.5 sized item. We can place it in bin 3
Doing the same for all items, we get.
Thus only 4 bins are required which is the same as the optimal solution.
Approximation Algorithms for the Traveling Salesman Problem
We solved the traveling salesman problem by exhaustive search in Section 3.4, mentioned
its decision version as one of the most well-known NP-complete problems in Section 11.3,
and saw how its instances can be solved by a branch-and-bound algorithm in Section 12.2.
Here, we consider several approximation algorithms, a small sample of dozens of such
algorithms suggested over the years for this famous problem.
But first let us answer the question of whether we should hope to find a polynomial-time
approximation algorithm with a finite performance ratio on all instances of the traveling
salesman problem. As the following theorem [Sah76] shows, the answer turns out to be no,
unless P = N P .
THEOREM 1 If P != NP, there exists no c-approximation algorithm for the traveling salesman
problem, i.e., there exists no polynomial-time approximation algorithm for this problem so
that for all instances
Nearest-neighbour algorithm
The following well-known greedy algorithm is based on the nearest-neighbor heuristic:
always go next to the nearest unvisited city.
Step 1 Choose an arbitrary city as the start.
Step 2 Repeat the following operation until all the cities have been visited: go to the
unvisited city nearest the one visited last (ties can be broken arbitrarily).
Step 3 Return to the starting city.
EXAMPLE 1 For the instance represented by the graph in Figure 12.10, with a as the starting
vertex, the nearest-neighbor algorithm yields the tour (Hamiltonian
circuit) sa: a − b − c − d − a of length 10.
The optimal solution, as can be easily checked by exhaustive search, is the
tour s∗: a − b − d − c − a of length 8. Thus, the accuracy ratio of this approximation is
Unfortunately, except for its simplicity, not many good things can be said about the nearest-
neighbor algorithm. In particular, nothing can be said in general about the accuracy of
solutions obtained by this algorithm because it can force us to traverse a very long edge on
the last leg of the tour. Indeed, if we change the weight of edge (a, d) from 6 to an arbitrary
large number w ≥ 6 in Example 1, the algorithm will still yield the tour a − b − c − d − a of
length 4 + w, and the optimal solution will still be a − b − d − c − a of length 8. Hence,
which can be made as large as we wish by choosing an appropriately large value of w.
Hence, RA = ∞ for this algorithm (as it should be according to Theorem 1).
Twice-around-the-tree algorithm
Step 1 Construct a minimum spanning tree of the graph corresponding to a given instance of
the traveling salesman problem.
Step 2 Starting at an arbitrary vertex, perform a walk around the minimum spanning tree
recording all the vertices passed by. (This can be done by a DFS traversal.)
Step 3 Scan the vertex list obtained in Step 2 and eliminate from it all repeated occurrences
of the same vertex except the starting one at the end of the list. (This step is equivalent to
making shortcuts in the walk.) The vertices remaining on the list will form a Hamiltonian
circuit, which is the output of the algorithm.
EXAMPLE 2 Let us apply this algorithm to the graph in Figure 12.11a. The minimum
spanning tree of this graph is made up of edges (a, b), (b, c), (b, d), and (d, e) . A twice-
around-the-tree walk that starts and ends at a is
a, b, c, b, d, e, d, b, a.
Eliminating the second b (a shortcut from c to d), the second d, and the third b (a shortcut
from e to a) yields the Hamiltonian circuit
a, b, c, d, e, a
of length 39.
The tour obtained in Example 2 is not optimal. Although that instance is small enough to find
an optimal solution by either exhaustive search or branch-and-bound, we refrained from
doing so to reiterate a general point. As a rule, we do not know what the length of an
optimal tour actually is, and therefore we cannot compute the accuracy ratio f (sa)/f (s∗). For
the twice-around-the-tree algorithm, we can at least estimate it above, provided the graph
is Euclidean.
Fermat's Little Theorem:
If n is a prime number, then for every a, 1 < a < n-1,
an-1 ≡ 1 (mod n)
OR
an-1 % n = 1
Example: Since 5 is prime, 24 ≡ 1 (mod 5) [or 24%5 = 1],
34 ≡ 1 (mod 5) and 44 ≡ 1 (mod 5)
Since 7 is prime, 26 ≡ 1 (mod 7),
36 ≡ 1 (mod 7), 46 ≡ 1 (mod 7)
56 ≡ 1 (mod 7) and 66 ≡ 1 (mod 7)
Algorithm
1) Repeat following k times:
a) Pick a randomly in the range [2, n - 2]
b) If gcd(a, n) ≠ 1, then return false
c) If an-1 ≢ 1 (mod n), then return false
2) Return true [probably prime].
Unlike merge sort, we don’t need to merge the two sorted arrays. Thus Quicksort requires
lesser auxiliary space than Merge Sort, which is why it is often preferred to Merge Sort.
Using a randomly generated pivot we can further improve the time complexity of QuickSort.
Algorithm for random pivoting
partition(arr[], lo, hi)
pivot = arr[hi]
i = lo // place for swapping
for j := lo to hi – 1 do
if arr[j] <= pivot then
swap arr[i] with arr[j]
i=i+1
swap arr[i] with arr[hi]
return i
partition_r(arr[], lo, hi)
r = Random Number from lo to hi
Swap arr[r] and arr[hi]
return partition(arr, lo, hi)
quicksort(arr[], lo, hi)
if lo < hi
p = partition_r(arr, lo, hi)
quicksort(arr, lo , p-1)
quicksort(arr, p+1, hi)
Finding kth smallest element
Problem Description: Given an array A[] of n elements and a positive integer K, find the Kth
smallest element in the array. It is given that all array elements are distinct.
For Example :
Input : A[] = {10, 3, 6, 9, 2, 4, 15, 23}, K = 4
Output: 6
Input : A[] = {5, -8, 10, 37, 101, 2, 9}, K = 6
Output: 37
Quick-Select : Approach similar to quick sort
This approach is similar to the quick sort algorithm where we use the partition on the input
array recursively. But unlike quicksort, which processes both sides of the array recursively,
this algorithm works on only one side of the partition. We recur for either the left or right
side according to the position of pivot.
Solution Steps
1. Partition the array A[left .. right] into two subarrays A[left .. pos] and A[pos + 1 .. right] such
that each element of A[left .. pos] is less than each element of A[pos + 1 .. right].
2. Computes the number of elements in the subarray A[left .. pos] i.e. count = pos - left + 1
3. if (count == K), then A[pos] is the Kth smallest element.
4. Otherwise determines in which of the two subarrays A[left .. pos-1] and A[pos + 1 .. right]
the Kth smallest element lies.
If (count > K) then the desired element lies on the left side of the partition
If (count < K), then the desired element lies on the right side of the partition. Since we
already know i values that are smaller than the kth smallest element of A[left .. right], the
desired element is the (K - count)th smallest element of A[pos + 1 .. right].
Base case is the scenario of single element array i.e left ==right. return A[left] or A[right].
Pseudo-Code
// Original value for left = 0 and right = n-1
int kthSmallest(int A[], int left, int right, int K)
{
if (left == right)
return A[left]
int pos = partition(A, left, right)
count = pos - left + 1
if ( count == K )
return A[pos]
else if ( count > K )
return kthSmallest(A, left, pos-1, K)
else
return kthSmallest(A, pos+1, right, K-i)
}
int partition(int A[], int l, int r)
{
int x = A[r]
int i = l-1
for ( j = l to r-1 )
{
if (A[j] <= x)
{
i=i+1
swap(A[i], A[j])
}
}
swap(A[i+1], A[r])
return i+1
}
Complexity Analysis
Time Complexity: The worst-case time complexity for this algorithm is O(n²), but it can be
improved if we choose the pivot element randomly. If we randomly select the pivot, the
expected time complexity would be linear, O(n)