Lecture 30.
P, NP and NP Complete Problems
1
Recap
Data compression is a technique to compress the data
represented either in text, audio or image form.
Two important compress techniques are lossy and lossless
compression.
LZW is the foremost technique for general purpose data
compression due to its simplicity and versatility.
LZW compression uses a code table, with 4096 as a common
choice for the number of table entries.
2
Optimization & Decision Problems
Decision problems
– Given an input and a question regarding a problem, determine if the
answer is yes or no
Optimization problems
– Find a solution with the “best” value
Optimization problems can be cast as decision problems that are
easier to study
– E.g.: Shortest path: G = unweighted directed graph
Find a path between u and v that uses the fewest edges
Does a path exist from u to v consisting of at most k edges?
3
Algorithmic vs Problem Complexity
The algorithmic complexity of a computation is some measure
of how difficult is to perform the computation (i.e., specific to
an algorithm)
The complexity of a computational problem or task is the
complexity of the algorithm with the lowest order of growth of
complexity for solving that problem or performing that task.
– e.g. the problem of searching an ordered list has at most lgn time
complexity.
Computational Complexity: deals with classifying problems by
how hard they are.
4
Computational Complexity Theory
In computer science, computational complexity theory is the branch of the
theory of computation that studies the resources, or cost, of the computation
required to solve a given computational problem.
The relative computational difficulty of computable functions is the subject
matter of computational complexity.
Complexity theory analyzes the difficulty of computational problems in terms
of many different computational resources.
Example: looking up something in a dictionary has only logarithmic
complexity because a double sized dictionary only has to be opened one time
more (e.g. exactly in the middle - then the problem is reduced to the half).
5
Complexity Classes
A complexity class is the set of all of the computational problems which
can be solved using a certain amount of a certain computational resource.
The complexity class P is the set of decision problems that can be solved
by a deterministic machine in polynomial time. This class corresponds to
an intuitive idea of the problems which can be effectively solved in the
worst cases.
The complexity class NP is the set of decision problems that can be solved
by a non-deterministic machine in polynomial time. This class contains
many problems that people would like to be able to solve effectively. All
the problems in this class have the property that their solutions can be
checked effectively.
6
Complexity Classes (taxonomy)
Decision Problem
Type 0 (recursively enumerable) Undecidable
Decidable
PSPACE
PSPACE-Complete
Type 1 (context sensitive) NP
Co-NP
NP-Complete
P
P-Complete
Type 2 (context free)
Strict subset relationship
Type 3 (regular)
Set equality is unknown
7
Deterministic (Turing) Machine
Deterministic or Turing machines are extremely basic symbol-manipulating
devices which — despite their simplicity — can be adapted to simulate the logic of
any computer that could possibly be constructed.
They were described in 1936 by Alan Turing. Though they were intended to be
technically feasible, Turing machines were not meant to be a practical computing
technology, but a thought experiment about the limits of mechanical computation;
thus they were not actually constructed.
Studying their abstract properties yields many insights into computer science and
complexity theory.
Turing machines capture the informal notion of effective method in logic and
mathematics, and provide a precise definition of an algorithm or 'mechanical
procedure'.
8
Nondeterministic (Turing) Machine
In theoretical computer science, a non-deterministic Turing
machine (NTM) is a Turing machine whose control mechanism
works like a non-deterministic finite automaton.
An ordinary (deterministic) Turing machine (DTM) has a
transition function that, for a given state and symbol under the
tape head, specifies three things:
– the symbol to be written to the tape
– the direction (left or right) in which the head should move
– the subsequent state of the finite control
An NTM differs in that the state and tape symbol no longer
uniquely specify these things - many different actions may apply
for the same combination of state and symbol.
9
Complexity Class P
P is the complexity class containing decision problems which can be solved by a
deterministic Turing machine using a polynomial amount of computation time, or
polynomial time.
P is often taken to be the class of computational problems which are "efficiently solvable"
or "tractable“.
Problems that are solvable in theory, but cannot be solved in practice, are called
intractable.
There exist problems in P which are intractable in practical terms; for example, some
require at least n1000000 operations.
P is known to contain many natural problems, including the decision
versions of linear programming, calculating the greatest common divisor,
and finding a maximum matching. In 2002, it was shown that the problem
of determining if a number is prime is in P.
10
Complexity Class NP
In computational complexity theory, NP ("Non-deterministic
Polynomial time") is the set of decision problems solvable in
polynomial time on a non-deterministic Turing machine.
It is the set of problems that can be "verified" by a deterministic
Turing machine in polynomial time.
All the problems in this class have the property that their solutions
can be checked effectively.
This class contains many problems that people would like to be
able to solve effectively, including
– the Boolean satisfiability problem (SAT)
– the Hamiltonian path problem (special case of TSP)
– the Vertex cover problem.
11
Complexity Class NP-Complete
In complexity theory, the NP-complete problems are the most difficult
problems in NP ("non-deterministic polynomial time") in the sense that they
are the ones most likely not to be in P.
If one could find a way to solve any NP-complete problem quickly (in
polynomial time), then they could use that algorithm to solve all NP problems
quickly.
At present, all known algorithms for NP-complete problems require time that
is super polynomial in the input size.
To solve an NP-complete problem for any nontrivial problem size, generally
one of the following approaches is used:
– Approximation
– Probabilistic
– Special cases
– Heuristic
12
Complexity Class NP-Complete (cont)
Some well-known problems that are NP-complete are:
– Boolean satisfiability problem (SAT)
– N-puzzle
– Knapsack problem
– Hamiltonian cycle problem
– Traveling salesman problem
– Subgraph isomorphism problem
– Subset sum problem
– Clique problem
– Vertex cover problem
– Independent set problem
– Graph coloring problem
– Minesweeper
13
Complexity Classes P and NP
14
Class of “P” Problems
Class P consists of (decision) problems that are solvable in
polynomial time
Polynomial-time algorithms
– Worst-case running time is O(nk), for some constant k
Examples of polynomial time:
– O(n2), O(n3), O(1), O(n lg n)
Examples of non-polynomial time:
– O(2n), O(nn), O(n!)
15
Tractable/Intractable Problems
Problems in P are also called tractable
Problems not in P are intractable or unsolvable
– Can be solved in reasonable time only for small inputs
– Or, can not be solved at all
Are non-polynomial algorithms always worst than polynomial
algorithms?
- n1,000,000 is technically tractable, but really impossible - nlog log log n is
technically intractable, but easy
16
Example of Unsolvable Problem
Turing discovered in the 1930’s that there are
problems unsolvable by any algorithm.
The most famous of them is the halting problem
– Given an arbitrary algorithm and its input, will that
algorithm eventually halt, or will it continue forever in an
“infinite loop?”
17
Examples of Intractable Problems
18
Intractable Problems
Can be classified in various categories based on their degree
of difficulty, e.g.,
– NP
– NP-complete
– NP-hard
Let’s define NP algorithms and NP problems
19
Nondeterministic and NP Algorithms
Nondeterministic algorithm = two stage procedure:
1) Nondeterministic (“guessing”) stage:
generate randomly an arbitrary string that can be thought of as a candidate
solution (“certificate”)
2) Deterministic (“verification”) stage:
take the certificate and the instance to the problem and returns YES if the
certificate represents a solution
NP algorithms (Nondeterministic polynomial)
20 verification stage is polynomial
Class of “NP” Problems
Class NP consists of problems that could be solved by NP
algorithms
– i.e., verifiable in polynomial time
If we were given a “certificate” of a solution, we could verify that
the certificate is correct in time polynomial to the size of the input
Warning: NP does not mean “non-polynomial”
21
Hamiltonian Cycle
Given: a directed graph G = (V, E), determine a simple cycle
that contains each vertex in V
– Each vertex can only be visited once
Certificate: hamiltonian
– Sequence: v1, v2, v3, …, v|V|
not
hamiltonian
22
Is P = NP?
P
Any problem in P is also in NP: NP
P NP
The big (and open question) is whether NP P or P = NP
– i.e., if it is always easy to check a solution, should it also be easy to
find a solution?
Most computer scientists believe that this is false but we do not
have a proof …
23
NP-Completeness (informally)
P NP-complete
NP-complete problems are
NP
defined as the hardest
problems in NP
Most practical problems turn out to be either P or NP-
complete.
Study NP-complete problems …
24
Summary
Decision problems
– Given an input and a question regarding a problem, determine if the answer is yes or no
Optimization problems
– Find a solution with the “best” value
NP-complete - means problems that are 'complete' in NP, i.e. the most difficult
to solve in NP
NP-hard - stands for 'at least' as hard as NP (but not necessarily in NP);
NP-easy - stands for 'at most' as hard as NP (but not necessarily in NP);
NP-equivalent - means equally difficult as NP, (but not necessarily in NP);
25
In next Lecture
In next lecture, we will revised the first fifteen lectures.
26