0% found this document useful (0 votes)
7 views41 pages

Daa Unit-5

The document discusses the Branch and Bound algorithm, a systematic method for solving combinatorial optimization problems such as the Traveling Salesman Problem (TSP) and the 0/1 knapsack problem. It outlines the principles of branching and bounding, the structure of state space trees, and the algorithm's application in various optimization scenarios. Additionally, it covers NP-Hard and NP-Complete problems, computability classes, and advanced topics like approximation algorithms and randomized methods.

Uploaded by

raju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views41 pages

Daa Unit-5

The document discusses the Branch and Bound algorithm, a systematic method for solving combinatorial optimization problems such as the Traveling Salesman Problem (TSP) and the 0/1 knapsack problem. It outlines the principles of branching and bounding, the structure of state space trees, and the algorithm's application in various optimization scenarios. Additionally, it covers NP-Hard and NP-Complete problems, computability classes, and advanced topics like approximation algorithms and randomized methods.

Uploaded by

raju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

UNIT – V

Branch and Bound: General method, applications - Traveling salesperson problem, 0/1 knapsack
problem - LC Branch and Bound solution, FIFO Branch and Bound solution.
NP-Hard and NP-Complete problems: Basic concepts, non-deterministic algorithms, NP-Hard and NP-
Complete classes, Cook’s theorem.

Tractable and Intractable Problems: Computability of Algorithms, Computability classes – P, NP, NP-
complete and NP-hard, Cook’s theorem. Advanced Topics: Approximation algorithms, Randomized
Branch and Bound:
algorithms.

Branch and bound is an algorithm design paradigm which is generally used for solving combinatorial
optimization problems. These problems are typically exponential in terms of time complexity and may
require exploring all possible permutations in the worst case. The Branch and Bound Algorithm technique
solves these problems relatively quickly.

The Branch and Bound Algorithm is a method used in combinatorial optimization problems to
systematically search for the best solution. It works by dividing the problem into smaller subproblems, or
branches, and then eliminating certain branches based on bounds on the optimal solution. This process continues
until the best solution is found or all branches have been explored. Branch and Bound is commonly used in
problems like the traveling salesman and job scheduling.

Branch and bound is a systematic method for solving optimization problems. Branch and bound is a
rather general optimization technique that applies where the greedy method and dynamic programming fail.
However, it is much lower. Indeed, it often leads to exponential time complexities in the worst case. On the
other hand, if applied carefully, it can lead to algorithms that run reasonably fast on average.
Branch-and-bound procedure requires two tools.
1. Branching
⮚ a way of covering the feasible region by several smaller feasible sub-regions (ideally,
splitting into sub-regions).
⮚ since the procedure may be repeated recursively to each of the
sub regions and all produced sub-regions naturally form a tree structure, called search tree
or branch-and-bound-tree. Its nodes are the constructed sub-regions.
2. Bounding
⮚ which is a fast way of finding upper and lower bounds for the optimal
solution within a feasible sub-region

The Branch and bound technique like Backtracking explores the implicit graph and deals with the
optimal solution to a given problem. In this technique at each stage we calculate the bound for a particular
node and check whether this bound will be able to give the solution or not .If we find that at any node the
solution so obtained is appropriate but the remaining solution is not leading to a best case then we leave
this node without exploring.

Terminologies

⮚ Each node in this tree defines a problem state


⮚ All paths from the root to other nodes define the state space of the problem
⮚ Solution states are those problem states “S” for which the path from the root to ’s’ defines a tuple in
the solution space.
⮚ Answer states are those solution states ’s’ for which the path from the root to “S”defines a tuple
that is a member of the set of solutions.
⮚ The tree organization of the solution space referred to as the state space tree.
⮚ A node which has been generated and all of whose children have not yet been generated is called a
live node.
⮚ The live node whose children are currently being generated is called the E-node (node being
expanded).
⮚ A dead node is a generated node which is not to be expanded further or all of whose children have
been generated.
The term branch and bound refers to all state space search methods in which all children of E-node are
generated before any other live node can become the E-node.

BRANCH AND BOUND FOR TSP (Traveling Salesman Problem)

TSP includes as a salesperson who has to visit a number of cities during a tour and the condition is to visit
all the cities exactly once and return back to the same city where the person started.

Basic steps

Let G=(V,E) be a direct graph defining an instance of the TSP.

1. This graph is first represented by a cost matrix where


cij = the cost of edge , if there is a path between vertex i and vertex j cij =
∞ , if there is no path
2. Convert cost matrix into reduced matrix i.e.,every row and column should contain at least
one zero entry.
3. Cost of the reduced matrix is the sum of elements that are subtracted from rows and
columns of cost matrix to make it reduced.
4. Make the state space tree for reduced matrix.
5. To find the next E-node, find the least cost valued node by calculating the reduced cost matrix with
every node.
6. If (i, j) edge is to be included, then there are three conditions to accomplish this task:
1. Change all entries in row i and column j of A to ∞ .
2. set A [j, 1] =∞
3. Reduce all rows and columns in resulting matrix except for rows and columns containing ∞.
7. Calculate the cost of the matrix where
cost = L + cost (i, j) + r
where L = cost of original reduced cost matrix and r= new
reduced cost matrix
8. Repeat the above steps for all the nodes until all the nodes are generated and we get a path.

General method-

The principle consists in partitioning the solution space into disjoint subsets, which are represented by the
nodes of the branching tree. Then, the algorithm explores the branches of the tree according to a route
strategy. To avoid exploring the entire tree, before creating a new node in the tree, the algorithm evaluates
the node by comparing the value of the best possible solution, which could be found in the corresponding
subtree, with the best current solution. If a better solution cannot belong to the subtree rooted at the
considered node, the subtree is discarded. Otherwise, the exploration process continues. This method can be
seen as a divide-and-conquer approach. Lots of combinatorial optimization problems may be solved, more
or less efficiently, using a branch and bound strategy.

Branch-and-bound this is one more problem solving strategy a method by which we can solve a problem this
is similar to backtracking in the sense that it also uses a state-space stream for solving the problem solution
by represented like a state space tree but it is useful for solving optimization problem and only minimization
problem not maximization problem anyway we can convert a maximization problem into minimization and
we can solve it.

Applications :

● Travelling salesperson problem


● 0/1 knapsack problem
● Job Sequencing with deadline Problems

Travelling salesperson problem-

0/1 knapsack problem -

Branch and Bound solution is the best suited method when item weights are not integers. To check if a
particular node can give us a better solution or not, we compute the optimal solution (through the node)
using the Greedy approach. If the solution computed by the Greedy approach itself is more than the best
so far, then we can’t get a better solution through the node.
Branch and Bound

⮲ Definitions:
 Branch and Bound is a state space search method in which all the
children of a node are generated before expanding any of its children.

 Live-node: A node that has not been expanded.

 It is similar to backtracking technique but uses BFS-like search.

1 FIFO Branch 1& LIFO Branch & 1Bound (D-Search)


Bound (BFS) Children of E-node are inserted
Children of E- in a stack.
2 3 4 5 node2 are 3 4 5 2 3 45
inserted in a
Live Node: 2, 3, 4, and 5 queue.
6 7 8 9 8 96 7

● Dead-node: A node that has been expanded


● Solution-node

⮲ LC-Search (Least Cost Search):


 The selection rule for the next E-node in FIFO or LIFO branch-and-
bound is sometimes “blind”. i.e. the selection
rule does not give any preference to a node that has a very good chance of
getting the search to an answer node quickly.

 The search for an answer node can often be speeded by using an “intelligent”
ranking function, also called an
^

approximate cost function C

^
 Expanded-node (E-node): is the live node with the best C

value

⮲ Requirements
 Branching: A set of solutions, which is represented by a node, can be
partitioned into mutually exclusive sets. Each subset in the partition is
represented by a child of the original node.

 Lower bounding: An algorithm is available for calculating a lower bound


on the cost of any solution in a given subset.

⮲ Searching: Least-cost search (LC)


 Cost and approximation

✔Each node, X, in the search tree is associated with a cost: C(X)

✔C(X) = cost of reaching the current node, X (E- node), from


the root + the cost of reaching an answer node from X.
C(X) = g(X) + h(X)
^

✔Get an approximation of C(x), (x) such that

^
C (x) ≤C(x), and

C (x) = C(x) if x is a solution-node.

✔The approximation part of C (x) is

h(x)=the cost of reaching a solution-node from X, not known.

● Least-cost search:
^
The next E-node is the one with least C

⮲ Example: 8-puzzle
^
= g(x) +h(x)
 Cost function: C

where
h(x) = the number of misplaced tiles and
g(x) = the number of moves so far

 Assumption: move one tile in any direction cost 1.


Initial State Final State
1 2 3 1 2 3
5 6 5 8 6
7 8 4 7 4

1 2 3
5 6
7 8 4

^ ^
C=1+4=5 C  1  4 ^ 5
C=1+2=3

1 2 3 1 2 3 1 2
5 6 4 5 6 5 6 3
7 8 7 8 4 7 8 4

^ ^
^
C=2+1=3 C=2+
C=2+3=5
3=5

1 2 3 1 2 3 1 3
5 8 6 5 6 5 2 6
7 4 7 8 4 7 8 4
^
C=3+2=5 ^
C=3+0=3

1 2 3 1 2 3
5 8 6 5 8 6
7 4 7 4

Note: In case of tie, choose the leftmost node.


⮲ Algorithm:
/* live_node_set: set to hold the live nodes at any time */
/* lowcost: variable to hold the cost of the best cost at any given node
*/
Begin
Lowcost = ∞;
While live_node_set ≠∞ do
- choose a branching node, k, such that k
∈live_node_set; /* k is a E-node */
- live_node_set= live_node_set - {k};
- Generate the children of node k and the
corresponding lower bounds;
Sk={(i,zi): i is child of k and zi its lower bound}
- For each element (i,zi) in Sk do

- If zi > U
- then

- Kill child i; /* i is a child node */

- Else
If child i is a solution
Then
U =zi; current best = child i;
Else
Add child i to live_node_set;
Endif;
Endif;
- Endfor;
Endwhile;
⮲ Travelling Salesman Problem: A Branch and Bound algorithm
 Definition: Find a tour of minimum cost starting from a node S going
through other nodes only once and returning to the starting point S.

 Definitions:

 A row(column) is said to be reduced iff it contains at least one


zero and all remaining entries are non- negative.

 A matrix is reduced iff every row and column is reduced.

 Branching:

 Each node splits the remaining solutions into two groups:


those that include a particular edge and those that exclude
that edge

 Each node has a lower bound.

 Example: Given a graph G=(V,E), let <i,j> ∈ E,

L
All Solutions

L1 L2

Solutions with <i,j> Solutions without <i,j>


 Bounding: How to compute the cost of each node?

 Subtract of a constant from any row and any column does not
change the optimal solution (The path).

 The cost of the path changes but not the path itself.

 Let A be the cost matrix of a G=(V,E).

 The cost of each node in the search tree is computed as follows:

● Let R be a node in the tree and A(R) its reduced


matrix
● The cost of the child (R), S:

● Set row i and column j to infinity


● Set A(j,1) to infinity
● Reduced S and let RCL be the
reduced cost.
● C(S) = C(R) + RCL+A(i,j)

 Get the reduced matrix A' of A and let L be the value


subtracted from A.
 L: represents the lower bound of the path solution
 The cost of the path is exactly reduced by L.

 What to determine the branching edge?

 The rule favors a solution through left subtree rather than


right subtree, i.e., the matrix is reduced by a dimension.
 Note that the right subtree only sets the branching edge to
infinity.

 Pick the edge that causes the greatest increase in the lower
bound of the right subtree, i.e., the lower bound of the root
of the right subtree is greater.

●Example:
o The reduced cost matrix is done as follows:
- Change all entries of row i and column j to

infinity
- Set A(j,1) to infinity (assuming the start node is 1)

- Reduce all rows first and then column of the resulting


matrix
● Given the following cost matrix:

● State Space Tree:


1 25

Vertex = 2 Vertex = 5
Vertex = 3 Vertex = 4
2 3 4 5
35 53 25 31

Vertex = 2 Vertex = 3 Vertex = 5


36
6 28 7 50 8

Vertex = 3 Vertex = 5

9 52 10 28

Vertex = 3

28
11
● The TSP starts from node 1: Node 1
o Reduced Matrix: To get the lower bound of the path starting at
node 1
▪ Row # 1: reduce by 10

▪ Row #2: reduce 2

▪ Row #3: reduce by 2


▪ Row # 4: Reduce by 3:

▪ Row # 4: Reduce by 4

▪ Column 1: Reduce by 1

▪ Column 2: It is reduced.
▪ Column 3: Reduce by 3

𝖥𝑖𝑛𝑓 1010 17
17 00 11
⎢ 12
𝖥𝑖𝑛𝑓𝑖𝑛𝑓 11
11 22 00
⎢ 0 3𝑖𝑛𝑓 𝑖𝑛𝑓 0 2
⎢ ⎢ 12
⎢⎢ 15 3 12 𝑖𝑛𝑓 0
⎣ 110 03 0 12
⎢ 𝑖𝑛𝑓 2

0

15 3 12 0

𝑖𝑛𝑓
⎣ 11 0 0 12 𝑖𝑛𝑓

▪ Column 4: It is reduced.

▪ Column 5: It is reduced.

▪ The reduced cost is: RCL = 25

▪ So the cost of node 1 is:


● Cost(1) = 25
▪ The reduced matrix is:

cost(1) = 25
𝖥𝑖𝑛𝑓 ⎤
⎢ 12 ⎥
⎢0 3 2 0⎥
1 17 0 1

𝑖 𝑖𝑛𝑓02
⎢ 15 0 ⎥
0

⎢ 3 𝑛 12𝑖𝑛𝑓 ⎥
11

⎣ 11 0 𝑓 0 12 𝑖𝑛𝑓 ⎦
● Choose to go to vertex 2: Node 2
- Cost of edge <1,2> is: A(1,2) = 10
- Set row #1 = inf since we are choosing edge <1,2>

- Set column # 2 = inf since we are choosing edge


<1,2>
- Set A(2,1) = inf

- The resulting cost matrix is:

- The matrix is reduced:


o RCL = 0
- The cost of node 2 (Considering vertex 2 from vertex 1) is:

▪ Cost(2) = cost(1) + A(1,2) = 25 + 10 = 35

● Choose to go to vertex 3: Node 3


- Cost of edge <1,3> is: A(1,3) = 17 (In the reduced matrix
- Set row #1 = inf since we are starting from node 1
- Set column # 3 = inf since we are choosing edge
<1,3>
- Set A(3,1) = inf
- The resulting cost matrix is:

● Reduce the matrix:


o Rows are reduced
o The columns are reduced except for column # 1:
▪ Reduce column 1 by 11:

● The lower bound is:


o RCL = 11
● The cost of going through node 3 is:
o cost(3) = cost(1) + RCL + A(1,3) = 25 + 11 + 17
= 53
● Choose to go to vertex 4: Node 4
o Remember that the cost matrix is the one that was reduced at the
starting vertex 1
o Cost of edge <1,4> is: A(1,4) = 0
o Set row #1 = inf since we are starting from node 1
o Set column # 4 = inf since we are choosing edge
<1,4>
o Set A(4,1) = inf
o The resulting cost matrix is:

o Reduce the matrix:


▪ Rows are reduced
▪ Columns are reduced
o The lower bound is: RCL = 0
o The cost of going through node 4 is:
▪ cost(4) = cost(1) + RCL + A(1,4) = 25 + 0
+ 0 = 25
● Choose to go to vertex 5: Node 5
o Remember that the cost matrix is the one that was reduced at starting
vertex 1
o Cost of edge <1,5> is: A(1,5) = 1
o Set row #1 = inf since we are starting from node 1
o Set column # 5 = inf since we are choosing edge
<1,5>
o Set A(5,1) = inf
o The resulting cost matrix is:

o Reduce the matrix:


▪ Reduce rows:
● Reduce row #2: Reduce by 2

● Reduce row #4: Reduce by 3


▪ Columns are reduced

o The lower bound is:


▪ RCL = 2 + 3 = 5
o The cost of going through node 5 is:
▪ cost(5) = cost(1) + RCL + A(1,5) = 25 + 5 + 1 = 31

● In summary:
o So the live nodes we have so far are:
▪ 2: cost(2) = 35, path: 1->2
▪ 3: cost(3) = 53, path: 1->3
▪ 4: cost(4) = 25, path: 1->4
▪ 5: cost(5) = 31, path: 1->5
o Explore the node with the lowest cost: Node 4 has a cost of 25
o Vertices to be explored from node 4: 2, 3, and 5
o Now we are starting from the cost matrix at node 4 is:

Cost(4) = 25
𝖥𝑖𝑛𝑓𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 ⎤
⎢ 12𝑖𝑛𝑓 11 𝑖𝑛𝑓 0 ⎥
0 2:3Node𝑖𝑛𝑓
● Choose to go⎢to vertex 6 (path𝑖𝑛𝑓 2 ⎥
o Cost of⎢edge𝑖𝑛𝑓 12= 3 𝑖𝑛𝑓 0
<4,2> is:3A(4,2) ⎥

is 1->4->2)

⎢ #4 11 0 we 0𝑖𝑛𝑓 𝑖𝑛𝑓⎦

o Set row = inf since are considering edge
<4,2>
o Set column # 2 = inf since we are considering edge <4,2>
o Set A(2,1) = inf
o The resulting cost matrix is:

o Reduce the matrix:


▪ Rows are reduced
▪ Columns are reduced
o The lower bound is: RCL = 0
o The cost of going through node 2 is:
▪ cost(6) = cost(4) + RCL + A(4,2) = 25 + 0 + 3 = 28

● Choose to go to vertex 3: Node 7 ( path is 1->4->3 )


o Cost of edge <4,3> is: A(4,3) = 12
o Set row #4 = inf since we are considering edge
<4,3>
o Set column # 3 = inf since we are considering edge <4,3>
o Set A(3,1) = inf
o The resulting cost matrix is:
o Reduce the matrix:
▪ Reduce row #3: by 2:

▪ Reduce column # 1: by 11
o The lower bound is: RCL = 13

o So the RCL of node 7 (Considering vertex 3 from vertex 4) is:


▪ Cost(7) = cost(4) + RCL + A(4,3) = 25 + 13
+ 12 = 50

● Choose to go to vertex 5: Node 8 ( path is 1->4->5 )


o Cost of edge <4,5> is: A(4,5) = 0
o Set row #4 = inf since we are considering edge
<4,5>
o Set column # 5 = inf since we are considering edge <4,5>
o Set A(5,1) = inf
o The resulting cost matrix is:

o Reduce the matrix:


▪ Reduced row 2: by 11
▪ Columns are reduced
o The lower bound is: RCL = 11
o So the cost of node 8 (Considering vertex 5 from vertex 4) is:
▪ Cost(8) = cost(4) + RCL + A(4,5) = 25 + 11
+ 0 = 36

● In summary:
o So the live nodes we have so far are:
▪ 2: cost(2) = 35, path: 1->2
▪ 3: cost(3) = 53, path: 1->3
▪ 5: cost(5) = 31, path: 1->5
▪ 6: cost(6) = 28, path: 1->4->2
▪ 7: cost(7) = 50, path: 1->4->3
▪ 8: cost(8) = 36, path: 1->4->5
o Explore the node with the lowest cost: Node 6 has a cost of 28
o Vertices to be explored from node 6: 3 and 5

o Now we are starting from the cost matrix at node 6 is:

Cost(6) = 28
𝖥𝑖𝑛𝑓
𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓

𝑖𝑛 1 𝑖𝑛 0

𝑓 1 𝑓

𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 2

𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓


𝑖𝑛𝑓 0 𝑖𝑛𝑓 𝑖𝑛𝑓

● Choose to go to vertex 3: Node 9 ( path is 1->4->2->3


)
o Cost of edge <2,3> is: A(2,3) = 11
o Set row #2 = inf since we are considering edge
<2,3>
o Set column # 3 = inf since we are considering edge <2,3>
o Set A(3,1) = inf
o The resulting cost matrix is:

o Reduce the matrix:


▪ Reduce row #3: by 2

▪ Reduce column # 1: by 11

o The lower bound is: RCL = 2 +11 = 13


o So the cost of node 9 (Considering vertex 3 from vertex 2) is:
▪ Cost(9) = cost(6) + RCL + A(2,3) = 28 + 13
+ 11 = 52

● Choose to go to vertex 5: Node 10 ( path is 1->4->2-


>5 )
o Cost of edge <2,5> is: A(2,5) = 0
o Set row #2 = inf since we are considering edge
<2,3>
o Set column # 3 = inf since we are considering edge <2,3>
o Set A(5,1) = inf
o The resulting cost matrix is:
Reduce the matrix:
▪ Rows reduced

▪ Columns reduced
o The lower bound is: RCL = 0
o So the cost of node 10 (Considering vertex 5 from vertex 2) is:
▪ Cost(10) = cost(6) + RCL + A(2,3) = 28 + 0
+ 0 = 28

● In summary:
o So the live nodes we have so far are:
▪ 2: cost(2) = 35, path: 1->2
▪ 3: cost(3) = 53, path: 1->3
▪ 5: cost(5) = 31, path: 1->5
▪ 7: cost(7) = 50, path: 1->4->3
▪ 8: cost(8) = 36, path: 1->4->5
▪ 9: cost(9) = 52, path: 1->4->2->3
▪ 10: cost(2) = 28, path: 1->4->2->5

o Explore the node with the lowest cost: Node 10 has a cost of 28
o Vertices to be explored from node 10: 3
o Now we are starting from the cost matrix at node 10 is:
● Choose to go to vertex 3: Node 11 ( path is 1->4->2-
>5->3 )
o Cost of edge <5,3> is: A(5,3) = 0
o Set row #5 = inf since we are considering edge
<5,3>
o Set column # 3 = inf since we are considering edge <5,3>
o Set A(3,1) = inf
o The resulting cost matrix is:

o Reduce the matrix:


▪ Rows reduced

▪ Columns reduced
o The lower bound is: RCL = 0
o So the cost of node 11 (Considering vertex 5 from vertex 3) is:
▪ Cost(11) = cost(10) + RCL + A(5,3) = 28 + 0 + 0 = 28
`
The Backtracking Solution can be optimized if we know a bound on the best possible solution
subtree rooted with every node. If the best in subtree is worse than the current best, we can
simply ignore this node and its subtrees. So we compute the bound (best solution) for every
node and compare the bound with the current best solution before exploring the node.

Algorithm:

● Sort all items in decreasing order of ratio of value per unit weight so that an upper bound
can be computed using Greedy Approach.
● Initialize maximum profit, maxProfit = 0
● Create an empty queue, Q.
● Create a dummy node of the decision tree and enqueue it to Q. Profit and weight of the
dummy node are 0.
● Do the following while Q is not empty.
● Extract an item from Q. Let the extracted item be u.
● Compute profit of next level node. If the profit is more than maxProfit, then update
maxProfit.
● Compute bound of next level node. If bound is more than maxProfit, then add the next
level node to Q.
● Consider the case when the next level node is not considered as part of the solution and
add a node to queue with level as next, but weight and profit without considering next
level nodes.
Example bounds used in below diagram are, A down can give $315, B down can $275, C down
can $225, D down can $125 and E down can $30 and knapsack capacity is 30
Design & Analysis of UNIT-5
Algorithm

NP-Hard and NP-Complete


Tractable and Intractable Problems:

Tractable Problem: a problem that is solvable by a polynomial-time algorithm. The upper


bound is polynomial. Intractable Problem: a problem that cannot be solved by a polynomial-
time algorithm.

Let’s start by reminding ourselves of some common functions, ordered by how fast they grow.

• Computer Scientists divide these functions into two classes:

k k
Polynomial functions: Any function that is O(n ), i.e. bounded from above by n for some
constant k.
E.g. O(1), O(log n), O(n), O(n × log n), O(n 2 ), O(n 3 ).
We can define ‘polynomial’ as any function of the form a kn k + ak−1n k−1 + . . . + a1n + a0. But
here the word ‘polynomial’ is used to put together the functions that are bounded from above
by polynomials. So, log n and n × log n, which are not polynomials in our original sense, are
polynomials by our alternative definition, because they are bounded from above by, e.g., n and
n 2 respectively.

Exponential functions: The remaining functions. E.g. O(2n), O(n!), O(n n). A function of the
form k n is genuinely exponential. But now some functions which are worse than polynomials
but not quite exponential, such as O(n log n), are also (incorrectly) called exponential. And some
functions which are worse than exponential, such as the super exponentials, e.g. O(n n), will
also (incorrectly) be called exponential. A better word than ‘exponential’ would be ‘super-
polynomial’. But ‘exponential’ is what everyone uses, so it’s what we’ll use.

On the basis of this classification of functions into polynomial and exponential, we can classify
algorithms:

Polynomial-Time Algorithm: an algorithm whose order-of-magnitude time performance is


bounded from above by a polynomial function of n, where n is the size of its inputs.

Exponential Algorithm: an algorithm whose order-of-magnitude time performance is not


bounded from above by a polynomial function of n.

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm
Computability of Algorithms

Computability is the ability to solve a problem in an effective manner. It is a key topic of the
field of computability theory within mathematical logic and the theory of computation within
computer science. The computability of a problem is closely linked to the existence of an
algorithm to solve the problem.

Computability classes (Complexity Classes) –

In Computer Science, many problems are solved where the objective is to maximize or
minimize some values, whereas in other problems we try to find whether there is a solution or
not. Hence, the problems can be categorized as follows −
Optimization Problem- Optimization problems are those for which the objective is to maximize
or minimize some values.

For example,

● Finding the minimum number of colors needed to color a given graph.

● Finding the shortest path between two vertices in a graph.

Decision Problem- There are many problems for which the answer is a Yes or a No. These
types of problems are known as decision problems.

For example,

● Whether a given graph can be colored by only 4-colors.

● Finding a Hamiltonian cycle in a graph is not a decision problem, whereas checking


whether a graph is Hamiltonian or not is a decision problem.

In this context, we can categorize the problems as follows −P, NP, NP-complete and NP-hard.
P, NP, NP-complete and NP-hard

P-Class:

The class of problems that have polynomial-time deterministic algorithms.

– That is, they are solvable in O(p(n)), where p(n) is a polynomial on n

– A deterministic algorithm is (essentially) one that always computes the correct answer

The class P consists of those problems that are solvable in polynomial time, i.e. these problems
can be solved in time O(nk) in the worst-case, where k is constant. These problems are called
tractable, while others are called intractable or superpolynomial.

Formally, an algorithm is a polynomial time algorithm, if there exists a polynomial p(n) such that

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm
the algorithm can solve any instance of size n in a time O(p(n)).

Problems requiring Ω(n50) time to solve are essentially intractable for large n. Most known
polynomial time algorithms run in time O(n k) for a fairly low value of k. The advantages in
considering the class of polynomial-time algorithms is that all reasonable deterministic single
processor models of computation can be simulated on each other with at most a polynomial.

Example:

● Searching for key element among n number of array elements- O(n)


● All pair Shortest path algorithm- O(n3)
● Sorting of n elements- O(n2)
● Addition of matrices -O(n2)
● Multiplication of matrices- O(n3)

NP-Class:

The class of decision problems that are solvable in polynomial time on a nondeterministic
machine or with a nondeterministic algorithm.

– A deterministic computer is what we know. (There exist unique state from any state with
specified input)

– A nondeterministic computer is one that can “guess” the right answer or solution. (There
exist multiple states from any states for a given input)

• Thus NP can also be thought of as the class of problems “whose solutions can be verified in
polynomial time”

• Note that NP stands for “Nondeterministic Polynomial-time”

NP is the class of decision problems for which it is easy to check the correctness of a claimed
answer, with the aid of a little extra information. Hence, we aren’t asking for a way to find a
solution, but only to verify that an alleged solution (Without proof) really is correct. Every
problem in this class can be solved in exponential time using exhaustive search.

Example:

● 0/1 Knapsack Problem - O(2n/2)


● Traveling Salesperson Problem- O(n2 2n)
● Graph Coloring Problem- O(n mn)
● Hamiltonian path Problem- O(n2 2n)

Relation between P Class and NP Class Problems:

● Every Problem whis is P class is also in NP class but Every class which is NP is not in P
class.
● P class problems can be solved efficiently but NP class problems can’t be solved efficiently.

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm

NP class problems again divided into two categories: NP-Hard problems and NP-complete
Problems.

NP-complete class: A problem L1 is in NP-complete, if and only if it satisfies the following


two conditions

● L1 is in NP i.e., L1 Є NP
● Every Problem that is in NP is polynomial time reducible to L1. i.e., L2 ∝ L1

How to prove that a given problem is NP complete?

From the definition of NP-complete, it appears impossible to prove that a problem L is NP-
Complete. By definition, it requires us to show that every problem in NP is polynomial time
reducible to L. Fortunately, there is an alternate way to prove it. The idea is to take a known
NP-Complete problem and reduce it to L. If polynomial time reduction is possible, we can
prove that L is NP-Complete by transitivity of reduction (If a NP-Complete problem is
reducible to L in polynomial time, then all problems are reducible to L in polynomial time).

What is Reduction?

Let L1 and L2 be two decision problems. Suppose algorithm A2 solves L2. That is, if y is an
input for L2 then algorithm A2 will answer Yes or No depending upon whether y belongs to L2
or not.

The idea is to find a transformation from L1 to L2 so that the algorithm A2 can be part of an
algorithm A1 to solve L1.

Examples:

● Determining whether a given graph has a hamiltonian cycle.


● Determining whether a boolean formula is satisfiable.

NP-hard class: A problem L is in NP-Hard, if and only if satisfiability reduces to L. i.e.,


satisfiability
∝ L. Let L1 and L2 be two problems, problem L1 reducible to L2 (l1∝L2) if and only if there is a
way to solve L1 by a deterministic algorithm in a polynomial time.

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm
Examples,

● Set cover problem


● Node cover (vertex Cover) problem
● Travelling salesperson problem.

Relationship between P,NP,NP-hard,NP-complete

What was the first problem proved as NP-Complete?

There must be some first NP-Complete problem proved by definition of NP-Complete problems.
SAT (Boolean satisfiability problem) is the first NP-Complete problem proved by Cook.

SAT (Satisfiability Problems)

A formula φ is satisfiable (SAT) if there exists an assignment of values to its variables that makes
φ
true.
For example, formula (¬x ∨ y) ∧ (¬y ∨ z) ∧ (x ∨ ¬z ∨ y) is
satisfiable. Choose x = false,y = true, z = true

Conjunctive Normal Form (CNF)

When we’re discussing the SAT problem, it’s essential to know about the conjunctive normal
form (CNF). A boolean expression is said to be in CNF form if it’s a conjunction of a set of
clauses, where each clause is defined as a disjunction (logical OR) of literals. We can define a
literal as a variable or a negation of a variable. In the above example x, ¬x, y,¬y,z,¬z are all
literals.

Every boolean expression can be expressed as a CNF by repeatedly using De Morgan’s Laws,
the law of double negation and distributive laws under AND and OR: C1∧ C2∧ C3….∧ Cn
where, Ci=a i ∨ a i ∨ …….∨ a i
1 2 k

The above CNF is a k-CNF, that is, it has a maximum of k literals in every clause. We pick

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm
each literal from a set of m variables: x1,x2,x3,...xm.¬x1,¬x2,¬x3…..¬xm
There are a number of SAT variants:

● 2-SAT
● Weighted 2-SAT
● 3-SAT
● Exactly 3-SAT.

2- SAT

In the 2-SAT problem, every clause has two literals and is solvable in a polynomial-time problem.

We can pick a variable a, and then assign true. We assign true to the other variable of all the
clauses with ¬ a. If those clauses are satisfied, then we pick the remaining clauses which don’t
have a or ¬ a and continue the process.

On the other hand, if the clauses with ¬ a are not all satisfied at this point, then we assign false
to a and repeat the above process. If all the clauses containing a and ¬ a are all not satisfied,
then the 2-SAT is unsatisfiable.

An important application of 2-SAT is to identify an undirected graph that can be represented as


a complete bipartite graph and independent set. This helps to identify the business relationships
among autonomous subsystems of the internet.

Many algorithms for the automatic label placement problem use the 2-SAT concept.

Weighted 2-SAT

Given a CNF with 2 literals per clause and an integer k, the problem of determining if there is a
truth assignment with at most k variables set to true for which the CNF is satisfiable.

This problem is NP-Complete. Many algorithms use the weighted 2-SAT concept for solving
the vertex cover problem.

3- SAT

3SAT, or the Boolean satisfiability problem, is a problem that asks what is the fastest algorithm
to tell for a given formula in Boolean algebra (with unknown number of variables) whether it is
satisfiable, that is, whether there is some combination of the (binary) values of the variables that
will give 1

In the 3-SAT problem, each clause has at most 3 literals and at least 1 literal must be true. This
problem is NP-Complete.

3- SAT defines the problem of determining whether a given CNF, with each clause containing
at most 3 literals, is satisfiable or not.

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm
The 3-SAT problem is simpler then 2-SAT as it seeks to solve the 2-SAT problem where there
can be at most three variables in each parenthesis in the boolean expression. Still, no
polynomial-time algorithm exists for 3-SAT.

Let’s see an example. We’ll take an SAT problem and make it a 3-

SAT. We’re taking an example of an SAT problem:

Ø = (x1 V x2 V ¬ x4 V ¬ x5) Λ (¬ x1 V x4 V x5 V ¬ x6) Λ (x6 V x7) Λ

(¬ x2 V x3) Ø is satisfiable for truth values:

x1 = 1, x2 = 0, x3 = 0, x4 = 1, x5 = 0, x6 = 0, x7=1

We can introduce two new variables y1, y2 to make the given SAT problem to a 3-SAT problem:

Ø’ = (x1 V x2 V y1) Λ ( ¬ y1 V ¬ x4 V ¬ x5) Λ (¬ x1 V x4 V y2) Λ ( ¬ y2 V x5 V ¬ x6) Λ (x6


V x7) Λ (¬ x2 V x3)

Ø’ is satisfiable for truth values x1 = 1, x2 = 0, x3 = 0, x4 = 1, x5 = 0, x6 = 0, x7=1, y1= 0, y2


= 0. In fact, for either values of y1 and y2, the 3-CNF is satisfiable.

Exactly One 3-SAT

Each clause in the 3-SAT problem has at least 1 literal that must be true. But in Exactly one 3-
SAT problem, we determine if the CNF is satisfiable where exactly one of the three literals in
every clause is true and the other two are false. This problem is NP-Complete.

Exactly one 3-SAT is a known NP-Complete problem, and it’s used in reduction to prove other
problems NP-complete.

Cook’s theorem

Cook’s Theorem states that- ”Any NP problem can be converted to SAT in polynomial time.”

The Cook-Levin Theorem gives a proof that the problem SAT is NP-Complete, via the
technique of showing that any problem in NP may be reduced to it. Before proceeding to the
theorem itself, we revisit some basic definitions relating to NP-Completeness.

First, we need to understand what problems belong to the classes P and NP. If we restrict
ourselves to decision problems, every problem has a set of accepted inputs; we call this set of
inputs a language, and can think of the original problem as reducing to the question of whether
or not a given input is in this language. Then, P is the class of languages for which membership
can be decided in polynomial time, and NP is the class of languages such that x ∈ L if and only
if there’s a certificate of its membership that can be verified in polynomial time.

Second, we need the notion of reducibility of one problem to another. We say that a problem B
can be reduced to A, or B ∝ A if, given access to a solution for A, we can solve B in
polynomial time using polynomially many calls to this solution for A. This idea of reducibility

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm
is crucial to completeness of a problem for a class. If a problem L ∈ NP satisfies that for any L
′ ∈ NP, L ′ ∝ L, then we say L is NP-Complete. It is worth noting that for proving NP-
Completeness, typically a more restrictive form of reduction is used, namely a Karp reduction.
It follows the same basic structure, but requires that only one query be made to the solution for
the problem being reduced to, and that the solution be directly computable from the result of the
query.

The Cook-Levin Theorem shows that SAT is NP-Complete, by showing that a reduction exists
to SAT for any problem in NP. Before proving the theorem, we give a formal definition of the
SAT problem (Satisfiability Problem): Given a boolean formula φ in CNF (Conjunctive Normal
Form), is the formula satisfiable?

In other words, we are given some boolean formula φ with n boolean variables x1, x2, . . . , xn, φ
= C1
∧ C2 ∧ · · · ∧ Ck, where each of the Ci is a clause of the form (t i1 ∨ ti2 ∨ . . . ∨ til), with each lij
is a literal drawn from the set {x1, x2, . . . , xn, ¬x1, ¬x2, . . . , ¬xn}. We need to decide
whether or not there exists some setting of the x1, x2, . . . , xn that causes the formula φ to be
satisfied (take on a boolean 1 value of true).

Cook-Levin Theorem: SAT is NP-complete.

Proof: Suppose L is a NP problem; then L has a polynomial time verifier V :


1. If x ∈ L, ∃ witness y, V (x, y) = 1

2. If x ∉ L,∀ witness y, V (x, y) = 0

We can build a circuit with polynomial size for the verifier V , since the verifier runs in
polynomial time (note that this fact is nontrivial; however, it is left to the reader to verify that it
is true). The circuit contains AND, OR and NOT gates. The circuit has |x| + |y| sources, where |
x| of them are computed to the values of the bits in x and the rest |y| are variables.

Now to solve problem L, we only need to find a setting of |y| variables in the input which
causes the circuit to output a 1. That means problem L has been reduced to determining whether
or not the circuit can be caused to output a 1. The following shows how the circuit satisfaction
problem can be reduced to an instance SAT.

Each gate in the circuit can be represented by a 3CNF (each clause has exactly three terms). For
example:
1. The functionality of an OR gate with input a, b and output Zi is represented as: (a ∨ b ∨ ¬Z i) ∧
(Zi ∨ ¬a) ∧ (Zi ∨ ¬b)

2. The functionality of a NOT gate with input a and output Zi is represented as: (a ∨ ¬Zi) ∧
(¬a ∨ Z i)

Notice although some of the clauses have fewer than 3 terms, it is quite easy to pad them with
independant literals to form clauses in 3CNF. The value of an independent literal won’t affect

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm
the overall boolean value of the clauses. We essentially include clauses for it taking the values 0
or 1; regardless of the actual value, our clauses have the same value as the simpler two-term
clause.

Suppose we have q gates in V marked as Z1, Z2, . . . , Zq with Zq representing the final output
of V . Each of them either takes some of the sources or some intermediate output Zi as input.

Therefore the whole circuit can be represented as a formula in CNF:


φ = C1 ∧ C2 ∧ . . . ∧ Cq ∧
Zq where
Ci = (t1 ∨ t2 ∨ t3), t1, t2, t3 ∈ (x, y, Z1, Z2, . . . Zq,¬Z 1,¬Z 2, . . . ,¬Z q)
As we said before, even if the last clause of φ has only one term Zq, we may extend φ to an
equivalent formula in 3CNF by adding independent variables. Thus, we have shown that the
circuit can be reduced to φ, a formula in 3CNF which is satisfiable if and only if the original
circuit could be made to output a value of 1.

Hence, L ∝ SAT. Since SAT can be trivially shown to be in NP (any 2 satisfying assignments
may be used as a certificate of membership), we may conclude that SAT is NP-Complete.

Furthermore, it should be noted that throughout the proof, we restricted ourselves to formulas
that were in 3CNF. In general, if we restrict the formulas under consideration to be in k-CNF
for some k
∈ Z, we get a subproblem of SAT which we may refer to as k-SAT. Since the above proof
ensured that φ was in 3CNF, it actually demonstrated not only that SAT is NP-Complete, but
that 3-SAT is NP-Complete as well. In fact, one may prove in general that k-SAT is NP-
Complete for k ≥ 3; for k = 2, however, this problem is known to be in P.

Ms. Rajashree Sutrawe, Associate Professor of CSE


Design & Analysis of UNIT-5
Algorithm

The decision variant of this problem makes only two changes to its specification: a positive
integer k is added to the given s; and rather than searching for a V ′ of minimal size, the goal is
to decide whether a V ′ exists with |V ′ | = k. Furthermore, we can reduce the search version of
this problem to the decision version.

We know that the size of a minimal vertex cover must be between 0 and V , so if we are given
access to a solution for the decision version we can just do a binary search over this range to
find the exact value for the minimum possible size for a vertex cover. Once we know this, we
can just check vertices one by one to see if they’re in a minimal vertex cover.

Let k be the size we found for a minimal vertex cover. For each vertex, remove it (and its
incident edges) from the graph, and check whether it is possible to find a vertex cover of size k
− 1 in the resultant graph. If so, the removed vertex is in a minimal cover, so reduce k by 1 and
continue the process on the resulting graph. If not, add the vertex back into the graph, and try
the next one. When k reaches 0, we have our vertex cover. This property of being able to solve
a search problem via queries to an oracle for its decision variant is called self-reducibility.

There are many cases where no polynomial time algorithm is known for finding an optimal
solution. In those situations, we can compromise by seeking a polynomial time near-optimal
solution. However, we want to guarantee in this case that this alternative is close to optimal in
some well-defined sense. To demonstrate that a solution generated by an approximation
algorithm is close to optimal, we need to compare its solution to the optimal one. One way of
formalizing the idea of “close enough” is that of α-approximations which may be formalized
with the following definitions

Ms. Rajashree Sutrawe, Associate Professor of CSE

You might also like