Daa Unit-5
Daa Unit-5
Branch and Bound: General method, applications - Traveling salesperson problem, 0/1 knapsack
problem - LC Branch and Bound solution, FIFO Branch and Bound solution.
NP-Hard and NP-Complete problems: Basic concepts, non-deterministic algorithms, NP-Hard and NP-
Complete classes, Cook’s theorem.
Tractable and Intractable Problems: Computability of Algorithms, Computability classes – P, NP, NP-
complete and NP-hard, Cook’s theorem. Advanced Topics: Approximation algorithms, Randomized
Branch and Bound:
algorithms.
Branch and bound is an algorithm design paradigm which is generally used for solving combinatorial
optimization problems. These problems are typically exponential in terms of time complexity and may
require exploring all possible permutations in the worst case. The Branch and Bound Algorithm technique
solves these problems relatively quickly.
The Branch and Bound Algorithm is a method used in combinatorial optimization problems to
systematically search for the best solution. It works by dividing the problem into smaller subproblems, or
branches, and then eliminating certain branches based on bounds on the optimal solution. This process continues
until the best solution is found or all branches have been explored. Branch and Bound is commonly used in
problems like the traveling salesman and job scheduling.
Branch and bound is a systematic method for solving optimization problems. Branch and bound is a
rather general optimization technique that applies where the greedy method and dynamic programming fail.
However, it is much lower. Indeed, it often leads to exponential time complexities in the worst case. On the
other hand, if applied carefully, it can lead to algorithms that run reasonably fast on average.
Branch-and-bound procedure requires two tools.
1. Branching
⮚ a way of covering the feasible region by several smaller feasible sub-regions (ideally,
splitting into sub-regions).
⮚ since the procedure may be repeated recursively to each of the
sub regions and all produced sub-regions naturally form a tree structure, called search tree
or branch-and-bound-tree. Its nodes are the constructed sub-regions.
2. Bounding
⮚ which is a fast way of finding upper and lower bounds for the optimal
solution within a feasible sub-region
The Branch and bound technique like Backtracking explores the implicit graph and deals with the
optimal solution to a given problem. In this technique at each stage we calculate the bound for a particular
node and check whether this bound will be able to give the solution or not .If we find that at any node the
solution so obtained is appropriate but the remaining solution is not leading to a best case then we leave
this node without exploring.
Terminologies
TSP includes as a salesperson who has to visit a number of cities during a tour and the condition is to visit
all the cities exactly once and return back to the same city where the person started.
Basic steps
General method-
The principle consists in partitioning the solution space into disjoint subsets, which are represented by the
nodes of the branching tree. Then, the algorithm explores the branches of the tree according to a route
strategy. To avoid exploring the entire tree, before creating a new node in the tree, the algorithm evaluates
the node by comparing the value of the best possible solution, which could be found in the corresponding
subtree, with the best current solution. If a better solution cannot belong to the subtree rooted at the
considered node, the subtree is discarded. Otherwise, the exploration process continues. This method can be
seen as a divide-and-conquer approach. Lots of combinatorial optimization problems may be solved, more
or less efficiently, using a branch and bound strategy.
Branch-and-bound this is one more problem solving strategy a method by which we can solve a problem this
is similar to backtracking in the sense that it also uses a state-space stream for solving the problem solution
by represented like a state space tree but it is useful for solving optimization problem and only minimization
problem not maximization problem anyway we can convert a maximization problem into minimization and
we can solve it.
Applications :
Branch and Bound solution is the best suited method when item weights are not integers. To check if a
particular node can give us a better solution or not, we compute the optimal solution (through the node)
using the Greedy approach. If the solution computed by the Greedy approach itself is more than the best
so far, then we can’t get a better solution through the node.
Branch and Bound
⮲ Definitions:
Branch and Bound is a state space search method in which all the
children of a node are generated before expanding any of its children.
The search for an answer node can often be speeded by using an “intelligent”
ranking function, also called an
^
^
Expanded-node (E-node): is the live node with the best C
value
⮲ Requirements
Branching: A set of solutions, which is represented by a node, can be
partitioned into mutually exclusive sets. Each subset in the partition is
represented by a child of the original node.
^
C (x) ≤C(x), and
● Least-cost search:
^
The next E-node is the one with least C
⮲ Example: 8-puzzle
^
= g(x) +h(x)
Cost function: C
where
h(x) = the number of misplaced tiles and
g(x) = the number of moves so far
1 2 3
5 6
7 8 4
^ ^
C=1+4=5 C 1 4 ^ 5
C=1+2=3
1 2 3 1 2 3 1 2
5 6 4 5 6 5 6 3
7 8 7 8 4 7 8 4
^ ^
^
C=2+1=3 C=2+
C=2+3=5
3=5
1 2 3 1 2 3 1 3
5 8 6 5 6 5 2 6
7 4 7 8 4 7 8 4
^
C=3+2=5 ^
C=3+0=3
1 2 3 1 2 3
5 8 6 5 8 6
7 4 7 4
- If zi > U
- then
- Else
If child i is a solution
Then
U =zi; current best = child i;
Else
Add child i to live_node_set;
Endif;
Endif;
- Endfor;
Endwhile;
⮲ Travelling Salesman Problem: A Branch and Bound algorithm
Definition: Find a tour of minimum cost starting from a node S going
through other nodes only once and returning to the starting point S.
Definitions:
Branching:
L
All Solutions
L1 L2
Subtract of a constant from any row and any column does not
change the optimal solution (The path).
The cost of the path changes but not the path itself.
Pick the edge that causes the greatest increase in the lower
bound of the right subtree, i.e., the lower bound of the root
of the right subtree is greater.
●Example:
o The reduced cost matrix is done as follows:
- Change all entries of row i and column j to
infinity
- Set A(j,1) to infinity (assuming the start node is 1)
Vertex = 2 Vertex = 5
Vertex = 3 Vertex = 4
2 3 4 5
35 53 25 31
Vertex = 3 Vertex = 5
9 52 10 28
Vertex = 3
28
11
● The TSP starts from node 1: Node 1
o Reduced Matrix: To get the lower bound of the path starting at
node 1
▪ Row # 1: reduce by 10
▪ Row # 4: Reduce by 4
▪ Column 1: Reduce by 1
▪ Column 2: It is reduced.
▪ Column 3: Reduce by 3
𝖥𝑖𝑛𝑓 1010 17
17 00 11
⎢ 12
𝖥𝑖𝑛𝑓𝑖𝑛𝑓 11
11 22 00
⎢ 0 3𝑖𝑛𝑓 𝑖𝑛𝑓 0 2
⎢ ⎢ 12
⎢⎢ 15 3 12 𝑖𝑛𝑓 0
⎣ 110 03 0 12
⎢ 𝑖𝑛𝑓 2
0
⎢
15 3 12 0
𝑖𝑛𝑓
⎣ 11 0 0 12 𝑖𝑛𝑓
⎦
▪ Column 4: It is reduced.
▪ Column 5: It is reduced.
cost(1) = 25
𝖥𝑖𝑛𝑓 ⎤
⎢ 12 ⎥
⎢0 3 2 0⎥
1 17 0 1
𝑖 𝑖𝑛𝑓02
⎢ 15 0 ⎥
0
⎢ 3 𝑛 12𝑖𝑛𝑓 ⎥
11
⎣ 11 0 𝑓 0 12 𝑖𝑛𝑓 ⎦
● Choose to go to vertex 2: Node 2
- Cost of edge <1,2> is: A(1,2) = 10
- Set row #1 = inf since we are choosing edge <1,2>
● In summary:
o So the live nodes we have so far are:
▪ 2: cost(2) = 35, path: 1->2
▪ 3: cost(3) = 53, path: 1->3
▪ 4: cost(4) = 25, path: 1->4
▪ 5: cost(5) = 31, path: 1->5
o Explore the node with the lowest cost: Node 4 has a cost of 25
o Vertices to be explored from node 4: 2, 3, and 5
o Now we are starting from the cost matrix at node 4 is:
Cost(4) = 25
𝖥𝑖𝑛𝑓𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 ⎤
⎢ 12𝑖𝑛𝑓 11 𝑖𝑛𝑓 0 ⎥
0 2:3Node𝑖𝑛𝑓
● Choose to go⎢to vertex 6 (path𝑖𝑛𝑓 2 ⎥
o Cost of⎢edge𝑖𝑛𝑓 12= 3 𝑖𝑛𝑓 0
<4,2> is:3A(4,2) ⎥
⎥
is 1->4->2)
⎢ #4 11 0 we 0𝑖𝑛𝑓 𝑖𝑛𝑓⎦
⎣
o Set row = inf since are considering edge
<4,2>
o Set column # 2 = inf since we are considering edge <4,2>
o Set A(2,1) = inf
o The resulting cost matrix is:
▪ Reduce column # 1: by 11
o The lower bound is: RCL = 13
● In summary:
o So the live nodes we have so far are:
▪ 2: cost(2) = 35, path: 1->2
▪ 3: cost(3) = 53, path: 1->3
▪ 5: cost(5) = 31, path: 1->5
▪ 6: cost(6) = 28, path: 1->4->2
▪ 7: cost(7) = 50, path: 1->4->3
▪ 8: cost(8) = 36, path: 1->4->5
o Explore the node with the lowest cost: Node 6 has a cost of 28
o Vertices to be explored from node 6: 3 and 5
Cost(6) = 28
𝖥𝑖𝑛𝑓
𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓 𝑖𝑛𝑓
⎤
𝑖𝑛 1 𝑖𝑛 0
𝑓 1 𝑓
▪ Reduce column # 1: by 11
▪ Columns reduced
o The lower bound is: RCL = 0
o So the cost of node 10 (Considering vertex 5 from vertex 2) is:
▪ Cost(10) = cost(6) + RCL + A(2,3) = 28 + 0
+ 0 = 28
● In summary:
o So the live nodes we have so far are:
▪ 2: cost(2) = 35, path: 1->2
▪ 3: cost(3) = 53, path: 1->3
▪ 5: cost(5) = 31, path: 1->5
▪ 7: cost(7) = 50, path: 1->4->3
▪ 8: cost(8) = 36, path: 1->4->5
▪ 9: cost(9) = 52, path: 1->4->2->3
▪ 10: cost(2) = 28, path: 1->4->2->5
o Explore the node with the lowest cost: Node 10 has a cost of 28
o Vertices to be explored from node 10: 3
o Now we are starting from the cost matrix at node 10 is:
● Choose to go to vertex 3: Node 11 ( path is 1->4->2-
>5->3 )
o Cost of edge <5,3> is: A(5,3) = 0
o Set row #5 = inf since we are considering edge
<5,3>
o Set column # 3 = inf since we are considering edge <5,3>
o Set A(3,1) = inf
o The resulting cost matrix is:
▪ Columns reduced
o The lower bound is: RCL = 0
o So the cost of node 11 (Considering vertex 5 from vertex 3) is:
▪ Cost(11) = cost(10) + RCL + A(5,3) = 28 + 0 + 0 = 28
`
The Backtracking Solution can be optimized if we know a bound on the best possible solution
subtree rooted with every node. If the best in subtree is worse than the current best, we can
simply ignore this node and its subtrees. So we compute the bound (best solution) for every
node and compare the bound with the current best solution before exploring the node.
Algorithm:
● Sort all items in decreasing order of ratio of value per unit weight so that an upper bound
can be computed using Greedy Approach.
● Initialize maximum profit, maxProfit = 0
● Create an empty queue, Q.
● Create a dummy node of the decision tree and enqueue it to Q. Profit and weight of the
dummy node are 0.
● Do the following while Q is not empty.
● Extract an item from Q. Let the extracted item be u.
● Compute profit of next level node. If the profit is more than maxProfit, then update
maxProfit.
● Compute bound of next level node. If bound is more than maxProfit, then add the next
level node to Q.
● Consider the case when the next level node is not considered as part of the solution and
add a node to queue with level as next, but weight and profit without considering next
level nodes.
Example bounds used in below diagram are, A down can give $315, B down can $275, C down
can $225, D down can $125 and E down can $30 and knapsack capacity is 30
Design & Analysis of UNIT-5
Algorithm
Let’s start by reminding ourselves of some common functions, ordered by how fast they grow.
k k
Polynomial functions: Any function that is O(n ), i.e. bounded from above by n for some
constant k.
E.g. O(1), O(log n), O(n), O(n × log n), O(n 2 ), O(n 3 ).
We can define ‘polynomial’ as any function of the form a kn k + ak−1n k−1 + . . . + a1n + a0. But
here the word ‘polynomial’ is used to put together the functions that are bounded from above
by polynomials. So, log n and n × log n, which are not polynomials in our original sense, are
polynomials by our alternative definition, because they are bounded from above by, e.g., n and
n 2 respectively.
Exponential functions: The remaining functions. E.g. O(2n), O(n!), O(n n). A function of the
form k n is genuinely exponential. But now some functions which are worse than polynomials
but not quite exponential, such as O(n log n), are also (incorrectly) called exponential. And some
functions which are worse than exponential, such as the super exponentials, e.g. O(n n), will
also (incorrectly) be called exponential. A better word than ‘exponential’ would be ‘super-
polynomial’. But ‘exponential’ is what everyone uses, so it’s what we’ll use.
On the basis of this classification of functions into polynomial and exponential, we can classify
algorithms:
Computability is the ability to solve a problem in an effective manner. It is a key topic of the
field of computability theory within mathematical logic and the theory of computation within
computer science. The computability of a problem is closely linked to the existence of an
algorithm to solve the problem.
In Computer Science, many problems are solved where the objective is to maximize or
minimize some values, whereas in other problems we try to find whether there is a solution or
not. Hence, the problems can be categorized as follows −
Optimization Problem- Optimization problems are those for which the objective is to maximize
or minimize some values.
For example,
Decision Problem- There are many problems for which the answer is a Yes or a No. These
types of problems are known as decision problems.
For example,
In this context, we can categorize the problems as follows −P, NP, NP-complete and NP-hard.
P, NP, NP-complete and NP-hard
P-Class:
– A deterministic algorithm is (essentially) one that always computes the correct answer
The class P consists of those problems that are solvable in polynomial time, i.e. these problems
can be solved in time O(nk) in the worst-case, where k is constant. These problems are called
tractable, while others are called intractable or superpolynomial.
Formally, an algorithm is a polynomial time algorithm, if there exists a polynomial p(n) such that
Problems requiring Ω(n50) time to solve are essentially intractable for large n. Most known
polynomial time algorithms run in time O(n k) for a fairly low value of k. The advantages in
considering the class of polynomial-time algorithms is that all reasonable deterministic single
processor models of computation can be simulated on each other with at most a polynomial.
Example:
NP-Class:
The class of decision problems that are solvable in polynomial time on a nondeterministic
machine or with a nondeterministic algorithm.
– A deterministic computer is what we know. (There exist unique state from any state with
specified input)
– A nondeterministic computer is one that can “guess” the right answer or solution. (There
exist multiple states from any states for a given input)
• Thus NP can also be thought of as the class of problems “whose solutions can be verified in
polynomial time”
NP is the class of decision problems for which it is easy to check the correctness of a claimed
answer, with the aid of a little extra information. Hence, we aren’t asking for a way to find a
solution, but only to verify that an alleged solution (Without proof) really is correct. Every
problem in this class can be solved in exponential time using exhaustive search.
Example:
● Every Problem whis is P class is also in NP class but Every class which is NP is not in P
class.
● P class problems can be solved efficiently but NP class problems can’t be solved efficiently.
NP class problems again divided into two categories: NP-Hard problems and NP-complete
Problems.
● L1 is in NP i.e., L1 Є NP
● Every Problem that is in NP is polynomial time reducible to L1. i.e., L2 ∝ L1
From the definition of NP-complete, it appears impossible to prove that a problem L is NP-
Complete. By definition, it requires us to show that every problem in NP is polynomial time
reducible to L. Fortunately, there is an alternate way to prove it. The idea is to take a known
NP-Complete problem and reduce it to L. If polynomial time reduction is possible, we can
prove that L is NP-Complete by transitivity of reduction (If a NP-Complete problem is
reducible to L in polynomial time, then all problems are reducible to L in polynomial time).
What is Reduction?
Let L1 and L2 be two decision problems. Suppose algorithm A2 solves L2. That is, if y is an
input for L2 then algorithm A2 will answer Yes or No depending upon whether y belongs to L2
or not.
The idea is to find a transformation from L1 to L2 so that the algorithm A2 can be part of an
algorithm A1 to solve L1.
Examples:
There must be some first NP-Complete problem proved by definition of NP-Complete problems.
SAT (Boolean satisfiability problem) is the first NP-Complete problem proved by Cook.
A formula φ is satisfiable (SAT) if there exists an assignment of values to its variables that makes
φ
true.
For example, formula (¬x ∨ y) ∧ (¬y ∨ z) ∧ (x ∨ ¬z ∨ y) is
satisfiable. Choose x = false,y = true, z = true
When we’re discussing the SAT problem, it’s essential to know about the conjunctive normal
form (CNF). A boolean expression is said to be in CNF form if it’s a conjunction of a set of
clauses, where each clause is defined as a disjunction (logical OR) of literals. We can define a
literal as a variable or a negation of a variable. In the above example x, ¬x, y,¬y,z,¬z are all
literals.
Every boolean expression can be expressed as a CNF by repeatedly using De Morgan’s Laws,
the law of double negation and distributive laws under AND and OR: C1∧ C2∧ C3….∧ Cn
where, Ci=a i ∨ a i ∨ …….∨ a i
1 2 k
The above CNF is a k-CNF, that is, it has a maximum of k literals in every clause. We pick
● 2-SAT
● Weighted 2-SAT
● 3-SAT
● Exactly 3-SAT.
2- SAT
In the 2-SAT problem, every clause has two literals and is solvable in a polynomial-time problem.
We can pick a variable a, and then assign true. We assign true to the other variable of all the
clauses with ¬ a. If those clauses are satisfied, then we pick the remaining clauses which don’t
have a or ¬ a and continue the process.
On the other hand, if the clauses with ¬ a are not all satisfied at this point, then we assign false
to a and repeat the above process. If all the clauses containing a and ¬ a are all not satisfied,
then the 2-SAT is unsatisfiable.
Many algorithms for the automatic label placement problem use the 2-SAT concept.
Weighted 2-SAT
Given a CNF with 2 literals per clause and an integer k, the problem of determining if there is a
truth assignment with at most k variables set to true for which the CNF is satisfiable.
This problem is NP-Complete. Many algorithms use the weighted 2-SAT concept for solving
the vertex cover problem.
3- SAT
3SAT, or the Boolean satisfiability problem, is a problem that asks what is the fastest algorithm
to tell for a given formula in Boolean algebra (with unknown number of variables) whether it is
satisfiable, that is, whether there is some combination of the (binary) values of the variables that
will give 1
In the 3-SAT problem, each clause has at most 3 literals and at least 1 literal must be true. This
problem is NP-Complete.
3- SAT defines the problem of determining whether a given CNF, with each clause containing
at most 3 literals, is satisfiable or not.
x1 = 1, x2 = 0, x3 = 0, x4 = 1, x5 = 0, x6 = 0, x7=1
We can introduce two new variables y1, y2 to make the given SAT problem to a 3-SAT problem:
Each clause in the 3-SAT problem has at least 1 literal that must be true. But in Exactly one 3-
SAT problem, we determine if the CNF is satisfiable where exactly one of the three literals in
every clause is true and the other two are false. This problem is NP-Complete.
Exactly one 3-SAT is a known NP-Complete problem, and it’s used in reduction to prove other
problems NP-complete.
Cook’s theorem
Cook’s Theorem states that- ”Any NP problem can be converted to SAT in polynomial time.”
The Cook-Levin Theorem gives a proof that the problem SAT is NP-Complete, via the
technique of showing that any problem in NP may be reduced to it. Before proceeding to the
theorem itself, we revisit some basic definitions relating to NP-Completeness.
First, we need to understand what problems belong to the classes P and NP. If we restrict
ourselves to decision problems, every problem has a set of accepted inputs; we call this set of
inputs a language, and can think of the original problem as reducing to the question of whether
or not a given input is in this language. Then, P is the class of languages for which membership
can be decided in polynomial time, and NP is the class of languages such that x ∈ L if and only
if there’s a certificate of its membership that can be verified in polynomial time.
Second, we need the notion of reducibility of one problem to another. We say that a problem B
can be reduced to A, or B ∝ A if, given access to a solution for A, we can solve B in
polynomial time using polynomially many calls to this solution for A. This idea of reducibility
The Cook-Levin Theorem shows that SAT is NP-Complete, by showing that a reduction exists
to SAT for any problem in NP. Before proving the theorem, we give a formal definition of the
SAT problem (Satisfiability Problem): Given a boolean formula φ in CNF (Conjunctive Normal
Form), is the formula satisfiable?
In other words, we are given some boolean formula φ with n boolean variables x1, x2, . . . , xn, φ
= C1
∧ C2 ∧ · · · ∧ Ck, where each of the Ci is a clause of the form (t i1 ∨ ti2 ∨ . . . ∨ til), with each lij
is a literal drawn from the set {x1, x2, . . . , xn, ¬x1, ¬x2, . . . , ¬xn}. We need to decide
whether or not there exists some setting of the x1, x2, . . . , xn that causes the formula φ to be
satisfied (take on a boolean 1 value of true).
We can build a circuit with polynomial size for the verifier V , since the verifier runs in
polynomial time (note that this fact is nontrivial; however, it is left to the reader to verify that it
is true). The circuit contains AND, OR and NOT gates. The circuit has |x| + |y| sources, where |
x| of them are computed to the values of the bits in x and the rest |y| are variables.
Now to solve problem L, we only need to find a setting of |y| variables in the input which
causes the circuit to output a 1. That means problem L has been reduced to determining whether
or not the circuit can be caused to output a 1. The following shows how the circuit satisfaction
problem can be reduced to an instance SAT.
Each gate in the circuit can be represented by a 3CNF (each clause has exactly three terms). For
example:
1. The functionality of an OR gate with input a, b and output Zi is represented as: (a ∨ b ∨ ¬Z i) ∧
(Zi ∨ ¬a) ∧ (Zi ∨ ¬b)
2. The functionality of a NOT gate with input a and output Zi is represented as: (a ∨ ¬Zi) ∧
(¬a ∨ Z i)
Notice although some of the clauses have fewer than 3 terms, it is quite easy to pad them with
independant literals to form clauses in 3CNF. The value of an independent literal won’t affect
Suppose we have q gates in V marked as Z1, Z2, . . . , Zq with Zq representing the final output
of V . Each of them either takes some of the sources or some intermediate output Zi as input.
Hence, L ∝ SAT. Since SAT can be trivially shown to be in NP (any 2 satisfying assignments
may be used as a certificate of membership), we may conclude that SAT is NP-Complete.
Furthermore, it should be noted that throughout the proof, we restricted ourselves to formulas
that were in 3CNF. In general, if we restrict the formulas under consideration to be in k-CNF
for some k
∈ Z, we get a subproblem of SAT which we may refer to as k-SAT. Since the above proof
ensured that φ was in 3CNF, it actually demonstrated not only that SAT is NP-Complete, but
that 3-SAT is NP-Complete as well. In fact, one may prove in general that k-SAT is NP-
Complete for k ≥ 3; for k = 2, however, this problem is known to be in P.
The decision variant of this problem makes only two changes to its specification: a positive
integer k is added to the given s; and rather than searching for a V ′ of minimal size, the goal is
to decide whether a V ′ exists with |V ′ | = k. Furthermore, we can reduce the search version of
this problem to the decision version.
We know that the size of a minimal vertex cover must be between 0 and V , so if we are given
access to a solution for the decision version we can just do a binary search over this range to
find the exact value for the minimum possible size for a vertex cover. Once we know this, we
can just check vertices one by one to see if they’re in a minimal vertex cover.
Let k be the size we found for a minimal vertex cover. For each vertex, remove it (and its
incident edges) from the graph, and check whether it is possible to find a vertex cover of size k
− 1 in the resultant graph. If so, the removed vertex is in a minimal cover, so reduce k by 1 and
continue the process on the resulting graph. If not, add the vertex back into the graph, and try
the next one. When k reaches 0, we have our vertex cover. This property of being able to solve
a search problem via queries to an oracle for its decision variant is called self-reducibility.
There are many cases where no polynomial time algorithm is known for finding an optimal
solution. In those situations, we can compromise by seeking a polynomial time near-optimal
solution. However, we want to guarantee in this case that this alternative is close to optimal in
some well-defined sense. To demonstrate that a solution generated by an approximation
algorithm is close to optimal, we need to compare its solution to the optimal one. One way of
formalizing the idea of “close enough” is that of α-approximations which may be formalized
with the following definitions