BIL206: Algorithm Analysis and Design
Dynamic Programming
Slides courtesy from Jianhua Ruan, The University of Texas at San Antonio
And some slides from Jeff Edmonds @ York University
5/15/2023 1
In the next few lectures
• Two important algorithm design techniques
– Dynamic programming
– Greedy algorithm
• Meta algorithms, not actual algorithms (like
divide-and-conquer)
• Very useful in practice
• Once you understand them, it is often easier to
reinvent certain algorithms than trying to look
them up!
Optimization Problems
• An important and practical class of computational
problems.
– For each problem instance, there are many feasible solutions
(often exponential)
– Each solution has a value (or cost)
– The goal is to find a solution with the optimal value
• For most of these, the best known algorithm runs in
exponential time.
• Industry would pay dearly to have faster algorithms.
• For some of them, there are quick dynamic
programming or greedy algorithms.
• For the rest, you may have to consider acceptable
solutions rather than optimal solutions.
Dynamic programming
• The word “programming” is historical and predates
computer programming.
• A very powerful, general tool for solving certain
optimization problems.
• Once understood it is relatively easy to apply.
• It looks like magic until you have seen enough
examples.
A motivating example: Fibonacci
numbers
• 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …
F(0) = 1;
F(1) = 1;
F(n) = F(n-1) + f(n-2)
• How to compute F(n)?
A recursive algorithm
function fib(n)
if (n == 0 or n == 1) return 1;
else return fib(n-1) + fib(n-2);
What’s the time complexity?
F(9)
F(8) F(7)
h=n F(7) F(6) F(6) F(5) h = n/2
F(6) F(5) F(5) F(4) F(5) F(4) F(4) F(3)
• Top-down solution: divide & conquer - > O(2^n)
• Time complexity between 2n/2 and 2n
T(n) = F(n) = O(1.6n)
An iterative algorithm
function fib(n)
F[0] = 1; F[1] = 1;
for i = 2 to n
F[i] = F[i-1] + F[i-2];
Return F[n];
Time complexity?
An iterative algorithm
• Problem with the recursive Fib algorithm:
– Each subproblem was solved for many times!
• Solution: avoid solving the same subproblem more than
once
(1) pre-compute all subproblems that may be needed later
(2) Compute on demand, but memorize the solution to avoid
recomputing
• Can you always speedup a recursive algorithm by making it
an iterative algorithm?
– E.g., merge sort
– No. since there is no overlap between the two sub-problems
An iterative algorithm
• bottom-up solution: dynamic programming
• calculate F2, F3, then F4 on a table - > O(n)
Another motivating example:
Shortest path problem
start
goal
• Find the shortest path from start to goal
• An optimization problem
Possible approaches
• Brute-force algorithm
– Enumerate all possible paths and compare
– How many?
• Greedy algorithm
– Always use the currently shortest edge
– Does not necessarily lead to the optimal solution
• Can you think about a recursive strategy?
Optimal subpaths
• Claim: if a path startgoal is optimal, any sub-path xy,
where x, y is on the optimal path, is also optimal.
b
a c a + b + c is shortest
start x y goal
b’ < b
b’ a + b’ + c < a + b + c
• Proof by contradiction
– If the subpath b between x and y is not the shortest
– we can replace it with a shorter one, b’
– which will reduce the total length of the new path
– the optimal path from start to goal is not the shortest =>
contradiction!
– Hence, the subpath b must be the shortest among all paths from x
to y
Shortest path problem
dist(start, A) + SP(A, goal)
SP(start, goal) = min dist(start, B) + SP(B, goal)
dist(start, C) + SP(C, goal)
A special shortest path problem
S
G
n
Each edge has a length (cost). We need to get to G from S. Can only move
to the right or bottom. Aim: find a path with the minimum total length
Use a greedy approach
S 3 9 1 2
0
5 3 3 3 3
3 2 5 2
2 3 3 9 3
2 4 2 3
6 2 3 7 4
3 6 3 3
4 6 3 1 3
1 2 3 2
Recursive solution
• Suppose we’ve found the shortest path n
• It must use one of the two edges:
– (m, n-1) to (m, n) Case 1
– (m-1, n) to (m, n) Case 2
• If case 1
– find shortest path from (0, 0) to (m, n-1)
– SP(0, 0, m, n-1) + dist(m, n-1, m, n) is the
m
overall shortest path
• If case 2
– find shortest path from (0, 0) to (m-1, n)
– SP(0, 0, m-1, n) + dist(m-1, n, m, n) is the overall shortest path
• We don’t know which case is true
– But if we’ve found the two paths, we can compare
– Actual shortest path is the one with shorter overall length
Recursive solution
Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n)
n
F(m-1, n) + dist(m-1, n, m, n)
F(m, n) = min
F(m, n-1) + dist(m, n-1, m, n)
Generalize
F(i-1, j) + dist(i-1, j, i, j)
m
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
• To find shortest path from
(0, 0) to (m, n), we need to
find shortest path to both (m-
1, n) and (m, n-1)
m • If we use recursive call, some
subpaths will be recomputed
for many times
• Strategy: solve subproblems in
certain order such that
redundant computation can
n be avoided
Dynamic Programming
Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n)
n
F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
m
i = 1 .. m, j = 1 .. n
Boundary condition: i = 0 or j = 0.
Easy to figure out. Number of unique subproblems =
m*n
(i-1, j)
Data dependency determines
order to compute: left to right, top
(i, j-1)
to bottom
(i, j)
Dynamic programming illustration
S 3 9 1 2
0
5 3 3 3 3
3 2 5 2
2 3 3 9 3
2 4 2 3
6 2 3 7 4
3 6 3 3
4 6 3 1 3
1 2 3 2
F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
Dynamic programming illustration
S 3 9 1 2
0 3 12 13 15
5 3 3 3 3
3 2 5 2
5 6 8 13 15
2 3 3 9 3
2 4 2 3
7 9 11 13 16
6 2 3 7 4
3 6 3 3
13 11 14 17 20
4 6 3 1 3
1 2 3 2
17 17 17 18 20
G
F(i-1, j) + dist(i-1, j, i, j)
F(i, j) = min
F(i, j-1) + dist(i, j-1, i, j)
Trace back
S 3 9 1 2
0 3 12 13 15
5 3 3 3 3
3 2 5 2
5 6 8 13 15
2 3 3 9 3
2 4 2 3
7 9 11 13 16
6 2 3 7 4
3 6 3 3
13 11 14 17 20
4 6 3 1 3
1 2 3 2
17 17 17 18 20
G
Shortest path has length 20
What is the actual shortest path?
Elements of dynamic programming
• Optimal sub-structures
– Optimal solutions to the original problem contains
optimal solutions to sub-problems
• Overlapping sub-problems
– Some sub-problems appear in many solutions
• Memorization and reuse
– Carefully choose the order that sub-problems are solved,
such that each sub-problem will be solved for at most
once and the solution can be reused
Two steps to dynamic programming
• Formulate the solution as a recurrence relation of
solutions to subproblems.
• Specify an order to solve the subproblems so you
always have what you need.
– Bottom-up
• Tabulate the solutions to all subproblems before they are used
– What if you cannot determine the order easily, of if not
all subproblems are needed?
– Top-down
• Compute when needed.
• Remember the ones you’ve computed
Question
• If instead of shortest path, we want (simple)
longest path, can we still use dynamic
programming?
– Simple means no cycle allowed
Example problems that can be
solved by dynamic programming
• Sequence alignment problem (several different
versions)
• Shortest path in general graphs
• Knapsack (several different versions)
• Scheduling
• etc.
• We’ll discuss alignment, knapsack, and scheduling
problems.
Knapsack problem
• Each item has a value and a weight
• Objective: maximize value
• Constraint: knapsack has a weight limitation
Three versions:
0-1 knapsack problem: take each
item or leave it
Fractional knapsack problem:
items are divisible
Unbounded knapsack problem:
unlimited supplies of each item.
Which one is easiest to solve?
We study the 0-1 problem today.
Formal definition (0-1 problem)
• Knapsack has weight limit W
• Items labeled 1, 2, …, n (arbitrarily)
• Items have weights w1, w2, …, wn
– Assume all weights are integers
– For practical reason, only consider wi < W
• Items have values v1, v2, …, vn
• Objective: find a subset of items, S, such that
iS wi W and iS vi is maximal among all such
(feasible) subsets
Naïve algorithms
• Enumerate all subsets.
– Optimal. But exponential time
• Greedy 1: take the item with the largest value
– Not optimal
– Give an example
• Greedy 2: take the item with the largest
value/weight ratio
– Not optimal
– Give an example
A DP algorithm
• Suppose you’ve find the optimal solution S
• Case 1: item n is included
• Case 2: item n is not included
wn wn
Total weight limit: Total weight limit:
W W
Find an optimal solution using items Find an optimal solution using items
1, 2, …, n-1 with weight limit W - wn 1, 2, …, n-1 with weight limit W
Dynamic Approach for Knapsack
• We need to derive a recurrence relation that
expresses a solution to an instance of the problem
in terms of solutions to its smaller subinstances.
• Consider an instance defined by the first i items
1<=i<=n
– with weights w1, …, wi
– values v1, …, vi
– capacity j, 1<=j<=W
– V[i, j] be the value of an optimal solution to this
instance the most suitable subset of first i items that fit
into the knapsack of capacity j.
Dynamic Approach for Knapsack
• We can divide all subsets of the first i items that
fit the knapsack of capacity w into two categories
– 1. Among the subsets that do not include the ith item,
the value of an optimal subset is, V[ i-1, w]
– 2. Among the subsets that do include the ith item, an
optimal subset is made up of this item and an optimal
subset of the first i-1 items that fit into the knapsack
of capacity w-wi (w-wi>=0)
– The value of such an optimal subset is vi+ V[i-1, w-wi]
Recursive formulation
• Let V[i, w] be the optimal total value when items 1, 2, …, i
are considered for a knapsack with weight limit w
=> V[n, W] is the optimal solution
V[n-1, W-wn] + vn
V[n, W] = max
V[n-1, W]
Generalize
V[i-1, w-wi] + vi item i is taken
V[i, w] = max
V[i-1, w] item i not taken
V[i-1, w] if wi > w item i not taken
Boundary condition: V[i, 0] = 0, V[0, w] = 0. Number of sub-problems = ?
Example
• n = 6 (# of items)
• W = 10 (weight limit)
• Items (weight, value):
2 2
4 3
3 3
5 6
2 4
6 9
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0
2 4 3 0 wi
3 3 3 0 V[i-1, w-wi] V[i-1, w]
4 5 6 0 V[i, w]
5 2 4 0
6 6 9 0
V[i-1, w-wi] + vi item i is taken
max
V[i-1, w] item i not taken
V[i, w] =
V[i-1, w] if wi > w item i not taken
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0
2 4 3 0
3 3 3 0
4 5 6 0
5 2 4 0
6 6 9 0
V[i-1, w-wi] + vi item i is taken
max
V[i-1, w] item i not taken
V[i, w] =
V[i-1, w] if wi > w item i not taken
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0 0 2 2 2 2 2 2 2 2 2
2 4 3 0 0 2 2 3 3 5 5 5 5 5
3 3 3 0 0 2 3 3 5 5 6 6 8 8
4 5 6 0 0 2 3 3 6 6 8 9 9 11
5 2 4 0 0 4 4 6 7 7 10 10 12 13
6 6 9 0 0 4 4 6 7 9 10 13 13 15
V[i-1, w-wi] + vi item i is taken
max
V[i-1, w] item i not taken
V[i, w] =
V[i-1, w] if wi > w item i not taken
w 0 1 2 3 4 5 6 7 8 9 10
i wi vi 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0 0 2 2 2 2 2 2 2 2 2
2 4 3 0 0 2 2 3 3 5 5 5 5 5
3 3 3 0 0 2 3 3 5 5 6 6 8 8
4 5 6 0 0 2 3 3 6 6 8 9 9 11
5 2 4 0 0 4 4 6 7 7 10 10 12 13
6 6 9 0 0 4 4 6 7 9 10 13 13 15
Optimal value: 15
Item: 6, 5, 1
Weight: 6 + 2 + 2 = 10
Value: 9 + 4 + 2 = 15
Binary Knapsack
Time complexity
• Θ (nW)
• Polynomial?
– Pseudo-polynomial
– Works well if W is small
• Consider following items (weight, value):
(10, 5), (15, 6), (20, 5), (18, 6)
• Weight limit 35 Events scheduling problem
– Optimal solution: item 2, 4 (value = 12). Iterate: 2^4 = 16
subsets
– Dynamic programming: fill up a 4 x 35 = 140 table entries
• What’s the problem?
– Many entries are unused: no such weight combination
– Top-down may be better
Sequence alignment
• Compare two strings to see if they are similar
– We need to define similarity
– Very useful in many applications
– Comparing DNA sequences, articles, source code, etc.
– Example: Longest Common Subsequence problem
(LCS)
Common subsequence
• A subsequence of a string is the string with zero
or more chars left out
• A common subsequence of two strings:
– A subsequence of both strings
– Ex: x = {A B C B D A B }, y = {B D C A B A}
– {B C} and {A A} are both common subsequences of x
and y
Longest Common Subsequence
• Given two sequences x[1 . . m] and y[1 . . n], find a
longest subsequence common to them both.
“a” not “the”
x: A B C B D A B
BCBA =
LCS(x, y)
y: B D C A B A
functional notation,
but not a function
Brute-force LCS algorithm
Check every subsequence of x[1 . . m] to see
if it is also a subsequence of y[1 . . n].
Analysis
• 2m subsequences of x (each bit-vector of length m
determines a distinct subsequence of x).
• Hence, the runtime would be exponential !
Towards a better algorithm: a DP strategy
• Key: optimal substructure and overlapping sub-
problems
• First we’ll find the length of LCS. Later we’ll modify
the algorithm to find LCS itself.
Optimal substructure
• Notice that the LCS problem has optimal substructure:
parts of the final solution are solutions of subproblems.
– If z = LCS(x, y), then any prefix of z is an LCS of a prefix of
x and a prefix of y.
i m
x
z
n
y
j
• Subproblems: “find LCS of pairs of prefixes of x and y”
Recursive thinking
m
x
n
y
• Case 1: x[m]=y[n]. There is an optimal LCS that matches
x[m] with y[n]. Find out LCS (x[1..m-1], y[1..n-1])
• Case 2: x[m] y[n]. At most one of them is in LCS
– Case 2.1: x[m] not in LCS Find out LCS (x[1..m-1], y[1..n])
– Case 2.2: y[n] not in LCS Find out LCS (x[1..m], y[1..n-1])
Recursive thinking
m
x
n
y
• Case 1: x[m]=y[n] Reduce both sequences by 1 char
– LCS(x, y) = LCS(x[1..m-1], y[1..n-1]) || x[m]
• Case 2: x[m] y[n] concatenate
– LCS(x, y) = LCS(x[1..m-1], y[1..n]) or
LCS(x[1..m], y[1..n-1]), whichever is longer
Reduce either sequence by 1 char
Finding length of LCS
m
x
n
y
• Let c[i, j] be the length of LCS(x[1..i], y[1..j])
=> c[m, n] is the length of LCS(x, y)
• If x[m] = y[n]
c[m, n] = c[m-1, n-1] + 1
• If x[m] != y[n]
c[m, n] = max { c[m-1, n], c[m, n-1] }
Generalize: recursive formulation
c[i–1, j–1] + 1 if x[i] = y[j],
c[i, j] = max{c[i–1, j], c[i, j–1]} otherwise.
1 2 i m
x: ...
1 2 j n
y: ...
Recursive algorithm for LCS
LCS(x, y, i, j)
if x[i] = y[ j]
then c[i, j] LCS(x, y, i–1, j–1) + 1
else c[i, j] max{ LCS(x, y, i–1, j),
LCS(x, y, i, j–1)}
Worst-case: x[i] y[ j], in which case the
algorithm evaluates two subproblems, each
with only one parameter decremented.
Recursion tree
m = 3, n = 4: 3,4
2,4 same 3,3
subproblem
1,4 2,3 2,3 3,2 m+n
1,3 2,2 1,3 2,2
Height = m + n work potentially exponential.,
but we’re solving subproblems already solved!
DP Algorithm
• Key: find out the correct order to solve the sub-problems
• Total number of sub-problems: m * n
c[i–1, j–1] + 1 if x[i] = y[j],
c[i, j] = max{c[i–1, j], c[i, j–1]} otherwise.
0 j n
i C(i, j)
m
DP Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y[0]
4. for j = 1 to n c[0,j] = 0 // special case: X[0]
5. for i = 1 to m // for all X[i]
6. for j = 1 to n // for all Y[j]
7. if ( X[i] == Y[j])
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c
LCS Example
We’ll see how LCS algorithm works on the following
example:
• X = ABCB
• Y = BDCAB
What is the LCS of X and Y?
LCS(X, Y) = BCB
X =AB C B
Y= B DCAB
LCS Example (0) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
A
1
2 B
3 C
4 B
X = ABCB; m = |X| = 4
Y = BDCAB; n = |Y| = 5
Allocate array c[5,6]
LCS Example (1) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0
2 B
0
3 C 0
4 B 0
for i = 1 to m c[i,0] = 0
for j = 1 to n c[0,j] = 0
LCS Example (2) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0
2 B
0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (3) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0
2 B
0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (4) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1
2 B
0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (5) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (6) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (7) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (8) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (9) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (10) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (11) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (12) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (13) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Example (14) ABCB
j 0 1 2 3 4 5
BDCAB
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )
LCS Algorithm Running Time
• LCS algorithm calculates the values of each entry of
the array c[m,n]
• So what is the running time?
O(m*n)
since each c[i,j] is calculated in
constant time, and there are m*n
elements in the array
How to find actual LCS
• The algorithm just found the length of LCS, but not LCS itself.
• How to find the actual LCS?
• For each c[i,j] we know how it was acquired:
c[i 1, j 1] 1 if x[i] y[ j ],
c[i, j ]
max( c[i, j 1], c[i 1, j ]) otherwise
• A match happens only when the first equation is taken
• So we can start from c[m,n] and go backwards, remember
x[i] whenever c[i,j] = c[i-1, j-1]+1.
2 2 For example, here
2 3 c[i,j] = c[i-1,j-1] +1 = 2+1=3
Finding LCS
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
Time for trace back: O(m+n).
Finding LCS (2)
j 0 1 2 3 4 5
i Y[j] B D C A B
0 X[i]
0 0 0 0 0 0
A
1 0 0 0 0 1 1
2 B
0 1 1 1 1 2
3 C 0 1 1 2 2 2
4 B 0 1 1 2 2 3
LCS (reversed order): B C B
LCS (straight order): B C B
(this string turned out to be a palindrome)
A more general problem
• Aligning two strings, such that
Match = 1
Mismatch = 0 (or other scores)
Insertion/deletion = -1
• Aligning ABBC with CABC
– LCS = 3: ABC
– Best alignment
ABBC ABBC
CABC CABC
Score = 2 Score = 1
Aligment Problem
• Let F(i, j) be the best alignment score between
X[1..i] and Y[1..j].
• F(m, n) is the best alignment score between X and Y
• Recurrence
F(i-1, j-1) + (i, j) Match/Mismatch
F(i, j) = max
F(i-1,j) – 1 Insertion on Y
F(i, j-1) – 1 Insertion on X
(i, j) = 1 if X[i]=Y[j] and 0 otherwise.
Alignment Example ABBC
j 0 1 2 3 4
CABC
i Y[j] C A B C
0 X[i]
A
1
2 B
3 B
4 C
X = ABBC; m = |X| = 4
Y = CABC; n = |Y| = 4
Allocate array F[5,5]
Alignment Example ABBC
CABC
j 0 1 2 3 4
i Y[j] C A B C
X[i]
0 0 -1 -2 -3 -4
A -1 0 0 -1 -2
1
2 B -2 -1 0 1 0
3 B -3 -2 -1 1 1
4 C -4 -2 -2 0 2
F(i-1, j-1) + (i, j) Match/Mismatch
F(i, j) = max F(i-1,j) – 1 Insertion on Y
F(i, j-1) – 1 Insertion on X
(i, j) = 1 if X[i]=Y[j] and 0 otherwise.
Alignment Example ABBC
CABC
j 0 1 2 3 4
i Y[j] C A B C
X[i]
0 0 -1 -2 -3 -4
A -1 0 0 -1 -2
1
2 B -2 -1 0 1 0
ABBC
3 B -3 -2 -1 1 1 CABC
Score = 2
4 C -4 -2 -2 0 2
F(i-1, j-1) + (i, j) Match/Mismatch
F(i, j) = max F(i-1,j) – 1 Insertion on Y
F(i, j-1) – 1 Insertion on X
(i, j) = 1 if X[i]=Y[j] and 0 otherwise.
Alignment as a longest path problem
C A B C
-1
1 0
Alignment as a longest path problem
C A B C
-1
1 0