Understanding Dynamic Programming and Edit Distance

Dynamic programming is a method that optimizes recursive algorithms by avoiding repeated calculations, particularly useful in problems like edit distance. The edit distance measures the minimum mutations needed to transform one string into another, with applications in spelling correction, plagiarism detection, and speech recognition. The algorithm involves defining a recursive relationship and building a solution from base cases, ultimately resulting in a time complexity of O(n^2).

Uploaded by

rafay.utcs6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views28 pages

Understanding Dynamic Programming and Edit Distance

Uploaded by

rafay.utcs6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Dynamic Programming

Dr. Rashid Amin

 Dynamic programming is essentially
recursion without repetition.
 Developing a dynamic programming
algorithm generally involves two separate
steps:
◦ Formulate problem recursively. Write down a
formula for the whole problem as a simple
combination of answers to smaller subproblems.
◦ Build solution to recurrence from bottom up. Write
an algorithm that starts with base cases and works
its way up to the final solution.
 The words “computer” and “commuter” are
very similar, and a change of just one letter,
p->m
 The word “sport” can be changed into “sort”
by the deletion of the ‘p’
 The edit distance of two strings, s1 and s2, is
defined as the minimum number of point
mutations required to change s1 into s2,
where a point mutation is one of:
◦ change a letter,
◦ insert a letter or delete a letter
 For example, the edit distance between FOOD
and MONEY is at most four:
 There are numerous applications of the Edit
Distance algorithm.
 Spelling Correction
◦ If a text contains a word that is not in the dictionary, a
‘close’ word, i.e. one with a small edit distance, may be
suggested as a correction.
◦ such as Microsoft Word, have spelling checking and
correction facility.
 Plagiarism Detection
◦ If someone copies, say, a C program and makes a few
changes here and there, for example, change variable
names, add a comment of two, the edit distance between
the source and copy may be small.
 Computational Molecular Biology
◦ DNA is a polymer. The major units of DNA are nucleotides.
There are four different types of nucleotides found in DNA,
those are given one letter abbreviations as shorthand for
the four bases.
 A-adenine G-guanine
 C-cytosine T-thymine
 Speech Recognition
◦ Algorithms similar to those for the edit-distance problem
are used in some speech recognition systems.
◦ Find a close match between a new utterance and one in a
library of classified utterances.
 A better way to display this editing process is
to place the words above the other:

 The first word has a gap for every insertion (I)

and the second word has a gap for every
deletion (D).
 Columns with two different characters
correspond to substitutions (S). Matches (M)
do not count.
 The Edit transcript is defined as a string over
the alphabet M, S, I, D that describes a
transformation of one string into another. For
example

 S D I M D M
 1+ 1+ 1+ 0+ 1+ 0+ = 4
 In general, it is not easy to determine the
optimal edit distance.
 For example, the distance between
ALGORITHM and ALTRUISTIC is at most 6.
 Suppose we have an m-character string A and
an n-character string B.
 Define E(i, j) to be the edit distance between
the first i characters of A and the first j
characters of B. For example,

 The edit distance between entire strings A

and B is E(m, n).
 The gap representation for the edit sequences
has a crucial “optimal substructure” property.
 If we remove the last column, the remaining
columns must represent the shortest edit
sequence for the remaining substrings.
 The edit distance is 6 for the following two
words.

 If we remove the last column, the edit distance

reduces to 5.
 We can use the optimal substructure property
to devise a recursive formulation of the edit
distance problem.
 There are a couple of obvious base cases:
◦ The only way to convert an empty string into a
string of j characters is by doing j insertions. Thus
 E(0, j) = j
◦ The only way to convert a string of i characters into
the empty string is with i deletions:
 E(i, 0) = i
 There are four possibilities for the last
column in the shortest possible edit
sequence:
 Deletion: Last entry in bottom row is empty.

 In this case E(i, j) = E(i-1, j) + 1

 Insertion: The last entry in the top row is
empty.

 In this case E(i, j) = E(i, j - 1) + 1

 Substitution: Both rows have characters in the
last column.

 If the characters are different, then

 E(i, j) = E(i - 1, j - 1) + 1
 If characters are same, no substitution is
needed: E(i, j) = E(i - 1, j - 1)
 Thus the edit distance E(i, j) is the smallest of
the four possibilities:
 Consider the example of edit between the
words “ARTS” and “MATHS”:

 The edit distance would be in E(4, 5). If we

recursion to compute, we will have
 Recursion clearly leads to the same repetitive
call pattern that we saw in Fibonnaci
sequence.
 To avoid this, we will use the DP approach.
 We will use the base case E(0, j) to fill first
row and the base case E(i, 0) to fill first
column.
 We will fill the remaining E matrix row by row.
 Possible edit scripts. The red arrows from E[0,
0] to E[4, 5] show the paths that can be
followed to extract edit scripts.
 There are Q(n2) entries in the matrix. Each
entry E(i, j) takes Q (1) time to compute. The
total running time is Q(n2).

Minimum Edit Distance in NLP Analysis
No ratings yet
Minimum Edit Distance in NLP Analysis
11 pages
Understanding Minimum Edit Distance
No ratings yet
Understanding Minimum Edit Distance
187 pages
Dynamic Programming: Edit Distance Explained
No ratings yet
Dynamic Programming: Edit Distance Explained
70 pages
Edit Distance in Dynamic Programming
No ratings yet
Edit Distance in Dynamic Programming
17 pages
Minimum Edit Distance in NLP Applications
No ratings yet
Minimum Edit Distance in NLP Applications
41 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
49 pages
Minimum Edit Distance in NLP
No ratings yet
Minimum Edit Distance in NLP
41 pages
NLP L7
No ratings yet
NLP L7
19 pages
Understanding Edit Distance Metrics
No ratings yet
Understanding Edit Distance Metrics
23 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
23 pages
Edit Distance via Dynamic Programming
No ratings yet
Edit Distance via Dynamic Programming
30 pages
Edit Distance
No ratings yet
Edit Distance
56 pages
Spelling Error Detection and Correction
No ratings yet
Spelling Error Detection and Correction
27 pages
Lec12 13 Edit Distance
No ratings yet
Lec12 13 Edit Distance
57 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
52 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
53 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
19 pages
Minimum Edit Distance in NLP
No ratings yet
Minimum Edit Distance in NLP
16 pages
NLP Experiment 2 Writeup
No ratings yet
NLP Experiment 2 Writeup
6 pages
Understanding Minimum Edit Distance in NLP
No ratings yet
Understanding Minimum Edit Distance in NLP
3 pages
Lecture 18
No ratings yet
Lecture 18
7 pages
Minimum Cost Edit Distance Explained
No ratings yet
Minimum Cost Edit Distance Explained
24 pages
Advanced Dynamic Programming: D.1 Saving Space: Divide and Conquer
No ratings yet
Advanced Dynamic Programming: D.1 Saving Space: Divide and Conquer
18 pages
Minimum Edit Distance in NLP
No ratings yet
Minimum Edit Distance in NLP
28 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
35 pages
Edit Distance in Dynamic Programming
No ratings yet
Edit Distance in Dynamic Programming
5 pages
Introduction to Software Systems Course
No ratings yet
Introduction to Software Systems Course
21 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
37 pages
Edit Distance and String Matching Theory
No ratings yet
Edit Distance and String Matching Theory
13 pages
Understanding Minimum Edit Distance
No ratings yet
Understanding Minimum Edit Distance
40 pages
Understanding Minimum Edit Distance
No ratings yet
Understanding Minimum Edit Distance
232 pages
02.2 Edit Distance SLP
No ratings yet
02.2 Edit Distance SLP
34 pages
Levenshtein Distance Algorithm in Python
No ratings yet
Levenshtein Distance Algorithm in Python
8 pages
Dynamic Programming Concepts and Applications
No ratings yet
Dynamic Programming Concepts and Applications
19 pages
Aie241 - NLP Lecture 3
No ratings yet
Aie241 - NLP Lecture 3
43 pages
Minimum Edit Distance Explained
No ratings yet
Minimum Edit Distance Explained
52 pages
Levenshtein Distance Explained
No ratings yet
Levenshtein Distance Explained
14 pages
Edit Distance in C++ for CS 2
No ratings yet
Edit Distance in C++ for CS 2
59 pages
Neerc 2011 Analysis
No ratings yet
Neerc 2011 Analysis
13 pages
Edit Distance and String Alignment
No ratings yet
Edit Distance and String Alignment
2 pages
Minimum Edit Distance
No ratings yet
Minimum Edit Distance
31 pages
Dynamic Programming for Sequence Alignment
No ratings yet
Dynamic Programming for Sequence Alignment
11 pages
Edit Distance and Spelling Correction
No ratings yet
Edit Distance and Spelling Correction
35 pages
Understanding Minimum Edit Distance
No ratings yet
Understanding Minimum Edit Distance
30 pages
Dynamic Programming in Algorithms
No ratings yet
Dynamic Programming in Algorithms
24 pages
Edit Distance in Dynamic Programming
No ratings yet
Edit Distance in Dynamic Programming
2 pages
Edit Distance Computation Explained
No ratings yet
Edit Distance Computation Explained
54 pages
Understanding Minimum Edit Distance
No ratings yet
Understanding Minimum Edit Distance
35 pages
Step-by-Step Levenshtein Algorithm Guide
No ratings yet
Step-by-Step Levenshtein Algorithm Guide
10 pages
Minimum Edit Distance Algorithm Explained
No ratings yet
Minimum Edit Distance Algorithm Explained
7 pages
Weighted Minimum Edit Distance Explained
No ratings yet
Weighted Minimum Edit Distance Explained
2 pages
Dynamic Programming: Subsequence Algorithms
No ratings yet
Dynamic Programming: Subsequence Algorithms
107 pages
DP Notes Chunk4 Massive-31-45
No ratings yet
DP Notes Chunk4 Massive-31-45
15 pages
Understanding Minimum Edit Distance
No ratings yet
Understanding Minimum Edit Distance
35 pages
Spell Check Techniques in NLP
No ratings yet
Spell Check Techniques in NLP
29 pages
Understanding Edit Distance Algorithms
No ratings yet
Understanding Edit Distance Algorithms
15 pages
Edit Distance Problem Analysis and Algorithms
No ratings yet
Edit Distance Problem Analysis and Algorithms
35 pages
Spelling Correction in NLP Systems
No ratings yet
Spelling Correction in NLP Systems
25 pages
O(ND) Algorithm for Sequence Comparison
No ratings yet
O(ND) Algorithm for Sequence Comparison
15 pages
Finding Zeros of Quadratic Polynomials
No ratings yet
Finding Zeros of Quadratic Polynomials
8 pages
Types of Machine Learning Algorithms
No ratings yet
Types of Machine Learning Algorithms
27 pages
Hybrid Quantum Tabu Search For Solving The Vehicle Routing Problem
No ratings yet
Hybrid Quantum Tabu Search For Solving The Vehicle Routing Problem
10 pages
Logo Detection and Retrieval System
No ratings yet
Logo Detection and Retrieval System
10 pages
Sorting Algorithms and Graph MST Codes
No ratings yet
Sorting Algorithms and Graph MST Codes
117 pages
IIT JEE Determinants Solutions Guide
No ratings yet
IIT JEE Determinants Solutions Guide
8 pages
Design and Analysis of Algorithms Syllabus
No ratings yet
Design and Analysis of Algorithms Syllabus
4 pages
Ada Lab Manual: Algorithms in C/C++
No ratings yet
Ada Lab Manual: Algorithms in C/C++
36 pages
Constructing FP Tree for Pattern Mining
No ratings yet
Constructing FP Tree for Pattern Mining
46 pages
Whirlpool Hash Function Overview
No ratings yet
Whirlpool Hash Function Overview
10 pages
OpenCV 2.4 C++ Cheat Sheet
No ratings yet
OpenCV 2.4 C++ Cheat Sheet
2 pages
Power Flow Solutions in Linear Algebra
No ratings yet
Power Flow Solutions in Linear Algebra
47 pages
DAA Lab Manual: Binary Search & Strassen
No ratings yet
DAA Lab Manual: Binary Search & Strassen
47 pages
Recursive Algorithm Analysis Techniques
100% (1)
Recursive Algorithm Analysis Techniques
5 pages
On-Line Discrete Wavelet Transform in EMTP Environment and Applications in Protection Relaying
No ratings yet
On-Line Discrete Wavelet Transform in EMTP Environment and Applications in Protection Relaying
6 pages
SVR for Seasonal Time Series Forecasting
No ratings yet
SVR for Seasonal Time Series Forecasting
10 pages
Optimize Algorithm Performance Analysis
No ratings yet
Optimize Algorithm Performance Analysis
8 pages
LeetCode 344: Reverse String Guide
No ratings yet
LeetCode 344: Reverse String Guide
29 pages
Local Search Algorithms in AI Explained
No ratings yet
Local Search Algorithms in AI Explained
20 pages
Cross-Correlation and Auto - Correlation
No ratings yet
Cross-Correlation and Auto - Correlation
6 pages
Clrs Solution Collection
No ratings yet
Clrs Solution Collection
217 pages
Understanding Trees and BST Concepts
No ratings yet
Understanding Trees and BST Concepts
8 pages
Asymptotic Analysis of Algorithms
No ratings yet
Asymptotic Analysis of Algorithms
10 pages
Three Stage Least Squares Overview
No ratings yet
Three Stage Least Squares Overview
23 pages
KNN Output in Regression Problems
No ratings yet
KNN Output in Regression Problems
31 pages
Numerical Integration Techniques Explained
No ratings yet
Numerical Integration Techniques Explained
29 pages
تدريب متقدم في علم البيانات والذكاء الاصطناعي
No ratings yet
تدريب متقدم في علم البيانات والذكاء الاصطناعي
15 pages
Specialization in AI & Machine Learning
No ratings yet
Specialization in AI & Machine Learning
8 pages
Gaussian Naive Bayes Classifier Insights
No ratings yet
Gaussian Naive Bayes Classifier Insights
21 pages
Big M Method Linear Programming Solutions
No ratings yet
Big M Method Linear Programming Solutions
9 pages

Understanding Dynamic Programming and Edit Distance

Uploaded by

Understanding Dynamic Programming and Edit Distance

Uploaded by

Dynamic Programming

Dr. Rashid Amin

 The first word has a gap for every insertion (I)

 The edit distance between entire strings A

 If we remove the last column, the edit distance

 In this case E(i, j) = E(i-1, j) + 1

 In this case E(i, j) = E(i, j - 1) + 1

 If the characters are different, then

 The edit distance would be in E(4, 5). If we

You might also like