CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Dynamic Programming
Atul Gupta
Problem Solving
Standard design principles/techniques looked so far
Divide and Conquer Graph Algorithms Greedy
Good aspect : Fast Algorithms Not so good aspect: specific and limited use
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Dynamic Programming
A sledgehammer of the algorithms craft Can be applied in many situations where the other techniques will fail However, a little more costly
DAG
There are two subclasses of graphs that have no negative cycles
Graph without negative edges Graph without cycles (DAG)
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
DAG
An important property
They can be linearized, i.e., In any path of a DAG, the vertices appear in increasing linearized order This allows to find shortest path in linear time
Shortest Paths in DAG
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Shortest Paths in DAG
Shortest Paths in DAG
The Algorithm
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Dynamic Programming
a problem is solved by identifying a collection of sub-problems and tackling them one by one,
smallest first, using the answers to small problems to help figure out larger ones, until the whole lot of them is solved
Dynamic Programming and DAG
A DAG is not given but implicit
Nodes are sub-problems Edges are dependencies between the subproblems An edge from A to B indicate, we require solution of A to solve for B, and hence, A have to be a smaller problem than B
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
DP Example 1: LIS
LIS Longest Increasing Subsequences
Input: A sequence of numbers Output: Find the increasing subsequence of greatest length
LIS
The DAG
The Goal: Find the longest path in the DAG
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
LIS
The Algorithm
The Complexity? n^2 The sequence itself?
Solving LIS by Recursion?
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Revisiting Fibonacci
Using Recursion
Why did recursion worked in Divide and Conquer (E.g. Binary Search, Merge sort, etc)
sub-problems were substantially smaller! the recursion tree had logarithmic depth and polynomial number of nodes !!!
Why recursion does not work here (i.e. say, in LIS)
sub-problems were not smaller! the recursion tree had polynomial depth and exponential number of nodes !!! Most of the sub-problems are REPEATS only
In Dynamic programming, the efficiency is achieve by
explicitly enumerating the distinct sub-problems, and solving them in the right order
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Another Problem: Edit Distance
How a spell checker search for a close match when encounters a misspelled word?
S S
N O W N N Cost: 3
Y Y
S U
N O W N Cost: 5
Y Y
Three edits for the first case: insert U, substitute O and delete W (Cost : 3)
N,
Terribly inefficient if we follow a nave approach!!!
A DP Solution for Edit Distance
1. What are the sub-problems? 2. There is an ordering on the sub-problems, and a relation that shows how to solve a subproblem, given the answers to smaller subproblems
(A DAG representing the problem graph)
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Sub-Problems for Edit Distance
Given two strings of size x[1..m] and y[1..n], Find the minimum Edit Distance between them
Find Edit Distance between two prefix strings x[1..i] and y[1..j] Call this problem as E(i, j) The original problem is then E(m, n)
Solving A Sub-Problem
Solving E(i, j)
Three cases for the rightmost column
E(i 1, j),
E(i, j 1)
diff(i, j) + E(I 1, j1)}
E(i, j) = min{1 + E(i 1, j), 1 + E(i, j 1), diff(i, j) + E(I 1, j1)}, where diff(i, j) is defined to be 0 if x[i] = y[j] and 1 otherwise.
10
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Edit Distance: An Example
E(7, 5) =
Solving Sub-problems: Edit Distance
E(i, j) = min{1 + E(i 1, j), 1 + E(i, j 1), di(i, j) + E(I 1, j1)}
11
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Edit Distance: Smallest Sub-problems
The Base Cases
E(i, 0) = i E(j, 0) = j
Edit Distance
The Algorithm
12
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Edit Distance: The Example
Edit Distance: The Examples DAG
One Solution -
13
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Edit Distance
A category of efficient String matching algorithms A more general form of the problem may require considering different weights for insertions, deletions, and/or substitutions
Computational Genomics
Our bodies are extraordinary machines Capability is programmed as DNA sequence
3 billion characters long over the alphabet {A, C, G, T }
The DNA sequences of any two people differ by only about 0.1%
still leaves three million positions on which they vary
14
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
Computational Genomics
DNA is further broken down into smaller units called genes Computers have become a crucial tool in understanding the genes Computational Genomics
A new gene is discovered, and looked for places
A variation of Edit Distance and Dynamic Programming The database GenBank already has a total length of over 1010 and counting
Methods for sequencing DNA to construct it The EVOLUTION
More Problems on DP
DP solutions for some more problems are included in the next lecture slide
15
CS 306 DAA, IIITDM Jabalpur, 2012
10/11/2012
References
Chapter 6, Algorithms by S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani
16