Warning! You are probably being recorded (and transcribed).
Approximate overview
None
(s,t) represents “number of characters in the LCS between
S[s:] and T[t:]. S (indexed by s)
0:D 1:E 2:F 3:E 4:A 5:T 6:
+---+---+---+---+---+---+---+
0:F | | | | | | | 0 |
+---+---+---+---+---+---+---+
1:A | | | | | | | 0 |
+---+---+---+---+---+---+---+
T 2:T | | | | | | | 0 |
+---+---+---+---+---+---+---+
3:E | | | | | | | 0 |
+---+---+---+---+---+---+---+
4: | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
// Find the length of the longest common substring between S and T.
lcs(S, T)
return lcs_kernel(S, 0, T, 0 )
lcs_kernel(S, s, T, t)
if (table[s,t] is empty)
if (s >= S.length) or (t >= T.length)
table[s,t] = 0
else if S[s] == T[t]
table[s,t] = 1 + lcs_kernel(S, s+1, T, t+1)
else
table[s,t] = max(lcs_kernel(S, s+1, T, t) // Drop in S
lcs_kernel(S, s, T, t+1)) // Drop in T
return table[s,t]
Which of the approaches do you want to use (left to right or right to left)?
How might we initialize parts of the table?
How do we fill in the rest of the table?
lcs(S,T)
lcs_table = fill_lcs(S, T)
return lcs_table[0,0]
fill_lcs(S, T)
// Initialize (fill in the last row and column)
for (col = 0; col <= S.length; col++)
table[T.length, col] = 0
for (row = 0; row <= T.length; row++)
table[row, S.length] = 0;
// Fill the rest
for (row = T.length-1; row >=0; row--)
for (col = S.length-1; col >= 0; col--)
if (S[col] == T[row])
table[row,col] = 1 + table[row+1,col+1]
else
table[row,col] = max(table[row+1, col], table[row, col+1])
return table
S (indexed by s)
0:D 1:E 2:F 3:E 4:A 5:T 6:
+---+---+---+---+---+---+---+
0:F | 3 | 3 | 3 | 2 | 2 | 1 | 0 |
+---+---+---+---+---+---+---+
1:A | 2 | 2 | 2 | 2 | 2 | 1 | 0 |
+---+---+---+---+---+---+---+
T 2:T | 1 | 1 | 1 | 1 | 1 | 1 | 0 |
+---+---+---+---+---+---+---+
3:E | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
4: | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+
The iterative algorithm should be \(O(|S|x|T|)\) because the table is that size and we can fill in each cell in constant time..
Once we’ve built the table, how can you extract the LCS?
Note that Problem Set 5 and Assessment 4 will ask a similar question.
Approach one: Walk the first row, any time you drop the value, add the character.
extractLCS(S,T,table)
lcs = {};
row = 0, col = 0
while (row < S.length && col < T.length)
if (S[col] = T[row])
lcs.push(S[col])
row += 1
col += 1
else if (table[row,col] == table[row+1,col])
row += 1
else
col += 1
return lcs
Given two strings, \(S\) and \(T\), what is the fewest number of changes needed to convert \(S\) to \(T\)?
Valid changes:
delete(i) - delete the letter at position i in S.insert(i,ch) - insert the letter ch immediately before position i.replace(i,ch) - replace the letter at position i with ch.
(This is often called substitute.)A bad approach
// "samr" delete(1) // "smr" delete(1) // "sr" delete(1) // "s" insert(1, 'p') // "sp" insert(2, 'a') // "spa" insert(3, 'm') // "spam"
A less-bad approach
// "samr"
replace(1, 'p') // "spmr"
replace(2, 'a') // "spar"
replace(3, 'm') // "spam"
Can we do better?
// "samr"
insert(1, 'p') // "spamr"
delete(3) // "spam"
// "kitten"
delete(0) // "itten"
insert(0,'s') // "sitten"
insert(1,'p') // "spitten"
insert(2,'l') // "splitten"
insert(6,'i') // "splittin"
insert(8,'g') // "splitting"
// "splitting"
Iimprove by making the first step replace(0, 's')
Write a recursive procedure, ed(S,T) that returns the minimum edit
distance between S and T.
Things to think about:
ed(S, T)
// Base case(s)
if (S == T)
return 0
else if (S.length == 0)
return T.length
else if (T.length == 0)
return S.length
// Easy case
else if S[0] == T[0]
return ed(S.substring(1), T.substring(1))
// Minimize
else
return min(1 + ed(S.substring(1), T), // delete first character of S
1 + ed(S, T.substring(1)), // insert first character of T at start of S
1 + ed(S.substring(1),
T.substring(1))) // replace first character of S with 1st of T
ed(S[0:row],T[0:col]).row elements of Scol elements of T. TARGET
s p a m
0 1 2 3 4
+---+---+---+---+---+
0 | | | | | |
S +---+---+---+---+---+
O s 1 | | | | | |
U +---+---+---+---+---+
R a 2 | | | | | |
C +---+---+---+---+---+
E m 3 | | | | | |
+---+---+---+---+---+
r 4 | | | | | |
+---+---+---+---+---+
Note that we’ve flipped almost everything from our LCS solution.
QUESTION: HOW DO WE DO THIS ITERATRIVELY
ed(S,T)
edTable = fillTableED(S,T)
return edTable[S.length, T.length]
fillTableED(S,T)
// Initialize
// Loop
for (row = ?; row ? ?; row++)
for (col = ?; col ? ?; col++)
CODE
edits(S,T)
edTable = fillTableED(S,T)
return editsHelper(S, T, edTable, S.length, T.length, {})
editsHelper(S, T, edTable, col, row, edits)
// If we reach the top left of the table, no edits are needed
if (col == 0) && (row == 0)
return edits
// If the target is empty, we need to ?
else if (col == 0)
?
// If the source is empty, we need to ?
else if (row == 0)
?
// If the current characters match, we don't need an edit here
else if S[row] == T[col]
return editsHelper(S, T, edTable, col-1, row-1, edits)
// At this point, we need an edit
else
// Compute the best direction
best = min(edTable[row-1, col-1], // Represents ?
edTable[row-1, col], // Represents ?
edTable[row, col-1]) // Represents ?
if (edTable[row-1, col-1] == best)
edits.append(?)
return editsHelper(S, T, edTable, row-1, col-1, edits)
else if (edTable[row-1, col] == best)
edits.append(?)
return editsHelper(S, T, edTable, row-1, col, edits)
else
edits.append(?)
return editsHelper(S, T, edTable, row, col-1, edits)
We can also make this iterative.