Longest Common
Subsequence
HARSHADA SONKAMBLE
Longest Common Subsequence
● The longest common subsequence (LCS) is defined as the longest subsequence that is
common to all the given sequences, provided that the elements of the subsequence
are not required to occupy consecutive positions within the original sequences.
● A subsequence is nothing but a series of elements that occur in the same order but are
not necessarily contiguous.
● If two sequences are given in the LCS problem, then our task is to identify a common
subsequence that has a maximum length.
Example
S1 = “ABCDE” S2 = “CDE”
Common Subsequences: “C”, “D”, “E”, “CD”, “DE”, “CE”, “CDE”
Out of these common subsequences, subsequence CDE has a maximum
length.
Algorithm
Solved Example -
Consider two strings:
X= a b a a b a Since both the characters are different so we consider the
maximum value. Both contain the same value, i.e., 0 so put
Y= b a b b a b
0 in (a,b). Suppose we are taking the 0 value from 'X' string,
so we put arrow towards 'a' as shown in the above table.
Both the characters are the same, so the value would be
calculated by adding 1 and upper diagonal value. Here, upper
diagonal value is 0, so the value of this entry would be (1+0)
equal to 1. Here, we are considering the upper diagonal value,
so the arrow will point diagonally.
Since both the characters are different so we consider the
maximum value. The character 'a' has the maximum value,
i.e., 1. The new entry, i.e., (a, b) will contain the value 1
pointing to the 1 value.
Since both the characters are different so we consider the
maximum value. The character 'a' has the maximum value,
i.e., 1. The new entry, i.e., (a, b) will contain the value 1
pointing to the 1 value.
both the characters are same so the value would be
calculated by adding 1 and upper diagonal value. Here, upper
diagonal value is 0 so the value of this entry would be (1+0)
equal to 1. Here, we are considering the upper diagonal value
so arrow will point diagonally.
Since both the characters are different so we consider the
maximum value. The character 'a' has the maximum
value, i.e., 1. The new entry, i.e., (a, b) will contain the
value 1 pointing to the 1 value.
In this way, fill the complete table. The final table would be:
In the above table, we can observe that all the entries are filled. Now we are at the last cell
having 4 value. This valve came from its previous column. Move to previous column Now in
this column 4 have calculated using diagonal value. When moving to diagonal consider that
character i.e a
After moving diagonal we have value 3.. Again this value came from its previous
column.move to previous column. Now in this column 3 have calculated using diagonal value.
So when moving to diagonal consider that character i.e b
After moving diagonal we have value 2. This value came from its diagonal column. Move to
diagonal & consider character i.e a
After moving diagonal we have value 1. This value came from its diagonal value . move to
Hence The Longest Common subsequence in given string is : baba
Length of Subsequence is : 4
Time Complexity
The time taken by the dynamic programming approach to complete a table is O(mn)
Longest Common Subsequence Applications
1. In compressing genome resequencing data
2. To authenticate users within their mobile phone through in-air signatures