0% found this document useful (0 votes)

309 views11 pages

Longest Common Subsequence

The document discusses algorithms for finding the longest common subsequence (LCS) between two strings. It presents: 1. The definition of LCS and examples of computing LCS between strings. 2. A dynamic programming formulation to find LCS in O(nm) time, where n and m are string lengths. 3. An O(n log n) algorithm to find LCS by reducing it to problems of finding the longest increasing subsequence and constructing the smallest cover, both solvable in O(n log n) time.

Uploaded by

Pervez_Alam_7255

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

309 views11 pages

Longest Common Subsequence

Uploaded by

Pervez_Alam_7255

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Longest Common Subsequence

Definition: The longest common

subsequence or LCS of two strings S1 and
S2 is the longest subsequence common
between two strings.
S1 : A -- A T -- G G C C -- A T A n=10
S2: A T A T A A T T C T A T -- m=12

The LCS is AATCAT. The length of the LCS is 6.

The solution is not unique for all pair of strings. Consider the pair (ATTA,
ATAT). The solutions are ATT, ATA. In general, for arbitrary pair of
strings, there may exist many solutions.

LCS Theorem

The LCS can be found by dynamic programming

formulation. One can easily show:
Theorem: With a score of 1 for each match and a zero for
each mismatch or space , the matched characters in an
alignment of maximum value for a LCS.
Since it is using the general dynamic programming
algorithm its complexity is O(nm) .
A longest substring problem, on the other hand has
a O(n+m) solution. Subsequences are much more
complex than substrings.
Can we do better for the LCS problem? We will
see …

1
S1 : A -- A T -- G G C C -- A T A n=10
S2: A T A T A A T T C T A T -- m=12

The optimal alignment is shown above. Note the

alignment shows three insert (dark), one delete green)
and three substitution or replacement operations (blue),
which gives an edit distance of 7.
But, the 3 replacement operations can be realized by 3
insert and 3 delete operations because a replacement is
equivalent to first delete the character and then insert a
character in its place like:
G -- G -- C --
-- A -- T -- T

if we give a cost of 2 for replace operation

and cost of 1 for both insert and delete
operations, the minimum edit distance D can
be computed in terms of the length L of LCS
as:
D = m + n − 2L
For the above example, n=10, m=12, L=6.
So, D=10 ( 6 insert and 4 delete).

2
Direct Computation of LCS by Dynamic
Programming
More efficient although the asymptotic complexity
remains the same, O(nm).
Let L denote The equations are given below without
proof (which is simple).
L(0,0) = 0
L(i,0) = 0
L(0, j ) = 0
L(i, j ) = 1 + L(i − 1, j − 1).........................S1 (i ) = S 2 ( j )
L(i, j ) = max[ L(i, j − 1), L(i − 1, j )]..........S1 (i ) ≠ S 2 ( j )

Again, if we leave suitable back pointers in the

matrix, trace(s) can be derived for the LCS.

Example 3: Edit Distance

= 6 + 8 – 2*5 = 4
S1 = A T C A T
S2 = T A A T C A T A

↓ ↓ ↑ ↓
A

j 0 T A A T C A T A
i 1 2 3 4 5 6 7 8
0 0 0 0 0 0 0 0 0 0
A1 0 0 1 1 1 1 1 1 1
T2 0 1 1 1 2 2 2 2 2
C3 0 1 1 1 2 3 3 3 3
A4 0 1 2 2 2 3 4 4 4
A5 0 1 2 3 3 3 4 4 5
T6 0 1 2 3 4 4 4 5 5

3
A Faster Algorithm for LCS

An algorithm that is asymptotically better than O(nm)

for determining LCS.
Implies that for special cases of edit distance, there
exist more efficient algorithm.
Definition:
Let π be a set of n integers, not necessarily distinct.
Definition:
An increasing subsequence(IS) of π is a subsequence of
π whose values are strictly increasing from left to right.
Example: π=(5,3,4,4, 9,6,2,1,8,7,10).
IS=(3,4,6,8,10), (5,9,10)

Definition:
A longest increasing subsequence(LIS) of π is
an IS π of maximum length.
Definition:
A decreasing subsequence (DS) of π is a non-
increasing subsequence f π.
Example: DS=(5,4,4,2,1).

4
Definition:
A cover is a set of disjoint DS of π that covers or
contains all elements of π. The size of the cover c
equals the number of DS in the cover.
Example: π=(5,3,4,9,6,2,1,8,7) Cover:{
(5,3,2,1),(4),(9,6),(8,7)}. C=#of DS=4.
Definition:
A smallest cover (SC) is a cover with a minimum
value of c.

Determine LIS and SC simultaneously in

O(nlogn)
Lemma:
If I is an IS of π with length equal to the size of a
cover C of π, then I is a LIS of π and C is the
smallest cover of size c.

5
Proof

If I is an increasing sequence, it cannot contain more than one

element from a decreasing sequence.
This means that no increasing subsequence can have size more
than the size of any cover C, that is, if
C = C1 U C 2 U ......... U C k
a maximum of one element from each can participate in any
increasing sequence.
Thus, an IS derived from this decomposition can have a
maximum length of |C |=c. Conversely, C must be the smallest. If
not, let c’ be the length of a cover C’ such that |C’|=< c i.e., if we
derive IS from C, it must contain more than one element from
one of the decreasing sequence of C’, which is not possible.
Hence C has to be of smallest size.

Construction of a cover

Greedy algorithm to derive a cover:

Starting from the left of π, examine each
successive number in π.
Append the current number at the left-most
subsequence derived so far if it is possible do that
maintaining the decreasing sequence property.
If not start a new decreasing subsequence
beginning with the current element.
Proceed until π is exhausted.

6
Example

π=(5,3,4,9,6,2,1,8,7,10)
D1=(5,3,2,1), D2=(4), D3=(9,6), D4=(8,7),
D4=(10)
The algorithm has O (n2) complexity. We will
present an O (n logn) algorithm.

An Efficient Algorithm for Constructing

the Cover
We use a data structure which is a list containing the
last number of each of the decreasing sequence
that is being constructed.
The list is always sorted in increasing order. An
identifier indicating which list the number belongs to
also included.
Procedure Decreasing Sequence Cover
Input: π= ( x 1 , x 2 , ......... x n ) , the list of input numbers.
Output: the set of decreasing sequences Di
constituting the cover.

7
O(n logn) Algorithm

Initialize: i←1; Di=(x1); L=(x1, i) ; j←1;

For i=2 to n do
Search the x-fields of L to find the first x-value such
that
xi < x. ….takes O( logn) time.
If such a value exists, then insert x at the end in the
list Di and set xi←x in L… This step takes constant
time.
If such a value does not exist in L, then set j←j+1.
insert in L a new element (x,j) and start a new
decreasing sequence Dj=(x)
End

Lemma:
At any point in the execution of the algorithm the list L is
sorted in increasing order with respect to x-values as well
as with respect to identifier value.
In fact two separate lists will be better from practical
implementation point of view.
Theorem:
The greedy cover can be constructed taking O(nlogn) time.
A longest increasing sequence and a smallest cover thus
can be constructed using O(nlogn) time.

8
Example: π=(5,3,4,9,6,2,1,8,7,10)
i=1 x1=5 L={(5,1)} D1=(5)
2 3 {(3,1)} (5,3)
3 4 {(3,1),(4,2)} (5,3) D2=(4)
4 9 {(3,1),(4,2),(9,3)} (5,3) (4) D3=(9)
5 6 {(3,1),(4,2),(6,3)} (5,3) (4) (9,6)
6 2 {(2,1),(4,2),(6,3) (5,3,2) (4) (9,6)
7 1 {(1,1),(4,2),(6,3)} (5,3,2,1) (4) (9,6)
8 8 {(1,1),(4,2),(6,3),(8,4)} (5,3,2,1) (4) (9,6) D4=(8)
9 7 {(1,1),(4,2),(6,3),(7,4)} (5,3,2,1) (4) (9,6) D4=(8,7)
10 10 {(1,1),(4,2),(6,3),(7,4),(10,5)} (5,3,2,1) (4) (9,6) D4=(8,7) D5=(10)

The x-component of the list, if separated, will look

like the following during execution:
(5),(3),(3,4), (3,4,9), (3,4,6), (2,4,6),(1,4,6),
(1,4,6,7),(1,4,6,7,10)

Reduction of LIS problem to LCS

problem
Definition:
Given sequences S1 and S2, let ri be the number
of occurrence of the ith character of S1 in S2.

123456
(position index in sequence S2: )
Example:S1=a b a c x and S2= b a a b c a
Then, r(1)=3, r(2)=2, r(3)=3, r(4)=1, r(5)=0 .

9
Definition:
for each distinct character x in S1, define list(x) to
be the positions of x in S2 in decreasing order.
Example: list(a)= (6,3,2); list(b)=(4,1),
list(c)=(5), list(x)=φ (empty sequence).

Definition: Let Π (S1,S2) be a sequence

obtained by concatenating list(si) for
i=1,2,…,n where n is the length of S1 and si
is the ith symbol of S1.
Example: Π (S1,S2)= (6,3,2,4,1,6,3,2,5).

10
Theorem:
Every increasing sequence I of Π (S1,S2) specifies an
equal length common subsequence of S1 and S2 and vice
versa. Thus a longest common subsequence LCS of S1
and S2 corresponds to a longest increasing sequence of Π
(S1,S2).
Example: Π (S1,S2)= (6,3,2,4,1,6,3,2,5). The
possible longest increasing sequences used as
indices to access the characters in S2 yield the LCS
as: (1,2,5)= b a c, (2,3,5)=a a c, (3,4,6)= a b a for
S1=a b a c x and S2= b a a b c a.

Longest Common Subsequence Explained
No ratings yet
Longest Common Subsequence Explained
4 pages
Longest Common Subsequence Guide
No ratings yet
Longest Common Subsequence Guide
3 pages
hw09 Solution PDF
No ratings yet
hw09 Solution PDF
8 pages
Advanced Algorithms Homework
No ratings yet
Advanced Algorithms Homework
8 pages
18k-1090 18k0429 An Efficient Algorithm For LCS Problem Between Two Arbitrary Sequences
No ratings yet
18k-1090 18k0429 An Efficient Algorithm For LCS Problem Between Two Arbitrary Sequences
5 pages
8 Dynamic Programming
No ratings yet
8 Dynamic Programming
75 pages
Fundamental Algorithms, Assignment 7 Solutions
No ratings yet
Fundamental Algorithms, Assignment 7 Solutions
6 pages
Longest Common Subsequence Algorithm
No ratings yet
Longest Common Subsequence Algorithm
4 pages
DAA - Week 11 - Lecture 1 - Longest Common Subsequence
No ratings yet
DAA - Week 11 - Lecture 1 - Longest Common Subsequence
9 pages
Dynamic Programming
No ratings yet
Dynamic Programming
8 pages
Lec 06
No ratings yet
Lec 06
41 pages
Lecture Notes
No ratings yet
Lecture Notes
54 pages
Longest Common Subsequence Explained
No ratings yet
Longest Common Subsequence Explained
67 pages
17 Dynprog2
No ratings yet
17 Dynprog2
33 pages
17 Dynprog2
No ratings yet
17 Dynprog2
33 pages
CSE 205 Lab Manual 13 LCS
No ratings yet
CSE 205 Lab Manual 13 LCS
5 pages
Lec06 448
No ratings yet
Lec06 448
6 pages
DS May 19 Solved
No ratings yet
DS May 19 Solved
24 pages
LCS Algorithm in Dynamic Programming
No ratings yet
LCS Algorithm in Dynamic Programming
3 pages
10 Dynamic 1
No ratings yet
10 Dynamic 1
37 pages
Longest Common Subsequence Algorithm
No ratings yet
Longest Common Subsequence Algorithm
5 pages
Design Techniques Part 2 64
No ratings yet
Design Techniques Part 2 64
15 pages
Lect11 DP Lcs
No ratings yet
Lect11 DP Lcs
6 pages
Longest Common Subsquence
No ratings yet
Longest Common Subsquence
8 pages
Ra2311026050228 Sundaranandhan.r.j
No ratings yet
Ra2311026050228 Sundaranandhan.r.j
4 pages
B60 Exp07 Aoa
No ratings yet
B60 Exp07 Aoa
8 pages
W-8 - L-1 - DP Longest Common Subsequence and Edit Distance
No ratings yet
W-8 - L-1 - DP Longest Common Subsequence and Edit Distance
19 pages
Geeks DP
No ratings yet
Geeks DP
111 pages
Algorithm Report PDF
No ratings yet
Algorithm Report PDF
6 pages
Dynamic Programming: Assignment
No ratings yet
Dynamic Programming: Assignment
29 pages
Algorithms, Fall 2005. (Massachusetts Institute of Technology: MIT
No ratings yet
Algorithms, Fall 2005. (Massachusetts Institute of Technology: MIT
11 pages
M2-Longest Common Subsequence
No ratings yet
M2-Longest Common Subsequence
30 pages
Dynamic Programming Algorithms Explained
No ratings yet
Dynamic Programming Algorithms Explained
16 pages
11339AoA - EX-7
No ratings yet
11339AoA - EX-7
7 pages
Dynamic Programming Lecture Notes
No ratings yet
Dynamic Programming Lecture Notes
8 pages
Ap 2
No ratings yet
Ap 2
18 pages
Unit III Daa
No ratings yet
Unit III Daa
96 pages
Huffman Coding Example Explained
No ratings yet
Huffman Coding Example Explained
22 pages
Intro To Dynamic Programming
No ratings yet
Intro To Dynamic Programming
7 pages
String Matching Algorithms
No ratings yet
String Matching Algorithms
13 pages
L09 DynamicProgramming - Part03
No ratings yet
L09 DynamicProgramming - Part03
14 pages
Longest Common Subsequence 19
No ratings yet
Longest Common Subsequence 19
17 pages
5.4 Longest Common Subsequence Problem
No ratings yet
5.4 Longest Common Subsequence Problem
10 pages
Advanced Algorithms for Exams
No ratings yet
Advanced Algorithms for Exams
20 pages
Dynamic Programming - Longest Common Subsequence (LCS)
No ratings yet
Dynamic Programming - Longest Common Subsequence (LCS)
34 pages
B306 DAA Lab Manual Exp 7
No ratings yet
B306 DAA Lab Manual Exp 7
8 pages
Understanding Longest Common Subsequence
No ratings yet
Understanding Longest Common Subsequence
11 pages
1 s2.0 S1570866712001633 Main
No ratings yet
1 s2.0 S1570866712001633 Main
9 pages
Semester Final Project Report
No ratings yet
Semester Final Project Report
11 pages
The Complexity of Some Problems On Subsequences and Supersequences
No ratings yet
The Complexity of Some Problems On Subsequences and Supersequences
15 pages
Sequences
No ratings yet
Sequences
3 pages
Problem String Love Iroha-Chan Easy Problem String Love Iroha-Chan Easy
No ratings yet
Problem String Love Iroha-Chan Easy Problem String Love Iroha-Chan Easy
4 pages
Dynamic Programming: LCS and BST Optimization
No ratings yet
Dynamic Programming: LCS and BST Optimization
32 pages
DP 1
No ratings yet
DP 1
67 pages
Flow Chart For Product of First N Natural Numbers: Syllabus/Lectures/Same/First Grade/programming 1 PDF
No ratings yet
Flow Chart For Product of First N Natural Numbers: Syllabus/Lectures/Same/First Grade/programming 1 PDF
7 pages
Lecture 13
No ratings yet
Lecture 13
31 pages
Perfect Numbers and Array Sorting Lab
No ratings yet
Perfect Numbers and Array Sorting Lab
2 pages
OSY Question Bank-1 K-Scheme
No ratings yet
OSY Question Bank-1 K-Scheme
3 pages
Amazon OA
No ratings yet
Amazon OA
56 pages
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
No ratings yet
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
15 pages
File Handling
No ratings yet
File Handling
12 pages
OS Course for Engineering Students
No ratings yet
OS Course for Engineering Students
5 pages
Expert MACD
No ratings yet
Expert MACD
8 pages
Queue Implementation in C Using Arrays
No ratings yet
Queue Implementation in C Using Arrays
3 pages
Alv Events Class
No ratings yet
Alv Events Class
10 pages
Bison
No ratings yet
Bison
108 pages
Question - Set (K-Map)
No ratings yet
Question - Set (K-Map)
4 pages
Practice 1
No ratings yet
Practice 1
4 pages
Code Generation I: Compiler Construction
No ratings yet
Code Generation I: Compiler Construction
28 pages
Lingo 11 Users Manual
100% (3)
Lingo 11 Users Manual
714 pages
Chapter 1 (Exercise)
No ratings yet
Chapter 1 (Exercise)
18 pages
ch4. Control With Conditionals
No ratings yet
ch4. Control With Conditionals
16 pages
12 Loops Lab CPP
No ratings yet
12 Loops Lab CPP
7 pages
PPL Complete Notes PPL
No ratings yet
PPL Complete Notes PPL
126 pages
Parallel Integral Calculation in C
No ratings yet
Parallel Integral Calculation in C
2 pages
Matlab
100% (3)
Matlab
309 pages
CD3291 Course Plan 25-26
No ratings yet
CD3291 Course Plan 25-26
6 pages
Java Programs: Nested Classes, Inheritance, and More
No ratings yet
Java Programs: Nested Classes, Inheritance, and More
7 pages
C HRHPC 1911-Questions
No ratings yet
C HRHPC 1911-Questions
4 pages
Python Simple Calculator Project
No ratings yet
Python Simple Calculator Project
11 pages
Mfvisualcobol Vs 80 Release Notes
No ratings yet
Mfvisualcobol Vs 80 Release Notes
60 pages
Shashwat Krishna: Education
No ratings yet
Shashwat Krishna: Education
1 page
Client Server & Java Explained
No ratings yet
Client Server & Java Explained
20 pages
Presentation Liquidsoap FOSDEM 2023
No ratings yet
Presentation Liquidsoap FOSDEM 2023
74 pages
5 Steps To A 5: AP Computer Science A 2022 Dean R. Johnson Updated 2025
No ratings yet
5 Steps To A 5: AP Computer Science A 2022 Dean R. Johnson Updated 2025
80 pages
Using Dmee
100% (3)
Using Dmee
24 pages
Understanding Young Generation GC in Java
No ratings yet
Understanding Young Generation GC in Java
4 pages