Compare sequences in Python using dfflib module
Last Updated :
24 Feb, 2021
The dfflib Python module includes various features to evaluate the comparison of sequences, it can be used to compare files, and it can create information about file variations in different formats, including HTML and context and unified diffs.
It contains various classes to perform various comparisons between sequences:
Class SequenceMatcher
It is a very flexible class for matching sequence pairs of any sort. This class contains various functions discussed below:
- The ratio() method of this class returns the similarity ratio between the two arguments passed. The similarity ratio is determined using the formula below.
2*X/Y
Where X is the number of similar matches and
Y is the total elements present in both the sequences.
Example 1:
Python3
# import required module
import difflib
# assign parameters
par1 = ['g', 'f', 'g']
par2 = 'gfg'
# compare
print(difflib.SequenceMatcher(None, par1, par2).ratio())
Output:
1.0
Example 2:
Python3
# import required module
import difflib
# assign parameters
par1 = 'Geeks for geeks!'
par2 = 'geeks'
# compare
print(difflib.SequenceMatcher(None, par1, par2).ratio())
Output:
0.47619047619047616
Example 3:
Python3
# import required module
import difflib
# assign parameters
par1 = 'gfg'
par2 = 'GFG'
# compare
print(difflib.SequenceMatcher(None, par1, par2).ratio())
Output:
0.0
- The get_matching_blocks() method of this class returns a list of triples describing matching subsequences. Each triple is of the form (i, j, n), and means that a[i:i+n] == b[j:j+n].
Example 1:
Python3
# import required module
import difflib
# assign parameters
par1 = 'Geeks for geeks!'
par2 = 'geeks'
# compare
matches = difflib.SequenceMatcher(
None, par1, par2).get_matching_blocks()
for ele in matches:
print(par1[ele.a:ele.a + ele.size])
Output:
geeks
Example 2:
Python3
# import required module
import difflib
# assign parameters
par1 = 'GFG'
par2 = 'gfg'
# compare
matches = difflib.SequenceMatcher(
None, par1, par2).get_matching_blocks()
for ele in matches:
print(par1[ele.a:ele.a + ele.size])
Output:
As there are no matching subsequences between GFG and gfg. So no output is displayed.
- get_close_matches() method: This method returns the best character or group of character matches column. The term is a sequence in which close similarities are needed (usually a string) and possibilities are a set of sequences for matching terms (mostly a list of strings).
Example :
Python3
# import required module
import difflib
# assign parameters
string = "Geeks4geeks"
listOfStrings = ["for", "Gks", "G4g", "geeks"]
# find common strings
print(difflib.get_close_matches(string, listOfStrings))
Output:
['geeks']
Class Differ
This class is used for matching sequences in the form of lines of text and creating human-readable variations or deltas. Every line of the Differ delta starts with a two-letter code:
Code | Meaning |
---|
'- ' | line unique to sequence 1 |
---|
'+ ' | line unique to sequence 2 |
---|
' ' | line common to both sequences |
---|
'? ' | line not present in either input sequence |
---|
Following are the functions contained within this class:
- The compare() method in this class, compares two sequences of lines, and generate the delta (a sequence of lines).
Example 1:
Python3
# import required module
from difflib import Differ
# assign parameters
par1 = 'Geeks'
par2 = 'geeks!'
# compare parameters
for ele in Differ().compare(par1, par2):
print(ele)
Output:
- G
+ g
e
e
k
s
+ !
Example 2:
Python3
# import required module
from difflib import Differ
# assign parameters
par1 = ['Geeks','for','geeks!']
par2 = 'geeks!'
# compare parameters
for ele in Differ().compare(par1, par2):
print(ele)
Output:
- G
+ g
e
e
k
s
+ !
- ndiff() method: The above type of comparison can be performed using this method also. However, if lists are passed then the elements of the lists are compared first.
Example 1:
Python3
# import required module
import difflib
# assign parameters
par1 = 'Geeks'
par2 = 'geeks!'
# compare parameters
for ele in difflib.ndiff(par1, par2):
print(ele)
Output:
- G
+ g
e
e
k
s
+ !
Example 2:
Python3
# import required module
import difflib
# assign parameters
par1 = ['Geeks','for','geeks!']
par2 = 'geeks!'
# compare parameters
for ele in difflib.ndiff(par1, par2):
print(ele)
Output:
- Geeks
- for
- geeks!
+ g
+ e
+ e
+ k
+ s
+ !
- context_diff() method: The Context diffs are a convenient way to display only the lines that have shifted, with a few lines of context. The improvements are seen in the style before/after. The number of background lines is set to n, which is set to three by default.
Example 1:
Python3
# import required module
import difflib
# assign parameters
par1 = 'Geeks'
par2 = 'geeks!'
# compare parameters
for ele in difflib.context_diff(par1, par2):
print(ele)
Output:
***
---
***************
*** 1,5 ****
! G
e
e
k
s
--- 1,6 ----
! g
e
e
k
s
+ !
Example 2:
Python3
# import required module
import difflib
# assign parameters
par1 = ['Geeks', 'for', 'geeks!']
par2 = 'geeks!'
# compare parameters
for ele in difflib.context_diff(par1, par2):
print(ele)
Output:
***
---
***************
*** 1,3 ****
! Geeks
! for
! geeks!
--- 1,6 ----
! g
! e
! e
! k
! s
! !
Similar Reads
Compare two files using Hashing in Python In this article, we would be creating a program that would determine, whether the two files provided to it are the same or not. By the same means that their contents are the same or not (excluding any metadata). We would be using Cryptographic Hashes for this purpose. A cryptographic hash function i
3 min read
Python | Decimal compare_signal() method Decimal#compare_signal() : compare_signal() is a Decimal class method which compares th two Decimal values, except for the NaN values. Syntax: Decimal.compare_signal()Parameter: Decimal valuesReturn: 1 - if a > b -1 - if a < b 0 - if a = b Code #1 : Example for compare_signal() method Python
2 min read
String Comparison in Python Python supports several operators for string comparison, including ==, !=, <, <=, >, and >=. These operators allow for both equality and lexicographical (alphabetical order) comparisons, which is useful when sorting or arranging strings.Letâs start with a simple example to illustrate the
3 min read
Compare two Files line by line in Python In Python, there are many methods available to this comparison. In this Article, We'll find out how to Compare two different files line by line. Python supports many modules to do so and here we will discuss approaches using its various modules. This article uses two sample files for implementation.
3 min read
Python | Decimal compare() method Decimal#compare() : compare() is a Decimal class method which compares the two Decimal values. Syntax: Decimal.compare() Parameter: Decimal values Return: 1 - if a > b -1 - if a < b 0 - if a = b Code #1 : Example for compare() method Python3 # Python Program explaining # compare() method # loa
2 min read