Knuth-Morris-Pratt Algorithm KENT

The Knuth-Morris-Pratt algorithm is a linear-time string matching algorithm that improves upon the naive algorithm. It preprocesses the pattern to compute a failure function that allows it to avoid re-examining characters when a mismatch occurs. This failure function encodes repeated substrings in the pattern to indicate the maximum shift possible after a mismatch. By utilizing information from previous comparisons, the Knuth-Morris-Pratt algorithm runs in O(n+m) time, which is optimal in the worst case where all characters must be examined.

Uploaded by

Grama Silviu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

425 views4 pages

Knuth-Morris-Pratt Algorithm KENT

Uploaded by

Grama Silviu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 4

Knuth-Morris-Pratt Algorithm

http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/StringM
atch/kuthMP.htm

Knuth, Morris and Pratt discovered first linear time string-matching algorithm
by following a tight analysis of the nave algorithm. Knuth-Morris-Pratt
algorithm keeps the information that nave approach wasted gathered during the
scan of the text. By avoiding this waste of information, it achieves a running
time of O(n + m), which is optimal in the worst case sense. That is, in the
worst case Knuth-Morris-Pratt algorithm we have to examine all the characters
in the text and pattern at least once.

The Failure Function

The KMP algorithm preprocess the pattern P by computing a failure function f
that indicates the largest possible shift s using previously performed
comparisons. Specifically, the failure function f(j) is defined as the length of
the longest prefix of P that is a suffix of P[i . . j].

KNUTH-MORRIS-PRATT FAILURE (P)

Input: Pattern with m characters
Output: Failure function f for P[i . . j]
i1
j0
f(0) 0
while i < m do
if P[j] = P[i]
f(i) j +1
i i +1
j j + 1
else if
j f(j - 1)
else
f(i) 0
i i +1

Note that the failure function f for P, which maps j to the length of the longest
prefix of P that is a suffix of P[1 . . j], encodes repeated substrings inside the
pattern itself.

As an example, consider the pattern P = a b a c a b. The failure function,

f(j), using above algorithm is

a 1 2
j
3 4 5
a b a
P[j]
c a b
0 0
f(j)
1 0 1 2

By observing the above mapping we can see that the longest prefix of pattern,
P, is "a b" which is also a suffix of pattern P.
Consider an attempt to match at position i, that is when the pattern P[0 ..m
-1] is aligned with text P[i . . i + m -1].
T: a b a c a a b a c c
P: a b a c a b

Assume that the first mismatch occurs between characters T[ i+ j] and P[j]
for 0 < j < m. In the above example, the first mismatch is T[5] = a and
P[5] = b.
Then, T[i . . i + j -1] = P[0 . . j -1] = u
That is, T[ 0 . . 4] = P[0 . . 4] = u, in the example [u = a b
a c a] and
T[i + j] P[j] i.e., T[5] P[5], In the example [T[5] = a
b = P[5]].

When shifting, it is reasonable to expect that a prefix v of the pattern matches

some suffix of the portion u of the text. In our example, u = a b a c a and v
= a b a c a, therefore, 'a' a prefix of v matches with 'a' a suffix of u. Let l(j)
be the length of the longest string P[0 . . j -1] of pattern that matches with
text followed by a character c different from P[j]. Then after a shift, the
comparisons can resume between characters T[i + j] and P[l(j)], i.e., T(5)
and P(1)

T: a b a c a a b a c c
P: a b a c a b

Note that no comparison between T[4] and P[1] needed here.

KNUTH-MORRIS-PRATT (T, P)
Input: Strings T[0 . . n] and P[0 . . m]
Output: Starting index of substring of T matching P
f compute failure function of Pattern P
i0
j0
while i < length[T] do
if j m-1 then
return i- m+1 // we have a match
i i +1
j j +1
else if j > 0
j f(j -1)
else
i i +1

Analysis
The running time of Knuth-Morris-Pratt algorithm is proportional to the time
needed to read the characters in text and pattern. In other words, the worst-case
running time of the algorithm is O(m + n) and it requires O(m) extra space.
It is important to note that these quantities are independent of the size of the
underlying alphabet.

SAP Cloud Platform Integration Training Content
40% (5)
SAP Cloud Platform Integration Training Content
4 pages
PA1100 Manual (Eng)
No ratings yet
PA1100 Manual (Eng)
9 pages
Introduction To Common Lisp
No ratings yet
Introduction To Common Lisp
32 pages
Study of Van Emde Boas Tree With Application To Dijkstra: Advanced Problem Solving
No ratings yet
Study of Van Emde Boas Tree With Application To Dijkstra: Advanced Problem Solving
16 pages
Documentum Administrator 6.5 User Guide
100% (17)
Documentum Administrator 6.5 User Guide
879 pages
Network Checklist Ver 2
67% (3)
Network Checklist Ver 2
9 pages
Data Structures Using C: Example 4.13
No ratings yet
Data Structures Using C: Example 4.13
5 pages
Van Emde Boas Trees
No ratings yet
Van Emde Boas Trees
5 pages
Comparative Analysis of Brute Force and Boyer Moore Algorithms in Word Suggestion Search
No ratings yet
Comparative Analysis of Brute Force and Boyer Moore Algorithms in Word Suggestion Search
5 pages
Morphological PCB
No ratings yet
Morphological PCB
5 pages
Lecture 4: Divide and Conquer: Van Emde Boas Trees
No ratings yet
Lecture 4: Divide and Conquer: Van Emde Boas Trees
7 pages
Segmentation and Object Recognition Using Edge Detection Techniques
No ratings yet
Segmentation and Object Recognition Using Edge Detection Techniques
9 pages
Outline and Reading: Tries 4/1/2003 9:02 AM
No ratings yet
Outline and Reading: Tries 4/1/2003 9:02 AM
3 pages
Segmentation
100% (1)
Segmentation
51 pages
Logistic Regression Model - A Review
No ratings yet
Logistic Regression Model - A Review
5 pages
Android Application For Crop Yield Prediction and Crop Disease Detection
No ratings yet
Android Application For Crop Yield Prediction and Crop Disease Detection
4 pages
Rice and Climate Change
No ratings yet
Rice and Climate Change
37 pages
Stock Market Forecasting Using Deep Learning and Technical Analysis A Systematic Review
No ratings yet
Stock Market Forecasting Using Deep Learning and Technical Analysis A Systematic Review
11 pages
Data Structures 2
No ratings yet
Data Structures 2
82 pages
Intermediate Code Generation
No ratings yet
Intermediate Code Generation
22 pages
Text Processing (Complete)
No ratings yet
Text Processing (Complete)
100 pages
Convex Hull Algorithms
No ratings yet
Convex Hull Algorithms
4 pages
Minimum Spanning Trees
No ratings yet
Minimum Spanning Trees
19 pages
Poisson Distribution
100% (1)
Poisson Distribution
6 pages
Compiler Design Unit 4
No ratings yet
Compiler Design Unit 4
28 pages
CSE245 - Algorithms: Single Source Shortest Path (Dijkstra's Algorithm)
No ratings yet
CSE245 - Algorithms: Single Source Shortest Path (Dijkstra's Algorithm)
39 pages
Unit 5 - Compiler Design - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Compiler Design - WWW - Rgpvnotes.in
20 pages
Van Emde Boas Tree
No ratings yet
Van Emde Boas Tree
27 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Automobile
No ratings yet
Automobile
15 pages
Lab Final 9-7-19
100% (1)
Lab Final 9-7-19
144 pages
ML UNIT-IV Notes
100% (1)
ML UNIT-IV Notes
23 pages
Characteristics of Data Structures
No ratings yet
Characteristics of Data Structures
3 pages
Mac Flow
No ratings yet
Mac Flow
12 pages
Boyer Moore Algorithm: Idan Szpektor
100% (1)
Boyer Moore Algorithm: Idan Szpektor
48 pages
Implementation Data Mining With K-Means Algorithm For Clustering Distribution Rabies Case Area in Palembang City PDF
No ratings yet
Implementation Data Mining With K-Means Algorithm For Clustering Distribution Rabies Case Area in Palembang City PDF
8 pages
Unit II Requirements Elicitation
No ratings yet
Unit II Requirements Elicitation
23 pages
Algorithm Lectures
No ratings yet
Algorithm Lectures
117 pages
Image Enhancement Techniques
No ratings yet
Image Enhancement Techniques
15 pages
Recurrent Neural Network: Dr. Sukanta Ghosh
100% (1)
Recurrent Neural Network: Dr. Sukanta Ghosh
34 pages
Median Finding Algorithm
No ratings yet
Median Finding Algorithm
10 pages
Divide and Conquer For Convex Hull
100% (1)
Divide and Conquer For Convex Hull
8 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Crop Yield Prediction Using Machine Learning - 2020 - Computers and Electronic
50% (2)
Crop Yield Prediction Using Machine Learning - 2020 - Computers and Electronic
18 pages
Data Structures Lab Manual 2021-22
No ratings yet
Data Structures Lab Manual 2021-22
98 pages
Gis Implementation
No ratings yet
Gis Implementation
79 pages
Image Processing QB
100% (1)
Image Processing QB
29 pages
A Project Report On "Stock Market Operations"
100% (2)
A Project Report On "Stock Market Operations"
29 pages
07 Network Flow I
No ratings yet
07 Network Flow I
87 pages
Cropthesis PDF
0% (1)
Cropthesis PDF
67 pages
Minimum Spanning Tree
No ratings yet
Minimum Spanning Tree
20 pages
14 Websitesto Download Research Paperfor Free 2022
No ratings yet
14 Websitesto Download Research Paperfor Free 2022
9 pages
Prims Algorithm
No ratings yet
Prims Algorithm
12 pages
Ford-Fulkerson Algorithm - Network Flow Problem
No ratings yet
Ford-Fulkerson Algorithm - Network Flow Problem
45 pages
Dependency Parsing
No ratings yet
Dependency Parsing
96 pages
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
100% (1)
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
51 pages
RESEARCH PAPER Stock Market Basics
No ratings yet
RESEARCH PAPER Stock Market Basics
21 pages
CS6659 AI UNIT 3 Notes
50% (4)
CS6659 AI UNIT 3 Notes
30 pages
Building Data-Driven Applications with LlamaIndex: A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications
From Everand
Building Data-Driven Applications with LlamaIndex: A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications
Andrei Gheorghiu
No ratings yet
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Dynamic Bayesian Networks: Fundamentals and Applications
From Everand
Dynamic Bayesian Networks: Fundamentals and Applications
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
HTI Exam Questions
100% (2)
HTI Exam Questions
7 pages
Protocol DNP3 Master
No ratings yet
Protocol DNP3 Master
16 pages
Standard SRMU
0% (5)
Standard SRMU
24 pages
SSE Errors and Explanations KBA-01862-M8P4
No ratings yet
SSE Errors and Explanations KBA-01862-M8P4
5 pages
Part 4 Adding A Visual Effect
No ratings yet
Part 4 Adding A Visual Effect
1 page
11 Advanced Oracle Troubleshooting Guide When The Wait Interface Is Not Enough
No ratings yet
11 Advanced Oracle Troubleshooting Guide When The Wait Interface Is Not Enough
5 pages
Food Delivery App Proposal V1 PDF
38% (8)
Food Delivery App Proposal V1 PDF
35 pages
Creating Logic Puzzles: Bart Peintner
No ratings yet
Creating Logic Puzzles: Bart Peintner
15 pages
Chapter 3
No ratings yet
Chapter 3
13 pages
Unwritten Procedural Modeling With The Straight Skeleton
No ratings yet
Unwritten Procedural Modeling With The Straight Skeleton
257 pages
Spirent Test Module Training
No ratings yet
Spirent Test Module Training
37 pages
Computer Science - Introduction To Java ISC Notes
100% (3)
Computer Science - Introduction To Java ISC Notes
10 pages
Dec 13 001
No ratings yet
Dec 13 001
3 pages
SAD (System Analysis and Design) Prelim
No ratings yet
SAD (System Analysis and Design) Prelim
3 pages
Absolutely Important UNIX Commands
No ratings yet
Absolutely Important UNIX Commands
4 pages
Apps DBA Faq
No ratings yet
Apps DBA Faq
63 pages
Full Source Book
No ratings yet
Full Source Book
105 pages
Software Engineer III (50445)
No ratings yet
Software Engineer III (50445)
2 pages
CEM User Gude - Draft
No ratings yet
CEM User Gude - Draft
354 pages
Most Voip
No ratings yet
Most Voip
49 pages
IIS Web Server
No ratings yet
IIS Web Server
12 pages
ABAP Enhancements
No ratings yet
ABAP Enhancements
3 pages
DSP Syllabus
No ratings yet
DSP Syllabus
2 pages
Presentation On TTCN-3
No ratings yet
Presentation On TTCN-3
38 pages
Tutorial All PPSS PostGIS
100% (1)
Tutorial All PPSS PostGIS
11 pages
Data Mining For Web Personalization
No ratings yet
Data Mining For Web Personalization
59 pages