0% found this document useful (0 votes)

26 views9 pages

Rabin-Karp and KMP Algorithms Explained

The document discusses three string matching algorithms: Rabin-Karp, Knuth-Morris-Pratt (KMP), and Automata-Based. It provides C implementations for each algorithm, explaining their key concepts, parameters, and time complexities. Each algorithm is designed to efficiently find occurrences of a pattern in a given text, with unique approaches to handle mismatches and optimize search performance.

Uploaded by

Manipriyaa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views9 pages

Rabin-Karp and KMP Algorithms Explained

Uploaded by

Manipriyaa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

NAME: MANIPRIYAA R

REG NO: 23BAI0214

1.Rabin-Karp Algorithm (C Implementation)

#include <stdio.h>

#include <string.h>

#define d 256

void rabinKarp(char pattern[], char text[], int q) {

int M = strlen(pattern);

int N = strlen(text);

int p = 0, t = 0, h = 1;

int i, j;

for (i = 0; i < M - 1; i++)

h = (h * d) % q;

for (i = 0; i < M; i++) {

p = (d * p + pattern[i]) % q;

t = (d * t + text[i]) % q;

}// Slide the pattern over the text

for (i = 0; i <= N - M; i++) {

if (p == t) {

for (j = 0; j < M; j++)

if (text[i + j] != pattern[j])

break;

if (j == M)

printf("Pattern found at index %d\n", i);

if (i < N - M) {

t = (d * (t - text[i] * h) + text[i + M]) % q;

if (t < 0) t += q; }}}
int main() {

char text[] = "ABCCDEABCDABCD";

char pattern[] = "ABCD";

int q = 101; // Prime number

rabinKarp(pattern, text, q);

return 0;

1. Understanding the Algorithm

 The Rabin-Karp algorithm is a pattern-searching algorithm that uses hashing to find

occurrences of a pattern in a given text efficiently.

 It calculates the hash value of the pattern and compares it with the hash values of substrings of
the text.

2. Defining Key Parameters

 The text and pattern lengths are calculated.

 Variables are initialized to store the hash values of the pattern and the current substring of the
text.

 h is a factor used in hash value calculations.

3. Precomputing the Hash Factor (h)

 A helper value h = (d^(M-1)) % q is computed.

 This helps in efficiently removing the first character and adding a new character while
updating the rolling hash

4. Calculating Initial Hash Values

 The hash values of the pattern and the first window of text (of the same length as the pattern)
are computed.

 This avoids comparing the pattern character-by-character initially.

5. Sliding the Pattern Over the Text

 The algorithm moves the pattern one character at a time over the text.
 If the hash values match, a character-by-character comparison is performed to confirm the
match.

6. Rolling Hash Technique

 Instead of recalculating the hash from scratch for each new substring, the hash is updated using
the rolling hash formula:

o Remove the first character's contribution.

o Add the new character at the end.

 This reduces redundant computations and makes the algorithm efficient.

7. Handling Negative Hash Values

 If the updated hash value becomes negative, a small adjustment is made to convert it into a
positive value.

8. Output the Match Index

 If a pattern match is found, the index of occurrence is printed.

9. Main Function Execution

 A sample text ("ABCCDEABCDABCD") and pattern ("ABCD") are given.

 The Rabin-Karp function is called, and it finds all occurrences of "ABCD" in the text.

10. Time Complexity

 Best case: O(N)O(N)O(N) (when there are few hash collisions).

 Worst case: O(NM)O(NM)O(NM) (when many hash collisions occur, leading to character-by-
character comparisons).

 Average case: O(N)O(N)O(N) (efficient due to hashing).

2. Knuth-Morris-Pratt (KMP) Algorithm

#include <stdio.h>

#include <string.h>
void computeLPSArray(char* pattern, int M, int* lps) {

int len = 0, i = 1;

lps[0] = 0;

while (i < M) {

if (pattern[i] == pattern[len]) {

len++;

lps[i] = len;

i++;

} else {

if (len != 0)

len = lps[len - 1];

else {

lps[i] = 0;

i++}}}}

void KMPSearch(char* pattern, char* text) {

int M = strlen(pattern);

int N = strlen(text);

int lps[M];

computeLPSArray(pattern, M, lps);

int i = 0, j = 0;

while (i < N) {

if (pattern[j] == text[i]) {

i++;

j++;

}
if (j == M) {

printf("Pattern found at index %d\n", i - j);

j = lps[j - 1];

} else if (i < N && pattern[j] != text[i]) {

if (j != 0)

j = lps[j - 1];

else

i++;}}}

int main() {

char text[] = "ABABDABACDABABCABAB";

char pattern[] = "ABABCABAB";

KMPSearch(pattern, text);

return 0;}

1. Compute the LPS (Longest Prefix Suffix) Array

 The LPS array helps determine how much the pattern should be shifted when a mismatch occurs.

 It stores the length of the longest proper prefix which is also a suffix for each prefix of the
pattern.

 If a mismatch happens, the LPS array tells where to resume the comparison instead of restarting
from the beginning.

2. Constructing the LPS Array

 Start with the first character (lps[0] = 0).

 Compare the current character with the last matched prefix character:

o If they match, extend the LPS length and update lps[i].

o If they don’t match, reduce the prefix length using previous LPS values until a match is
found or reset to zero.
 Continue until the entire pattern is processed.

3. Searching the Pattern in the Text

 Traverse the text and compare it with the pattern.

 If characters match, move both pointers forward.

 If a complete match occurs, print the index and use lps[j-1] to determine how much to shift.

 If a mismatch occurs:

o Use the LPS array to avoid unnecessary rechecking.

o Move the pattern according to lps[j-1], instead of shifting it back completely.

4. Time Complexity

 Preprocessing (LPS Calculation): O(M)O(M)O(M)

 Pattern Matching: O(N)O(N)O(N)

 Overall Complexity: O(N+M)O(N + M)O(N+M), making KMP much faster than naive string-
matching algorithms.

5. Key Advantage

 Unlike brute force methods, KMP doesn’t backtrack in the text, making it highly efficient for
searching in large strings.

3. Automata-Based String Matching Algorithm

#include <stdio.h>

#include <string.h>

#define NO_OF_CHARS 256

void computeTF(char* pattern, int M, int TF[][NO_OF_CHARS]) {

int i, lps = 0, x;
for (x = 0; x < NO_OF_CHARS; x++)

TF[0][x] = 0;

TF[0][pattern[0]] = 1;

for (i = 1; i < M; i++) {

for (x = 0; x < NO_OF_CHARS; x++)

TF[i][x] = TF[lps][x];

TF[i][pattern[i]] = i + 1;

if (i > 0)

lps = TF[lps][pattern[i]]; }

for (x = 0; x < NO_OF_CHARS; x++)

TF[M][x] = TF[lps][x];}

void search(char* text, char* pattern) {

int M = strlen(pattern);

int N = strlen(text);

int TF[M + 1][NO_OF_CHARS];

computeTF(pattern, M, TF);

int state = 0, i;

for (i = 0; i < N; i++) {

state = TF[state][text[i]];

if (state == M)

printf("Pattern found at index %d\n", i - M + 1);}}

int main() {

char text[] = "AABAACAADAABAABA";

char pattern[] = "AABA";

search(text, pattern);

return 0;
}

1. Transition Function (TF) Table Construction

 The Transition Function (TF) table determines the next state based on the current state and the
incoming character.

 It ensures that when a mismatch occurs, the algorithm jumps to the longest prefix which is also a
suffix of the matched portion.

Steps for Building the TF Table:

1. Initialize the table: Set all transitions for state 0 to 0, except for the first character.

2. Iterate through the pattern:

o Copy transitions from the longest prefix suffix (LPS) state.

o Update the transition for the current character.

o Update the LPS state using previous transitions.

3. Final state transitions: Ensure that the last row contains proper shift values to continue pattern
matching efficiently.

2. Pattern Matching Process

 Start from state 0.

 For each character in the text:

o Move to the next state using the TF table.

o If the final state (equal to pattern length) is reached, a match is found.

3. Time Complexity

 Preprocessing (TF Table Construction): O(M×NO_OF_CHARS)O(M \times

\text{NO\_OF\_CHARS})O(M×NO_OF_CHARS) where MMM is the pattern length.

 Pattern Matching: O(N)O(N)O(N) where NNN is the text length.

 Overall Complexity: O(M×NO_OF_CHARS+N)O(M \times \text{NO\_OF\_CHARS} +

N)O(M×NO_OF_CHARS+N).

4. Key Advantages
 Efficient searching: Uses precomputed shifts, avoiding unnecessary comparisons.

 No backtracking: Unlike brute force methods, FA ensures smooth state transitions.

5. Example Execution

 Text: "AABAACAADAABAABA"

 Pattern: "AABA"

 The algorithm quickly identifies matches at indices 0, 9, and 12 by efficiently shifting the pattern

Rabin-Karp and KMP String Matching Algorithms
No ratings yet
Rabin-Karp and KMP String Matching Algorithms
9 pages
String Data Types and Matching Algorithms
No ratings yet
String Data Types and Matching Algorithms
20 pages
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
No ratings yet
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
18 pages
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
No ratings yet
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
18 pages
String Matching Algorithms Guide
No ratings yet
String Matching Algorithms Guide
5 pages
Pattern Matching
No ratings yet
Pattern Matching
33 pages
Advanced String Lecture
No ratings yet
Advanced String Lecture
50 pages
Rabin-Karp Algorithm For Pattern Searching: Examples
No ratings yet
Rabin-Karp Algorithm For Pattern Searching: Examples
5 pages
AOA Module 6 - String of Algorithms - Aeraxia - in
No ratings yet
AOA Module 6 - String of Algorithms - Aeraxia - in
26 pages
Algo Lab Project
No ratings yet
Algo Lab Project
9 pages
AoA Exp10
No ratings yet
AoA Exp10
8 pages
Exp 10 Daa Ak
No ratings yet
Exp 10 Daa Ak
7 pages
Lecture 34, 35 36 - String Matching Algorithms
No ratings yet
Lecture 34, 35 36 - String Matching Algorithms
42 pages
String Searching Algorithms Explained
No ratings yet
String Searching Algorithms Explained
23 pages
Patternmatching
No ratings yet
Patternmatching
29 pages
Daa Exp 09
No ratings yet
Daa Exp 09
7 pages
Overview of String Matching Algorithms
No ratings yet
Overview of String Matching Algorithms
5 pages
Steps
No ratings yet
Steps
3 pages
11 Data Structures and Algorithms - Narasimha Karumanchi
100% (1)
11 Data Structures and Algorithms - Narasimha Karumanchi
12 pages
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
No ratings yet
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
15 pages
Unit II
No ratings yet
Unit II
94 pages
String Matching Algorithms Explained
No ratings yet
String Matching Algorithms Explained
34 pages
Daa
No ratings yet
Daa
10 pages
v1 Finnal
No ratings yet
v1 Finnal
10 pages
Aho-Corasick and String Matching Techniques
No ratings yet
Aho-Corasick and String Matching Techniques
89 pages
Module III Problem Solving
No ratings yet
Module III Problem Solving
16 pages
Lecture 56string Matching
No ratings yet
Lecture 56string Matching
43 pages
String Matching
No ratings yet
String Matching
35 pages
String Matching Algorithms
No ratings yet
String Matching Algorithms
25 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
49 pages
String Matching 2019
No ratings yet
String Matching 2019
50 pages
String Searching Algorithms Overview
No ratings yet
String Searching Algorithms Overview
13 pages
String Matching Kmprabin Karp and Naive
No ratings yet
String Matching Kmprabin Karp and Naive
41 pages
20BCS5977 - DAA LAB WORKSHEET 3.3pdf
No ratings yet
20BCS5977 - DAA LAB WORKSHEET 3.3pdf
5 pages
Unit 3
No ratings yet
Unit 3
34 pages
Lecture 04
No ratings yet
Lecture 04
18 pages
String Matching Algorithms Guide
No ratings yet
String Matching Algorithms Guide
57 pages
Efficient String Search Techniques
No ratings yet
Efficient String Search Techniques
2 pages
DAA Unit 5
No ratings yet
DAA Unit 5
22 pages
Abstract
No ratings yet
Abstract
12 pages
Unit 2 - Letter ManipilationPattern Searching
No ratings yet
Unit 2 - Letter ManipilationPattern Searching
19 pages
Lecture#8 - String Matching Algorithm
No ratings yet
Lecture#8 - String Matching Algorithm
38 pages
String Matching and Hashing
No ratings yet
String Matching and Hashing
10 pages
Adsa
No ratings yet
Adsa
9 pages
Aoa Vrushali Exp 10
No ratings yet
Aoa Vrushali Exp 10
2 pages
Ads Unit5
No ratings yet
Ads Unit5
26 pages
Cse 217
No ratings yet
Cse 217
10 pages
M269 - Lec8 Fall 1819
No ratings yet
M269 - Lec8 Fall 1819
24 pages
DAA Unit 5 Part 1
No ratings yet
DAA Unit 5 Part 1
27 pages
4th Sem DAA Module 4
No ratings yet
4th Sem DAA Module 4
10 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
21 pages
UNIT-V String Matching
No ratings yet
UNIT-V String Matching
24 pages
CH 8
No ratings yet
CH 8
26 pages
String Matching Chapter 12 Goodrich Nep
No ratings yet
String Matching Chapter 12 Goodrich Nep
43 pages
Algoritmen & Datastructuren 2012 - 2013 Substring Search (Slides by Sedgewick)
No ratings yet
Algoritmen & Datastructuren 2012 - 2013 Substring Search (Slides by Sedgewick)
32 pages
DS Unit-5 Topic
No ratings yet
DS Unit-5 Topic
26 pages
Learners Book Stage 9 Cambridge Free Download, Borrow, and Streaming Internet Archive
No ratings yet
Learners Book Stage 9 Cambridge Free Download, Borrow, and Streaming Internet Archive
1 page
Energy-Efficient Median Filter On FPGA: Andrea Sanny Viktor K. Prasanna
No ratings yet
Energy-Efficient Median Filter On FPGA: Andrea Sanny Viktor K. Prasanna
8 pages
Torque Chart For 12.9
No ratings yet
Torque Chart For 12.9
7 pages
Design Principles in Architecture
No ratings yet
Design Principles in Architecture
24 pages
Honeywell Fvms
No ratings yet
Honeywell Fvms
4 pages
5 Analytic Geometry
No ratings yet
5 Analytic Geometry
5 pages
Hubungan Kepuasan Karyawan dan Motivasi
No ratings yet
Hubungan Kepuasan Karyawan dan Motivasi
3 pages
01 CSL Part One
No ratings yet
01 CSL Part One
79 pages
Big Data Analytics and Technologies Overview
No ratings yet
Big Data Analytics and Technologies Overview
5 pages
Seismic and Wind Design of Concrete Buildings
100% (13)
Seismic and Wind Design of Concrete Buildings
542 pages
Theoretical Considerations Governing The Choice of The Optimum Semiconductor For Photovoltaic Solar Energy Conversion
No ratings yet
Theoretical Considerations Governing The Choice of The Optimum Semiconductor For Photovoltaic Solar Energy Conversion
9 pages
BCA - 440-18, BCA-440-20, BCA-CS-440-20 Java
No ratings yet
BCA - 440-18, BCA-440-20, BCA-CS-440-20 Java
2 pages
Vendor Document Review & Specs
No ratings yet
Vendor Document Review & Specs
23 pages
Mole Concept - DPP 08 (Of Lec 10) - Prayas JEE 2.0 2024
No ratings yet
Mole Concept - DPP 08 (Of Lec 10) - Prayas JEE 2.0 2024
2 pages
Hardy-Weinberg Principle Worksheet
No ratings yet
Hardy-Weinberg Principle Worksheet
4 pages
Maths Flashcards
No ratings yet
Maths Flashcards
88 pages
ADVEI D 21 00714 - Reviewer
No ratings yet
ADVEI D 21 00714 - Reviewer
25 pages
Characteristics of Fourth Generation Language
No ratings yet
Characteristics of Fourth Generation Language
2 pages
AA SL - Unit 2c - Rational Functions
No ratings yet
AA SL - Unit 2c - Rational Functions
34 pages
Aluminium MMC Brake Rotor Guide
No ratings yet
Aluminium MMC Brake Rotor Guide
15 pages
Human Dimension & Interior Space (By Julius Panero and Martin Zelnik) - Ergonomia e Antropometria
91% (157)
Human Dimension & Interior Space (By Julius Panero and Martin Zelnik) - Ergonomia e Antropometria
154 pages
Online Chapter Tests: 7. Integrals
No ratings yet
Online Chapter Tests: 7. Integrals
3 pages
Equivalencein Translation
No ratings yet
Equivalencein Translation
103 pages
Miller 2018
No ratings yet
Miller 2018
41 pages
Lechler Fachbericht Asme-Power-Conference 2009-07 Van-Durme en
No ratings yet
Lechler Fachbericht Asme-Power-Conference 2009-07 Van-Durme en
6 pages
AQA Biology Paper 2 - 2023 - QP
No ratings yet
AQA Biology Paper 2 - 2023 - QP
41 pages
MCQs in Fixed Partial Dentures - Tooth Preparation
50% (2)
MCQs in Fixed Partial Dentures - Tooth Preparation
7 pages
Practical File For Class-X (AI)
No ratings yet
Practical File For Class-X (AI)
16 pages
Assignment On Array
No ratings yet
Assignment On Array
5 pages
Mémoire D'aristide 2012
No ratings yet
Mémoire D'aristide 2012
133 pages