0% found this document useful (0 votes)
6 views7 pages

Module 6

Uploaded by

pathakpunit720
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views7 pages

Module 6

Uploaded by

pathakpunit720
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module 6: String Matching Algorithm

String Matching Introduction


 String Matching Algorithm is also called "String Searching Algorithm."
 This is a vital class of string algorithm is declared as "this is the method to find a place
where one is several strings are found within the larger string."
 A string-searching algorithm, sometimes called string-matching algorithm, is
an algorithm that searches a body of text for portions that match by pattern.
 Algorithms used for String Matching:
1. The Naive String Matching Algorithm
2. The Rabin-Karp-Algorithm
3. The Knuth-Morris-Pratt Algorithm

The Naïve String-Matching Algorithm


 Naive pattern searching is the simplest method among other pattern searching
algorithms. Although, it is more efficient than the brute force approach
 it also checks for all characters of the main string in order to find the pattern. Hence, its
time complexity is O(m*n) where the 'm' is the size of pattern and 'n' is the size of the
main string. This algorithm is helpful for smaller texts only.
 It compares first character of pattern with searchable text. If Match Found, pointer in
the both string is advanced.
 If Matched is not found, pointer of the text is incremented & pointer of pattern is reset.
 Algorithm:

Step 1: There will be two loops: One loop for input pattern and the second loop for text
string.
Step 2: Now, At the time of searching algorithm in a window, there will be two cases:
o Case 1: If a match is found, we will match the entire pattern with the current
window of the text string. And if found the pattern string is found in the current
window after traversing, print the result. Else traverse in the next window.
o Case 2: If a match is not found, we will break from the loop (using the 'break'
keyword), and the j pointer of the inner loop will move one index more and start
the search algorithm in the next window.
Step 3: This process is repeated till the end of text.

 Time Complexity:
o Worst case: O((n - m + 1) * m)
o Best case: O(n) (when mismatch happens quickly)
 Example:
Rabin-Karp Algorithm
 The Rabin-Karp algorithm is a string matching algorithm that efficiently finds patterns
within a larger text using a technique called hashing.
 The basic idea is to convert the pattern and each possible substring of the text into
numeric values (hashes) and then compare these values rather than the strings
themselves. This allows for faster comparisons, especially when dealing with large texts.
 The Rabin-Karp algorithm for string matching is useful because it can quickly find patterns
in large texts. It’s especially good when you need to search for multiple patterns at once
or when the text is very long.
 Algorithm:
Step 1: Initially calculate the hash value of the pattern.
Step 2: Start iterating from the starting of the string:
o Calculate the hash value of the current substring having length m.
o If the hash value of the current substring and the pattern are same check if the
substring is same as the pattern.
o If they are same, store the starting index as a valid answer. Otherwise,
continue for the next substrings.
Step 3: Return the starting indices as the required answer.
 Time Complexity:
o Average case: O(n + m)
o Worst case: O(nm) (due to hash collisions)
 Example:
Knuth-Morris-Pratt (KMP) Algorithm
 To avoid such redundancy, a linear sequence-matching algorithm named the KMP pattern
matching algorithm. It is also referred to as Knuth Morris Pratt pattern matching
algorithm.
 The KMP algorithm starts the search operation from left to right. It uses the prefix
function to avoid unnecessary comparisons while searching for the pattern.
 The KMP algorithm starts the search operation from left to right. It uses the prefix
function to avoid unnecessary comparisons while searching for the pattern.
 This function stores the number of characters matched so far which is known as LPS
value.
 Algorithm:
Step 1: Define a prefix function LPS (Longest Prefix Suffix) array.
Step 2: Slide the pattern over the text for comparison.
Step 3: If all the character’s match, we have found a match.
Step 4: If not, use the prefix function LPS Array to skip the unnecessary comparisons.
 Time Complexity:
 Preprocessing: O(m)
 Search: O(n)
 Overall: O(n + m)
 Example:
Comparison Summary
Algorithm Time Complexity Space Best For

Naïve O((n-m+1)*m) O(1) Small texts

Rabin-Karp Avg: O(n + m) O(1) Hash-based searching

Efficient string matching with


Knuth-Morris-Pratt O(n + m) O(m)
preprocessing

Summary
 Choose Naïve for simplicity and small data
 Use Rabin-Karp for large text and multiple patterns
 Use KMP for efficiency with a single long pattern and repetitive text

You might also like