0% found this document useful (0 votes)
17 views20 pages

Data-Structures-Lecture 3

The document discusses string processing and pattern matching algorithms. It introduces strings, how they are stored, and the pattern matching problem of determining if a pattern occurs within a text string. It then describes a first algorithm that compares all substring combinations and has complexity of O(|P| * (|T| - |P| + 1)) in the worst case. Finally, it introduces a fast pattern matching algorithm that uses a table derived from the pattern to efficiently determine matches.

Uploaded by

Zakir Hossen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views20 pages

Data-Structures-Lecture 3

The document discusses string processing and pattern matching algorithms. It introduces strings, how they are stored, and the pattern matching problem of determining if a pattern occurs within a text string. It then describes a first algorithm that compares all substring combinations and has complexity of O(|P| * (|T| - |P| + 1)) in the worst case. Finally, it introduces a fast pattern matching algorithm that uses a table derived from the pattern to efficiently determine matches.

Uploaded by

Zakir Hossen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Data Structures

Class Lecture-3

Mohammad Ashraful Islam


Lecturer
Department of CSE, JU
08/15/20 1
String Processing
String is a finite sequence S of zero or more
character.
The String with zero character is called empty
string.
Alphabet: a, b, c, d ... x, y, z
Digits: 0 to 9
Special character: +,-,/,(), $, =

08/15/20 2
String Processing
String are stored in three types of structure:
• Fix length structure
• Variable length structure
• Linked structure

08/15/20 3
Pattern Matching Problem
You are given a string text T and string pattern P. You
have to determine whether P occurs or appears in T
or not. This problem is known as pattern matching
algorithm.
For example:
o Let T =“abcdef” and P = “bcd”. Here P occurs in T.
o Let T =“abcdef” and P = “abd”. Here P does not
occurs in T.

08/15/20 4
First Pattern Matching Algorithm
• T = GCTACACATC
• P = ACAT
• T = T[1]T[2]T[3] T[4] T[5] T[6] T[7] T[8] T[9]T[10]
• P = P[1]P[2]P[3]P[4]
• So, T has 10-4+1 substrings of length 4.

08/15/20 5
First Pattern Matching Algorithm
• These 10-4+1 =7 substrings of length 4 are,
• W1 = T[1][2][3][4]
• W2 = T [2][3][4][5]
• W3 = T[3][4][5][6]
• W4 = T[4][5][6][7]
• W5 = T[5][6][7] [8]
• W6 = T[6][7] [8] [9]
• W7 = T[7] [8] [9][10]
08/15/20 6
First Pattern Matching Algorithm
• More formally, substring of T starts at kth
position : WK = T[K][K+1][K+2][K+3]

• In first pattern matching algorithm, for each K =


1 to 7, we will compare WK with Pattern P. If any
of the substring matches exactly with the
pattern P, then we can say Text T contains
pattern P.

08/15/20 7
First Pattern Matching Algorithm

08/15/20 8
First Pattern Matching Algorithm

08/15/20 9
First Pattern Matching Algorithm

08/15/20 10
Complexity
• The complexity of this pattern matching
algorithm is the number of comparison
between characters in pattern Pand
characters in text T.
In the given example, C = 1+1+1+4+1+4 =12

08/15/20 11
Complexity of First Pattern
Matching Algorithm
• Let, P = aaaab
• T = aaaaaaaaaaa
• |P| = 5, |T|= 11
• So, W1 = W2 = W3 = W4 = W5= W6=W7 = aaaaa
• C = 5+5+5+5+5+5+5 = 35.
• So, in worse case, C = |P| × (|T| - |P| + 1)

08/15/20 12
Fast Pattern Matching Algorithm

In second pattern matching algorithm we use a table


which is derived from P.
P = aaba
Q0 = “” (Empty string)
Q1= “a”
Q2= “aa”
Q3= “aab”
Q4= “aaba”
08/15/20 13
Fast Pattern Matching Algorithm
Initially, we have an empty string Q0.
Character ‘a’ after Q0 gives us Q1 =“a”
Character ‘a’ after Q1 gives us Q2 =“aa”
Character ‘b’ after Q2 gives us Q3 =“aab”
Character ‘a’ after Q3 gives us Q4 =“aaba” = P

08/15/20 14
Fast Pattern Matching Algorithm
Initially, we have an empty string Q0.
Character ‘a’ after Q0 gives us Q1 =“a”
But, what will happen if we get ‘b’ after Q0 ?
Answer : ‘b’ after Q0 means string, Str = “b”.
The maximum length Suffix of Str that matches
with Prefix of P is 0. So, it takes us to Q0.

08/15/20 15
Fast Pattern Matching Algorithm
What will happen if we get ‘b’ after Q 1 ?
Answer : ‘b’ after Q1 means string, Str = “ab”.
The maximum length Suffix of Str that matches
with Prefix of P is 0. So, it takes us to Q0.
The Suffix of Str are , “b” , “ab”
The prefix of P are , “a” , “aa”

08/15/20 16
Fast Pattern Matching Algorithm
What will happen if we get ‘a’ after Q2 ?
Answer : ‘a’ after Q2 means string, Str = “aaa”.
The maximum length Suffix of Str that matches with
Prefix of P is 0. So, it takes us to Q0.
The Suffix of Str are , “a” , “aa” , “aaa”
The prefix of P are , “a” , “aa” , “aab”
So, Maximum length that matches is 2.

08/15/20 17
Fast Pattern Matching Algorithm
What will happen if we get ‘b’ after Q 3 ?
Answer : ‘b’ after Q3 means string, Str = “aabb”.
The maximum length Suffix of Str that matches
with Prefix of P is 0. So, it takes us to Q0.
The Suffix of Str are , “b” , “bb” , “abb”, “aabb”
The prefix of P are , “a” , “aa” , “aab” , “aaba”.

08/15/20 18
Pattern Matching Table
a b

Q0 Q1 Q0

Q1 Q2 Q0

Q2 Q2 Q3

Q3 Q4 = P Q0

08/15/20 19
Thank You

20

You might also like