MODULE 3
Input Enhancement in
String Matching :
HORSPOOL’S ALGORITHM
Horspool’s Algorithm
It is a pattern matching algorithm used to find the first occurrence of a
particular pattern string in the main string.
Whenever there is no match found, the pattern string is shifted by more than
one position but how many positions; that is decided by the SHIFT TABLE
Horspool’s Algorithm
Shift Table Algorithm Pattern Matching Algorithm
Shift Table Algorithm
EXAMPLE :
PATTERN :
INDIA
MAIN STRING :
I LOVE MY COUNTRY INDIA
Shift Table
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Uppercase Letters – 26 letters [ A – Z ]
Space character – 1
Total Size of the Shift Table - 27
Shift Table
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
for i = 0 to size – 1
i.e. from i = 0 to 26
Table[ i ] = m
Where m is the length of the pattern string
m=5
Therefore Table [ i ] = 5
Shift Table
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
for j = 0 to m - 2 j = 0 -> Table [ P [ 0 ] ] = Table [ I ] = 5 – 1 – 0 = 4
i.e. from j = 0 to 3
j = 1 -> Table [ P [ 1 ] ] = Table [ N ] = 5 – 1 – 1 = 3
Table [ P [ j ] ] = m – 1 - j j = 2 -> Table [ P [ 2 ] ] = Table [ D ] = 5 – 1 – 2 = 2
j = 3 -> Table [ P [ 3 ] ] = Table [ I ] = 5 – 1 – 3 = 1
Shift Table
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
2 4 3
1
Horspool’s Algorithm
Size of the Main String = 23
Example : I LOVE MY COUNTRY INDIA
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
I L O V E M Y C O U N T R Y I N D I A
I N D I A
I N D I A
I N D I A
I N D I A
I N D I A
Size of the Main String = 23
Example : I LOVE MY COUNTRY INDIA
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
I L O V E M Y C O U N T R Y I N D I A
I N D I A
I N D I A
I N D I A
I N D I A
I N D I A
1.Call the Shift_Table ( ) function
2.i = m - 1 where m is the length of the pattern string
Here, i = 5 – 1 = 4
i=4
3.while i < = n – 1
k 0
i < = n – 1 where n is the length of the Main String n = 23
i < = 22
4 < = 22 conditions satisfies
therefore k = 0
4. while k < = m – 1 and P [ m – 1 – k ] = T [ i – k ]
k=0;m=5;i=4
P [ ] is the pattern array and T is the main string i.e. the Text array
0 < = 5 - 1 and P [ 5 – 1 – 0 ] = T [ 4 – 0 ]
0 < = 4 and P [ 4 ] = T [ 4 ]
check if A = V condition fails ( out of while loop )
if k = m --> condition fails
else i i + Table [ T [ i ] ]
i = 4 + Table [ V ] = 4 + 5 = 9
Therefore i = 9
0 < = 5 - 1 and P [ 5 – 1 – 0 ] = T [ 9 – 0 ] 0 < = 5 - 1 and P [ 5 – 1 – 0 ] = T [ 14 – 0 ]
0 < = 4 and P [ 4 ] = T [ 9 ] 0 < = 4 and P [ 4 ] = T [ 14 ]
check if A = space condition fails check if A = T condition fails
if k = m --> condition fails if k = m --> condition fails
else i i + Table [ T [ i ] ] else i i + Table [ T [ i ] ]
i = 4 + Table [ space ] = 9 + 5 = 14 i = 4 + Table [ T ] = 14 + 5 = 19
Therefore i = 14 Therefore i = 19
0 < = 5 - 1 and P [ 5 – 1 – 0 ] = T [ 19 – 0 ]
0 < = 4 and P [ 4 ] = T [ 19 ]
check if A = N condition fails
if k = m --> condition fails
else i i + Table [ T [ i ] ]
i = 4 + Table [ N ] = 19 + 3 = 22
Therefore i = 22
For k = 0 => 0 < = 4 and P [ 4 ] = T [ 22 ] ; check if A = A - TRUE
For k = 1 => 1 < = 4 and P [ 3 ] = T [ 21 ] ; check if I = I - TRUE
For k = 2 => 2 < = 4 and P [ 2 ] = T [ 20 ] ; check if D = D - TRUE
For k = 3 => 3 < = 4 and P [ 1 ] = T [ 19 ] ; check if N = N - TRUE
For k = 4 => 4 < = 4 and P [ 0 ] = T [ 18 ] ; check if I = I - TRUE
For k = 5 => 5 < = 4
if k = m therefore k = 5 and m = 5 condition satisfies
returns i – m + 1
22 – 5 + 1 = 18
Index of the left end of the first matching sub string is returned i.e. 18 is returned
Previous Year’s Questions
Question:
Write Horspool’s Algorithm. Apply Horspool Algorithm to search for the pattern BAOBAB in
the text BESS_KNEW_ABOUT_BAOBABA
2008-2009 [10 marks]
THANK YOU
!!