0% found this document useful (0 votes)
8 views2 pages

IR Assignment7

The document presents an assignment on constructing and searching using a suffix array in the context of Information Retrieval. It includes a theoretical explanation of suffix arrays, a Python program for generating and searching them, and a user interface built with Tkinter. The program allows users to input a string and a pattern to find the pattern's index in the suffix array.

Uploaded by

vinayostwal707
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views2 pages

IR Assignment7

The document presents an assignment on constructing and searching using a suffix array in the context of Information Retrieval. It includes a theoretical explanation of suffix arrays, a Python program for generating and searching them, and a user interface built with Tkinter. The program allows users to input a string and a pattern to find the pattern's index in the suffix array.

Uploaded by

vinayostwal707
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

WALCHAND INSTITUTE OF TECHNOLOGY, SOLAPUR

INFORMATION TECHNOLOGY
2021-22 SEMESTER - II
ASSIGNMENT - 7
Subject: Information Retrieval

Name: Ayush Pande Roll no: 74 Class: Final Year Btech IT

Title: Construction of Suffix Array & Searching using Suffix Array

Theory:
A suffix array will contain integers that represent the starting indexes of the all
the suffixes of a given string, after the aforementioned suffixes are sorted. We
first sort all suffixes according to first character, then according to first 2
characters, then first 4 characters and so on while the number of characters to be
considered is smaller than 2n. The important point is, if we have sorted suffixes
according to first 2i characters, then we can sort suffixes according to first 2i+1
character in O(nLogn) time using a nLogn sorting algorithm like Merge Sort.

Program Code:
#7
def find():
text = variable1.get()
suffix = [text[i:] for i in range(len(text))]
Sortedsuffix = sorted([text[i:] for i in range(len(text))])
print("Suffixarray:", Sortedsuffix)
SuffixArray = [suffix.index(ss) for ss in Sortedsuffix]
print(SuffixArray)
return [SuffixArray, Sortedsuffix]
def bisect_left(lo=0, hi=None):
m=find()
a=m[0]
text=variable1.get()
x=variable2.get()
if lo < 0:
raise ValueError('lo must be non negative')
if hi is None:
hi = len(a)
#Binary search
while lo < hi:
mid = (lo+hi)//2
if text[a[mid]:] == x:
return a[mid]
elif text[a[mid]:] < x:
lo = mid+1
else:
hi = mid
if not text[a[lo]:].startswith(x):
raise IndexError('not found')
listNodes = Listbox(root, width=80,height=15)
listNodes.grid(column=0, row=12)
listNodes.insert(END,"\nSuffixArray: {}".format(m[0]))
listNodes.insert(END, "\nSuffixIndexArray: {}".format(m[1]))
listNodes.insert(END, "\n\nPatter Found In Suffix Array at Index : {}".format(a[lo]))
print(a[lo])
from tkinter import *
root = Tk()
root.geometry("500x550")
root.title('Suffix Array and Searching using Suffix Array')
var = IntVar()
variable1=StringVar() # Value saved here
variable2=StringVar() # Value saved here
Label(root, text = "Suffix Array and Searching using Suffix Array: ").grid(row=4, sticky=W)
Label(root, text = "Enter String: ").grid(row=6, sticky=W)
Entry(root, width=30, textvariable=variable1).place(y=25, x=130)
Label(root, text = "Enter Pattern: ").grid(row=8, sticky=W)
Entry(root, width=30, textvariable=variable2).place(y=45, x=130)
Button(root, text = "Show Results", command=bisect_left).grid(row=10, sticky=W)
root.mainloop()

Screenshots/Output:

You might also like