UNIT-2
INTRODUCTION TO STRING MATCHING
• Find pattern occurrences within a text.
• Fundamental problem with many applications.
• Components of String Matching:
• Pattern: The string to search for. Example: "abc"
• Text: The larger string where the pattern is searched.
• Example: "abcabcabc"
TYPES OF STRING MATCHING:
• Exact Matching:
• The goal is to find all exact occurrences of the pattern in the text.
• Example:Text: "abababa"
• Pattern: "aba"
• Output: Positions 0, 2, and 4 (0-based index).
• Approximate Matching:
• The pattern can match the text approximately, allowing mismatches or edits.
• Example: DNA sequence alignment, where "AGCT" may match "AGTT" with one
substitution.
APPLICATIONS OF STRING MATCHING
CHALLENGES IN STRING MATCHING:
• Efficiency: For long texts or patterns, naive approaches are
too slow.
• Variability: In approximate matching, handling mismatches
increases complexity.
COMPUTATIONAL GEOMETRY
• Introduction to Computational Geometry
• Definition: Algorithms to solve geometric problems involving points, lines,
polygons, etc.
• Applications:
• Computer graphics
• Robotics and pathfinding
• Geographic Information Systems (GIS)
• Game development
KEY CHARACTERISTICS
• Focuses on the design, analysis, and implementation of
geometric algorithms.
• Deals with problems in two or more dimensions.
APPLICATIONS
SEQUENTIAL SEARCH (LINEAR
SEARCH)
• Sequentially check elements in list.
• Find desired key or traverse all.
• Simple algorithm for searching arrays.
• Works on sorted or unsorted data.
• Compares each element with key.
• Stops if key is found [Link] until key or end.
• Inefficient for large data sizes.
STEPS OF SEQUENTIAL SEARCH
• Start from the first element in the list.
• Compare the key with the current element.
• If a match is found, return the index of the element.
• If no match is found, move to the next element.
• Repeat until the element is found or the list ends.
PSEUDOCODE
• SequentialSearch(array, key):
• for i = 0 to [Link] - 1:
• if array[i] == key:
• return i # Return index of key
• return -1 # Key not found
• Advantages:
• Simple to implement.
• Works on both sorted and unsorted data.
• No additional memory required.
• Disadvantages:
• Inefficient for large datasets.
• Requires linear traversal, making it slower compared to other
search algorithms like binary search.
APPLICATIONS:
• Small datasets where simplicity is preferred.
• Searching in unsorted or unordered lists.
• Scenarios where preprocessing (like sorting) is not feasible.
BRUTE-FORCE STRING MATCHING
• Simplest method for string matching.
• Finds a pattern within a text.
• Checks every position for a match.
• Compares pattern characters with text.
• Works without preprocessing text or pattern.
• Inefficient for large text or patterns.
STEPS
[Link] the pattern with the start of the text.
[Link] characters of the pattern with the corresponding
characters of the text.
[Link] all characters match, record the starting position.
[Link] the pattern by one position and repeat until the end of
the text.
PSEUDOCODE
• BruteForceStringMatch(text, pattern):
• n = length of text
• m = length of pattern
• for i = 0 to n - m:
• match = True
• for j = 0 to m - 1:
• if text[i + j] != pattern[j]:
• match = False
• break
• if match:
• return i # Pattern found at index i
Advantages:
• Simple and easy to implement.
• Does not require preprocessing of the text or pattern.
Disadvantages:
• Inefficient for large texts and patterns.
• High time complexity compared to optimized algorithms like
KMP or Boyer-Moore.
• Example:
• Text: "abcabcabc“
• Pattern: "abc“
• Result: Pattern found at positions 0, 3, and 6.
APPLICATION
CLOSEST-PAIR PROBLEM
• - Find two closest points in set.
• - Typically works in a 2D plane.
• - A classic problem in geometry.
• - Applications include clustering and analysis.
• - Used in pattern recognition and more.
CLOSEST-PAIR PROBLEM
• Find the two closest points in a set of n points (in the two
dimensional Cartesian plane).
• Brute-force algorithm
• Compute the distance between every pair of distinct points
and return the indexes of the points for which the distance is
the smallest.
• Standard Euclidean distance:
ALGORITHM
CONVEX HULL
• Convex hull is the smallest convex polygon that contain all the points
in the plane.
• Given a set of X of point Pi(Xi,Yi)……..Pn(Xn,Yn) in plane find the
convex hull.
• Divide and conquer approach take O(nlogn) time to compute the
convex hull in clock wise order.
ALGORITHM
• Step1 Partition X into half X1 and X2 acc to X coordinates.
• Step2 If X1 is empty then upper hull is simple line with end
point p1 and pn
• Step3 If x1 is not empty then find Pmax in x1 which is fastest
from lines p1,Pn.
• Step 4If Pmax tie then point the maximum angel Pmax P1 Pn.
• Step 5 Now identify all the point of X1 that are left of line
P1Pmax
• goto step 2
EXAMPLE CONVEX HULL