Design & Analysis of Algorithms Exam
Design & Analysis of Algorithms Exam
B. Tech.
(SEM V) THEORY EXAMINATION 2022-23
DESIGN & ANALYSIS OF ALGORITHM
Solution
Q1.
Sol. The development of an algorithm is a process that involves several basic steps.
These steps are as follows:
b) Explain and compare best and worst time complexity of Quick Sort.
The best-case time complexity of Quick Sort occurs when the pivot element divides
the input array into two roughly equal parts. The best-case time complexity is O(n log
n), where n is the number of elements in the array.
The worst-case time complexity of Quick Sort occurs when the input array is already
sorted or nearly sorted. The worst-case time complexity is O(n^2), where n is the
number of elements in the array.
The average-case time complexity of Quick Sort is O(n log n), which is faster than
many other sorting algorithms.
Sol. Skip list is a data structure that provides a way to perform fast searches,
insertions, and deletions in a sorted sequence of elements. It is similar to a linked list,
but with additional links that allow for faster search times. The basic idea behind skip
lists is to maintain multiple linked lists with different "skip distances", allowing for
faster searches.
A skip list is a series of linked lists that are layered on top of one another, with the
bottom layer containing all the elements in sorted order. Each higher layer has fewer
elements than the one below it, and the elements in each layer are chosen randomly
from the layer below it. The topmost layer contains only two elements, a negative
infinity element and a positive infinity element.
The operations of skip list are as follows:
1. Search.
2. Insertion
3. Deletion
The time complexity of the search operation in a skip list is O(log n), where n is the
number of elements in the skip list.
For example, in the classic dynamic programming problem of finding the shortest
path between two nodes in a graph, the principle of optimality can be applied to break
down the problem into subproblems of finding the shortest path between each pair of
nodes. By solving each subproblem optimally, the shortest path between any two
nodes in the graph can be obtained by combining the shortest paths between
intermediate nodes.
Backtracking traverses the state space tree Branch-and-Bound traverse the tree in any
Traversal
by DFS(Depth First Search) manner. manner, DFS or BFS.
Backtracking can solve almost any Branch-and-Bound can not solve almost any
Solve
problem. (chess, sudoku, etc ). problem.
Nodes in stat space tree are explored in Nodes in tree may be explored in depth-
Nodes
depth first tree. first or breadth-first order.
Next move from current state can lead to Next move is always towards better
Next move
bad choice. solution.
On successful search of solution in state Entire state space tree is search in order to
Solution
space tree, search stops. find optimal solution.
Example,
Here,
Green is the start point, blue is the intermediate point, red are points with no feasible solution,
dark green is end solution.
Here, when the algorithm propagates to an end to check if it is a solution or not, if it is then
returns the solution otherwise backtracks to the point one step behind it to find track to the
next point to find solution.
Sol. A problem is in the class NPC if it is in NP and is as hard as any problem in NP. A
problem is NP-hard if all problems in NP are polynomial time reducible to it, even though it
may not be in NP itself.
If a polynomial time algorithm exists for any of these problems, all problems in NP would be
polynomial time solvable. These problems are called NP-complete. The phenomenon of NP-
completeness is important for both theoretical and practical reasons.
Definition of NP-Completeness
• B is in NP
• Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the language B is
known as NP-Hard. Informally, a search problem B is NP-Hard if there exists some NP-
Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a problem is
proved to be NPC, there is no need to waste time on trying to find an efficient algorithm for
it. Instead, we can focus on design approximation algorithm.
NP-Complete Problems
Following are some NP-Complete problems, for which no polynomial time algorithm is
known.
NP-Hard Problems
Sol. An algorithm that uses random numbers to decide what to do next anywhere in its logic
is called a Randomized Algorithm. For example, in Randomized Quick Sort, we use a
random number to pick the next pivot (or we randomly shuffle the array).
How to analyse Randomized Algorithms?
Some randomized algorithms have deterministic time complexity. For example, this
implementation of Karger’s algorithm has time complexity is O(E). Such algorithms are
called Monte Carlo Algorithms and are easier to analyse for worst case.
On the other hand, time complexity of other randomized algorithms (other than Las Vegas) is
dependent on value of random variable. Such Randomized algorithms are called Las Vegas
Algorithms. These algorithms are typically analysed for expected worst case. To compute
expected time taken in worst case, all possible values of the used random variable needs to be
considered in worst case and time taken by every possible value needs to be evaluated.
Average of all evaluated times is the expected worst case time complexity. Below facts are
generally helpful in analysis os such algorithms.
Q 2.
a) Explain Merge sort algorithm and sort the following sequence {23, 11, 5,
15, 68, 31, 4, 17} using merge sort.
Sol. Merge sort is a divide-and-conquer algorithm that sorts an array by recursively dividing
it into two halves, sorting the two halves separately, and then merging the sorted halves to
produce a single sorted array.
To sort the sequence {23, 11, 5, 15, 68, 31, 4, 17} using Merge sort, we can follow these
steps:
1. Divide the input sequence into two halves: {23, 11, 5, 15} and {68, 31, 4, 17}.
2. Recursively sort each half of the sequence by applying the merge sort algorithm to
each half.
o Sort {23, 11, 5, 15}:
▪ Divide into {23, 11} and {5, 15}.
▪ Recursively sort each half: {23, 11} and {5, 15}.
▪ Merge the two sorted halves: {5, 11, 15, 23}.
o Sort {68, 31, 4, 17}:
▪ Divide into {68, 31} and {4, 17}.
▪ Recursively sort each half: {68, 31} and {4, 17}.
▪ Merge the two sorted halves: {4, 17, 31, 68}.
3. Merge the two sorted halves {5, 11, 15, 23} and {4, 17, 31, 68} to produce the sorted
sequence {4, 5, 11, 15, 17, 23, 31, 68}.
Therefore, the sorted sequence using Merge sort is {4, 5, 11, 15, 17, 23, 31, 68}.
Sol. Binomial heap and Fibonacci heap are two types of priority queues, which are used to
efficiently implement operations such as insert, delete, and extract-min. While both data
structures have similar performance characteristics in most cases, there are some differences
between them.
1. Structure: A binomial heap is a collection of binomial trees, where each binomial tree
follows the heap property, i.e., the key of each node is greater than or equal to the key
of its parent. In contrast, a Fibonacci heap is a collection of trees that are not
necessarily binomial trees, but still follow the heap property.
2. Merge operation: In binomial heap, merging two heaps of the same order involves
merging the two corresponding trees by making one tree the child of the other. In
Fibonacci heap, merging two heaps involves simply concatenating the two root lists,
which is a constant time operation.
3. Potential function: The Fibonacci heap uses a potential function that is proportional to
the number of nodes in the heap, whereas the binomial heap uses a potential function
that is proportional to the number of trees in the heap.
4. Time complexity: In terms of time complexity, the worst-case time complexity of
most operations is better in Fibonacci heap compared to binomial heap. For example,
the worst-case time complexity of extract-min in Fibonacci heap is O(log n), whereas
in binomial heap it is O(log n) on average, but can be as bad as O(log n) in the worst
case.
5. Space complexity: In terms of space complexity, the binomial heap is more space-
efficient than the Fibonacci heap. This is because the binomial heap stores the trees in
a strict order, whereas the Fibonacci heap allows trees of any size and shape.
6. Decrease-key operation: The decrease-key operation is more efficient in Fibonacci
heap than in binomial heap, because it can be done in constant time in Fibonacci heap,
whereas in binomial heap it can take O(log n) time in the worst case.
In summary, both binomial heap and Fibonacci heap are efficient data structures for
implementing priority queues. While Fibonacci heap has better worst-case time complexity
and more efficient decrease-key operation, binomial heap is more space-efficient and can be
more predictable in terms of the number of trees.
We can use Kruskal's algorithm to find the minimum spanning tree. Here is the step-by-step
process:
Using this algorithm, we can find the minimum spanning tree for the given graph. The edges
in the minimum spanning tree are: (A, B), (B, D), (A, C).
Kruskal's algorithm is a greedy algorithm for finding the minimum spanning tree of a
connected undirected graph. It works by maintaining a forest of trees, where each tree
initially contains a single vertex. The algorithm repeatedly adds the next cheapest edge that
connects two trees in the forest, until all vertices are in a single tree.
Kruskal's algorithm is efficient, with a time complexity of O(E log E), where E is the number
of edges in the graph.
LCS Problem Statement: Given two sequences, find the length of the longest subsequence
present in both of them. A subsequence is a sequence that appears in the same relative order
but is not necessarily contiguous. For example, “abc”, “abg”, “bdf”, “aeg”, ‘”acefg”, .. etc are
subsequences of “abcdefg”.
In order to find out the complexity of brute force approach, we need to first know the number
of possible different subsequences of a string with length n, i.e., find the number of
subsequences with lengths ranging from 1,2,..n-1. Recall from a theory of permutation and
combination that number of combinations with 1 element is nC1. A number of combinations
with 2 elements are nC2 and so forth and so on. We know that nC0 + nC1 + nC2 + … nCn = 2n.
So a string of length n has 2n-1 different possible subsequences since we do not consider the
subsequence with length 0. This implies that the time complexity of the brute force approach
will be O(n * 2n). Note that it takes O(n) time to check if a subsequence is common to both
strings. This time complexity can be improved using dynamic programming.
Algorithm
1. Create a 2D array “dp” with rows and columns equal to the length of each input string
plus 1.
2. Initialize the first row and column of the dp array to 0.
3. Iterate through the rows of the dp array, starting from 1.
4. Within the outer loop, iterate through the columns of the dp array, also starting from
1.
5. If the character at the current row of the first input string is equal to the character at
the current column of the second input string, set the current element of the dp array
to the value of the element above-left of the current element, plus 1.
6. Else, set the current element of the dp array to the maximum value of the element
above or to the left of the current element.
7. After the nested loops, the last element of the dp array will contain the length of the
LCS.
8. To find the actual LCS, initialize an empty string and iterate through the dp array,
starting from the last element and going towards the first element.
9. At each step, if the current element is not equal to the element above or to the left of
it, add the character at the current position of the corresponding input string to the
LCS string and move diagonally to the left and up.
10. Else, move either up or left, whichever element is greater.
11. Reverse the LCS string to get the final result.
Examples:
LCS for input Sequences “ABCDGH” and “AEDFHR” is “ADH” of length 3.
LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4.
Sol. The Naïve-String matching algorithm is a simple and straightforward algorithm used to
find occurrences of a pattern string P in a text string T. The algorithm works by sliding the
pattern string P over the text string T, comparing each character of P with the corresponding
character of T until the entire pattern is found or the end of T is reached.
1. Start with the first character of the text string T and compare it with the first character
of the pattern string P.
2. If the first characters of T and P match, then compare the second character of T with
the second character of P, and continue until all the characters in P have been
compared with the corresponding characters in T.
3. If all the characters of P match with the corresponding characters in T, then an
occurrence of P is found in T.
4. If the characters do not match, then slide the pattern string P one position to the right,
and repeat steps 2 and 3 until the end of T is reached.
Let's apply the Naïve-String Matching algorithm on the given pattern P = aab and text T =
acaabc:
Step 1: Start with the first character of the text string T and compare it with the first character
of the pattern string P.
T: acaabc
P: aab
T: acaabc
P: aab
The second character of T does not match with the second character of P.
Step 3: Slide the pattern string P one position to the right, and repeat steps 1 and 2.
T: acaabc
P: aab
The first character of T does not match with the first character of P.
T: acaabc
P: aab
The first character of T does not match with the first character of P.
T: acaabc
P: aab
T: acaabc
P: aab
T: acaabc
P: aab
The third character of T does not match with the third character of P.
Step 4: Continue sliding the pattern string P one position to the right, and repeating steps 1 to
3 until the end of T is reached.
T: acaabc
P: aab (occurrence found)
Therefore, the Naïve-String Matching algorithm found one occurrence of the pattern P = aab
in the text T = acaabc.
To solve the recurrence relation, we can use the recursive tree method. We start by expanding
the recurrence relation until we reach the base case:
We can simplify the sum in the last line using the formula for the sum of the first n fourth
powers:
Since the algorithm performs n^4 operations at each recursive call, the time complexity of the
algorithm is O(n^5).
Therefore, the solution to the recurrence relation T(n) = T(n-1) + n^4 is T(n) = T(1) +
(n(n+1)/2)^2 + n(n+1)(2n+1)/6, and the time complexity of the algorithm is O(n^5).
To solve the recurrence relation, we can use the Master Theorem, which gives us the time
complexity of divide-and-conquer algorithms like this one. The Master Theorem has the
form:
where:
Now we can compare f(n) to n^(log_b a) to determine the time complexity of the algorithm.
n^(log_b a) = n^(log_4 2) = n^(1/2)
f(n) = O(n^2)
Therefore, by the Master Theorem, the time complexity of the algorithm is O(n^2).
Sol Counting sort is a sorting algorithm that works by counting the number of occurrences of
each element in the input array and then using that information to determine the sorted order
of the array. The algorithm assumes that each element in the array is an integer within a
specific range.
1. Find the maximum value in the input array A and create a new array C of size
max(A)+1, initialized to all zeros.
2. Traverse the input array A and for each element A[i], increment the corresponding
count in the array C.
3. Calculate the running total of the array C by summing up the values in C from left to
right. This step ensures that the sorted output will maintain the relative order of
elements with equal values.
4. Create a new output array B of the same size as A.
5. Traverse the input array A in reverse order, and for each element A[i], decrement the
corresponding count in C and use the count as an index into the output array B. Place
the element A[i] at the index given by C[A[i]], then decrement C[A[i]].
Q4 a) Discuss the various cases for insertion of key in red-black tree for
given sequence of key in an empty red-black tree- {15,13,12,16,19,23,5,8}.
Also show that a red-black tree with n internal nodes has height at most
2lg(n+1).
Sol. To insert the given sequence of keys in an empty red-black tree, we first create a root
node with key 15 and color black (as per the rules of a red-black tree). Then we insert the
remaining keys one by one, maintaining the properties of a red-black tree at each step.
1. Insert 13: Insert 13 as the left child of the root 15. Since the root is black and has no
red parent, we can insert 13 as a red node without violating any property.
2. Insert 12: Insert 12 as the left child of 13. This creates a violation of the red-black tree
property 4, as 13 and 12 are both red nodes. To fix this, we need to perform a rotation
and a recoloring. We perform a right rotation on the parent of 12, which is 13. After
the rotation, 12 becomes the parent of 13 and the left child of 15. We then recolor 12
to black and its parent 15 to red. Now the red-black tree properties are satisfied.
3. Insert 16: Insert 16 as the right child of 15. Since the parent of 16 is black, we can
insert 16 as a red node without violating any property.
4. Insert 19: Insert 19 as the right child of 16. Since the parent of 19 is red, we need to
perform a recoloring and/or rotation to maintain the red-black tree properties. In this
case, we can perform a left rotation on 16 and recolor 16 to black and 15 to red. This
gives us a balanced red-black tree.
5. Insert 23: Insert 23 as the right child of 19. Since the parent of 23 is red, we need to
perform a recoloring and/or rotation. In this case, we can perform a left rotation on 19
and recolor 19 to black and 16 to red. This gives us a balanced red-black tree.
6. Insert 5: Insert 5 as the left child of 12. This creates a violation of the red-black tree
property 4, as 12 and 5 are both red nodes. To fix this, we need to perform a rotation
and a recoloring. We perform a right rotation on the parent of 5, which is 12. After the
rotation, 5 becomes the parent of 12 and the left child of 13. We then recolor 5 to
black and its parent 13 to red. Now the red-black tree properties are satisfied.
7. Insert 8: Insert 8 as the right child of 5. Since the parent of 8 is red, we need to
perform a recoloring and/or rotation. In this case, we can perform a left rotation on 5
and recolor 5 to black and 12 to red. This gives us a balanced red-black tree.
[13,B]
/ \
[5,B] [15,R]
/ \ / \
[12,B] [8,R] [16,B] [19,R]
\ \
[23,B] [20,R]
Now we will prove that a red-black tree with n internal nodes has height at most 2lg(n+1).
A red-black tree with n internal nodes has n+1 leaves. Since the height of a red-black tree is
the number of black nodes in the path from the root to a leaf, we need to show that a red-
black tree with n+1 leaves has at most 2lg(n+1) black nodes
Q 4 b) Explain and write an algorithm for union of two binomial heaps and
write its time complexity.
A binomial heap is a collection of binomial trees, where each binomial tree follows the heap
property - the key of a parent node is always smaller than or equal to the keys of its children.
The binomial heap is named so because the binomial trees follow a specific pattern of having
the same shape as a binary number with the bits in reversed order. The union of two binomial
heaps is the process of combining the two heaps into a single heap while maintaining the
heap property.
The algorithm for the union of two binomial heaps H1 and H2 is as follows:
The time complexity of the union of two binomial heaps is O(log N), where N is the
maximum number of nodes in either heap. The merging of the root lists takes O(log N) time,
since the maximum number of trees in a binomial heap of size N is log N. The linking of
trees in step 3 takes O(1) time per link operation, and there can be at most log N link
operations. Therefore, the overall time complexity of the union operation is O(log N).
Sol A greedy algorithm is an algorithmic paradigm that follows the problem-solving heuristic
of making the locally optimal choice at each stage with the hope of finding a global optimum.
The idea is to make the best possible decision at each step without considering the overall
structure of the problem. In other words, the algorithm makes the best choice that is
immediately available, without considering the future consequences of that choice.
A common example of a problem that can be solved using a greedy algorithm is the
Fractional Knapsack problem. In this problem, we are given a set of items, each with a
weight and a value, and a knapsack with a maximum weight capacity. The objective is to fill
the knapsack with the maximum value of items without exceeding the weight capacity of the
knapsack.
The greedy-choice property of the Fractional Knapsack problem states that the optimal
solution can be obtained by selecting items in decreasing order of their value-to-weight ratios.
In other words, the algorithm chooses the item with the highest value-to-weight ratio and puts
as much of it as possible into the knapsack, and repeats this process until the knapsack is full.
Here is the pseudo code for the greedy algorithm to solve the Fractional Knapsack problem:
java
FractionalKnapsack(items, capacity):
sort the items in decreasing order of value-to-weight ratio
total_value = 0
for each item in items:
if capacity == 0:
return total_value
fraction = min(item.weight, capacity)
total_value = total_value + fraction * item.value / item.weight
item.weight = item.weight - fraction
capacity = capacity - fraction
return total_value
The time complexity of the Fractional Knapsack problem using a greedy algorithm is O(n log
n) for sorting the items by value-to-weight ratio, plus O(n) for the loop over the items.
Therefore, the overall time complexity is O(n log n).
Single source shortest paths is a problem in graph theory that involves finding the shortest
paths from a single source vertex to all other vertices in a weighted graph. The problem can
be solved using various algorithms, such as Dijkstra's algorithm, Bellman-Ford algorithm,
and Floyd-Warshall algorithm.
Dijkstra's algorithm is a popular algorithm for solving the single source shortest paths
problem in weighted graphs with non-negative edge weights. The algorithm works by
repeatedly selecting the vertex with the smallest distance from the source vertex, relaxing its
adjacent edges, and adding it to the set of visited vertices. The algorithm maintains a set of
visited vertices and their distances from the source vertex.
Here is the pseudocode for Dijkstra's algorithm:
Dijkstra(G, s):
Initialize distances of all vertices as infinite and the distance of
the source vertex as 0
Create an empty set S
While S does not contain all vertices:
Select the vertex u with the smallest distance from the source
vertex that is not in S
Add u to S
For each neighbor v of u that is not in S:
If the distance to v through u is smaller than the current
distance to v:
Update the distance of v
Return the distances of all vertices from the source vertex
In the above pseudocode, G is the graph, s is the source vertex, and S is the set of visited
vertices. The algorithm initializes the distances of all vertices as infinite, except the distance
of the source vertex, which is set to 0. The algorithm then selects the vertex with the smallest
distance from the source vertex that is not in S, adds it to S, and relaxes its adjacent edges.
The algorithm repeats this process until S contains all vertices in the graph.
The time complexity of Dijkstra's algorithm is O(E + V log V), where E is the number of
edges and V is the number of vertices in the graph. The algorithm uses a priority queue to
store the vertices, and the priority queue operations take O(log V) time. The algorithm visits
each vertex at most once and processes each edge once, resulting in a total of O(E + V log V)
time complexity.
The sum of subsets problem is a well-known problem in computer science and mathematics
that involves finding all subsets of a given set of numbers that sum to a specified target value.
In the given example, we have the set w = {5, 7, 10, 12, 15, 18, 20} and the target value m =
35. The task is to find all possible subsets of w that add up to 35.
One way to solve this problem is by using a recursive backtracking algorithm. The algorithm
works as follows:
[]
/ \
[5] []
/ \ / \
[5,7] [5] [7] []
/ \ |
[5,7,10] [5,10]
/ |
[5,7,10,12]
The tree shows all possible subsets of w that sum to 35. The nodes that represent valid subsets are
highlighted in the tree. From the tree, we can see that there are three subsets of w that sum to 35:
{5, 15, 15}, {7, 10, 18}, and {20, 15}.
The n-queens problem is a classic problem in computer science and mathematics, which
involves placing n queens on an n×n chessboard, such that no two queens threaten each other.
In other words, no two queens can be placed on the same row, column, or diagonal of the
chessboard.
For example, in the 4-queens problem, we need to place 4 queens on a 4x4 chessboard such
that no two queens threaten each other. The backtracking algorithm can be used to solve this
problem.
Here's how the backtracking algorithm works for the 4-queens problem:
1. Start by placing the first queen in the first row and the first column.
2. Move to the second row and try to place the second queen in a column that does not
conflict with the first queen.
3. If a conflict arises, backtrack to the previous row and try the next column.
4. If all columns in the current row have been tried and a valid position for the queen has
not been found, backtrack to the previous row and try the next column there.
5. Repeat the above steps until all queens have been placed on the board, or a valid
placement cannot be found.
Step 1: Place the first queen in the first row and the first column.
Q...
....
....
....
Step 2: Move to the second row and try to place the second queen in a column that does not
conflict with the first queen.
Q . . .
. . . .
. . . .
. . . .
Step 3: Since the second queen cannot be placed in the first column due to a conflict, move to
the next column and place the second queen there.
Q . . .
. . Q .
. . . .
. . . .
Step 4: Move to the third row and try to place the third queen in a column that does not
conflict with the first two queens.
Q . . .
. . Q .
. . . .
. . . .
Step 5: Since the third queen cannot be placed in the first or second columns due to conflicts,
move to the next column and place the third queen there.
Q . . .
. . Q .
. . . .
. Q . .
Step 6: Move to the fourth row and try to place the fourth queen in a column that does not
conflict with the first three queens.
Q . . .
. . Q .
. . . .
. Q . .
Step 7: Since the fourth queen cannot be placed in the first or third columns due to conflicts,
move to the next column and place the fourth queen there.
Q . . .
. . Q .
. Q . .
. . . .
We have found a valid placement of 4 queens on the chessboard, and the problem is solved.
This backtracking algorithm can be used to solve the n-queens problem for any value of n.
However, the time complexity of this algorithm is O(n^n), which makes it infeasible for large
values of n. Therefore, various optimizations have been proposed to solve this problem more
efficiently.
Q7.a) What is string matching algorithm? Explain Rabin-Karp method
with examples.
String matching is a well-known problem in computer science that involves finding all
occurrences of a pattern string in a larger text string. There are many algorithms to solve this
problem, each with its strengths and weaknesses.
The Rabin-Karp algorithm is one such algorithm that is particularly useful for matching large
patterns in large text strings. It uses a rolling hash function to compute the hash value of the
pattern and the hash value of substrings of the text. By comparing the hash values, we can
quickly determine if a potential match exists and move on to the next substring.
1. Compute the hash value of the pattern using a hash function. A common hash
function is to treat the pattern as a base-r number and compute its value modulo some
large prime p. For example, if we are matching the pattern "abc" and r = 256 (the
number of possible ASCII characters), we can compute the hash value as (97256^2 +
98256 + 99) % p.
2. Compute the hash value of the first substring of length m (the length of the pattern) in
the text.
3. Compare the hash values. If they are equal, compare the substrings character by
character to verify that a match exists.
4. If a match is not found, compute the hash value of the next substring of length m by
"rolling" the hash value to the next substring. To do this, subtract the value of the first
character in the previous substring, multiply the remaining hash value by r, and add
the value of the next character. For example, if the previous substring was "abc" and
the next character is "d", we can compute the hash value of the next substring as
((hash - 97*256^2) * 256 + 100) % p.
5. Repeat steps 3-4 until all substrings of length m in the text have been checked.
Suppose we want to find all occurrences of the pattern "abc" in the text "abdcabcabc". We
can use the Rabin-Karp algorithm with r = 256 and p = 101 to find the matches.
1. Compute the hash value of the pattern "abc" as (97256^2 + 98256 + 99) % 101 = 41.
2. Compute the hash value of the first substring "abd" as (97256^2 + 98256 + 100) %
101 = 48.
3. Compute the hash value of the second substring "bdc" as (98256^2 + 100256 + 99) %
101 = 94.
4. Compute the hash value of the third substring "dca" as (100256^2 + 99256 + 97) %
101 = 42.
5. Compute the hash value of the fourth substring "cab" as (99256^2 + 97256 + 98) %
101 = 11.
6. Compute the hash value of the fifth substring "aba" as (97256^2 + 98256 + 97) % 101
= 40.
7. Compute the hash value of the sixth substring "bac" as (98256^2 + 97256 + 99) %
101 = 68.
8. Compute the hash value of the seventh substring "abc" as (97256^2 + 98256 + 99) %
101 = 41.
9. Since the hash values of the first and seventh substrings match, we compare the
substrings
Q7 b) Explain approximation algorithm. Explore set cover problem using
approximation algorithm.
Sol. Approximation algorithms are algorithms that provide efficient solutions to optimization
problems, often in situations where finding an exact solution is computationally intractable.
Instead of guaranteeing an optimal solution, approximation algorithms provide a near-optimal
solution that is usually within a certain factor of the optimal solution.
The set cover problem is a classic optimization problem that can be solved using an
approximation algorithm. The problem involves selecting a minimum number of sets from a
collection of sets to cover all elements in a given universe. More formally, given a universe U
and a collection of sets S1, S2, ..., Sm whose union equals U, the goal is to find a minimum
subset of the sets whose union is equal to U.
Here's how the approximation algorithm for the set cover problem works:
1. Start with an empty set C that will contain the selected sets.
2. While U is not empty: a. Find the set Si that covers the most uncovered elements in U.
If there are ties, choose an arbitrary one. b. Add Si to C. c. Remove all the elements in
Si from U.
3. Return C.
The above algorithm is known as the greedy algorithm for set cover, and it has a provable
approximation guarantee of ln n, where n is the number of elements in U. This means that the
number of sets selected by the algorithm is at most ln n times the optimal number of sets.
Here's an example to illustrate the approximation algorithm for the set cover problem:
Suppose we have the universe U = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and the collection of sets S1 =
{1, 2, 3}, S2 = {2, 3, 4}, S3 = {3, 4, 5}, S4 = {4, 5, 6}, S5 = {5, 6, 7}, S6 = {6, 7, 8}, S7 =
{7, 8, 9}, S8 = {8, 9, 10}.