DAA Module 2 Notes
DAA Module 2 Notes
1. General method
Divide and Conquer is one of the best-known
best known general algorithm design technique. It works
according to the following general plan:
• Given a function to compute on ‘n’ inputs the divide-and-conquer
divide conquer strategy suggests
splitting the inputs into ‘k’ distinct subsets, 1<k<=n, yielding ‘k’ sub problems.
• These sub problems must be solved, and then a method must be found to combine sub
solutions into a solution
solutio of the whole.
• If the sub problems are still relatively large, then the divide-and-conquer
divide conquer strategy can
possibly be reapplied.
• Often the sub problems resulting from a divide-and-conquer
divide conquer design are of the same
type as the original problem.For those cases the reapplication of the divide-and-
divide
conquer principle is naturally expressed by a recursive algorithm.
Problem of size n
Where,
• T(n) is the time for divide and conquer method on any input of size n and
• g(n) is the time to compute answer directly for small inputs.
• The function f(n) is the time for dividing the problem ‘p’ and combining the solutions
to sub problems.
For divide and conquer based algorithms that produce sub problems of the same type as the
original problem, it is very natural to first describe them by using recursion.
More generally, an instance of size n can be divided into b instances of size n/b, with a of
them needing to be solved. (Here, a and b are constants; a>=1 and b > 1.). 1 Assuming that
k
size n is a power of b(i.e.
(i.e. n=b ), to simplify our analysis, we get the following recurrence for
the running time T(n):
..... (1)
where f(n) is a function that accounts for the time spent on dividing the problem into smaller
ones and on combining their solutions.
The recurrence relation can be solved by i) substitution method or by using ii) master
theorem.
1. Substitution Method - This method repeatedly makes substitution for each
occurrence of the function T in the right-hand side until all such occurrences
disappear.
2. Master Theorem - The efficiency analysis of many divide-and-conquer
divide conquer algorithms is
greatly simplified by the master theorem. It states that, in recurrence equation T(n) =
(n) Θ (nd) where d ≥ 0 then
aT(n/b) + f (n), If f (n)∈
Problems on Substitution method & Master theorem to solve the recurrence relation
3. Binary Search
Problem definition:Let ai, 1 ≤ i ≤ n be a list of elements that are sorted in non-decreasing
non
order. The problem is tofind whether a given element x is present in the list or not. If x is
present we have to determine a value j (element’s position) such that aj=x. If x is not in the
list, then j is set to zero.
Solution:Let P = (n, ai…al , x) denote an arbitrary instance of search problem where n is the
number of elements in the list, ai…alis the list of elements andx is the key element to be
searched for in the given list.Binary
Binary search on the list is done as follows:
Step1: Pick an index q in the middle range [i, l] i.e. q= 1 /2 and compare x with aq.
Step 2: if x = aqi.e key element is equal to mid element, the problem is immediately solved.
Step 3: if x <aqin this case x has to be searched for only in the sub-list
sub ai, ai+1, ……, aq-
Therefore, problem reduces to (q-i, ai…aq-1, x).
Step 4: if x >aq,xx has to be searched for only in the sub-list
sub aq+1, ...,., al . Therefore problem
reduces to (l-i, aq+1…al, x).
For the above solution procedure, the Algorithm can be implemented as recursive or non-
non
recursive algorithm.
Recursive
cursive binary search algorithm
Analysis
In binary search the basic operation is key comparison. Binary Search can be analyzed with
the best, worst, and average case number of comparisons.The numbers of comparisons for the
recursive and iterative versions of Binary Search are the same, if comparison counting is
relaxed slightly. For Recursive Binary Search, count each pass through the if-then-else
if block
as one comparison. For Iterative Binary Search, count each pass through the while block as
one comparison.LetLet us find out how
h many such key comparison does the algorithm make on
an array of n elements.
Best case –Θ(1) In the best case, the key is the middle in the array. A constant number of
comparisons (actually just 1) are required.
Worst case - Θ(log2 n) In the worst case, the key does not exist in the array at all. Through
each recursion or iteration of Binary Search, the size of the admissible range
ran is halved. This
halving can be done ceiling(log2n ) times. Thus, log n comparisons are required.
Sometimes, in case of the successful search, it may take maximum number of comparisons.
log n . So worst case complexity of successful binary search is Θ (log2 n).
Average case - Θ (log2n) To find the average case, take the sum of the product of number of
comparisons required to find each element and the probability of searching for that element.
To simplify the analysis, assume
assum that no item which is not in array will be searched for, and
that the probabilities of searching for each element are uniform.
Space Complexity - The space requirements for the recursive and iterative versions of binary
search are different. Iterative Binary Search requires only a constant amount of space, while
Recursive Binary Search requires space proportional to the number of comparisons to
maintain the recursion stack.
Advantages: Efficient on very big list,
list Can be implemented iteratively/recursively.
iteratively/recursively
Limitations:
• Interacts poorly with the memory hierarchy
• Requires sorted list as an input
• Due to random access of list element, needs arrays instead of linked list.
Explanation:
StraightMaxMin requires 2(n-1)
2(n 1) comparisons in the best, average & worst cases.
By realizing the comparison of a[i]>max
a[i]>max is false, improvement in a algorithm can be
done. Hence we can replace the contents of the for loop by,
If(a[i]>Max) then Max = a[i]; Else if (a[i]<min)
(a[i]<min) min=a[i]
On the average
age a[i] is > max half the time.So,
time.So, the avg. no. of comparison is 3n/2-1.
3n/2
Example:
Space Complexity
Compared to the straight forward method, the MaxMin method requires extra stack space for
i, j, max, min, max1 and min1. Given n elements there will be 1 levels of
recursion and we need to save seven values
va for each recursive call. (6 + 1 for return address).
5. Merge Sort
Merge sort is a perfect example of a successful application of the divide-anddivide conquer
technique. It sorts a given array A [O ... n - 1] by dividing it into two halves A [0 .. /2 -1]
and A [ /2 .. n-1], 1], sorting each of them recursively, and then merging the two smaller
s
sorted arrays into a single sorted one.
one
After
fter that, the index of the smaller element is incremented to point to its immediate
successor in the array it was copied from. This operation is repeated until one of the
two given arrays is exhausted,
xhausted, and then the remaining elements of the other array are
copied to the end of the new array.
Example:
The operation of the algorithm on the
list 8, 3, 2, 9, 7, 1, 5, 4 is illustrated in
the figure
Analysis
Here the basic operationis key comparison. As merge sort execution does not depend on the
order of the data, best case and average case runtime are the same as worst case runtime.
During key comparison, neither of the two arrays becomes empty before the
Worst case:During
other one contains just one element leads to the worst case of merge sort. Assuming for
where, Cmerge(n) is the number of key comparisons made during the merging stage.
Let us analyze Cmerge(n),
), the number of key comparisons performed during themerging stage.
At each step, exactly one comparison is made, after which the totalnumber of elements in the
two arrays still needing to be processed is reducedby 1. In the worst case, neither of the two
arrays becomes empty before theother one contains just one element (e.g., smaller elements
may come from thealternating arrays).Therefore,
arrays). for the worst case,Cmerge(n) = n –1.
Now,
Advantages:
• Number of comparisons performed is nearly optimal.
• For large n, the number of comparisons made by this algorithm in the average case
turns out to be about 0.25n less and hence is also in Θ(n log n).
• Mergesort will never degrade to O (n2)
• Anotheradvantage
advantage of mergesort over quicksort and heapsort is its stability.
stability (A sorting
algorithm is said to be stable if two objects with equal keys appear in the same order
in sorted output as they appear in the input array to be sorted. )
Limitations:
• The principal shortcoming of mergesort is the linear amount [O(n) (n) ] of extra storage
the algorithm requires. Though merging can be done in-place,
in place, the resulting algorithm
is quite complicated and of theoretical interest only.
6. Quick sort
Quicksort is the other important sorting algorithm that is based on the divide-and-conquer
divide
approach. Unlike mergesort, which divides its input elements according to their position in
the array, quicksort divides ( or partitions) them according to their value.
A partition is an arrangement of the array’s elements so that all the elements to the left of
some element A[s] are less than or equal to A[s], and all the elements to the right of A[s] are
greater than or equal to it:
Obviously, after a partition is achieved, A[s] will be in its final position in thesorted array,
and we can continue sorting the two subarrays to the left and to theright of A[s]
independently (e.g., by the same method).
In quick sort, the entire workhappens in the division stage, with no work required to combine
the solutions tothe subproblems.
Partitioning
We start by selecting a pivot—
—an
an element with respect to whose valuewe are going to divide
the subarray.There are several different strategies forselecting
selecting a pivot. We use
thesophisticated
sophisticated method suggested byC.A.R. Hoare, the prominent British computer scientist
who invented quicksort.
Select the subarray’ss firstelement: p = A[l].Now
A[ ow scan the subarray from both ends,comparing
the subarray’s elements to the pivot.
The left-to-right
right scan, denoted below by index pointer i, starts with the second
element. Since we want elements smaller than the pivot to be in the left part of the
subarray, this scan skips over elements that are smaller than the pivot and stops upon
encountering the first element greater than or equal to the pivot.
The right-to-left
left scan, denoted below by index pointer j, starts with the last element of
the subarray. Since we want elements larger than the pivot to be in the right part of the
subarray, this scan skips over elements that are larger than the pivot and stops on
encountering the first element smaller than or equal to the pivot.
After both scans stop, three situations may arise, depending on whether or notthe scanning
indices have crossed.
1. If scanning indices i and j have not crossed, i.e., i< j, we simply exchange A[i] and
A[j ] and resume the scans by incrementing I and decrementing j, respectively:
2. If the scanning indices have crossed over, i.e., i> j, we will have partitioned the
subarray after exchanging the pivot with A[j]:
A[
3. Iff the scanning indices stop while pointing to the same element, i.e., i = j, the value
they are pointing to must be equal to p. Thus, we have the subarray partitioned, with
the split
lit position s = i = j :
ALGORITHM HoarePartition(A[l..r])
HoarePartition
//Partitions a subarray by Hoare’s algorithm, using the first element as a pivot
//Input: Subarray of array A[0..n
..n − 1], defined by its left and right indices l and r (l<r)
//Output: Partition of A[l..r],
], with the split position returned as this function’s value
Analysis
Here the basic operation is key comparison. Numberof
Best Case -Here Numberof key comparisons made
before a partition is achieved is n + 1 if the scanningindices cross over
over and n if they coincide.
coincide
If all the splits happen in themiddle of corresponding subarrays, we will have the best case.
The number of keycomparisons in the best case satisfies the recurrence,
recurrence
According to the Master Theorem, Cbest(n) ∈Θ(n log2 n); solving it exactly forn = 2k yields
Cbest(n) = n log2 n.
Worst Case – In the worst case, all the splits will be skewed to the extreme: one of thetwo
subarrays will be empty, and the size of the other will be just 1 less than thesize of the
subarray being partitioned. This unfortunate situation will happen, inparticular,
inparticular, for increasing
in
arrays. Indeed, if A[0..n − 1] is a strictly increasing array and we use A[0] as thepivot, the
left-to-right
right scan will stop on A[1] while the right-to-left
right left scan will go allthe way to reach
A[0], indicating the split at position 0:So,
0: after making n + 1 comparisons to get to this
partition and exchanging thepivot A[0] with itself, the algorithm will be left with the strictly
increasing arrayA[1..n − 1] to sort. This sorting of strictly increasing arrays of diminishing
sizes willcontinue until the last
st one A[n−2..
A[n n−1]
1] has been processed. The total numberof key
comparisons made will be equal to
Average Case - Let Cavg(n) be the average number of key comparisons made byquicksort on
anyposition s (0 ≤ s ≤ n−1) after
a randomly ordered array of size n. A partition can happen in anyposition
n+1comparisons are made to achieve the partition.After the partition, the left and right
subarrays will have s and n − 1− s elements,respectively. Assuming that the partition split can
happen in each position s withthe same probability
probability 1/n, we get the following recurrence
relation:
Thus, on the average, quicksort makes only 39% more comparisons than in thebest case.
Moreover, its innermost loop is so efficient that it usually runs faster thanmergesort on
randomly ordered arrays of nontrivial sizes. This certainly justifies the namegiven to the
algorithm by its inventor.
Variations: Because of quicksort’s importance, there have been persistent efforts over
theyears to refine the basic algorithm. Among several improvements discovered
byresearchers are:
Better
etter pivot selection methods such as randomized quicksort that uses a random
element or the median-of-three
median method that uses the median ian of the leftmost,
rightmost, and the middle element of the array
Switching
witching to insertion sort on very small subarrays (between 5 and 15 elements for
most computer systems) or not sorting small subarrays at all and finishing the
algorithm with insertion sort
sort applied to the entire nearly sorted array
Modifications
odifications of the partitioning algorithm such as the three-three-way partition into
segments smaller than, equal to, and larger than the pivot
Limitations: 1. It is not stable. 2. It requires a stack to store parameters
rameters of subarrays that are
yet to be sorted. 3. WhilePerformance on randomly ordered arrays is known to be sensitive
not only to the implementation details of the algorithm but also to both computer architecture
and data type.
In the above method, we do 8 multiplications for matrices of size n/2 x nn/2 and 4 additions.
O( 2) time. So the time complexity can be written asT(n)
Addition of two matrices takes O(n asT( =
2 3
8T(n/2) + O(n ) which happen to be O(n ); same as the direct method
Divide and Conquer through Strassen’s Method:By using divide-and and-conquer approach
proposed by Strassen in 1969, we can reduce thenumber of multiplications.
ons.
matrices The principal insightof the algorithm lies in the discovery
Multiplication of 2×2 matrices:The
that we can find the product C of two 2 × 2 matricesA and B with just 7 multiplications as
opposed to the eight requiredby the brute-force algorithm. This is accomplishedby using the
following formulas:
where
It is not difficult to verify that one can treat these submatrices as numbers toget the correct
product. For example, C00 can be computed as M1 + M4– M5 + M7 where M1, M4, M5, and M7
are found byStrassen’s formulas, with the numbers replaced by the corresponding
submatrices.If the seven products of n/2 × n/2 matrices are computed recursively by the
samemethod, we have Strassen’s algorithm for matrix multiplication.
Analysis( fromtext book T1: Levtin et al )
Here the basic operation is multiplication.
multiplication If M(n) is thenumber of multiplications made by
Strassen’s algorithm in multiplying two n × nmatrices (where n is a power of 2), we get the
following recurrence relation for it:
In the decrease-by-a-constant
constant variation, the size of an instance is reducedby the same
constant on each iteration of the algorithm. Typically, this constant is equal to one although
other constant size reductions do happenoccasionally.
Figure: (byhalf)-and-conquer
Decrease-(byhalf)
technique.
Example:
Note that a back edge in a DFS forest of a directed graph can connect a vertexto its parent.
Whether or not it is the case, the presence of a back edge indicatesthat
indicatesthat the digraph has a
directed cycle. A directed cycle in a digraph is a sequenceof three or more of its vertices that
starts and ends with the same vertex and inwhich every vertex is connected to its immediate
predecessor by an edge directedfrom the predecessor
predecessor to the successor. For example, a, b, a is
a directed cycle in the digraph in Figure given above.
above. Conversely, if a DFS forest of a
digraph has no backedges, the digraph is a dag, an acronym for directed acyclic graph.
graph
courses?The situation can be modeled by a digraph in which vertices represent courses and
directed edges indicate prerequisite requirements.
In terms of this digraph, the question is whether we can list its vertices in such an order that
for every edge in the graph, the vertex where the edge starts is listed before the vertex where
the edge ends. In other words,
ds, can you find such an ordering of this digraph’s vertices? This
problem is called topological sorting.
sorting
Topological Sort: Foror topological sorting to be possible, a digraph in question must be a
DAG. i.e., if a digraph has nodirected cycles, the topological
topological sorting problem for it has a
solution.
There
here are two efficient algorithms that both verify whether a digraph is a dag and, if it is,
produce an ordering of vertices that solves the topological sortingproblem.The
sortingproblem. first one is
based on depth-first search; the second is based on a direct application of the decrease-by-one
decrease
technique.
Topological Sorting based on DFS
Method
1. Perform a DFS traversal and note the order in which vertices become dead-ends
dead
2. Reversing this order yields a solution to the topological sorting problem, provided, of
course, no back edge has been encountered during the traversal. If a back edge has
been encountered, the digraph is not a DAG,, and topological sorting of its vertices is
i
impossible.
Illustration
a) Digraph for which the topological sorting problem needs to be solved.
b) DFS traversal stack with the subscript numbers indicating the popping off order.
c) Solution to the problem. Here we have drawn the edges of the digraph, andthey all
point from left to right as the problem’s statement requires. It is a convenientway to
check visually the correctness of a solution to an instance of thetopological sorting
problem.
1. Repeatedly,
epeatedly, identify in a remaining digraph a source, which is a vertex with no
incoming edges, and delete it along with all the edges outgoing from it. (If there
t are
several sources, break the tie arbitrarily. If there are none, stop because the problem
cannot be solved.)
2. The order in which the vertices are deleted yields a solution to the topological sorting
problem.
Illustration - Illustration of the source-removal
source removal algorithm for the topological sortingproblem
is given here.. On each iteration, a vertex with no incoming edges is deletedfrom the digraph.
***