Empirical Analysis of Sorting and
Searching Algorithms
Searching and Sorting
Searching algorithms
Linear Search (Iterative and Recursive)
Binary Search (Iterative and Recursive)
Sorting algorithms
Quick Sort (Iterative and Recursive)
Merge Sort (Iterative and Recursive)
Heap sort (Iterative and Recursive)
Iterative Binary Search
int binarySearch(int A[], int n, int target)
0 1 2 3 4 5 6 7
{
int low = 0, high = n - 1; 5 8 12 18 24 36 48 56
while (low <= high)
{ Mid (0+7)/2=3
int mid = (low + high)/2; Let us trace tree
if (target == A[mid]) 3
{ 1 5
return mid;
0 2 4 6
}
else if (target < A[mid]) 7
{ Target 18
high = mid - 1; 1 comparison needed
} Target 36
else { 2 comparison needed
low = mid + 1; Target 6
} 3 comparison needed
} It can be observed from the
return -1; analysis no. of comparison=
} height of tree, i.e., log (n)
Recursive Binary Search
int binarySearch(int A[], int low, int high, int target) ------ T(n)
{
if (low > high) {
return -1;
}
int mid = (low + high)/2; ---- Constant time (1)
if (target == A[mid]) {
return mid;
}
else if (target < A[mid]) {
return binarySearch(A, low, mid - 1, target);
} -- T(n/2)
else {
return binarySearch(A, mid + 1, high, target);
}
}
1 n=1
T(n) T(n/2)+1 n>1
Recursive Binary Search
Use Back substitution method
T(n)= T(n/2)+1 (i)
T(n)= [T(n/2*2)+1]+1
= T(n/22 )+2 (ii)
T(n) = [T(n/ 23)+1]+2
= T(n/ 23)+3 (iii)
T(n) = T(n/ 2k)+k (iv)
Assume n/ 2k =1 => 2k =n
T(n) = T(1) +k (v)
Take log
k log22 =log n
K = log n (vi)
Put value of k in equation (v)
T(n)= O(log n)
Recursive Merge Sort
Merge (A, p, q, r)
{
n1= q-p+1;
n2= r-q;
Let L[n1+1] and R[n2+1] be the new arrays
for(i=1 to n1)
L[i] = A[p+i-1]
for(j=1 to n2)
R[j] = A[q+j] void mergesort (A, p, r) -----T(n)
L[n1+1]= inf; {
R[n2+1]= inf; if (p>=r)
i=1; return;
j=1; int q=floor[(p+r)/2];
for (k=p to r) mergesort(A, p, q); ----T(n/2)
if L[i] <=R[j] mergesort(A, q+1, r); ---T(n/2)
A[k]=L[i]; merge(A, p, q, r); ---- n
i=i+1; }
else
A[k]=R[j] 1 n=1
} T(n) = T(n/2)+1 n>1
Analysis of Recursive Merge Sort
Time complexity merge sort
T(n) = T(n/2) + T(n/2) + n
After solving the above recurrence, the T(n) = O(nlogn)
Iterative Merge Sort
void Imergesort (A, n)
8, 3, 7, 4, 9, 2, 6, 5
{
int p, l, h, q, I; Pass1 3, 8 4, 7 2, 9 5, 6
for(p=2;p<=n; p=p*2)
{
for(i=0;i+p-1<n; i=i+p) Pass2 3, 4, 7, 8 2, 5, 6, 9
{
l=i;
h=i+p-1; Pass3 2, 3, 4, 5, 6, 7, 8, 9
q= (l+h)/2;
Merge(A, l, q, h);
}
}
if(p/2<n)
Merge(A, 0, p/2-1, n-1);
}
It can be observed from the example, that at each level n elements have to be
merged and there are log n levels. So, time complexity will be equal to
O(n*log n).
Recursive Quick Sort
PARTITION(A, p, r)
{
x = A[r] QUICKSORT(A, p, r){ -------- T(n)
i = p - 1 if p < r
for j = p to r – 1 {
{ q = PARTITION(A, p, r) -------- n
if A[j] <= x QUICKSORT(A, p, q - 1) -------- T(n-i)
{ QUICKSORT(A, q + 1, r) -------- T(n-i-1)
i = i + 1 }
exchange A[i] with A[j] }
}
} Assume T(n) is the time complexity of
exchange A[i + 1] with A[r] quicksort for n integers. Recursive method
return i + 1 uses divide and conquer approach to sort
} elements.
Analysis of Recursive Quick Sort
T(n) = T(i) + T(n-i-1) + O(n)
In worst case, T(i)=0
T(n) = T(0) + T(n-1) + n
After solving the above recurrence, the T(n) = O(n2)
In best case, list will be equally divided or there will at least some elements
on both side of pivot
T(n) = T(n/2) + T(n/2) + n or T(n) = T(9n/10) + T(n/10) + n
After solving the above recurrence, the T(n) = O(nlogn)
Iterative Quick Sort
void quickSortIterative(int arr[], int l, int h)
{
int stack[h - l + 1];
int top = -1;
stack[++top] = l;
stack[++top] = h;
int partition(int arr[], int l, int h)
while (top >= 0)
{
{
h = stack[top--];
int x = arr[h];
l = stack[top--];
int i = (l - 1);
int p = partition(arr, l, h);
for (int j = l; j <= h - 1; j++)
if (p - 1 > l) {
{
stack[++top] = l;
if (arr[j] <= x)
stack[++top] = p - 1;
{
}
i++;
if (p + 1 < h) {
swap(arr[i], arr[j]);
stack[++top] = p + 1;
}
stack[++top] = h;
}
}
swap(arr[i + 1], arr[h]);
}
return (i + 1);
}
}
Counting Sort
• Counting sort is a sorting algorithm that sorts the elements of an array
by counting the number of occurrences of each unique element in the
array.
• This sort –
• Finds the count of every distinct element
• Calculate position of each element in sorted array
Weaknesses:
• Restricted inputs- Counting sort only works when the range of potential
items in the input is known ahead of time.
• Space cost- If the range of potential values is big, then counting sort
requires a lot of space (perhaps more than O(n)).
Counting Sort
consider the data in the range of 0 to 8.
Input data:
Take a count array of size max+1 (8+1) to store the count of each
unique object.
Count array
Store the count of each element at their respective index in count
array
Count of each element stored
Counting Sort
Store cumulative sum of the elements of the count array. It helps in
placing the elements into the correct index of the sorted array.
Count Array
Cumulative count
To sort elements, take an empty auxiliary array of input size and perform
the following-
1. Traverse the original array from right to left to get the input element
and then go to that index of Cumulative count array which is equal to
the input element, decrease that by 1 to get position of element in
Final array.
2. Repeat the first step until all elements of input array is traversed.
Counting Sort
To sort elements, take an empty auxiliary array of input size and perform
the following-
1. Traverse the original array from right to left to get the input element and then
go to that index of Cumulative count array which is equal to the input
element, decrease that by 1 to get position of element in Final array.
2. Repeat the first step until all elements of input array is traversed.
Input array
0 1 0 3 5 6 6 6 6 7
0 1 2 3 4 5 6 7 8
Cumulative count
1
0 1 2 3 4 5 6
Final array
Counting Sort
Take an empty auxiliary array of input size and perform the following-
1. Traverse the original array from right to left to get the input element and then
go to that index of Cumulative count array which is equal to the input
element, decrease that by 1 to get position of element in auxiliary array.
2. Repeat the first step until all elements of input array is traversed.
Input array
0 1 0 3 5 4 6 6 6 6 7
0 1 2 3 4 5 6 7 8
Cumulative count
1 3
0 1 2 3 4 5 6
Final array
Counting Sort
Take an empty auxiliary array of input size and perform the following-
1. Traverse the original array from right to left to get the input element and then
go to that index of Cumulative count array which is equal to the input
element, decrease that by 1 to get position of element in auxiliary array.
2. Repeat the first step until all elements of input array is traversed.
Input array
0 1 0 3 543 6 6 6 6 7
0 1 2 3 4 5 6 7 8
Cumulative count
1 3 3
0 1 2 3 4 5 6
Final array
Counting Sort
Take an empty auxiliary array of input size and perform the following-
1. Traverse the original array from right to left to get the input element and then
go to that index of Cumulative count array which is equal to the input
element, decrease that by 1 to get position of element in auxiliary array.
2. Repeat the first step until all elements of input array is traversed.
Input array
0 1 0 3 543 6 6 6 6 76
0 1 2 3 4 5 6 7 8
Cumulative count
1 3 3 8
0 1 2 3 4 5 6
Final array
Counting Sort
Take an empty auxiliary array of input size and perform the following-
1. Traverse the original array from right to left to get the input element and then
go to that index of Cumulative count array which is equal to the input
element, decrease that by 1 to get position of element in auxiliary array.
2. Repeat the first step until all elements of input array is traversed.
Input array
0 1 0 32 543 6 6 6 6 76
0 1 2 3 4 5 6 7 8
Cumulative count
1 2 3 3 8
0 1 2 3 4 5 6
Final array
Counting Sort
Take an empty auxiliary array of input size and perform the following-
1. Traverse the original array from right to left to get the input element and then
go to that index of Cumulative count array which is equal to the input
element, decrease that by 1 to get position of element in auxiliary array.
2. Repeat the first step until all elements of input array is traversed.
Input array
0 1 0 321 543 6 6 6 6 76
0 1 2 3 4 5 6 7 8
Cumulative count
1 2 2 3 3 8
0 1 2 3 4 5 6
Final array
Counting Sort
Take an empty auxiliary array of input size and perform the following-
1. Traverse the original array from right to left to get the input element and then
go to that index of Cumulative count array which is equal to the input
element, decrease that by 1 to get position of element in auxiliary array.
2. Repeat the first step until all elements of input array is traversed.
Input array
0 1 0 321 543 6 5 6 6 6 76
0 1 2 3 4 5 6 7 8
Cumulative count
1 2 2 3 3 4 8
0 1 2 3 4 5 6
Final array
Counting Sort Algorithm
Time Complexity
Time taken to create an array of max range size ----- O (max)
Time taken to fill the frequencies of each element ----- O (size/n)
Time taken to find the cumulative array --------- O (max)
Iterate the cumulative count array to place the elements in final array -- O (size/n)
Overall complexity = O(max)+O(size)+O(max)+O(size) = O(max+size) = O(n+k)
Space Complexity = O (max)
Counting Sort for mixed numbers
Standard counting sort algorithm will not work for array that contain
mixed elements (+ve and –ve elements). Following modifications are
needed to handle the issue-
Manually map negative numbers to positive indexes in count array and then use
standard counting sort algorithm to sort them.
For example, we have an array having following elements, i.e., -9, 1, 2, 7, 3, 6, -
1
As the least negative element is -9. So, take index for -9 as 0 (i.e., -9+ 9) and
take index for 1 as 1+9=10, index for 2 as 2+9=11 and so on.
So, by doing such index manipulation, negative numbers can be easily handled.
Sort -9, 1, 2, 7, 3, 6, -1, -9, -1 using counting sort.
Bucket Sort
Assumption:
the input is generated by a random process that distributes
elements uniformly over [0, 1)
Idea:
Divide [0, 1] into k equal-sized buckets
Distribute the n input values into the buckets
Sort each bucket (e.g., using insertion/quicksort)
Go through the buckets in order and combine their elements
Input: A[1 . . n], where 0 ≤ A[i] < 1 for all i
Output: elements A[i] sorted
24
Bucket Sort Algorithm
Alg.: BUCKET-SORT(A, n) // n is the length of the bucket, i.e., 10
for i ← 1 to n
do insert A[i] into list B[nA[i]]
for i ← 0 to k - 1
do sort list B[i] with insertion/quicksort sort
concatenate lists B[0], B[1], . . . , B[n -1] together in order
return the concatenated lists
25
Example - Bucket Sort
A 1 .79 B 0 .06 / Since elements are uniformly distributed,
so each bucket has only one element.
2 .13 1 .13 /
Just combine them from 0 to n-1 to get
3 .64 2 .20 / the sorted sequence.
4 .39 3 .39 / Sorted sequence are-
.06, .13, .20, .39, .42, .53, .64, .79, .89,
5 .20 4 .42 / .94
6 .89 5 .53 /
7 .53 6 .64 /
8 .42 7 .79 /
9 .06 8 .89 /
10 .94 9 .94 /
Distribute Into buckets
26
Bucket Sort : Another example
27
Example - Bucket Sort
0 /
1 .12 .17 /
2 .21 .23 .26 /
3 .39 /
Sort within each
4 /
bucket
5 /
6 .68 /
7 .72 .78 /
8 /
9 .94 /
28
Example - Bucket Sort
.12 .17 .21 .23 .26 .39 .68 .72 .78 .94 /
0 /
1 .12 .17 /
2 .21 .23 .26 /
3 .39 /
4 /
5 / Concatenate the lists from
6 .68 / 0 to n – 1 together, in order
7 .72 .78 /
8 /
9 .94 /
29
Analysis of Bucket Sort
Alg.: BUCKET-SORT(A, n)
for i ← 1 to n
O(n)
do insert A[i] into list B[nA[i]]
for i ← 0 to k - 1
O(nlog(n) or
do sort list B[i] with insertion/quicksort sort O(n2)
concatenate lists B[0], B[1], . . . , B[n -1]
together in order O(k)
return the concatenated lists
30
Analysis of Bucket Sort
Best case: Elements are uniformly distributed across the buckets, no
sorting required
T (n) = O(n+k) = O(n)
Worst case: Elements are not uniformly distributed across the
buckets, no sorting required
T(n) = O(n) + O(n log n) + O(k)
= O(n log n)
Bucket sort can also be used to sort array that contain mixed
elements. In that case, range of elements varies from -1 to 1. To
sort array that contain mixed elements, standard algorithm is
modified.
31
Bucket Sort for mixed numbers
sortMixed(arr[], n)
1) Split array into two parts
create two Empty vector Neg[], Pos[]
(for negative and positive element respectively)
Store all negative element in Neg[] by converting
into positive (Neg[i] = -1 * Arr[i] )
Store all +ve in pos[] (pos[i] = Arr[i])
2) Call function bucketSortPositive(Pos, [Link]())
Call function bucketSortPositive(Neg, [Link]()) and store
elements in reverse order with negative sign
bucketSortPositive(arr[], n)
3) Create n empty buckets (Or lists).
4) Do following for every array element arr[i].
a) Insert arr[i] into bucket[n*array[i]]
5) Sort individual buckets using insertion sort.
6) Concatenate all sorted buckets.
32
Radix Sort
• Radix sort is a non-comparative sorting algorithm that normally sorts
elements in linear time. Radix sort can be implemented using one of the
following-
Use queues/buckets to do digit by digit sort starting from least significant digit to
most significant digit.
Use counting sort as a subroutine to do digit by digit sort starting from least
significant digit to most significant digit.
• Radix sort falls in the category of stable sorting.
• Keys/ element in radix sort is represented as d-digit numbers in some
base-k
key = x1x2...xd where 0≤xi≤k-1
• Example: key=15
key10 = 15, d=2, k=10 where 0≤xi≤9
key2 = 1111, d=4, k=2 where 0≤xi≤1
33
Radix Sort
• Assuming decimal elements and 10 buckets, we would put the
elements into the bucket associated with its unit digit.
• The buckets are actually queues, so the elements are added at the
end of the bucket.
• At the end of the pass, the buckets are combined in increasing order.
• On the second pass, we separate the elements based on the “tens”
digit, and on the third pass we separate them based on the
“hundreds” digit.
• Each pass must make sure to process the elements in order and to
put the buckets back together in the correct order.
34
Radix Sort
Sorting looks at one column at a time
For a d digit number, sort the least significant digit first
Continue sorting on the next least significant digit, until all
digits have been sorted
Requires only d passes through the list
radixSort(array)
d <- maximum number of digits in the largest element
create d buckets of size 0-9
for i <- 0 to d
sort the elements according to ith place digits using stable
sorting algorithm
35
Radix Sort
36
Analysis of Radix Sort
Each element is examined once for each of the digits it contains,
so if the elements have at most d digits and there are n elements
this algorithm has time complexity of order O(d*n).
For the radix sort that uses counting sort as an intermediate
stable sort, the time complexity is O(d(n+k)). Here, d is the
number cycle and O(n+k) is the time complexity of counting sort.
37
Radix Sort for mixed numbers
Store positive and negative numbers in separate arrays
Convert negative numbers to positive and sort them using radix
sort
Sort positive numbers using radix sort
Combine them [first print negative numbers from right to left with
negative sign and then print all elements of positive array
38
Insertion Sort
Insertion sort is an efficient algorithm for sorting a small number of
elements. Insertion sort works the way many people do it by hand
while playing a card game.
INSERTION-SORT(A):
for j = 2 to [Link]
key = A[j]
// Insert A[j] into the sorted sequence A[1..j-1]
i=j-1
while i > 0 and A[i] > key
A[i + 1] = A[i]
i=i-1
A[i + 1] = key
39
Insertion Sort
INSERTION-SORT(A):
for j = 2 to [Link]
key = A[j]
// Insert A[j] into the sorted sequence A[1..j-1]
i=j-1
while i > 0 and A[i] > key
A[i + 1] = A[i]
i=i-1
A[i + 1] = key
40
Insertion Sort
Analysis of Insertion Sort:
Best Case: If the input array is already sorted.
Time complexity = O(n)
Worst Case: The worst-case occurs if the array is sorted in reverse order
i.e., in decreasing order.
Time complexity = O(n2)
Since multiple keys with the same value are placed in the sorted array
in the same order that they appear in the input array, Insertion sort is
stable.
41
Heap
The (binary) heap data structure is an array object that we can view as a
nearly complete binary tree. A complete binary tree is a binary tree in which
every level, except possibly the last, is completely filled, and all nodes are as
far left as possible.
In heap, root’s key value of any subtree will be larger (max heap) or smaller
(min heap) than their children.
Macros functions:
•PARENT(i): return floor(i/2)
•LEFT(i): return 2 * i
•RIGHT(i): return 2 * i + 1
42
Heap
Procedure Functions:
•MAX-HEAPIFY procedure, runs in O(lg n) time, is the key to maintaining
the max-heap property
•BUILD-MAX-HEAP procedure, runs in a linear time, produces a max-heap
from an unordered input array
•HEAPSORT procedure, runs in O(n lg n) time, sorts an array in place
•MAX-HEAP-INSERT, HEAP-EXTRACT-MAX, HEAP-INCREASE-KEY
and HEAP-MAXIMUM procedures, all runs in O(lg n) time.
43
Heap: HEAPIFY procedure
▪ The inputs for this procedure are an array A and an index i into the array.
When HEAPIFY is called, it is assumed that the binary tree rooted
at LEFT(i) and RIGHT(i) are heaps, but that A[i] may be smaller than its
children, thus violating the heap property.
▪ The function of HEAPIFY is to let the value at A[i] "float down" in the
heap so that the subtree rooted at index i becomes a heap.
MAX-HEAPIFY(A,i):
l = LEFT(i)
r = RIGHT(i)
if l <= [Link]-size and A[l] > A[i]
largest = l
else largest = i
if r <= [Link]-size and A[r] > A[largest]
largest = r
if largest != i
exchange A[i] with A[largest]
MAX-HEAPIFY(A, largest)
44
Heap: HEAPIFY procedure
MAX-HEAPIFY(A,i):
l = LEFT(i)
r = RIGHT(i)
if l <= [Link]-size and A[l] > A[i]
largest = l
else largest = i
if r <= [Link]-size and A[r] > A[largest]
largest = r
if largest != i
exchange A[i] with A[largest]
MAX-HEAPIFY(A, largest)
Figure: The action of MAX-HEAPIFY(A, 2), where [Link]-size = 10
45
Heap: HEAPIFY procedure
MAX-HEAPIFY(A,i):
l = LEFT(i)
r = RIGHT(i)
if l <= [Link]-size and A[l] > A[i]
largest = l
else largest = i
if r <= [Link]-size and A[r] > A[largest]
largest = r
if largest != i
exchange A[i] with A[largest]
MAX-HEAPIFY(A, largest)
Running Time:
Total work done by the MAX-HEAPIFY procedure to covert the given complete binary tree
to max heap.
For this example, 2 comparison will be needed at each level, and it will continue until l
and r is smaller than [Link]. So, total work done= 2* hight of tree.
T(n) = 2*O(H)=O(nlogn)
46
Heap: Build_Heap procedure
Build_Heap converts A[1…n] to a max heap.
First create a complete binary tree from input array and find the leaf nodes. The
leaf nodes follow the max heap property, since they appears at the last level. So, to
convert the binary tree to heap apply the Heapify procedure for non-leaves.
The leaf nodes appears in a complete binary tree from index- floor(([Link]/2)+1)
The non-leaf nodes appears from index- floor([Link]/2) to 1.
BUILD-MEAX-HEAP(A):
[Link]-size = [Link]
for i = floor([Link] / 2) down to 1
MAX-HEAPIFY(A, i)
47
Heap: Build_Heap procedure
BUILD-MEAX-HEAP(A):
[Link]-size = [Link]
for i = floor([Link] / 2) downto 1
MAX-HEAPIFY(A, i)
48
Heap: Build_Heap procedure
Running Time:
The time complexity of the MAX-HEAPIFY procedure is O(h). This procedure
may be applied at each non-leaves in worst case. It has been proven in the literature
that in an n-element heap there are at most ceil[n/(2h+1)] nodes of height h.
No. of nodes of height 1= ceil([15/ 22 ]) = 4
So, total work done (considering MAX-HEAPIFY is also applied on leaves too)=
BUILD-MEAX-HEAP(A):
[Link]-size = [Link]
for i = floor([Link] / 2) downto 1
MAX-HEAPIFY(A, i)
49
Heap Sort
The heapsort algorithm starts with BUILD-MAX-HEAP to build a max-heap on the
input array A[1 .. n], where n = [Link]. Since the maximum element of the array
is stored at the root A[1], we can put it into its correct final position to exchanging
it with A[n].
HEAPSORT(A):
BUILD-MAX-HEAP(A)
for i = [Link] down to 2
exchange A[1] with A[i]
[Link]-size = [Link]-size - 1
MAX-HEAPIFY(A, 1)
50
Heap Sort
HEAPSORT(A):
BUILD-MAX-HEAP(A)
for i = [Link] down to 2
exchange A[1] with A[i]
[Link]-size = [Link]-size - 1
MAX-HEAPIFY(A, 1)
51
Heap Sort
The HEAPSORT procedure takes time O(n log n), since the call to BUILD-
MAX-HEAP takes time O(n) and each of the calls to MAX-HEAPIFY takes
time O(lg n)
52
Heap :Priority Queue
Heap data structure itself has many uses, one of the most popular application of a
heap is an efficient priority queue. A priority queue is a data structure for
maintaining a set S of elements, each with an associated value called a key.
A priority queue supports the following operations.
▪ EXTRACT-MAX(S) removes and returns the element of S with the largest
key.
▪ HEAP-INCREASE-KEY implements the INCREASE-KEY operation
▪ HEAP-DECRESE-KEY implements the DECRESE-KEY operation
▪ Etc.
53