Searching and Selection
Searching Algorithms
Linear Search: Simple, examines each element sequentially.
Algorithm: LinearSearch(A, x)
for i ← 0 to A.Size() − 1 do
if A[i] == x then
Return i;
end
end
Return -1;
Runtime Analysis:
Worst-Case: Θ(n), when x is not in the array (or its first index is in Ω(n))
Best-Case: Θ(1), when x is the first element (or its first index is in O(1))
Average-Case: Usually Θ(n), but it depends on the nature of the input specification
Binary Search: Efficient for sorted arrays, divides the search space in half.
Algorithm: BinarySearchRec(A, x, l = 0,r = n−1)
if l > r then
Return -1;
end
m ← l+r/2;
if A[m] == x then
Return m;
end
if A[m] > x then
Return BinarySearchRec(A, x, l, m − 1);
end
// A[m] < x
Return BinarySearchRec(A, x, m + 1,r);
Algorithm: BinarySearch(A, x)
l ← 0;
r ← A.Size() − 1;
while l ≤ r do
m ← l+r/2;
if A[m] == x then
Return m;
end
if A[m] > x then
r ← m − 1;
end
else
l ← m + 1;
end
end
Return -1;
Runtime Analysis:
Worst-Case: Θ(logn)
Best-Case: Θ(1), if the target x is in the first index we check (middle).
Average-Case: Θ(logn)
Space Complexity:
The iterative variant requires Θ(1) auxiliary space for all cases.
The recursive variant generally requires Θ(log n) auxiliary space for the recursive calls. However,
many languages and compilers incorporate tail call elimination, which reduces the space
complexity to Θ(1) as well.
Interpolation Search: Makes educated guesses based on the distribution of data.
Runtime Analysis:
Worst-Case: Θ(n)
Best-Case: Θ(1)
Average-Case: Θ(n)
Space Complexity:
The iterative variant is in Θ(1) like binary search.
The recursive variant requires Θ(n) auxiliary space in the worst-case, but it also
improves to Θ(1) with tail call elimination.
Selection Problem
SortSelect: Simple algorithm involving sorting the array. Θ(n log n) runtime with HeapSort, Θ(1) space.
HeapSelect: Uses a min-heap to find the k-th smallest element. Θ(n + k log n) runtime.
Algorithm: HeapSelect(A, k)
Min-Heapify(A);
repeat k times
A.DeleteMin ();
end
Return A.Root();
Runtime Analysis:
Worst-Case: Θ(logn)
Best-Case: Θ(1), if the target x is in the first index we check (middle).
Average-Case: Θ(logn)
QuickSelect: Algorithm based on partitioning and pivoting, aiming to solve the selection problem
efficiently.
Algorithm: ChoosePivotFirst(A)
Return A[0];
Algorithm: Partition(A, p)
i ← 1, j ← n − 1;
while True do
while i < n and A[i] < p do i + +;
while A[j] > p do j − −;
if i ≥ j then Break;
else
Swap(A[i], A[j]);
i + +, j − −;
end
end
Swap(A[0], A[j]);
Return j;
Algorithm: QuickSelectFirst(A,k, l = 0,r = n − 1)
p ← ChoosePivotFirst(A);
pvind ← Partition(A, p);
if k == pvind then
// pivot has rank k
Return p;
end
if k < pvind then
// rank k in left partition
Return QuickSelectFirst(A, k, l, pvind − 1);
end
else
// rank k in right partition
10 Return QuickSelectFirst(A, k − (pvind + 1), pvind + 1,r );
end
QuickSelect Analysis
Worst-case runtime: Θ(n2 ), explained for sorted input.
Average-case runtime: Analysis based on random pivot selection, resulting in Θ(n) expected runtime.
Best-Case runtime: Θ(n) if the element at rank k is the pivot after the first call to Partition, which runs in
Θ(n) time.
Randomized QuickSelect:
Modifies QuickSelect with random pivot selection. Expected runtime Θ(n).
Comparison-Based Sorting Algorithms
1. Sorting Problem:
Definition: Given a collection of data, the task is to arrange it in sorted order.
Formal definition: Given an input array A = [a0 , a1 , ..., an−1 ], return an output array B, a
permutation of A, where B is sorted in ascending order.
Applicability: Sorting is useful for speeding up searching (e.g., binary search) and selection.
2. Sorting Algorithms:
SelectionSort:
Idea: Find the smallest element, move it to the first index, repeat.
Algorithm: O(n2 ) time complexity, generally impractical.
Algorithm: SelectionSort(A)
for i ← 0 to A.Size − 1 do
mn ← i;
for j ← i + 1 to A.Size − 1 do
if A[j] < A[mn] then
mn ← j;
end
Swap A[i] with A[mn]
end
BubbleSort:
Idea: Scan array, swap adjacent elements if out of order, repeat.
Algorithm: O(n2 ) time complexity, relevant in parallel computing.
Algorithm: BubbleSort(A)
for i ← 1 to A.Size − 1 do
for j ← 1 to A.Size − i do
if A[j] < A[j − 1] then
Swap A[j] with A[j − 1];
end
end
3. InsertionSort:
Idea: Maintain a sorted array, add elements one by one at appropriate positions.
Algorithm: O(n2 ) worst-case, O(n) best-case (sorted array), in-place (Θ(1) space).
Algorithm: InsertionSort(A)
for i ← 1 to A.Size − 1 do
val ← A[i];
for j ← i downto 0 do
if j == 0 or A[j − 1] ≤ val then
A[j] ← val;
Break;
end
A[j] ← A[j − 1];
end
end
4. QuickSort:
Recursive algorithm using the Partition algorithm.
Expected time complexity: O(nlogn), expected space complexity: O(logn).
Randomized pivot selection to avoid worst-case scenarios.
Algorithm: QuickSort(A, l = 0,r = n − 1)
if l ≥ r then
Return;
p ← ChoosePivotRandom(A);
pvind ← Partition(A, p);
QuickSort(A, l, pvind − 1);
QuickSort(A, pvind + 1,r );
5. MergeSort:
Divide-and-conquer algorithm, recursively divides array, then merges sorted subarrays.
Time complexity: O(nlogn), space complexity: O(n).
Stable sorting algorithm, preserves relative order of equal elements.
Algorithm: Merge(A, B)
Initialize C as an empty output array;
i ← 0;
j ← 0;
while i < A.Size() or j < B.Size() do
if j == B.Size() or A[i] ≤ B[j] then
C.Append(A[i]);
i ← i + 1;
end
else
C.Append(B[j]);
j ← j + 1;
end
end
Return C;
Algorithm: MergeRec(A, lA, rA, B, lB ,rB)
// if one array is empty, return the other array
if lA > rA then
Return B[lB , . . . ,rB ];
if lB > rB then
Return A[lA, . . . ,rA];
if A[lA] ≤ B[lB ] then
Return A[lA] concatenated with MergeRec(A, lA + 1,rA, B, lB, rB);
end
else
Return B[lB ] concatenated with MergeRec(A, lA, rA, B, lB + 1, rB);
end
Algorithm: MergeSort(A, l,r)
// if one array is empty, return the other array
if l > r then
Return empty array;
m ← l+r/2;
A[l . . . m] ← MergeSort(A, l, m);
A[m + 1 . . .r] ← MergeSort(A, m + 1,r );
Return Merge(A[l . . . m], A[m + 1, . . .r]);
6. Stability of Sorting Algorithms:
Stable algorithms: MergeSort, InsertionSort, BubbleSort.
Unstable algorithms: QuickSort, HeapSort, SelectionSort.
Stability defined by the preservation of relative order for equal elements.
7. Choosing the Appropriate Algorithm:
No universal sorting algorithm; choice depends on requirements.
MergeSort for stability, HeapSort for space efficiency, QuickSort for practical speed.
InsertionSort for nearly sorted arrays, stable, and in-place.
8. Table of Comparison-Based Sorting Algorithms:
Algorithm Runtime Space Stable Other
InsertionSort Θ(n2 ) worst-case Θ(1) Yes Good for nearly sorted
HeapSort Θ(nlogn) worst-case Θ(1) No Space-efficient
QuickSort Θ(nlogn) expected Θ(logn) No Fast, may perform poorly
Algorithm Runtime Space Stable Other
MergeSort Θ(nlogn) Θ(n) Yes Stable
BubbleSort Θ(n2 ) Θ(1) Yes Impractical
SelectionSort Θ(n2 ) Θ(1) No Impractical
Conclusion:
No one-size-fits-all sorting algorithm.
Consider trade-offs between stability, runtime, and space complexity based on specific requirements.
Non-Comparison-Based Sorting Algorithms
1. Introduction:
Non-Comparison-Based Sorting:
Alternative to comparison-based sorting algorithms.
Specific requirements on array data for applicability.
2. BucketSort:
Algorithm: BucketSort(A, R)
B ← Array of R empty buckets (i.e., linked lists);
for i ← 0 to A.Size − 1 do
// element A[i] goes to the bucket with index key(A[i])
Add A[i] to the end of bucket B[Key(A[i])];
end
idx ← 0;
for i ← 0 to R − 1 do
foreach element x in Bucket B[i] do
A[idx] ← x;
idx + +;
end
end
Return A;
Distributes elements into buckets based on their values.
Each bucket is sorted individually.
Complexity:
Runtime: Θ(n + R) , where R is the range of values.
Space: Θ(n + R) .
Stability:
Stable sorting algorithm.
3. CountSort:
Algorithm: StableCountSort(A, R)
C ← Array of size R, initially all 0s;
// when we see A[i], increase the count for key(A[i])
for i ← 0 to A.Size − 1 do C
[Key(A[i])]++;
P ← Array of size R;
// P[i] denotes first available index for key i
P[0] ← 0;
for i ← 1 to R − 1 do
P[i] ← P[i − 1] + C[i − 1];
B ← Array of size A.Size() for the Output;
for i ← 0 to A.Size − 1 do
k ← Key(A[i]);
idx ← P[k] ; // next available index for key k
B[idx] ← A[i];
P[k]++;
end
Return B;
Uses counts of elements to determine their final positions.
Complexity:
Runtime: Θ(n + R) .
Space: Θ(n + R) .
Stability:
Stable sorting algorithm.
4. Space-Saving CountSort:
Algorithm: SpaceSavingCountSort(A, R)
C ← Array of size R, initially all 0s;
for i ← 0 to A.Size − 1 do C
[Key(A[i])]++;
P ← Array of size R;
// P[i] denotes first available index for key i
P[0] ← 0;
for i ← 1 to R − 1 do
P[i] ← P[i − 1] + C[i − 1];
i ← 0 // changes start here
while i < A.Size() do
k ← Key(A[i]);
if i == P[k] then
i++, P[k]++;
else
idx ← P[k];
Swap A[i] with A[idx];
P[k]++ ; // i remains the same here
end
end
Return A;
Swaps elements within the input array instead of using a separate output array.
Reduces space complexity but sacrifices stability.
Complexity:
Runtime: Θ(n + R) .
Space: Θ(R) .
Stability:
Unstable sorting algorithm.
5. Comparing BucketSort and CountSort:
Asymptotic Analysis:
Similar runtime and auxiliary space complexity.
CountSort tends to be faster in practice.
CountSort can be modified for in-place sorting.
6. LSD-RadixSort:
Algorithm: LSDRadixSort(A, R, d)
for j ← d downto 1 do
Sort A with a stable sort, defining Key(x) = j-th digit of original key of x;
end
Return A;
Sorts keys based on individual digits from the least significant to the most significant.
Complexity:
Runtime: Θ(d * (n + R)) , where d is the number of digits.
Space: Θ(n + R) .
Stability:
Always stable.
7. MSD-RadixSort:
Algorithm: MSDRadixSort(A, R, d, j = 1, l = 0,r = A.Size−1)
if l ≥ r then
Return A;
Sort A, defining Key(x) = j-th digit of original key of x;
for i ← 0 to R − 1 do
l′ ← index of first value whose d-th digit is i;
r′ ← index of last value whose d-th digit is i;
MSDRadixSort(A, R, d, j + 1, l′
end
Return A;
Sorts keys based on individual digits from the most significant to the least significant.
Generates subproblems for each digit.
Complexity:
Runtime: Θ(d * n * R) .
Space: Θ(d + n + R) .
Stability:
Depends on the stability of the digit-sorting algorithm used.
8. LSD-RadixSort vs. MSD-RadixSort:
Advantages:
LSD-RadixSort tends to reduce the range of values (R) significantly.
LSD-RadixSort generally performs better but requires a stable digit-sorting algorithm.
9. Conclusion:
Non-Comparison Based Algorithms:
Effective for specific scenarios.
Considerations for stability and suitability based on data characteristics.
10. Table of Non-Comparison-Based Sorting Algorithms:
Algorithm Overview:
Provides a summary of algorithms, their runtimes, space complexities, stability, and conditions of
use.
Algorithm Runtime Space Stable Conditions
BucketSort / CountSort Θ(n + R) Θ(n + R) Yes Elements from a known set of size R
CountSort (Space-Saving) Θ(n + R) Θ(R) No Elements from a known set of size R
MSD-RadixSort (Bucket/Count) Θ(dnR) Θ(d+n+R) Yes Radix-R, d digits, e.g., 0 to Rd − 1
Algorithm Runtime Space Stable Conditions
d
MSD-RadixSort (Space-Saving) Θ(dnR) Θ(d+R) No Radix-R, d digits, e.g., 0 to R − 1
LSD-RadixSort (Bucket/Count) Θ(dn+dR) Θ(n+R) Yes Radix-R, d digits, e.g., 0 to Rd − 1
MSD-RadixSort (Other) O(dRT) O(d+S) Ref Radix-R, d digits, any digit-sort
LSD-RadixSort (Other) O(dT) O(S) Yes Radix-R, d digits, stable digit-sort