CSCE 3110
Data Structures &
Algorithm Analysis
Rada Mihalcea
http://www.cs.unt.edu/~rada/CSCE3110
Sorting (II)
Reading: Chap.7, Weiss
Today…
• Quick Review
• Divide and Conquer paradigm
• Merge Sort
• Quick Sort
• Two more sorting algorithms
• Bucket Sort
• Radix Sort
Divide-and-Conquer
• Divide and Conquer is a method of algorithm
design.
• This method has three distinct steps:
• Divide: If the input size is too large to deal with in
a straightforward manner, divide the data into two
or more disjoint subsets.
• Recur: Use divide and conquer to solve the
subproblems associated with the data subsets.
• Conquer: Take the solutions to the subproblems
and “merge” these solutions into a solution for the
original problem.
Merge-Sort
• Algorithm:
• Divide: If S has at leas two elements (nothing needs
to be done if S has zero or one elements), remove all
the elements from S and put them into two
sequences, S1 and S2, each containing about half of
the elements of S. (i.e. S1 contains the first ⎡n/2⎤
elements and S2 contains the remaining ⎣n/2⎦
elements.
• Recur: Recursive sort sequences S1 and S2.
• Conquer: Put back the elements into S by merging
the sorted sequences S1 and S2 into a unique sorted
sequence.
Merge-Sort Example
Running Time of Merge-Sort
• At each level in the binary tree created for Merge
Sort, there are n elements, with O(1) time spent at
each element
• 🡪 O(n) running time for processing one level
• The height of the tree is O(log n)
• Therefore, the time complexity is O(nlog n)
Quick-Sort
1) Divide : If the sequence S has 2 or more elements,
select an element x from S to be your pivot. Any
arbitrary element, like the last, will do. Remove all
the elements of S and divide them into 3 sequences:
L, holds S’s elements less than x
E, holds S’s elements equal to x
G, holds S’s elements greater than x
2) Recurse: Recursively sort L and G
3) Conquer: Finally, to put elements back into S in
order, first inserts the elements of L, then those of E,
and those of G.
Idea of Quick Sort
1) Select: pick an element
2) Divide: rearrange elements
so that x goes to its final
position E
3) Recurse and Conquer:
recursively sort
Quick-Sort Tree
In-Place Quick-Sort
Divide step: l scans the sequence from the left, and r from the right.
A swap is performed when l is at an element larger than the pivot and r is at one
smaller than the pivot.
In Place Quick Sort (cont’d)
A final swap with the pivot completes the divide step
Quick Sort Running Time
• Worst case: when the pivot does not divide the sequence in
two
• At each step, the length of the sequence is only reduced by 1
• Total running time
• General case:
• Time spent at level i in the tree is O(n)
• Running time: O(n) * O(height)
• Average case:
• O(n log n)
More Sorting Algorithms
• Bucket Sort
• Radix Sort
• Stable sort
• A sorting algorithm where the order of elements
having the same key is not changed in the final
sequence
• Is bubble sort stable?
• Is merge sort stable?
Bucket Sort
• Bucket sort
• Assumption: the keys are in the range [0,
N)
• Basic idea:
1. Create N linked lists (buckets) to divide interval
[0,N) into subintervals of size 1
2. Add each input element to appropriate bucket
3. Concatenate the buckets
• Expected total time is O(n + N), with n =
size of original sequence
• if N is O(n) 🡪 sorting algorithm in O(n) !
Bucket Sort
Each element of the array is put in one of the N “buckets”
Bucket Sort
Now, pull the elements from
the buckets into the array
At last, the sorted array
(sorted in a stable way):
Does it Work for Real Numbers?
• What if keys are not integers?
• Assumption: input is n reals from [0, 1)
• Basic idea:
• Create N linked lists (buckets) to divide interval [0,1) into
subintervals of size 1/N
• Add each input element to appropriate bucket and sort
buckets with insertion sort
• Uniform input distribution 🡪 O(1) bucket size
• Therefore the expected total time is O(n)
• Distribution of keys in buckets similar with …. ?
Radix Sort
• How did IBM get rich originally?
• Answer: punched card readers for census
tabulation in early 1900’s.
• In particular, a card sorter that could sort cards
into different bins
• Each column can be punched in 12 places
• (Decimal digits use only 10 places!)
• Problem: only one column can be sorted on at a
time
Radix Sort
• Intuitively, you might sort on the most
significant digit, then the second most
significant, etc.
• Problem: lots of intermediate piles of cards to
keep track of
• Key idea: sort the least significant digit first
RadixSort(A, d)
for i=1 to d
StableSort(A) on digit i
Radix Sort
• Can we prove it will work?
• Inductive argument:
• Assume lower-order digits {j: j<i}are sorted
• Show that sorting next digit i leaves array correctly
sorted
• If two digits at position i are different, ordering numbers
by that digit is correct (lower-order digits irrelevant)
• If they are the same, numbers are already sorted on the
lower-order digits. Since we use a stable sort, the
numbers stay in the right order
Radix Sort
• What sort will we use to sort on digits?
• Bucket sort is a good choice:
• Sort n numbers on digits that range from 1..N
• Time: O(n + N)
• Each pass over n numbers with d digits takes
time O(n+k), so total time O(dn+dk)
• When d is constant and k=O(n), takes O(n) time
Radix Sort Example
• Problem: sort 1 million 64-bit numbers
• Treat as four-digit radix 216 numbers
• Can sort in just four passes with radix sort!
• Running time: 4( 1 million + 216 ) ≈ 4 million
operations
• Compare with typical O(n lg n) comparison
sort
• Requires approx lg n = 20 operations per number
being sorted
• Total running time ≈ 20 million operations
Radix Sort
• In general, radix sort based on bucket sort is
• Asymptotically fast (i.e., O(n))
• Simple to code
• A good choice
• Can radix sort be used on floating-point
numbers?
Summary: Radix Sort
• Radix sort:
• Assumption: input has d digits ranging from 0 to k
• Basic idea:
• Sort elements by digit starting with least significant
• Use a stable sort (like bucket sort) for each stage
• Each pass over n numbers with 1 digit takes time
O(n+k), so total time O(dn+dk)
• When d is constant and k=O(n), takes O(n) time
• Fast, Stable, Simple
• Doesn’t sort in place
Sorting Algorithms:
Running Time
• Assuming an input sequence of length n
• Bubble sort
• Insertion sort
• Selection sort
• Heap sort
• Merge sort
• Quick sort
• Bucket sort
• Radix sort
Sorting Algorithms:
In-Place Sorting
• A sorting algorithm is said to be in-place if
• it uses no auxiliary data structures (however, O(1) auxiliary variables
are allowed)
• it updates the input sequence only by means of operations
replaceElement and swapElements
• Which sorting algorithms seen so far can be made to work in
place?