Partition
l Clearly, all the action takes place in the
partition() function
Analysis of Algorithms n Rearranges the subarray in place
n End result:
u Two subarrays
u All values in first subarray ≤ all values in second
Quicksort
n Returns the index of the “pivot” element
separating the two subarrays
l How do you suppose we implement this?
1 11/19/2002 5 11/19/2002
Review: Quicksort Partition In Words
l Sorts in place l Partition(A, p, r):
l Sorts O(n lg n) in the average case n Select an element to act as the “pivot” (which?)
l Sorts O(n2) in the worst case n Grow two regions, A[p..i] and A[j..r]
u All elements in A[p..i] <= pivot
n But in practice, it’s quick
u All elements in A[j..r] >= pivot
n And the worst case doesn’t happen often (but more
n Increment i until A[i] >= pivot
on this later…)
n Decrement j until A[j] <= pivot
n Swap A[i] and A[j]
Note: slightly different from
n Repeat until i >= j book’s partition()
n Return j
2 11/19/2002 6 11/19/2002
Quicksort Partition Code
Partition(A, p, r)
l Another divide-and-conquer algorithm x = A[p];
Illustrate on
n The array A[p..r] is partitioned into two non- i = p - 1;
A = {5, 3, 2, 6, 4, 1, 3, 7};
j = r + 1;
empty subarrays A[p..q] and A[q+1..r]
while (TRUE)
u Invariant:
All elements in A[p..q] are less than all repeat
elements in A[q+1..r] j--;
until A[j] <= x;
n The subarrays are recursively sorted by calls to What is the running time of
repeat
quicksort i++;
partition()?
n Unlike merge sort, no combining step: two until A[i] >= x;
if (i < j)
subarrays form an already-sorted array Swap(A, i, j);
else
return j;
3 11/19/2002 7 11/19/2002
Quicksort Code Partition Code
Quicksort(A, p, r) Partition(A, p, r)
x = A[p];
{ i = p - 1;
j = r + 1;
if (p < r)
while (TRUE)
{ repeat
j--;
q = Partition(A, p, r); until A[j] <= x;
Quicksort(A, p, q); repeat partition() runs in O(n) time
i++;
Quicksort(A, q+1, r); until A[i] >= x;
} if (i < j)
Swap(A, i, j);
} else
return j;
4 11/19/2002 8 11/19/2002
Analyzing Quicksort Analyzing Quicksort: Average Case
l What will be the worst case for the algorithm? l Assuming random input, average-case running
n Partition is always unbalanced time is much closer to O(n lg n) than O(n2)
l What will be the best case for the algorithm? l First, a more intuitive explanation/example:
n Partition is perfectly balanced n Suppose that partition() always produces a 9-to-1
l Which is more likely? split. This looks quite unbalanced!
n The recurrence is thus:
n The latter, by far, except... Use n instead of O(n)
for convenience (how?)
T(n) = T(9n/10) + T(n/10) + n
l Will any particular input elicit the worst case?
n How deep will the recursion go? (draw it)
n Yes: Already-sorted input
9 11/19/2002 13 11/19/2002
Analyzing Quicksort Analyzing Quicksort: Average Case
l In the worst case: l Intuitively, a real-life run of quicksort will
T(1) = Θ(1) produce a mix of “bad” and “good” splits
T(n) = T(n - 1) + Θ(n) n Randomly distributed among the recursion tree
l Works out to n Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
T(n) = Θ(n2)
n What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
10 11/19/2002 14 11/19/2002
Analyzing Quicksort Analyzing Quicksort: Average Case
l In the best case: l Intuitively, a real-life run of quicksort will
T(n) = 2T(n/2) + Θ(n) produce a mix of “bad” and “good” splits
l What does this work out to? n Randomly distributed among the recursion tree
T(n) = Θ(n lg n) n Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
n What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
11 11/19/2002 15 11/19/2002
Improving Quicksort Analyzing Quicksort: Average Case
l The real liability of quicksort is that it runs in l Intuitively, a real-life run of quicksort will
O(n2) on already-sorted input produce a mix of “bad” and “good” splits
l Book discusses two solutions: n Randomly distributed among the recursion tree
n Randomize the input array, OR n Pretend for intuition that they alternate between
n Pick a random pivot element best-case (n/2 : n/2) and worst-case (n-1 : 1)
n What happens if we bad-split root node, then
l How will these solve the problem?
good-split the resulting size (n-1) node?
n By insuring that no particular input can be chosen u We end up with three subarrays, size 1, (n-1)/2, (n-1)/2
to make quicksort run in O(n2) time u Combined cost of splits = n + n -1 = 2n -1 = O(n)
u No worse than if we had good-split the root node!
12 11/19/2002 16 11/19/2002
Analyzing Quicksort: Average Case
l Intuitively, the O(n) cost of a bad split
(or 2 or 3 bad splits) can be absorbed
into the O(n) cost of each good split
l Thus running time of alternating bad and good
splits is still O(n lg n), with slightly higher
constants
l How can we be more rigorous?
17 11/19/2002
Analyzing Quicksort: Average Case
l For simplicity, assume:
n All inputs distinct (no repeats)
n Slightly different partition() procedure
u partition around a random element, which is not
included in subarrays
u all splits (0:n-1, 1:n-2, 2:n-3, … , n-1:0) equally likely
l What is the probability of a particular split
happening?
l Answer: 1/n
18 11/19/2002
Analyzing Quicksort: Average Case
l So partition generates splits
(0:n-1, 1:n-2, 2:n-3, … , n-2:1, n-1:0)
each with probability 1/n
l If T(n) is the expected running time,
1 n −1
T (n ) = ∑ [T (k ) + T (n − 1 − k )] + Θ(n)
n k =0
l What is each term under the summation for?
l What is the Θ(n) term for?
19 11/19/2002
Analyzing Quicksort: Average Case
l So…
1 n −1
T (n ) = ∑ [T (k ) + T (n − 1 − k )] + Θ(n )
n k =0
2 n −1 Write it on
= ∑ T (k ) + Θ(n)
n k =0
the board
n Note: this is just like the book’s recurrence (p166),
except that the summation starts with k=0
n We’ll take care of that in a second
20 11/19/2002