Algo
Algo
Alexandre Duret-Lutz
December 1, 2020
These are incomplete1 lecture notes for the ‘‘ALGO’’ course taught 1
The latest version can be retrieved
to ING1 students at EPITA. This course studies the complexity of algo- from https://www.lrde.epita.
rithms, and introduces students to several mathematical tools useful to fr/~adl/ens/algo/algo.pdf.
Please email me any correction to
analyze algorithms and derive complexity bounds. [email protected].
Contents
Mathematical Background 4
Read this before the second lecture.
Me r g e S o r t 30
Exploring Recurrence Equations 31
9 Solving Recurrence Equations by Differentiation 32
Master Theorem for Recurrence Equations 33
Establishing Upper Bounds by Mathematical Induction 34
When Mathematical Induction on Recurrence Fails 35
More Examples of Complexities 36
Nearly Complete Binary Trees 36
Heaps 36
He a p i f y and B u i l d H e a p 37
The Complexity of H e a p i f y 38
The Complexity of B u i l d H e a p 39
He a p S o r t 40
Pa r t i t i o n 41
Qu i c k S o r t 42
Worst and Best Cases for Q u i c k S o r t 43
Average Complexity of Q u i c k S o r t 44
Qu i c k S o r t Optimizations 45
Int r o S o r t, or How to Avoid Q u i c k S o r t’s Worst Case 46
Qu i c k S e l e c t 47
Average complexity of Q u i c k S e l e c t 48
Linear Selection 49
Space Complexity 50
In-place Algorithms 50
More Sort-Related Topics 51
Stable Sorting Techniques 51
Worst Case for Comparison Sort 52
Counting Sort 53
Radix Sort (LSD) 54
Radix Sort (MSD) 54
Further Reading 55
algo 3
F N O TAT I O N f
Although this document is written in English, it targets French students. As such, it mixes conventions
from different origins. For instance, I prefer to write ⊆ and ⊂ (for the analogy with ≤ and <) rather than
the ⊂ and convention commonly used in France.
N the set of natural numbers, including 0. N = {0, 1, 2, . . .}
N+ the set of natural numbers, excluding 0. N+ = {1, 2, 3, . . .}
Z the set of integers
R the set of real numbers
A⊆B A is a subset of (and possibly equal to) B
A⊂B A is a proper subset of B
loga x the logarithm of x in base a page 5
ln x loge x, the natural logarithm page 5
bxc the floor function, largest integer less than or equal to x page 6
dxe the ceiling function, smallest integer greater than or equal to x page 6
n the nth falling power of x: x n = x ( x − 1)( x − 2) . . . ( x − n + 1) page 7
x
n
a binomial coefficient: page 7
k k
there are (nk) = nk! ways to choose k items out of n
O( f (n)) the set of functions that, up to some multiplicative factor, are dominated pages 25–26
by f (n) asymptotically
Ω( f (n)) the set of functions that, up to some multiplicative factor, dominate f (n) pages 25–26
asymptotically
Θ( f (n)) the set of functions that, up to some multiplicative factors, dominate and pages 25–26
are dominated by f (n) asymptotically
f (n)
f (n) ∼ g(n) f (n) is asymptotically equivalent to g(n), i.e., lim =1 page 26
n→∞ g (n )
iff if and only if
positive (strictly) greater than zero (x > 0)
negative (strictly) less than zero (x < 0)
non-positive less than or equal to zero (x ≤ 0)
non-negative greater than or equal to zero (x ≥ 0)
algo 4
Mathematical Background
n −1
∑ ak ∑
2
or, using a more general form, ak . As programmers you should learn
to love semi-open intervals. Think
k =0 0≤ k < n
for instance about how begin() and
end() are used in the C++ standard
The latter form has a couple of advantages over the former one.
library. If the interval was closed (i.e.,
if it included the value pointed to
• As this example demonstrates, we can use the semi-open interval2 by end()) you would not be able to
0 ≤ k < n instead of the more tedious 0 ≤ k ≤ n − 1. specify an empty range.
499
∑ k or less intuitively ∑ 2k + 1. (1)
1≤k <1000 k =0
k odd
∑ k= ∑ 2k + 1
1≤k<1000 1≤2k +1<1000
k odd 2k +1 odd
∑ 2k + 1 = ∑ 2k + 1
1≤2k +1<1000 1≤2k +1<1000
2k +1 odd
∑ 2k + 1 = ∑ 2k + 1
1≤2k +1<1000 0≤k <499.5
10x ex 2x
Logarithms 10
8
The logarithm in base a, i.e., the function x 7→ loga x, is the reciprocal
function of x 7→ a x . Figure 1 shows a few examples. 6
It is common to write ln x = loge x for the natural logarithm4 , i.e., 4
the logarithm in base e. But this natural logarithm will have almost log2 x
2 ln x
no use to us. When analyzing algorithms, we will usually encounter log10 x
loga x for various integer values of a, and most often a = 2. 0
0 2 4 6 8 10
There is a simple algorithm for computing a logarithm in any
base, using only elementary operations. This algorithm is also a N Figure 1: Various logarithms and
perfect exercise to practice logarithms. their reciprocal functions. This figure
is restricted to positive values because
Let us compute log10 (1385) up to two decimal places. Because
negative values will never occur in the
x 7→ log10 x is the reciprocal function of x 7→ 10x we know that analysis of algorithms.
log10 (1000) = log10 (103 ) = 3 and log10 (10000) = log10 (104 ) = 4.
4
Trivia: x 7→ ln x is sometimes called
Furthermore, since 1000 < 1385 < 10000 and log10 is an increasing
Napierian logarithm after John Napier
function, it follows that 3 < log10 (1385) < 4. We are therefore (known as Neper in France), despite
looking for two digits a and b such that the fact that the function he defined
in 1614 was different. The natural
logarithm was introduced by Nicolaus
log10 (1385) = 3.ab . . . (2) Mercator in 1668 using the series
expansion of ln(1 + x ). Finally around
To find a, we should subtract 3 from both sides, multiply everything 1730 Leonhard Euler defined the
by 10 and rework the left-hand side as a log10 : functions ex = limn→∞ (1 + x/n)n and
ln x = limn→∞ n( x1/n − 1) and proved
that they are the reciprocal of each
log10 (1385) − 3 = 0.ab . . . other.
log10 (1385) − log10 (103 ) = 0.ab . . .
1385
log10 = 0.ab . . .
1000
log10 (1.385) = 0.ab . . . (3)
10 log10 (1.385) = a.b . . .
log10 (1.38510 ) = a.b . . . (4)
log10 (25.9715419 . . .) = a.b . . .
Since 104 < 13962.955 . . . < 105 we conclude that b = 4 and we have
just computed that log10 (1385) ≈ 3.14.
You can adjust this algorithm to compute a logarithm in any
base. Using paper and pen, the only difficult step is to compute x10 . 7
You can compute x10 using only 4
However, unless you plan to compute a lot of decimal places, you do multiplications. Can you see how?
not necessary need a very precise result.7 Hint: x4 requires 2 multiplications.
algo 6
dxe x
Floor b x c and Ceiling d x e of x
3 bxc
Given a real number x, the notation b x c denotes the largest integer
smaller than x. Conversely, d x e denotes the smallest integer larger 2
than x. Figure 2 illustrate both functions, called respectively floor 1
and ceiling.8 For instance bπ c = 3 and dπ e = 4. These two functions
have no effect on integers: b12c = d12e = 12. In fact for any real −2 −1 0 1 2 3 x
number x we have: −1
bxc ≤ dxe −2
b x c = d x e iff x ∈ Z
N Figure 2: The functions b x c and d x e.
1 + b x c = d x e iff x 6∈ Z 8
The C standard library has two
functions floor() and ceil() that
For any n ∈ Z and any x ∈ R the following properties hold9 : round a double accordingly.
b x c < n ⇐⇒ x < n 9
Exercise: demonstrate b− x c = −d x e
using these properties.
d x e ≤ n ⇐⇒ x ≤ n
n < d x e ⇐⇒ n < x
n ≤ b x c ⇐⇒ n ≤ x
b x c = n ⇐⇒ x − 1 < n ≤ x ⇐⇒ n ≤ x < n + 1
d x e = n ⇐⇒ x ≤ n < x + 1 ⇐⇒ n − 1 < x ≤ n
For any n ∈ N we have n = bn/2c + dn/2e. We can prove
this equation by considering the parity of n. If n is even, bn/2c =
dn/2e = n/2 and the equation holds trivially. If n is odd, then
bn/2c = n/2 − 1/2 and dn/2e = n/2 + 1/2 so the sum is indeed n.
Rounding to the nearest integer can be done with b x + 0.5c or
d x − 0.5e depending on how you want to round half-integers10 . 10
i.e., values of the form n + 0.5 with
Now let us nest these rounding notations. It should be easy to see n ∈ N. Should they be rounded down
to n, or up to n + 1?
that dd x ee = bd x ec = d x e and db x ce = bb x cc = b x c, i.e., only the
innermost rounding function matters.
Furthermore, for any n ∈ N+ , m ∈ N+ and x ∈ R we have11 : 11
Note that these two equations are
only good when n and m are integers.
bb x/nc/mc = b x/nmc For instance bb10/.3c/.3c = 111 but
b10/.3/.3c = 110.
dd x/ne/me = d x/nme
The floor notation should be used any time we want to represent int avg(int a, int b)
an integer division, for instance as in Figure 3. {
When rounding logarithms you should know the following iden- return (a + b) / 2;
}
tity:
N Figure 3: If we ignore overflows,
dlog2 (n + 1)e = blog2 (n)c + 1 this function computes b a+2 b c because
dividing an int by another int will
To prove that, rewrite n as 2m + p where m ∈ N and 0 ≤ p < 2m . always round the result towards zero.
Then:
j p k j p k
blog2 (n)c + 1 = log2 2m 1 + m + 1 = m + log2 1 + m +1
2 | {z2 }
1≤···<2
= m + 0 + 1 = m + 1, and
m p+1 p+1
dlog2 (n + 1)e = log2 2 1+ m = m + log2 1 + m
2 2
| {z }
1<···≤2
= m + 1.
algo 7
Simple Combinatorics 12
By word, we mean just any sequence
of letters, not necessarily a meaningful
• Assume you have a set S of n different letters. How many differ-
word in some dictionary.
ent words12 of length k can we build using only letters from S,
assuming we can use the same letter multiple times? There are
30 31 32 33
n possible letters for each of the k positions in the word, so the aaa ε ccc
number of choices is n × n × n × · · · × n = nk . See Figure 4. aab aa a c cc ccb
cc a
| {z } a ac b
ab cb cbc
k terms ab a c
abb c b
ac ca cb b
• What if we are only allowed to use each letter of S at most once? ab a ba bb bc ca a
a c c
ac b
ca a
ac
Then after we have selected the first letter among the n available,
ca c
baa c
b
bc
b ab
bcb
b ac
bc a
bb a
bbb bbc
we are left with only n − 1 choices for the second letter, n − 2 for
the third letter, etc. The number of words of length k ≤ n we can
N Figure 4: Over the alphabet { a, b, c}
build without repeated letter is therefore there are 31 ways to build a 1-letter
n! word, 32 ways to build a 2-letter word,
n × ( n − 1) × ( n − 2) × · · · × ( n − k + 1) = n k = and 33 ways to build a 3-letter word.
| {z } (n − k)! There is only 30 = 1 way to build the
k terms empty word (denoted ε).
See Figure 5 for an example. The notation nk , with an underlined
exponent, is the kth falling power of n: it works like a power except ε 30
that its argument is decremented by one after each product.13 We a b c 31
can define the falling power recursively as n0 = 1 and for k > 0, ab ac bc ba ca cb 32
nk = n × (n − 1)k−1 . In particular we have nn = n!. abc acb bca bac cab cba 33
N Figure 5: Without repeating letter
• Let us now build subsets of S that contain k letters. We could pro-
there are only 31 = 3 ways to build a
ceed as we did for building words of length k with unique letters: 1-letter word, 32 = 3 × 2 ways to build
choosing the first letter among n, then the second among n − 1, a 2-letter word, and 33 = 3 × 2 × 1
ways to build a 3-letter word.
etc. We can actually associate each word to a set. For instance,
the word ab would correspond to the set { a, b}, the word bc to 13
When both n and k are natural
{b, c}. The problem is that this correspondence is not a one-to-one numbers such that k ≤ n, we have
mapping: the word ba would also be mapped to the set { a, b} nk = n!/(n − k )!. However, the
falling power can be used even when
since sets are not ordered. For a given set with k letters, there are n is a complex number, or when k is
kk = k! different words. So the number of subsets of size k built larger than n, two cases that are not
supported by the expression using
from a set of size n, is equal to the number of k-letter words we
factorials.
can build without repeating letters from n letters, divided by the
k! numbers of ways to order these k letters. ∅ 30 /0! = (30)
{ a} {c} {b} 31 /1! = (31)
nk
n! n
= = { a, c} { a, b} {b, c} 32 /2! = (32)
k! (n − k)!k! k
{ a, b, c} 33 /3! = (33)
The number (nk), pronounced ‘n choose k’, is called binomial co-
efficient because it is the coefficient of x k yn−k in the polynomial N Figure 6: When the words of Fig-
ure 5 are converted to sets, the tree
expansion of the nth power of the binomial x + y: collapses into this lattice.
n
n k n−k
( x + y)n = ∑ x y 1 20
k =0
k
1 1
21
• What is the total number of subsets of S (of any size)? To build 1 2 1 22
1 3 3 1 23
one subset, we iterate over each letter of S and decide whether
1 4 6 4 1 24
we take it or not. We have 2 possibilities for each of the n letters, 1 5 10 10 5 1 25
that makes 2n different subsets. On the other hand, this number 1 6 15 20 15 6 1 26
of subsets is also the sum of all subsets of different sizes, as com- 1 7 21 35 35 21 7 1 27
n
n
puted in the previous paragraph. So we have ∑ = 2n as N Figure 7: The sum of each line of
k =0
k Pascal’s triangle is a power of 2.
illustrated by Figure 7.
algo 8
Triangular Numbers +1 +2 +3
A0 = 0 A1 = 1 A2 = 3 A3 = 6
n
The numbers An = 0 + 1 + 2 + · · · + n = ∑ k are called triangular
k =0
numbers because they can be represented as in Figure 8. +4 +5 +6
A4 = 10 A5 = 15 A6 = 21
The equality An = n(n + 1)/2 can be demonstrated in several
N Figure 8: The first triangular num-
ways. bers.
Since there are n + 1 terms on the right-hand side of the last line,
we find that 2An = n(n + 1). 1 n
2
Figure 9 shows a graphical version of this demonstration. 3
n
• The previous demonstration is easily performed using the ∑
3
notation as well: 2
n
! !
1
2An = ∑ k + ∑ k n+1
0≤ k ≤ n 0≤ k ≤ n
N Figure 9: Another way to see that
Replace k by n − k in the second sum: 2An = n(n + 1): there are An dots
! ! of each color arranged in a n by n + 1
rectangle.
2An = ∑ k + ∑ n−k
0≤ k ≤ n 0≤ n − k ≤ n
2An = ∑ (k + n − k) = ∑ n = n ( n + 1)
0≤ k ≤ n 0≤ k ≤ n
1
• As seen on page 7, there are (n+ 1
2 ) subsets of length 2 in {0, 1, . . . , n }. 1 1
Let { x, y} be such a subset, and assume x < y. Let us count all
1 2 1
subsets existing for the different values of x. If x = 0, there are n 1 3 3 1
possible values for y; if x = 1 we have n − 1 possible values for y; 1 14 6 4
etc. If x = n there is no value available for y. The sum of all these 5 10 10 5 1
1
n + (n − 1) + · · · + 0 just happens to be An . So we have 1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
N Figure 10: Triangular numbers form
( n + 1)2
n+1 ( n + 1) n
An = = = . a diagonal of Pascal’s triangle.
2 2! 2
Figure 10 should therefore not be a surprise.16 16
By convention (nk) = 0 when k > n or
k < 0 (i.e., outside of Pascal’s triangle)
so our (n+ 1
2 ) is also valid for A0 .
algo 9
+ A1 + A2
Tetrahedral Numbers
B0 = 0 B1 = 1 B2 = 4
What happens when we sum all consecutive triangle numbers? + A3 + A4
n n j B3 = 10 B4 = 20
Bn = A0 + A1 + · · · + An = ∑ Aj = ∑ ∑ k
j =0 j =0 k =0 + A5 + A6
B5 = 35 B6 = 56
We get tetrahedral numbers, so called because stacking the triangles of
N Figure 11: The first tetrahedral
Figure 8 gives you a triangular pyramid as shown in Figure 11.
numbers.
The closed formula is Bn = n(n + 1)(n + 2)/6 and there are again
a couple of ways to prove it.17 17
Do not confuse this formula with the
Cn = n(n + 1)(2n + 1)/6 from page 10.
• Induction is still a possible option. The key step is that Bn =
( n −1) n ( n +1)
Bn−1 + An = 6 + n(n2+1) = n(n+16)(n+2) . 1 1
∑1 = n
1 1
( n −1) n ( n +1)
• Note that the above formula 6 + n(n2+1) = n(n+16)(n+2) is 1 ∑ ∑ 1 = An
2 1
1 3 3 1 ∑ ∑ ∑ 1 = Bn
simply a long way to write (n+ 1 n +1 n +2
3 ) + ( 2 ) = ( 3 ). You may find it 1 4 6 4 1
easier to remember that Bn = (n+ 2
3 ), forming another diagonal of 1 5 10 10 5 1
Pascal’s triangle (Figure 12). 1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
Since each diagonal of Pascal’s triangle is made of the partial sum
N Figure 12: Tetrahedral numbers form
of the previous diagonal, you should find very easy to guess a another diagonal of Pascal’s triangle.
formula for the sum of consecutive tetrahedral numbers: (Note that these sums implicitly start at
1, not 0 like in the rest of the page; do
n n j n i j
you see why it matters in this picture?)
n+1 n+2 n+3
∑k= 2
, ∑ ∑k= 3
, ∑∑ ∑k= 4
.
k =0 j =0 k =0 i =0 j =0 k =0
• The above two points require you to know (or suspect) that Bn =
n(n+1)(n+2)
6 or Bn = (n+ 2
3 ) in order to prove it by induction.
How can we find a closed formula for Bn if we do not know
that? Looking at how balls are stacked in 3D in Figure 11, we
can assume that Bn should represent some volume, i.e., a cu-
bic polynomial. Or if you prefer a more mathematical view: A j
is a quadratic polynomial, Bn , as the sum of n of these terms,
should be expressible as a cubic polynomial. So we guess Bn =
an3 + bn2 + cn + d and we just need to evaluate this for a couple
of values of n to find a, b, c, and d. Evaluating B0 = 0 tells us that
d = 0. From B1 = 1, B2 = 4, and B3 = 10 we get:
a+b+c = 1
c = 1−a−b
8a + 4b + 2c = 4 hence 6a + 2b = 2
27a + 9b + 3c = 10 24a + 6b = 7
c = 1−a−b
c = 2/6
hence b = 1 − 3a hence b = 3/6
6a + 6 = 7 a = 1/6
n3 +3n2 +2n
Thus we have found that 6 , which happens to be equal to
n(n+1)(n+2)
6 ,
is a polynomial that will work for n = 0, 1, 2, 3, and
we can prove by induction that it is correct for any n ∈ N.
algo 10
Pyramidal Numbers +1 +4
C0 = 0 C1 = 1 C2 = 5
n
The numbers Cn = 02 + 12 + 22 + · · · + n2 = ∑ k2 are called +9 +16
k =0 C3 = 14 C4 = 30
pyramidal numbers because they represent the number of spheres
stacked in a pyramid with a square base, as shown in Figure 13.
+25 +36
Unlike previous numbers, we will not give the closed formula di- C5 = 55 C6 = 91
rectly. It seems remembering the formula is hard for many students, N Figure 13: The first pyramidal
so maybe it is best to learn three ways to rediscover it. numbers.
2n3 +3n2 +n
Hence our polynomial is 6 and without too much effort18 18
Because two of the three roots are
n(n+1)(2n+1) easy to find: 0 and −1.
we can factorize it as 6 .
By construction this formula is correct from C0 to C3 . If we as-
(n−1)n(2n−1)
sume that Cn−1 = 6 , then Cn = Cn−1 + n2 =
n(2n2 −3n+1)+6n2 2
6 = n(2n +63n+1) = n(n+1)(
6
2n+1)
. Hence by induction
our formula is correct for all n ∈ N.
n
• Let us compute S = ∑ (i + 1)3 − i3 in two ways. First, we
i =0
separate it in two sums which almost cancel out each other19 : 19
Watch out for the indices in these
two sums! The first sum is changed by
n n n +1 n
replacing i by i − 1 and rewriting the
S= ∑ ( i + 1)3 − ∑ i 3 = ∑ i 3 − ∑ i 3 = ( n + 1)3 (5) range 0 ≤ i − 1 ≤ n into 1 ≤ i ≤ n + 1.
i =0 i =0 i =1 i =1 In the second sum we just omit the first
In a second approach, we develop the summand and express the term, because it is equal to 0.
Since (5) and (6) are two expressions for S, we get that 3Cn +
3An + n + 1 = (n + 1)3 . Knowing a formula for An , we get
n(n+1)(2n+1)
3Cn = (n + 1)((n + 1)2 − 32 n − 1) hence Cn = 6 .
Dn = a+ ( a + b) + ( a + 2b) + · · · + ( a + nb)
Dn = ( a + nb) + ( a + (n − 1)b) + ( a + (n − 2)b) + · · · + a
1 2n+1
2Dn = (2a + nb) + (2a + nb) + (2a + nb) + · · · + (2a + nb) 3
5
n
Hence 2Dn = (2a + nb)(n + 1). Figure 15 gives an example with 5
3
2n+1 1
a = 1 and b = 2.
2n + 2
The above trick has a huge advantage over expressing Dn using
An : it can be generalized very easily to any partial sum of an arith- n
N Figure 15: The sum On = ∑ 2k + 1
metic progression. For instance, let us assume you want to sum all k =0
of the first n odd numbers is such that
the terms 3 + 5i for 100 ≤ i ≤ 1000. Calling S the result, you would 2On = n(2n + 2) hence On = n(n + 1).
write
w−v+1
(( a + vb) + ( a + wb)) ,
2
that is: the sum of the first and last terms, multiplied by half the
number of terms.21 21
But do also remember that this is
only valid for arithmetic progressions.
algo 12
n n f (x) f 0 (x)
lim = lim 0 .
∑ 1k = ∑ 1 = n + 1 x →c g ( x ) x →c g ( x )
k =0 k =0
1 − r n +1 −(n + 1)r n
lim = lim = n+1
r →1 1 − r r →1 −1
9 Equation (7) can be used to rediscover the formula for Triangular
Numbers (page 8). To transform ∑ r k into ∑ k, we differentiate
∑ r k with respect to r, giving ∑ kr k−1 , and then we set r = 1. Of
course we must do these operations on both sides of (7), and we
have to take a limit for r → 1 on the right:
n
d d 1 − r n +1
dr ∑ rk = dr 1−r
k =0
n
−(n + 1)r n (1 − r ) + (1 − r n+1 )
∑ krk−1 = (1 − r )2
k =1
n
nr n+1 − (n + 1)r n + 1 (n + 1)nr n − (n + 1)nr n−1
∑ k = rlim
→1 (1 − r ) 2
= lim
r →1 2(r − 1)
k =1 23
Doing so is left as an exercise to
(n + 1)nr n−1 (r − 1) ( n + 1) n the reader. If you survive the double
= lim = differentiation and the computation
r →1 2(r − 1) 2
of the limit, and obtain the expected
d d
∑ r k = ∑ k2 r k−1 so by setting r = 1 we get the
n(n+1)(2n+1)
Similarly dr r dr 6 , treat yourself with a
well-earned lollipop.
formula for the Pyramidal Numbers (page 8).23
algo 13
9 Catalan Numbers
n Pn Dyck words
A Dyck word of length 2n is a string built using n opening paren- 0 1 ε (empty word)
theses and n closing parentheses, in such a way that a closing 1 1 ()
2 2 ()(), (())
parenthesis always matches an opening one. For instance w1 = 3 5 ()()(), ()(()),
‘((()()))(())’ is a Dyck word, but w2 = ‘(()))(())()(’ is not. (())(), (()()), ((()))
Let Pn be the number of Dyck words of length 2n. This integer 4 14 …
6 42 …
sequence (Table 1) is known as the Catalan numbers24 . 7 132 …
A string with n opening parentheses and n closing parentheses N Table 1: Number of Dyck words for
can be interpreted as a path on a square grid (Figure 17). Starting various n, a.k.a. Catalan numbers.
24
from the lower left corner and interpreting the letter ‘(’ and ‘)’ Named after Eugène Charles Catalan
(1814–1894).
respectively as up and right, we necessarily reach the above right
corner. The number of paths that join the two corners using only n
up and n right movements is (2n n ): from the total of 2n movements
we simply have to choose n which will be the ups. (Or if you prefer
working with words: in a string of 2n characters we have to choose n
positions among the 2n positions available to put the ‘(’ letters.)
Not all these (2n
n ) paths correspond to Dyck words, only those that
stay above the diagonal. To count the number of paths that do not
correspond to Dyck words, let us consider the first segment of the
N Figure 17: The words
path that goes below the diagonal, and flip all up and right move- w1 = ‘((()()))(())’ and
ments afterwards (Figure 18). This is a reversible operation that can w2 = ‘(()))(())()(’ interpreted
as paths on a grid. The letter ‘(’ is up,
only be done on paths that do not represent a Dyck word. Since the
while ‘)’ is right. Dyck words corre-
resulting path has only n − 1 up movements, there are (n2n −1) words of sponds to paths that stay above the
length 2n that are not Dyck words. We have established that diagonal.
n
2n 2n
Pn = − (8)
n n−1
−1
f (k) dk ≤ ∑ f (k) ≤
0
f (k) dk
k =0
Note that the length of the two integration intervals is equal to the
number of terms in the sum.27 f (0) f (1) f (2) f (n)
These inequalities come in handy to bound a sum that we do
−1 0 1 2 n n+1
not know how to simplify. For instance, let us pretend that we do
not know how to compute triangular numbers (page 8). We simply N Figure 21: If f (i ) is interpreted as
rewrite the above inequalities with f (k) = k: an area between i − 1R and i, we have
n
f (0) + · · · + f (n) ≥ −1 f (k ) dk.
Z n n Z n +1
−1
k dk ≤ ∑ k≤
0
k dk 27
Using a semi-open interval for the
k =0 sum, we can rewrite these inequalities
using the same bounds for the sum and
Since the antiderivative28 of k is k2 /2 we get: integrals:
2 n n 2 n +1 Z n +1 Z n +1
k k
≤ ∑k≤ 0
f (k − 1) dk ≤ ∑ f (k) ≤ 0
f (k) dk
2 −1 k =0 2 0 28
0≤ k < n +1
a.k.a. primitive
n
n2 − 1 ( n + 1)2
≤ ∑k≤
2 k =0
2
We do not have an exact value for this sum, but from these bounds
we can at least derive some asymptotic equivalence29 : 29
f ∼ g iff lim
f (n)
=1
n→∞ g(n)
n
n2
∑k∼ 2
k =0
1
log2 k dk ≤ ∑ log2 (k) ≤ 2
log2 k dk your memory, and use the freed space
to store a formula that will work for
k =2
n n+1 all bases instead: the antiderivative of
k k loga ( x ) is x loga ( x/e).
k log2 ≤ log2 (n!) ≤ k log2
e 1 e 2
2
n log2 n − n log2 (e) + log2 (e) ≤ log2 (n!) ≤ (n + 1) log2 (n + 1) − (n + 1) log2 (e) − 2 log2
e
From that we easily conclude log2 (n!) ∼ n log2 n. A sorting algo- 31
Page 52 we will demonstrate that any
sorting algorithm that uses compar-
rithm that performs in the order of n log2 n operations is actually
isons to order values requires at least
pretty good.31 log2 (n!) comparisons in the worst case.
For a more precise tie between sums and integrals, look up the
Euler-Maclaurin formula in your preferred encyclopedia.
algo 15
∑2 k
2
have ∑ blog2 kc = 17 × 4 − ∑ 2k .
k =1 k =1
k =1
1
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
16
∑ blog2 kc
k =1
n blog2 nc blog2 nc
∑ blog2 kc = (n + 1)blog2 nc − ∑ 2k = (n + 1)blog2 nc + 1 − ∑ 2k
k =1 k =1 k =0
blog2 nc+1
= (n + 1)blog2 nc + 2 − 2
n 1 0 0 0 0 1
Finally, Fn = n + ∑ blog2 i c = n + (n + 1)blog2 nc + 2 − 2blog2 nc+1 . 2 0 0 0 1 0
i =1 3 0 0 0 1 1
4 0 0 1 0 0
F t f 5 0 0 1 0 1
6 0 0 1 1 0
Why do we care about such a function? Because blog2 (n)c + 1 is 7 0 0 1 1 1
the number of bits required to represent the number n in binary, and 8 0 1 0 0 0
many algorithms have a run time proportional to that. Fn is the sum 9 0 1 0 0 1
of the bits needed to represent each number between 1 and n (see 10 0 1 0 1 0
Figure 23). 11 0 1 0 1 1
For instance, running a recursive implementation of Bina- 12 0 1 1 0 0
13 0 1 1 0 1
rySearch (page 24) on an array of length n involves at most
14 0 1 1 1 0
2 + blog2 (n)c calls (the first one plus the recursive ones) to Bi- 15 0 1 1 1 1
narySearch. 16 1 0 0 0 0
Now let us assume that you are doing a binary search to insert N Figure 23: The first 16 positive inte-
a new element into a sorted array, and that you do this in a loop, gers with their binary representation.
so that each binary search is applied to an array that has one entry The log2 (i ) + 1 bits needed to represent
the number i are highlighted . If you
more than the previous one. The total number of calls the Binary-
omit the last column (a total of n bits),
Search (including recursive calls) will therefore have the form the two colored areas are the same as in
Figure 22.
j
∑ (2 + blog2 kc)
k =i
where i and j depends on the initial size of the array and the number
of iterations (i.e., binary search + insertion) performed.
algo 16
9 Finite Calculus
For those of you curious to learn new tricks, here is something called
Finite Calculus. This page should by no means be understood as a
reference on this subject, instead, consider it as a teaser33 . 33
For a more serious presentation of
The idea is that a sum of kα over a half-open interval behaves like Finite Calculus, I suggest you start with
Finite Calculus: A Tutorial for Solving
the integral of kα over the same interval. So summing these falling Nasty Sums, by David Gleich.
powers34 should be very natural (at least if you remember how to 34
The falling power was defined on
integrate). Compare the following equations: page 7 as k0 = 1 and kα = k (k − 1)α−1 .
Z n
∑ 1=n
0
1 dk = n
0≤ k < n
n2 n2
Z n
∑ k= 2 0
k dk =
2
0≤ k < n
n3 n3
Z n
∑ k2 =
3 0
k2 dk =
3
0≤ k < n
n α +1 n α +1
Z n
∑ kα =
α+1 0
kα dk =
α+1
0≤ k < n
but the sum and integral of x k do not actually exhibit that much
similarities:
x j − xi x j − xi
Z j
∑ xk =
x−1 i
xk dk =
ln x
i≤k< j
algo 17
y
9 Comparison of Two Uniformly Distributed Values
50
Let us assume we have two dices x and y, returning uniformly-
distributed integers such that 20 ≤ x ≤ 40 and 30 ≤ y ≤ 50. We
want to compute the probability that x ≥ y. Maybe this is part of a
role-playing game that we are developing, and this comparison will 40
decide the outcome of a battle. Being able to figure this probability
will allows us to fine-tune the range of x and y to balance the game.
We can represent the different values that x and y may take in two
dimensions, as on Figure 28, and then it is just a matter of counting 30
x
the dots: 20 30 40
∑50 40
y=30 ∑ x =y 1 N Figure 27: When x and y are inte-
P( x ≥ y) = gers, the probability P( x ≥ y) is the
∑50
y=30 ∑40
x =20 1 number of highlighted dots divided by
the total number of dots.
The constraint y ≤ x ≤ 40 on the inner sum of the numerator implies y
that the outer sum should be restricted to y ≤ 40: 50
∑40 40
y=30 ∑ x =y 1 ∑40
y=30 (41 − y ) (12 × 11)/2 66
= = = =
∑50
y=30 ∑40
x =20 1 ∑50
y=30 21
21 × 21 441
F notes f
SelectionS o r t 0 i n
smallest i largest n − i
≤
This is probably the simplest sorting algorithm to study. values, sorted values, unsorted
Given an array A containing n values to sort in increasing order, N Figure 32: Loop invariant for Selec-
tionSort.
SelectionSort maintains the loop invariant depicted by Figure 32:
at any iteration i, the values in the range A[0..i − 1] are the smallest 2 7 1 4 6 5 8 3
i values of A in increasing order (i.e., this part is sorted and will not i
be changed), while the values in A[i..n − 1] are unsorted and larger 1 7 2 4 6 5 8 3
than all the values in A[0..i − 1].
i
When i = 0 the array is completely unsorted, and when i = n the
array is fully sorted. Actually we can stop after i = n − 2 because 1 2 7 4 6 5 8 3
after this iteration the only ‘‘unsorted’’ value, A[n − 1], is necessarily i
the largest value in the array, so it is already at the right place. 1 2 3 4 6 5 8 7
To increase i while maintaining this invariant, all we need is to
i
exchange A[i ] with the minimum value of A[i..n − 1]. This gives us
the following algorithm (illustrated by Figure 33): 1 2 3 4 6 5 8 7
i
SelectionSort( A, n) (executions)
1 for i ← 0 to n − 2 do n−1 1 2 3 4 5 6 8 7
2 min ← i n−1 i
3 for j ← i + 1 to n − 1 do (n − 1)n/2 1 2 3 4 5 6 8 7
4 if A[ j] < A[min] (n − 1)n/2
i
5 min ← j ≤ (n − 1)n/2
1 2 3 4 5 6 7 8
6 A[min] ↔ A[i ] n−1
N Figure 33: The colored arrows show
Because line 1 iterates from 0 to n − 2, we can easily tell that lines the exchanges performed on line 6
1, 2, and 6 will be executed n − 1 times. The other three lines are by SelectionSort. Exactly n − 1
swaps are needed to sort an array of n
involved in two nested loops: for a given i the loop on line 3 will
elements.
make (n − 1) + 1 − (i + 1) = n − i − 1 iterations. We have to sum this
for all i using for instance the formula from page 11:
n −2
((n − 1) + (1))(n − 1) ( n − 1) n
∑ ( n − i − 1) =
47
= This is not exactly true, because
i =0
2 2 line 5 may not be proportional to any
of those. However, because line 5 is
Finally line 5 should have an execution count that is at most (n − simple and execute less often than
1)n/2 since it is only executed if the previous comparison succeeds. the comparisons, it should have little
influence in practice.
SelectionSort performs (n − 1)n/2 comparisons (line 4) and
n − 1 exchanges (line 6). Since the execution counts of all the other
lines are also expressed using these quantities, we should be able to
>>> import numpy as np
approximate47 the total number of operations performed (or even
>>> x=[1000,2000,5000,10000,
the time to execute the algorithm) as a linear combination of these ... 20000,50000,100000,200000]
two quantities: a polynomial of the form an2 + bn + c. >>> y=[0.001166,0.004356,
... 0.018224,0.052226,0.173569,
We can now predict the behavior of a SelectionSort imple- ... 0.921581,3.678394,14.70667]
mentation after measuring it on a few arrays of different sizes. For >>> p = np.polyfit(x, y, deg=2)
>>> print(p)
instance, if an implementation gives the following timings:
[ 3.67619503e-10 -4.42217128e-08
size: 1, 000 2, 000 5, 000 10, 000 20, 000 50, 000 100, 000 200, 000 9.81022492e-03]
time: 0.001166 0.004356 0.018224 0.052226 0.173569 0.921581 3.678394 14.70667 >>> np.polyval(p, 2000000)/60
we can use a least-square regression to fit these points to the follow- 24.506656289555412
ing quadratic polynomial as shown on Figure 34: N Figure 34: Using numpy to find a
quadratic polynomial that is a best fit
3.676 · 10−10} ×n2 |− 4.422 · 10−}8 ×n + 9.810 · 10−}3 (in a least-square sense) to our data
| {z {z | {z set, and then predict the run time of
a b c
our implementation on an input of size
Now we can estimate we need 24.5 minutes to sort 2,000,000 values. 2, 000, 000.
algo 22
0 i n
InsertionS o r t
sorted values unsorted values
While InsertionSort’s loop invariant (Figure 35) looks similar N Figure 35: Loop invariant for I nser-
tionSort.
to the invariant of SelectionSort (page 21), it is actually more
relaxed: there is no requirement for all the sorted values in A[0..i − 2 7 1 4 6 5 8 3
1] to be smaller than the unsorted values in A[i..n − 1]. We can start i key
with i = 1 because a sub-array of size 1 is always sorted. To increase
2 7 1 4 6 5 8 3
i, it is necessary to insert A[i ] at the correct position in the sorted
i key
range, shifting some values to the right to make some room. The
array will be sorted when we reach i = n, i.e., after the iteration for 1 2 7 4 6 5 8 3
i = n − 1. i key
There are actually a couple of ways to implement the shift-to- 1 2 4 7 6 5 8 3
insert procedure.48 The pseudo-code below (illustrated by Fig-
i key
ure 36) scans the sorted values from right to left, shifting right all
values greater than the one we want to insert (stored in the variable 1 2 4 6 7 5 8 3
key), and until it finds a smaller value or the start of the array. i key
InsertionSort( A, n) (executions) 1 2 4 5 6 7 8 3
1 for i ← 1 to n − 1 do n−1 i key
• The best case is when lines 5 and 6 are never executed.49 In that 49
For this to occur, key (which contains
case, ti = 0 for all i, and line 4 is executed ∑in=−11 (ti + 1) = n − A[i ]) must always be larger or equal to
A[i − 1], i.e., A must already be sorted.
1 times. The entire algorithm therefore executes a number of
operations that is proportional to n − 1, i.e., it is a linear function.
Average-Case Analysis
Knowing the worst and best case complexities of some algorithm is
important, but it does not really tells us how it behaves usually. This
is where an average analysis can be helpful: if possible we would
like to consider all possible inputs of size n, compute the number
of operations performed on each of them, and average the result.
This procedure is not really practical, because we are usually not 2 7 1 4 6 5 8 3
able (or willing) to compute a complexity for each individual case.
Therefore, we resort to statistics and probabilities, making some 1 2 3 4 5 6 7 8
hypothesis on the distribution of inputs. Instead of averaging on all
possible inputs, we will also usually consider only different possible
4 10 2 6 9 8 11 5
shapes of inputs, maybe with different probabilities.
Let us consider InsertionSort from page 22 again, and as-
sume for simplicity that all the values in the array A are different. 2 4 5 6 8 9 10 11
Although the two arrays of Figure 37 are different, from the point of N Figure 37: Two different arrays
view of the sorting algorithm they correspond the same input order that have the same input order will
be handled similarly by the sorting
and they can be sorted with the exact same operations.
algorithm.
So instead of averaging InsertionSor t over all inputs of size
n, we will only consider all possible input orders. Each order can
be given as a permutation π = (π1 , π2 , . . . , πn ) of {1, 2, . . . , n},
and there are n! such permutations possible. For instance, the input
order of the two arrays in Figure 37 corresponds to the permutation
(2, 7, 1, 4, 6, 5, 8, 3).
Given a permutation π, we say that (i, j) is an inversion if i < j
and πi > π j . For instance, the permutation (2, 7, 1, 4, 6, 5, 8, 3)
has 11 inversions: (1, 3), (2, 3), (2, 4), (2, 5), (2, 6), (2, 8), (4, 8),
(5, 6), (5, 8), (6, 8), and (7, 8). Note that the sorted permutation
(1, 2, 3, 4, 5, 6, 7, 8) contains no inversion, while the reverse permu-
tation (8, 7, 6, 5, 4, 3, 2, 1) contains n(n − 1)/2 inversions52 . At every 52
This is the maximum number of per-
iteration i of InsertionSort, when ti values are shifted right to mutations. Indeed, every permutation
is a pair (i, j) satisfying i < j. There
insert A[i ] to their left, exactly ti inversions are canceled. The total are n(n − 1) pairs, and half of them
number of executions of line 5 of InsertionSort, i.e., ∑in=−11 ti is satisfies i < j.
therefore equal to the number of inversions in the input array.53 53
We counted 11 inversions for
Back to our average-case analysis. To count how many times line 5 (2, 7, 1, 4, 6, 5, 8, 3), and you can check
that there are indeed 11 blue arrows on
will be executed on average54 we only need to know the average Figure 36 (page 22).
number of inversions in a permutation of size n. For each permu- 54
We should write this as E[∑in=−11 ti ],
tation (π1 , . . . , πi , . . . , πi , . . . , πn ) that contains the inversion (i, j), that is, the expected sum of all ti s.
quadratic function.
algo 24
BinarySearch( A, 0, 8, 7) :
BinarySea r c h b m e
2 4 5 6 8 9 10 11
So far we have studied two iterative algorithms, but we should also
know how to deal with recursive algorithms. As a very simple ex- BinarySearch( A, 0, 4, 7) :
b m e
ample, let us consider BinarySearch. 2 4 5 6 8 9 10 11
This takes a sorted array A[b..e − 1], a value v, and returns the
BinarySearch( A, 3, 4, 7) :
index where v is in A, or where it should be in case it is not. bm e
BinarySearch( A, b, e, v) 2 4 5 6 8 9 10 11
1 if b < e then
BinarySearch( A, 4, 4, 7) :
2 m ← b(b + e)/2c be
3 if v = A[m] then 2 4 5 6 8 9 10 11
4 return m return 4
5 else N Figure 38: Recursive call to Bina-
rySearch showing the evolution of
6 if v < A[m] then b and e (and the calculated m) when
7 return BinarySearch( A, b, m, v) searching for the value 7 in the array.
8 else
9 return BinarySearch( A, m + 1, e, v) unsigned s(unsigned n)
10 else {
if (n == 0) return 1;
11 return b
return 1 + s(n/2);
The algorithm first checks whether the middle value A[m] is }
equal to v, otherwise it looks for v recursively in either A[b..m − 1] or N Figure 39: Straightforward, recursive
implementation of S(n).
A[m + 1..e − 1].
Let us write n = e − b for the size of the array, and S(n) for unsigned s(unsigned n)
the number of calls (including recursive calls) to BinarySearch {
unsigned res = 1;
needed to locate a value in the worst case. Clearly S(0) = 1 because while (n != 0)
calling BinarySearch with b = e will return immediately. For {
n ≥ 1, the worst-case scenario is when the value is never found, and ++res;
n /= 2; // or n >>= 1
the recursion always occurs on the larger of the two halves. Since }
one value has been removed, these halves have length b(n − 1)/2c return res;
}
and d(n − 1)/2e = bn/2c, and the latter is the larger one. Therefore,
N Figure 40: Iterative implementation
in the worst case, the number of calls to BinarySearch satisfies of S(n).
1 when n = 0,
S(n) =
1 + S (bn/2c) when n ≥ 1.
You can actually solve this recursive equation (i.e., find a formula
for S(n)) as if you had to replace a recursive implementation of this
function (Figure 39) by an iterative version. Every time S is called
recursively, its argument is divided by two, and 1 is added to the
result. We could do this in a simple loop, as in Figure 40. So S(n) is 57
The number (110010)2 = 25 + 24 + 21 ,
needs 6 bits, because its left-most
equal to 1 plus the number of times we need to perform an integer
1-bit is at position 5 (counting from
division of n by 2 to reach 0. This integer division is similar to a right 0). We have 25 ≤ m < 26 hence
shift by one bit, so S(n) is equal to 1 plus the number of bits needed 5 ≤ log2 (m) < 6. More generally,
the position of the left-most 1-bit in
to represent n.57 In other words: the binary representation of any non-
negative integer m is blog2 (m)c, and
S(n) = 2 + blog2 (n)c if n ≥ 1, and S(0) = 1 since bits are numbered from 0, the
number of bits needed to represent m
From this formula, we have a pretty good idea of the behavior of
in base 2 is 1 + blog2 (m)c.
BinarySearch. Since the number of operations performed dur-
ing each of these calls can be bounded by some constant c, the run 58
Later, using notation introduced on
page 25 we shall write that Bina ry-
time of Bina rySearch cannot exceed c × S(n) in the worst-case
Search is a O(log n) algorithm for this
scenario.58 reason.
algo 25
Ω( g(n)) = { f (n) | ∃c > 0, ∃n0 ∈ N, ∀n ≥ n0 , 0 ≤ cg(n) ≤ f (n)} N Figure 43: f (n) ∈ Ω( g(n)): after
some n0 the function f (n) is bounded
below by cg(n) for some constant c.
These definitions imply that Θ( g(n)) = O( g(n)) ∩ Ω( g(n)).
algo 26
λ = Θ (1) λ = O(1)
f (n) = Θ( f (n)) f (n) = O( f (n))
Θ( f (n)) + Θ( g(n)) = Θ( f (n) + g(n)) O( f (n)) + O( g(n)) = O( f (n) + g(n))
Θ( f (n) + g(n)) = Θ(max( f (n), g(n))) O( f (n) + g(n)) = O(max( f (n), g(n)))
Θ( f (n)) · Θ( g(n)) = Θ( f (n) · g(n)) O( f (n)) · O( g(n)) = O( f (n) · g(n))
Θ(λ f (n)) = Θ( f (n)) O(λ f (n)) = O( f (n))
f (n) Ω( g(n))
if lim =∞ then f (n) = Ω( g(n)) and f (n) 6= Θ( g(n))
n→∞ g (n ) ω ( g(n))
`=∞
f (n)
Note that limn→∞ g(n) = 0 is the definition f (n) = o ( g(n)). We
actually have o ( g(n)) ⊂ O( g(n)) \ Θ( g(n)). Similarly, people oc-
f (n)
casionally write f (n) = ω ( g(n)) when limn→∞ g(n) = ∞, so that
N Figure 44: Relation between o ( g(n)),
we have f (n) = o ( g(n)) ⇐⇒ g(n) = ω ( f (n)) just like we have O( g(n)), Θ( g(n)), Ω( g(n)), and
f (n) = O( g(n)) ⇐⇒ g(n) = Ω( f (n)). f (n)
ω ( g(n)). If the limit ` = limn→∞ g(n)
See Figure 44 for a Venn diagram showing how these different exists, f (n) belongs to one of the round
classes.
sets relate to each other.
Exercises. 1. Show that 1 + sin(n) + n is in Θ(n). 2. Show that for
any a and any b > 0, the function (n + a)b is in Θ(nb ). 3. Show that
n + n sin(n) is in O(n) but is not in Θ(n). 4. Show that 2n + n sin(n)
is in Θ(n). 5. Prove Θ(logi n) = Θ(log j n) for any i > 1 and j > 1.
algo 27
SelectionSort( A, n)
1 for i ← 0 to n − 2 do Θ(n)
2 min ← i Θ(n)
3 for j ← i + 1 to n − 1 do Θ ( n2 )
4 if A[ j] < A[min] Θ ( n2 )
5 min ← j O( n2 )
6 A[min] ↔ A[i ] Θ(n)
Θ ( n2 )
i j k
Merging two Sorted Sub-Arrays
A : ··· ···
The Merge procedure will be used on next page to build Merge-
Sort, a better sorting algorithm than what we have seen so far. Merge( A, i, j, k)
Merge takes an array A and three indices i, j, and k, such that the i k
values in the sub-array A[i..j − 1] are sorted (in increasing order), A : ··· ···
and the values in A[ j..k − 1] are also sorted. The goal is to reorganize N Figure 45: Merge( A, i, j, k) takes
two consecutive sorted sub-arrays
all these values so that A[i..k − 1] is sorted (Figure 45). A[i..j − 1] and A[ j..k − 1] reorder the
entire range.
Merge( A, i, j, k)
1 `←i Θ (1)
2 r←j Θ (1)
3 for b ← i to k − 1 do Θ(n) for n = k − i
4 if r = k or (` < j and A[`] ≤ A[r ]) Θ(n)
5 B[b] ← A[`] O( n )
6 ` ← `+1 O( n )
7 else
8 B [ b ] ← A [r ] O( n )
9 r ← r+1 O( n )
10 A[i..k − 1] ← B[i..k − 1] Θ(n)
Θ(n)
MergeSort
‘‘Divide and conquer algorithms’’79 are designed around the following 79
We will discuss this class of algo-
idea: when faced with a complex problem, try to divide the problem rithms in more details later.
of size n/2 and recursively sorts those.80 Once the two halves are i j k
sorted, the complete sorted array is built using the Merge proce-
2 7 1 4 6 5 8 3
dure described on page 29. Of course the recursive calls to sorts the
arrays of size n/2 will probably divide the arrays into two arrays of
2 7 1 4 6 5 8 3
size n/4. Eventually the recursion will stop on arrays of size 1: those
are already sorted!
2 7 1 4 6 5 8 3
Here is the pseudo-code for MergeSort. We assume that A, the
array to be sorted between indices i (included) and j (excluded),
2 7 1 4 6 5 8 3
will be modified in place. Figure 47 illustrates it.
Let us first consider recurrence equations that do not involve the Θ, unsigned m(unsigned n)
{
O, Ω notations. For instance, let M(n) denote the number of times if (n == 1) return 0;
line 4 of Merge (page 29) is executed while running MergeSort return m(n / 2) + m(n - n/2) + n;
}
(page 30) on an array of length n. Since each call to Merge on a
sub-array of length n executes line 4 exactly n times, we have: unsigned m_floor(unsigned n)
{
if (n == 1) return 0;
0 if n = 1 return 2 * m_floor(n / 2) + n;
M(n) =
}
n n
M +M 2 + n for n ≥ 2
2
unsigned m_ceil(unsigned n)
At first, the mix of d·e and b·c might look intimidating. One can {
wonder if it would not be easier to solve equations such as if (n == 1) return 0;
return 2 * m_ceil(n - n / 2) + n;
}
Mfloor (n) = 2Mfloor (b n2 c) + n with Mfloor (1) = 0
or Mceil (n) = 2Mceil (d n2 e) + n with Mceil (1) = 0 int main()
{
for (unsigned n = 1; n <= 256; ++n)
We can write a small program (Figure 48) to compute the first val- printf("%5u %5u %5u %5u\n",
ues from these functions and plot them (Figure 49 on this page, and n, m_floor(n),
m(n), m_ceil(n));
Table 3 on next page). What can we make from this plot? First, we }
obviously have Mfloor (n) ≤ M(n) ≤ Mceil (n) and this is easy to prove N Figure 48: Computing M(n),
from our definitions. Then, these three functions coincide on values Mfloor (n), and Mceil (n) to draw Fig-
ure 49.
of n that are powers of 2: this should not be a surprise as d·e and b·c
are useless in this case. If n = 2m , solving any of the these equations 2,000
Mceil
amounts to solving: M
1,500 Mfloor
m m −1 m 0
M(2 ) = 2M(2 ) + 2 if m ≥ 1, and M(2 ) = 0
1,000
M (2m ) M (2m −1 )
Dividing everything by 2m , we have = 2m 2m −1
+ 1, and we
500
can iterate this definition until we reach M(20 ):
0
M (2m ) M (2m −1 ) M (2m −2 ) M (2m −3 ) M ( 20 ) n
2m = 2m −1
+1 = 2m −2
+2 = 2m −3
+3 = ··· = 20
+m = m 1 32 64 128 256
N Figure 49: Plot of M(n), Mfloor (n),
So M(2m ) = m2m and since m = log2 n it follows that M (n) = and Mceil (n), as computed in Figure 48.
n log2 n, but only if n is a power of two. How far is n log2 n from
1.6 Mceil (n)/n log2 n
M (n)? After writing another small program, we can plot Figure 50:
M (n)/n log2 n
M (n) appears closer to n log2 n than Mfloor and Mceil are. From the 1.4 Mfloor (n)/n log2 n
same figure, we also easily see (this is not a proof) that all three 1.2
functions satisfy 12 n log2 n ≤ M(n) ≤ 2n log2 n, which means that
1
they are all in Θ(n log n).
We will see later81 that as long as all we want is a complexity class 0.8
(such as Θ(n log n)), we can usually ignore the d·e or b·c functions 0.6 n
in this type of recurrence equations. 1 256 1,024 2,048
However, if we need an exact solution these d·e or b·c functions N Figure 50: The ratio between the
do matter. Figure 49 leaves no doubt about that. On next page, we three M functions, and n log2 n.
81
show how to compute an exact solution for M(n). cf. page 33
algo 32
M ( n ) = ( n − 1) + ∑ (1 + blog2 i c)
1≤ i < n
n −1
∑ (1 + blog2 ic) = n + 1 + nblog2 (n − 1)c − 2blog2 (n−1)c+1 82
It is very easy to forget a b·c or a little
i =1 −1 somewhere while making this kind
of calculation. To detect such mistakes,
So we can conclude82 that I usually evaluate both formulas (here,
the definition of M (n) at the top of the
0 if n = 1 page, and the one at the bottom) on a
M(n) = handful of values, and check that they
2n + nblog (n − 1)c − 2blog2 (n−1)c+1 if n ≥ 2 are the same (here, the values should
2
be those given by Table 3).
algo 33
∼ log3 n
So we have a = b = 2 and f (n) = Θ(n). We compute nlogb a =
nlog2 2 = n1 , and we now have to check which case of the the-
orem applies. Is f (n) in O(n1−ε ) (case 1), in Θ(n1 ) (case 2), or
in Ω(n1+ε ) (case 3)? Obviously we are in the second case since N Figure 51: If T (n) = 2T (n/3) + Θ(n)
we are in the third case of the theorem:
f (n) = Θ(n) = Θ(n1 ). We therefore conclude immediately that the work performed by recursive calls
MergeSort has complexity T (n) = Θ(n log n). diminishes exponentially fast, so only
the initial Θ(n) matters.
• T (n) = T (b n2 c) + Θ(1). This is the worst case complexity of Θ(n)
BinarySearch (page 24). We have a = 1, b = 2, and f (n) =
Θ(1). nlog2 1 = n0 = 1. Again in case 2, we conclude that T (n) =
∼ log3 n
F
t f
Θ (1) if n = 1
Example: Assume we have T (n) =
T (bn/2c) + Θ(1) if n > 1
and we want to prove that T (n) = O(log n).
Our inductive hypothesis Hn is that T (n) ≤ c log2 n. Clearly we
cannot prove H1 because T (1) is a constant necessary larger than
c log2 1 = 0. However T (2) = T (1) + Θ(1) = Θ(1) is a constant
as well, so we can choose c large enough so that T (2) ≤ c log2 2 =
c, and H2 holds. Similarly, T (3) = T (2) + Θ(1) = Θ(1) is also
a constant, so clearly H3 holds if we keep c larger than these two
constants.
Now for any n ≥ 4 let us assume that H2 , H3 , . . . , Hn−1 holds and
let us prove Hn . By definition of T (n) we have
T (n) ≤ c log2 n
From our experience, we guess that T (n) = O(n), so let’s try to prove
the hypothesis Hn : T (n) ≤ cn for some constant c. Clearly H1 is
true. So for any n > 1, let us assume that H1 , . . . , Hn−1 hold, and use
that to prove Hn :
T (n) = 2T (bn/2c) + 1
T (n) ≤ 2cbn/2c + 1
T (n) ≤ 2cn/2 + 1
T (n) ≤ cn + 1
T (n) = 2T (bn/2c) + 1
T (n) ≤ 2(cbn/2c − 1) + 1
T (n) ≤ 2cbn/2c − 1
T (n) ≤ 2cn/2 − 1
T (n) ≤ cn − 1
0 9
Heaps 1 7 2 8
The Heapify function is the main building block for the Build-
Heap algorithm. Let A be an array of size m storing a nearly com- i
plete binary tree. Heapify takes the index i of a node whose left
` r
and right children are already known to be subtrees that satisfy the i
heap heap
heap property, and it rearranges the values of i and its children so
that the subtree rooted in i has the heap property. These conditions heap
are illustrated by Figure 57.
N Figure 57: Pre- and post-conditions
Note that if the left child ` of i satisfies the heap property, its value
of Heapify. The input is a node i whose
A[`] is necessarily the maximum of the left subtree. Similarly, A[r ] children subtrees are already known to
is the maximum of the right subtree. If A[i ] is already greater than satisfy the heap property. In the output
the entire subtree rooted in i satisfies
A[`] and A[r ], then the subtree rooted in i already satisfies the heap the heap property. This implies that
property. Otherwise, two of these three values have to be swapped: A[i ] in the output should be equal to
max( A[i ], A[`], A[r ]) in the input.
bringing the maximum at the top, and possibly destroying the heap
property of one of the children (but this can be fixed recursively).
0 1
Heapify( A, i, m)
1 2 2 7
1 ` ← LeftChild(i ) Θ (1)
2 r ← RightChild(i ) Θ (1) 3 4 4 5 5 5 6 4
3 if ` < m and A[`] > A[i ] Θ (1) 7 2 8 3 9 3 10 1
4 g←` O(1) When running Heapify( A, 1, 11) on
the above tree, A[1] is swapped with
5 else A[4] on line 10. 0 1
6 g←i O(1)
7 if r < m and A[r ] > A[ g] Θ (1) 1 5 2 7
8 g←r O(1) 3 4 4 2 5 5 6 4
9 if g 6= i Θ (1) 7 2 8 3 9 3 10 1
10 A [i ] ↔ A [ g ] O(1) The subtree of 4 is then corrected by
calling Heapify( A, 4, 11) recursively.
11 Heapify(A, g, m) ?
0 1
Figure 58 illustrates this algorithm on an example. Using Heapify 1 5 2 7
to turn a complete binary tree into a heap is now quite easy: notice
3 4 4 3 5 5 6 4
that all leaves already satisfy the heap property, so all we need is to
7 2 8 3 9 2 10 1
call Heapify on the internal nodes, in a bottom-up way. Remem-
N Figure 58: Execution of
ber that the first leave is at position bn/2c in the array, so the last Heapify( A, 1, 11) on an example.
internal node is just before. States colored in blue are roots of
subtrees with the heap property.
BuildHeap( A, n)
1 for i from bn/2c − 1 down to 0: Θ(n)
2 Heapify( A, i, n) ?
The Complexity of H e a p i f y
Page 37 presents Heapify and BuildHeap, but does not give their
complexity.
Heapify contains different execution branches. The most efficient
scenario is obviously when g = i on line 9, because then no recursion
i
occurs. In this case, Heapify executes in constant time.
For the recursive case, it is instructive to consider different ways
to measure the size of the input.
x nodes x nodes
• Heapify( A, i, m) will only work on nodes that belong to the
x + 1 nodes
subtree rooted in i. So we could use TH (s) to denote the time
N Figure 60: Worst case for the recur-
complexity of Heapify on a subtree of s nodes. When Heapify sion of Heapify: the left subtree has
recurses into one of the two children of i, how many nodes are slightly more than twice the number of
nodes of the right subtree. If s = 3x + 2,
left in the worst case? To answer that, look at Figure 60: because the left subtree has 2x + 1 = (2s − 1)/3
the last level of a heap is not necessarily full, the left subtree can nodes.
actually have twice the numbers of nodes of the right one. The
left subtree can actually have up to d(2s − 1)/3e = 2s/3 + O(1)
nodes. We therefore have the following recurrence:
Θ (1) if s = 1
TH (s) ≤
TH (2s/3 + O(1)) + Θ(1) if s > 1
TH (h) ≤ TH (h − 1) + Θ(1)
TH (h) ≤ TH (h − 2) + Θ(1) + Θ(1)
..
.
TH (h) ≤ TH (0) + Θ(1) + . . . + Θ(1)
| {z }
TH (h) ≤ (h + 1)Θ(1) h terms
TH (h) ≤ Θ(h)
TH (h) = O(h)
Note that these two results, TH (s) = O(log s) and TH (h) = O(h),
are compatible because h = Θ(log s) for complete binary trees.94 94
Exercise: Prove that any complete
We will use both expressions for TH on next page, to compute the binary tree of s nodes has a height of
exactly h = blog2 sc.
complexity of BuildHeap.
algo 39
The Complexity of B u i l d H e a p
BuildHeap( A, n)
1 for i from bn/2c − 1 down to 0: Θ(n)
2 Heapify( A, i, n) ?
Having established the complexity of Heapify on page 38, we
only need to answer one question before we can give complexity
TBH (n) of running BuildHeap: ‘‘what is the cost of line 2?’’
• We can consider that in the worst case, Heapify runs on a sub-
tree of n nodes. This is the case when called with i = 0 and the
Heapify call then costs TH (n) = O(log n). It costs less in the
other iterations, but O(log n) already gives an upper bound any-
way. Since there are bn/2c iterations, the total complexity can be
expressed as follows:
TBH (n) = Θ(n) + bn/2cO(log n) blog2 nc
| {z } | {z }
line 1 line 2
TBH (n) = Θ(n) + Θ(n)O(log n)
TBH (n) = O(n log n)
However, that is a crude upper bound, because we considered h
that all calls to Heapify cost as much as the last one.
N Figure 61: The number of subtrees
• In practice, Heapify is called on many small subtrees where of height h in a complete binary tree of n
it has constant cost. For instance, on all subtrees of height 1, nodes without missing nodes on the last
level, can be expressed as the number of
Heapify costs TH (1) = Θ(1). A more precise evaluation of
nodes at depth d = blog2 nc − h, that
line 2 would therefore account for the different sizes of each sub- is 2blog2 nc−h . This value is smaller or
tree considered. Let S(h, n) be the number of subtrees of height h equal to 2log2 (n)−h = n/2h .
Now if the binary tree is nearly complete
in a heap of size n. We can express the complexity of BuildHeap (i.e., it has missing nodes), n/2h is
as: still an upper bound of the number of
blog nc subtrees with height h.
TBH (n) = Θ(n) + ∑ S(h, n) TH (h) (10) So we conclude that S(h, n) ≤ n/2h .
h =1
| {z }
line 1 | {z }
line 2
Indeed: we have S(h, n) subtrees of height h, the call to Build-
Heap costs TH (h) for each of them, and we are running Build-
Heap on all subtrees with heights ranging from 1 (the node just
above the leaves) to blog nc (for the root95 ). 95
See the remark 94 on p. 38.
After BuildHeap
HeapSort 0 9
Sorting an array in ascending order using a max-heap is easy: once 1 7 2 8
the heap has been built, its topmost value (i.e., the first value of the 3 7 4 4 5 4 6 0
array) is the maximum. This maximum should be therefore moved 7 2 8 4 9 3
to the end of the array. If we do that with an exchange, and new 0 1 2 3 4 5 6 7 8 9
consider only the first n − 1 values to be part of the tree, we are in 9 7 8 7 4 4 0 2 4 3
the situation depicted on Figure 62: calling Heapify on the root After A[0] ↔ A[9]
of this (restricted) tree is all we need to sift up its maximum value. 0 3
This can be iterated to sort the entire array: each iteration places one 1 7 2 8
new value at its correct place, and reorder the remaining heap. 3 7 4 4 5 4 6 0
7 2 8 4 9 9
HeapSort( A, n)
0 1 2 3 4 5 6 7 8 9
1 BuildHeap( A, n) Θ(n)
3 7 8 7 4 4 0 2 4 9
2 for i from n − 1 down to 1 Θ(n)
After Heapify( A, 0, 9)
3 A [0] ↔ A [ i ] Θ(n)
0 8
4 Heapify( A, 0, i ) O(n log n)?
1 7 2 4
The complexity of the first three lines of HeapSort, should be 3 7 4 4 5 3 6 0
quite obvious: we know the cost of BuildHeap from page 39, and 7 2 8 4 9 9
line 3 is a constant-time operation repeated n − 1 times. That leaves 0 1 2 3 4 5 6 7 8 9
us with the cost of line 4. 8 7 4 7 4 3 0 2 4 9
The first call to Heapify is done on an array of size n − 1, so its
After A[0] ↔ A[8]
cost should be TH (n − 1) = O(log(n − 1)) = O(log n) according 0 4
to what we established on page 38. The following iterations will 1 7 2 4
call Heapify on smaller arrays, so we can still use O(log n) as an 3 7 4 4 5 3 6 0
upper bound, and claim that the sum of all the these calls will cost 7 2 8 8 9 9
(n − 1)O(log n) = O(n log n). 0 1 2 3 4 5 6 7 8 9
It would be legitimate to ask whether we could get a better com- 4 7 4 7 4 3 0 2 8 9
plexity bound by being more precise when summing the costs of the
After Heapify( A, 0, 8)
different calls to Heapify like we did for BuildHeap on page 39.
0 7
Here the total work performed by all iterations of line 4 is
1 7 2 4
! !
n −1 n −1 n −1 n −1 3 4 4 4 5 3 6 0
∑ TH (i) = ∑ O(log i) = O ∑ log i = O log ∏ i
7 2 8 8 9 9
i =1 i =1 i =1 i =1
0 1 2 3 4 5 6 7 8 9
= O log((n − 1)!) (11) 7 7 4 4 4 3 0 2 8 9
Stirling’s formula is a powerful tool to simplify expressions involv- 2 7 4 4 4 3 0 7 8 9
ing factorials, if you can remember it. We have 7 4 4 2 4 3 0 7 8 9
√ n n 0 4 4 2 4 3 7 7 8 9
n! ∼ 2πn hence log2 (n!) ∼ n log2 n. 4 4 4 2 0 3 7 7 8 9
e
3 4 4 2 0 4 7 7 8 9
(For another way to obtain the equivalence on the right, see page 14.) 4 3 4 2 0 4 7 7 8 9
We can therefore return to equation (11) and simplify it: 0 3 4 2 4 4 7 7 8 9
n −1 4 3 0 2 4 4 7 7 8 9
∑ TH (i) = O
(n − 1) log(n − 1) = O(n log n) 2 3 0 4 4 4 7 7 8 9
i =1 3 2 0 4 4 4 7 7 8 9
Unfortunately, this result is not better than our original approxima- 0 2 3 4 4 4 7 7 8 9
2 0 3 4 4 4 7 7 8 9
tion. We conclude that HeapSort( A, n) runs in O(n log n).
0 2 3 4 4 4 7 7 8 9
Can you explain the fundamental difference between the loops 0 2 3 4 4 4 7 7 8 9
of BuildHeap and HeapSort? Why is one O(n) and the other
N Figure 62: Progression of Heap-
O(n log n)? Sort, starting from the entire heap.
algo 41
0 ` r n
Partition
··· unsorted ···
The Partition algorithm is a building block for QuickSort. Par-
tition reorders a given range of elements in an array, such that all p ← Partition( A, `, r )
the elements in the left-hand side of the range are smaller than those 0 ` p r n
in the right-hand side as pictured by Figure 63. The resulting range ··· unsorted ≤ unsorted ···
does not need to be sorted.97
N Figure 63: Overview of the Pa rti-
One way to implement Partition is to choose a value, let’s say tion algorithm. The range A[`..r − 1]
x ← A[`] and use it as a threshold to decide whether an element is reordered so that any value in
A[`..p − 1] is less than or equal to any
A[v] can belong to the left-hand part (if A[v] ≤ x) or to the right- value in A[ p..r − 1]. The value p should
hand part (if A[v] ≥ x).98 The following implementation of the be such that ` < p < r, ensuring that
reordering is often described as the ‘‘collapse the walls’’ technique. each part is non-empty. Note that the
two parts may have different lengths.
The walls are in fact two indices i and j starting at both ends of the
range, and moving towards each other, exchanging values along the 97
Sorting A[`..r − 1] would be one way
way. to implement Partition( A, `, r ), but it
would be less efficient.
Partition( A, `, r )
98
Note that elements equal to x can go
1 x ← A[`] Θ (1)
to either side; this is on purpose.
2 i ← ` − 1; j ← r Θ (1)
3 repeat forever O( n )
` r
// find a value that can go to the right-hand side
4 2 8 7 3 4 0 7 9 4
4 do i ← i + 1 until A[i ] ≥ x
Θ(n) i
4 x
j
// find a value that can go to the left-hand side
5 do j ← j − 1 until A[ j] ≤ x
4 2 8 7 3 4 0 7 9 4
// swap the two values unless the walls collapsed i j
6 if j ≤ i O( n )
4 2 8 7 3 4 0 7 9 4
7 return i + (i = `) Θ (1)
i j
8 A [i ] ↔ A [ j ] O( n )
4 2 0 7 3 4 8 7 9 4
The ‘‘repeat forever’’ loop might look daunting, but since lines 4 i j
and 5 necessarily update i and j at each iteration of the main loop, it
is guaranteed that eventually j ≤ i and the algorithm will terminate. 4 2 0 4 3 7 8 7 9 4
i
What is less obvious is that there are exactly two ways in which j
the algorithm may terminate. Either i = j (in this case A[i ] = x), or
i = j + 1 as in Figure 64. It is not possible for i to be larger than j + 1, N Figure 64: Execution of Partition
on an example. In this case, the index
because all the values to the left of i are less than or equal to x, so the returned is i, and the algorithm has (by
loop decrementing j will stop as soon as it passes i. chance!) reordered the range in two
equal partitions.
The algorithm assumes that the range contains at least two values
(r − l ≥ 2). To argue that the returned value p satisfies ` < p < r,
consider what it would take for this to be violated: To have p = `,
line 4 should be executed only once, which means that line 5 will
execute until j = i = `. However, in this case line 7 will return i + 1
so not `.
Finally, the Θ(n) complexity of Partition should be obvious after
we realize that because of the ‘‘collapsing walls’’ strategy, the sum of
the executions of lines 4 and 5 is at least n + 1 (if we end with i = j)
and at most n + 2 (if we end with i = j + 1).
algo 42
(`, r )
QuickSort
4 2 8 7 3 4 0 7 9 4
QuickSort consists in recursively calling Partition on the two (0, 10)
parts created by Partition, until we reach an array of length 1 (that 4 2 0 4 3 7 8 7 9 4
does not need to be sorted). (0, 5)
3 2 0 4 4 7 8 7 9 4
QuickSort( A, `, r ) TQS (1) TQS (n) for n = r − ` > 1 (0, 4)
1 if r − ` > 1 Θ (1) Θ (1) 0 2 3 4 4 7 8 7 9 4
2 p ← Partition( A, `, r ) Θ(n) (0, 2)
3 QuickSort( A, `, p) TQS ( L)? for L = p − ` 0 2 3 4 4 7 8 7 9 4
4 QuickSort( A, p, r ) TQS (n − L)? (2, 4)
0 2 3 4 4 7 8 7 9 4
Figure 65 shows the effects of the different calls to Partition (5, 10)
occurring while sorting an example array with QuickSort. 0 2 3 4 4 4 7 8 9 7
(5, 7)
The proof that QuickSort actually sorts the array can be done 0 2 3 4 4 4 7 8 9 7
by induction on the length of the considered range. The induction (7, 10)
hypothesis Hn is ‘‘for any range [`..r − 1] of length r − ` = n, calling 0 2 3 4 4 4 7 7 9 8
QuickSort( A, `, r ) will sort all the elements in A[`..r − 1]’’. (8, 10)
It would therefore be tempting to express the complexity as the N Figure 66: Let us assume that Parti-
tion always puts L elements in the left
solution of part, and n − L elements in the right
one.
Θ (1) if n = 1
TQS (n) =
Θ(n) + TQS ( L) + TQS (n − L) if n > 1
log10/9 n
know the solution is TQS (n) = Θ(n log n).
log10 n
n 9n 9n 81n
100 100 100 100
• What if L = n/10? For this 10%-90% scenario the equation is
TQS (n) = Θ(n) + TQS (bn/10c) + TQS (d9n/10e). 1 9 log10 n
n
10
1
Figure 67 shows the shape of the recursion tree: each node is la- 1
beled by the length of the array passed to Partition. The short-
1
est branch of the tree is the left one, where the range is always
N Figure 67: Shape of the tree of the
divided by 10: the height of this branch is log10 n. The longest recursive calls to QuickSort in a
branch is the right one, with height log10/9 n since the range is scenario where the Partition always
makes a 10%-90% split.
(slowly) divided by 10/9 at each recursive call. The work per-
formed by Partition is proportional to the value displayed on
each node of this tree, therefore the total cost of QuickSort is
proportional to the sum of all the nodes of this tree. The sum of
each line of the first log10 n lines if this tree is necessarily n, so
these sum to n log10 n. But the algorithm processes more than
101
that. The total for each remaining line is less than n, so the sum of The difference between the bad
cases and the good cases discussed on
the whole tree is less than n log10/9 n. We therefore have
this page is whether L is constant or
whether it is proportional to n. The
Θ(n log10 n) ≤ TQS (n) ≤ Θ(n log10/9 n) hence TQS (n) = Θ(n log n).
actual constant or ratio does not affect
the resulting complexity class.
The same result holds if L = n/10000 or any other ratio.101
algo 44
Average Complexity of Q u i c k S o r t
Let us start again from the equation102 The fact that TQS (1) = Θ(1) is
102
n −1
1
∑ Θ(n) + T̂QS ( L) + T̂QS (n − L)
T̂QS (n) =
n−1 L =1
!
n −1 n −1
1
T̂QS (n) = Θ(n) +
n−1 ∑ T̂QS ( L) + ∑ T̂QS (n − L)
L =1 L =1
n −1
2
T̂QS (n) = Θ(n) +
n−1 ∑ T̂QS ( L)
L =1
n −1
2
F (n) = cn +
n−1 ∑ F ( L)
L =1
To get rid of the sum, we first multiply both sides by n − 1 to get rid
of the non-constant factor in front of the sum, and then subtract the
same expression for F (n − 1):
n −1
(n − 1) F (n) = (n − 1)cn + 2 ∑ F ( L)
L =1
n −2
( n − 2) F ( n − 1) = ( n − 2) c ( n − 1) + 2 ∑ F ( L)
L =1
(n − 1) F (n) − (n − 2) F (n − 1) = 2c(n − 1) + 2F (n − 1)
(n − 1) F (n) = 2c(n − 1) + nF (n − 1)
104
The fact that ∑in=1 1i = Θ(log n)
can be derived from Euler’s formula
(∑in=1 1i = ln n + γ + o (1)), or easily
Let’s divide both sides by n(n − 1) and then set Y (n) = F (n)/n: proven by bounding the sum with
integrals as done on page 14:
F (n) 2c F ( n − 1)
= + Z n +1
1 n
1
Z n
1
n n n−1
2 i
di ≤ ∑i ≤
1 i
di
n i =2
2c 1
Y (n) = + Y (n − 1) = 2c ∑ n
1
n i =1
i ln(n + 1) − ln(2) ≤ ∑i ≤ ln(n)
i =2
n
1
From this harmonic series104 , we conclude that Y (n) = Θ(log n), Θ(log n) ≤ ∑i ≤ Θ(log n)
i =2
hence F (n) = Θ(n log n). The average complexity of QuickSort is n
1
therefore T̂QS = Θ(n log n). Θ(log n) ≤ ∑i ≤ Θ(log n)
i =1
algo 45
QuickSort Optimizations
Typical QuickSort optimizations include:
• Selecting a different pivot value in the Partition procedure from
page 41. The ideal value would be the median of the range as it
would ensure equal size for both sides. However, the median is
not really easy to compute without sorting the range already.105 105
It is possible to find the median of
The usual strategy is to pick the median of the three values A[`], an array with only Θ(n) operations
using an algorithm sometimes called
A[r − 1] and A[b(r + `)/2c]. Line 1 of Partition is replaced ‘‘median of medians’’ (cf. page 49).
by x ← MedianOf3( A[`], A[r − 1], A[b(r + `)/2c]). With this However this would be very inconve-
nient here: firstly the constant hidden
change, QuickSort deals nicely with nearly-sorted arrays.106 behind the Θ(n) notation is quite large,
and secondly this algorithm is itself
• The last recursive call to QuickSort is a tail call, so it can be based on a recursive procedure similar
optimized as a loop. Compare these equivalent implementations: to QuickSort.
106
Input arrays that trigger the worst-
QuickSort( A, `, r ) QuickSort( A, `, r ) case Θ(n2 ) complexity still exist (see
1 if r − ` > 1 1 while r − ` > 1 Fig. 68 page 46), but are harder to come
2 p ← Partition( A, `, r ) 2 p ← Partition( A, `, r ) by.
3 QuickSort( A, `, p) 3 QuickSort( A, `, p)
4 QuickSort( A, p, r ) 4 `←p
Any decent compiler would already do this kind of tail call elimi-
nation automatically: this saves memory, because the value of the
local variables have to be saved on the stack before each call.
However, what the compiler cannot guess is that the order of
the two recursive calls to QuickSort does not matter: we can
actually choose which of the two calls should be turned into a loop.
Here, we want to always recurse on the smaller part, to keep the
recursion as shallow as possible.
QuickSort( A, `, r )
1 while r − ` > 1
2 p ← Partiti on( A, `, r )
3 if p − ` ≤ r − p
4 QuickSort( A, `, p)
5 `←p
6 else
7 QuickSort( A, p, r )
8 r←p
While this does not change the time complexity of the algorithm,
it changes its memory complexity107 . Indeed the memory com- 107
I.e., the number of additional mem-
plexity was O(n) in our first implementation of QuickSort ory an algorithm requires to process its
input — this includes the stack in case
because the recursion could be n-deep in the worst case; it is now of recursive algorithms.
O(log n) because there is no way to recurse on an sub-array larger
than n/2.
0 8 2 10 4 12 6 1 3 5 7 9 11 13
IntroSort, or How to Avoid Q u i c k S o r t’s Worst Case
Even with the usual optimizations described on page 45, it is pos- 0 1 2 10 4 12 6 8 3 5 7 9 11 13
sible to construct an input that triggers QuickSort’s worst case.
Figure 68 shows an example where the pivot selected by Medi- 0 1 2 3 4 12 6 8 10 5 7 9 11 13
anOf3 is the second smallest value, so that Partition does a sin-
gle swap and creates a left part with L = 2 elements, and a right 0 1 2 3 4 5 6 8 10 12 7 9 11 13
QuickSele c t
Given an array A of n values, the value of rank k is the kth smallest
value of A. An algorithm that takes A and k as input and returns
the rank-k value is called a selection algorithm.111 In what follows, we 111
One easy (and inefficient) way to
assume 0 ≤ k < n. The value of rank k = 0 is the minimal value, and build a selection algorithm is to first sort
A in increasing order, and then pick
the value of rank n − 1 is the maximal value. its kth value. What sorting algorithm
QuickSelect can be seen as a modification of QuickSort to would you use, and what would be the
complexity of that selection algorithm?
implement a selection without sorting the entire array. The trick is
that after Partition has run, we know (from the sizes of the two
parts) in which part we need to search our value recursively.
QuickSelect( A, n, k)
1 return QuickSelectRec( A, 0, n, k)
QuickSelectRec( A, `, r, k ) T (1) T (n)
1 if r − ` = 1 then return A[`] Θ (1) Θ (1) 4 2 8 7 3 4 0 7 9 4
2 p ← Partition( A, `, r ) Θ(n) QuickSelectRec( A, 0, 10, 6)
3 L ← p−` Θ (1) 4 2 0 4 3 7 8 7 9 4
4 if k < L Θ (1)
QuickSelectRec( A, 5, 10, 1)
5 then return QuickSelectRec( A, `, p, k ) T ( L)
6 else return QuickSelectRec( A, p, r, k − L) T (n − L) 4 2 0 4 3 4 7 8 9 7
QuickSelectRec( A, 5, 7, 1)
Figure 69 shows an example.
The complexity differs from QuickSort, because at most one of 4 2 0 4 3 4 7 8 9 7
the two possible recursive calls on lines 5–6 may be executed. In the QuickSelectRec( A, 6, 7, 0) = 7
worst case, we have to assume that Partition performed poorly N Figure 69: An example execution
of QuickSelect( A, 10, 6). The figure
(L = 1), and that we always have to recurse into the largest part of shows the state of the array before the
n − 1 elements. This gives us the following recursive equation: next recursive call to QuickSelec-
tRec.
T (1) = Θ (1),
T (n) = T (n − 1) + Θ(n) for n > 1
T (1) = Θ (1),
T (n) = T (n/2) + Θ(n) for n > 1
The Master Theorem (p. 33) tells us the solution to this equation
is T (n) = Θ(n). While this is not the best case scenario112 , this 112
Exercise: Devise a scenario where
seems like a favorable case. If we imagine a similar scenario where calling QuickSelect on an array of
size n runs in Θ(1).
we recurse into 90% of the array (or any other ratio), the complexity
will remain linear.
On page 48 we will prove that the average complexity of QuickS-
elect is actually Θ(n), despite the quadratic worst case.113 113
Exercise: Using the idea presented
Do you believe it is possible to write a selection algorithm that page 46, write another selection algo-
rithm, called IntroSelect, with a
would always run in Θ(n)? Jump to page 49 for an answer. worst case of Θ(n log n).
algo 48
Average complexity of Q u i c k S e l e c t
To study the average complexity T̂ (n) for QuickSelect, we con-
sider (as we did on page 44) that the value of L can be anything
between 1 and n − 1 with equal probability, and we average over all
these possibilities. The problem in our case, is that we do not know
if the algorithm will recurse in a part of size L or n − L, so let us
compute an upper bound of T̂ by assuming we recurse into the largest
of these two parts.
T̂ (1) = Θ(1),
1 n −1
n − 1 i∑
T̂ (n) ≤ T̂ (max(i, n − i )) + Θ(n) if n > 1
=1
i if i ≥ dn/2e
We can simplify this a bit because max(i, n − i ) =
n − i if i ≤ bn/2c
If n is even, all the terms between T̂ (dn/2e) and T̂ (n) appear twice
in the sum. If n is even, T̂ (bn/2c) additionally occurs once, but it is
OK to count it twice since we are establishing an upper bound.
n −1
2
n − 1 i=b∑
T̂ (n) ≤ T̂ (i ) + Θ(n)
n/2c
It is not clear how to simplify this, however from the last scenario of
page 47, it seems likely that T̂ (n) = Θ(n). Since we are working on
an upper bound, let us prove that T̂ (n) = O(n) by induction114 . 114
cf. pp. 34–35
Our hypothesis H (n) is that T̂ (n) ≤ cn from some c we can pick
arbitrarily large. Clearly H (1) is true, because T̂ (n) = Θ(1), so we
can find some c larger than this Θ(1) constant. Now let us assume
that H (i ) holds for 1 ≤ i < n and let us establish H (n):115 115
In the first line, the sum over ci can
be removed by showing that
n −1
2 2c 3n2
∑
n −1
T (n) ≤ ci + Θ(n) ≤ + Θ(n) ∑ i≤
3 2
n .
n − 1 i=bn/2c n−1 8 i =bn/2c
8
necessarily performs at least Θ(n) operations, so we can claim that 4n2 − 4n + 2(n − 1) − (n − 1)2
=
8
its average complexity is in fact T̂ (n) = Θ(n). 3n2 − 3 3n2
= ≤
8 8
algo 49
Linear Selection
While QuickSelect has an average complexity of Θ(n), it still has
a Θ(n2 ) worst case when we arrange the data so that Partition
behaves badly (for instance using an arrangement similar to Fig. 68
on page 46). However if we could make sure that Partition would
always allow us to eliminate a number of values that is a fraction of
n (as opposed to a constant), then the selection would be linear.
This is actually the key of the LinearSelect algorithm116 : at any 116
This algorithm is also known as
‘‘median of medians’’ for reasons that
step of the recursion, it removes a quarter of the values. LinearSe-
will be soon obvious.
lect can be seen as a variant of QuickSelect where Partition is
changed to select its pivot as the median of medians (steps 1–3).
LinearSelect( A, n, k ) T (n)
1 consider the n input values as d n5 e groups of 5 val- Θ(n)
x
ues (with maybe the last group having less than 5
values) and compute the median of each group
2 compute the median x of all those d n5 e medians T (d n5 e)
3 use x as the pivot of Partition (i.e., replacing line 1 Θ(n)
of the algorithm on page 41) to reorganize A N Figure 70: The circles represent
4 recursively call LinearSelect on one of the two ≤ T ( 3n
4 )
an array of 34 values organized in
7 groups (6 groups of 5 values, and
parts (depending on k), as done in QuickSelect. one group of 4 values). We interpret
u v as meaning u ≤ v. Inside
Computing the median of 5 values can be done in constant time, each group (i.e., each column), values
so repeating this operation d n5 e times in step 1 requires Θ(n) op- have been moved so that the median is
erations. Computing the median of these d n5 e medians however on the center line, values above it are
greater, and values below it are lesser.
cannot be done in constant time since the number of values de- Similarly, columns have been ordered
pends on n; however it correspond to the value of rank bd n5 e/2c such that the median x of all medians
has three greater medians to its left,
among these d n5 e values, so we can compute it by calling LinearS- and three lesser medians to its right.
elect recursively on an array of size d n5 e. Hence, assuming that We can see that there are more than
running LinearSelect on n values takes T (n) operations, then step n/4 values that are greater than x (the
top left quarter), and more than n/4
2 needs T (d n5 e) operations. Calling Partition costs Θ(n), as seen values that are lesser than x. Therefore,
on page 41, but thanks to our pivot, we are sure that at each of the if x is used as a pivot in Partition,
both parts are guaranteed to have more
two parts contains at least 25% of the array (see Fig. 70), so in the than n/4 values.
worst case, the recursive call of step 4 will cost T ( 3n
4 ). Thus, we have:
T (1) = Θ (1)
l n m 3n
T (n) ≤ Θ(n) + T +T
5 4
Let us prove the following induction hypothesis117 ,
Hn : ‘‘T (n) ≤ cn’’. 117
See pp. 34–35 for the technique.
Clearly H1 holds, because T (1) = Θ(1) and we can find c large One way to guess a probable solution
for this equation is to consider that
enough to dominate that Θ(1). Let us assume that H1 , H2 , …, Hn−1 T (d n5 e) seems much smaller than
hold, and inject this knowledge into our recursive inequality: T ( 3n
4 ). So as a first guess, we neglect
it and solve T (n) ≤ Θ(n) + T ( 3n 4 ). In
l n m 3cn c(n + 4) 3cn this simplified case, we find using the
T (n) ≤ Θ(n) + c + ≤ Θ(n) + +
5 4 5 4 master theorem that T (n) = O(n).
Then we use induction to verify if this
19cn 4c 4c cn
T (n) ≤ Θ(n) + + = cn + Θ(n) + − solution is still correct for the complete
20 3 3 20 problem.
We can chose c large enough so that cn/20 dominates Θ(n) + 4c 3.
Then T (n) ≤ cn, which means that Hn holds for any n ≥ 1, and this
allows us to conclude that T (n) = O(n).
We can strengthen this result to T (n) = Θ(n) because of step 3.
algo 50
Space Complexity
As mentioned on page 20 the space complexity of an algorithm,
often noted S(n), is the amount of additional memory required to
process an input of size n.
The space complexity can be computed using the same mathe-
matical tools we used for computing time complexities. We simply
sum the sizes of all local variables, and if there are function calls
(even recursive calls), we add the maximum space complexities of
all these calls.
For a simple iterative algorithm like SelectionSort (p. 21) or
InsertionSort (p. 22), there is only a couple of local variables
used to store values or indices, so S(n) = Θ(1).118 118
Here, and in all our examples, we
A recursive algorithm like BinarySearch (p. 24) has to store are making the practical assumption
that values and indices are integers
one local variable m per recursive call. Its space complexity is the stored in a fixed amount of memory.
solution of S(n) ≤ Θ(1) + S(n/2) which is S(n) = O(log n). How- Some people who are interested in
bit-level complexity may have different
ever if we account for the fact that BinarySearch is tail recursive expectation. For instance they could
and assume the compiler will perform tail calls elimination, changing say that if we want to work with
the recursion into a simple loop, then the space complexity drops to array that are arbitrarily large, we
need Θ(log n) bits to store an index.
Θ (1). Similarly, if we want be able to store
Similarly, the space complexity of HeapSort (p. 40) satisfies n distinct values in the input array, it
needs Θ(n log n) bits.
S(n) ≤ S(2n/3 + O(1)) + Θ(1), and this can be solved as for TH (n)
on page 38: S(n) = O(log n).
For an algorithm like MergeSort (p. 30), we can assume that
the array B of size Θ(n) used in Merge (p. 29) is allocated once
before the recursion, and the space requirement of the recursion is
Θ(log n). We conclude that S(n) = Θ(n).
Finally, the space complexity of QuickSort depends on how it
has been implemented. If we use the implementation described on
pages 41–42, each call to QuickSort requires Θ(1) local variables
(call to Part ition included) but we can have n − 1 recursive calls
in the worst case. Therefore S(n) = O(n). However if the trick of
page 45 is used to only recurse on the smaller of the to sides of the
partition, then the recursive depth is at most log2 n and the space
complexity drops to S(n) = O(log n).
In-place Algorithms
An algorithm is in-place if its space complexity belong to O(log n). In
other words, the additional memory it requires to process an input
of size n is at most logarithmic in n.
This definition is a bit more practical than what we would intu-
itively allow: limiting the memory consumption to O(log n) instead
of O(1) allows some recursive algorithms like HeapSort to qualify
as in-place while still disallowing algorithms that would use recur-
sion to allocate enough local variables to duplicate the input on the
stack.
algo 51
Finally, we can turn any unstable sort into a stable one by append-
H Figure 73: Making any sort stable:
ing the input order to the key (Fig. 73), or equivalently, by adding a (1) modify the key to include the input
new column with the input order and using it for tie breaking. order, (2) sort against the new key,
(3) revert to old keys.
SelectionSort( A, n)
Worst Case for Comparison Sort 1 for i ← 0 to n − 2 do
2 min ← i
Let us build a decision tree tracing all the comparisons performed 3 for j ← i + 1 to n − 1 do
by SelectionSort (Figure 74) on the array [ a, b, c]. Each branch 4 if A[ j] < A[min]
of this tree (Figure 75) corresponds to a possible execution of the 5 min ← j
6 A[min] ↔ A[i ]
algorithm. Each internal node corresponds to a comparison of the
N Figure 74: SelectionSort, for
values to sort (line 4 of Figure 74): the first two lines of comparisons reference. See page 21 for more details.
correspond to the first iteration with i = 0, after which the smallest We focus on the comparison of line 4.
element of the array is swapped with the first element, and the third
line of the tree corresponds to iteration i = 1 which orders the last
two elements of the array. The state of the array is displayed above
the first comparison of each iteration.
c<b a<b c<a a<b with plain lines, are cases where it is
successful.
[ a, b, c] [ a, c, b] [c, b, a] [c, a, b] [b, a, c] [b, c, a] [c, b, a]
Such a decision tree can be built for an array of any size, and for
any comparison sort, i.e., any sorting procedure that compares the
elements to sort.120 For some algorithms, the branches may not have 120
Can you think of sorting procedures
all of same length. As branches correspond to a possible executions that do not?
of the algorithm, the height of the tree gives the number of compar-
isons necessary in the worst case.
If we construct such a decision tree for an array of n elements,
the number of leaves has to be at least n! to account for all possible
ordering of those elements (and possibly more since we may have
duplicates). As this is a binary tree, the number of comparisons
necessary to sort an array in the worst case is at least log2 (n!).
Counting Sort
Let us study a first non-comparitive sort. That’s right: Counting-
Sort is able to sort an array without comparing its elements. This
algorithm is quite limited: it can only sort non-negative integers that
span a small range. But it can be used as a building block to sort
more complex values, as we shall see later. Also, this is a stable sort,
and that will prove important to build upon.
Figure 76 illustrates how CountingSort works on an example.
5
3
(1) count 2
1 0 2 1 0 1 1 0 2 1 J Figure 76: CountingSort steps:
0 1 2 (1) build a histogram of all values, (2)
(2) cumulate
...
(3) copy to
final place
3 8 10 3
0 0 0 1 1 1 1 1 2 2
≤0 ≤1 ≤2
The steps of Figure 76 are highlighted on the left. After step (2),
C [ x ] contains the number of occurence of values ≤ x, so when scan-
ning the array from right to left123 , we know that the first x we en- 123
It is also possible to do it from left-
counter should go at position C [ x − 1] and we can then decrement to-right, but preserving stability will
require some tweaks to lines 5–9.
C [ x ] for the next occurrence of x (these two operations are done in
the opposite order, on lines 8–9).
The time complexity is easily seen to be Θ(n + k) and this can be
reduced to Θ(n) whenever it is known that k = O(n). In practice, k
is usually a small constant like 10 or 256 to keep the histogram short.
algo 54
75014 94270 94200 69002 73014 69002 J Figure 77: Sorting a list of 5-digit
numbers, one digit at a time using.
94270 94200 69002 75014 94110 73014
Each of the five arrows represents an
73014 94110 94110 73014 94200 75014 application of CountingSort using
94200 → 94250 → 75014 → 94110 → 94250 → 94110 the key highlighted to its left. This
iterative implementation of radix sort
94110 69002 73014 94200 94270 94200 starts from the least significant digit
94250 75014 94250 94250 75014 94250 (LSD).
69002 73014 94270 94270 69002 94270
Clearly if we have n 5-digit numbers to sort, each of these 5 passes
takes Θ(n) time. Therefore the entire process is linear. But this does
not necessarily has to be done in base 10. Sorting a list of 32-bit
integers, we could apply RadixSort in base 256 to process 8 bits at
a time, and require only four passes.
If the recursive calls are laid as in Figure 78, where the recursive
calls in each ‘‘layer’’ sort their range according to the same digit, it is
clear the the number of values that need to be moved is can be less
than in the LSD version (because recursion stops on ranges of size
1), and that the total amount of work is still Θ(n) because of the first
pass.
algo 55
Further Reading