Lec1 - AnalysisOfAlgorithms
Lec1 - AnalysisOfAlgorithms
A NALYSIS OF A LGORITHMS
http://algs4.cs.princeton.edu
A NALYSIS OF A LGORITHMS
‣ introduction
‣ mathematical models
‣ order-of-growth classifications
‣ theory of algorithms
‣ memory
http://algs4.cs.princeton.edu
Reasons to analyze algorithms
Predict performance.
Compare algorithms.
Provide guarantees.
4
The challenge
5
Scientific method applied to analysis of algorithms
Scientific method.
・Observe some feature of the natural world.
・Hypothesize a model that is consistent with the observations.
・Predict events using the hypothesis.
・Verify the predictions by making further observations.
・Validate by repeating until the hypothesis and observations agree.
6
Example: 3-SUM
3-SUM. Given N distinct integers, how many triples sum to exactly zero?
% more 8ints.txt
8 a[i] a[j] a[k] sum
30 -40 -20 -10 40 0 10 5 1
30 -40 10 0
4 -40 40 0 0
3
-10 0 10 0
4
7
3-SUM: brute-force algorithm
9
Empirical analysis
Run the program for various input sizes and measure running time.
N time (seconds) †
250 0
500 0
1,000 0.1
2,000 0.8
4,000 6.4
8,000 51.1
16,000 ?
10
Data analysis
11
Data analysis
Log-log plot. Plot running time T (N) vs. input size N using log-log scale.
lg(T (N)) = b lg N + c
b = 2.999
c = -33.2103
T (N) = a N b, where a = 2 c
3 orders
of magnitude
power law
Predictions.
・51.0 seconds for N = 8,000.
・408.1 seconds for N = 16,000.
8,000 51.1
8,000 51
8,000 51.1
16,000 410.8
validates hypothesis!
13
Experimental algorithmics
14
A NALYSIS OF A LGORITHMS
‣ Introduction
‣ mathematical models
‣ order-of-growth classifications
‣ theory of algorithms
‣ memory
http://algs4.cs.princeton.edu
Mathematical models for running time
Donald Knuth
1974 Turing Award
17
Cost of basic operations
assignment statement a = b c2
18
Example: 1-SUM
int count = 0;
for (int i = 0; i < N; i++)
if (a[i] == 0)
count++;
N array accesses
operation frequency
variable declaration 2
assignment statement 2
equal to compare N
array access N
increment N to 2 N
19
Example: 2-SUM
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0)
count++;
operation frequency
equal to compare ½ N (N − 1)
tedious to count exactly
array access N (N − 1)
increment ½ N (N − 1) to N (N − 1)
20
Simplification 1: cost model
Cost model. Use some basic operation as a proxy for running time.
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
if (a[i] + a[j] == 0)
count++;
operation frequency
equal to compare ½ N (N − 1)
increment ½ N (N − 1) to N (N − 1)
22
Simplification 2: tilde notation
Ex 1. ⅙ N 3 + 20 N + 16 ~ ⅙N3
Ex 2. ⅙ N 3 + 100 N 4/3 + 56 ~ ⅙N3
Ex 3. ⅙N3 - ½N 2 + ⅓ N ~ ⅙N3
23
Simplification 2: tilde notation
increment ½ N (N − 1) to N (N − 1) ~ ½ N 2 to ~ N 2
24
Example: 2-SUM
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
"inner loop"
if (a[i] + a[j] == 0)
count++;
A. ~ N 2 array accesses.
Bottom line. Use cost model and tilde notation to simplify counts.
25
Example: 3-SUM
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++) "inner loop"
if (a[i] + a[j] + a[k] == 0)
count++;
A. ~ ½ N 3 array accesses.
Bottom line. Use cost model and tilde notation to simplify counts.
26
A NALYSIS OF A LGORITHMS
‣ introduction
‣ mathematical models
‣ order-of-growth classifications
‣ theory of algorithms
‣ memory
http://algs4.cs.princeton.edu
Common order-of-growth classifications
Definition. If f (N) ~ c g(N) for some constant c > 0, then the order of growth
of f (N) is g(N).
・Ignores leading coefficient.
・Ignores lower-order terms.
Ex. The order of growth of the running time of this code is N 3.
int count = 0;
for (int i = 0; i < N; i++)
for (int j = i+1; j < N; j++)
for (int k = j+1; k < N; k++)
if (a[i] + a[j] + a[k] == 0)
count++;
30
Common order-of-growth classifications
order of
name typical code framework description example T(2N) / T(N)
growth
add two
1 constant a = b + c; statement 1
numbers
while (N > 1)
log N logarithmic divide in half binary search ~1
{ N = N / 2; ... }
divide
N log N linearithmic [see mergesort lecture] mergesort ~2
and conquer
31
Practical implications of order-of-growth
tens of hundreds of
N millions billions
millions millions
hundreds of hundreds of
N log N millions millions
thousands millions
tens of
N2 hundreds thousand thousands
thousands
2N 20 20s 20s 30
Bottom line. Need linear or linearithmic alg to keep pace with Moore's law.
32
Binary search demo
Goal. Given a sorted array and a key, find index of the key in the array?
6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
lo hi
33
Binary search demo
Goal. Given a sorted array and a key, find index of the key in the array?
6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
lo mid hi
Binary search demo
Goal. Given a sorted array and a key, find index of the key in the array?
6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
lo mid hi
Binary search demo
Goal. Given a sorted array and a key, find index of the key in the array?
6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
lo mid hi
Binary search demo
Goal. Given a sorted array and a key, find index of the key in the array?
lo = hi
successful search for 33
6 13 14 25 33 43 51 53 64 72 84 93 95 96 97
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
mid
return 4
Binary search: Java implementation
Trivial to implement?
・First binary search published in 1946.
・First bug-free one in 1962.
・Bug in Java's Arrays.binarySearch() discovered in 2006.
public static int binarySearch(int[] a, int key)
{
int lo = 0, hi = a.length-1;
while (lo <= hi)
{
int mid = lo + (hi - lo) / 2;
if (key < a[mid]) hi = mid - 1;
one "3-way compare"
else if (key > a[mid]) lo = mid + 1;
else return mid;
}
return -1;
}
Invariant. If key appears in the array a[], then a[lo] key a[hi].
38
Binary search: mathematical analysis
T (N) ≤ T (N / 2) + 1 [ given ]
= 1 + log N 39
An N2 log N algorithm for 3-SUM
Algorithm.
input
・Step 1: Sort the N (distinct) numbers. 30 -40 -20 -10 40 0 10 5
・Step 2: For each pair of numbers a[i]
sort
and a[j], binary search for -(a[i] + a[j]). -40 -20 -10 0 5 10 30 40
binary search
(-40, -20) 60
Analysis. Order of growth is N2 log N.
(-40, -10) 50
・Step 1: N2 with insertion sort.
(-40, 0) 40
・Step 2: N 2 log N with binary search.
(-40, 5) 35
(-40, 10) 30
⋮ ⋮
(-20, -10) 30
⋮ ⋮
only count if
(-10, 0) 10
a[i] < a[j] < a[k]
⋮ ⋮ to avoid
double counting
( 10, 30) -40
32,000 14.88
64,000 59.16
ThreeSumDeluxe.java
develop
Big Oh Θ(N2) and smaller O(N2)
upper bounds
develop
Big Omega Θ(N2) and larger Ω(N2)
lower bounds
Theory of algorithms: example 1
Goals.
・Establish “difficulty” of a problem and develop “optimal” algorithms.
・Ex. 1-SUM = “Is there a 0 in the array? ”
Upper bound.
・Ex. Brute-force algorithm for 1-SUM: Look at every array entry.
・Running time of the optimal algorithm for 1-SUM is O(N).
Lower bound.
・Ex. Have to examine all N entries.
・Running time of the algorithm for 1-SUM is Ω(N).
Optimal algorithm.
・Lower bound equals upper bound (to within a constant factor).
・Ex. Brute-force algorithm for 1-SUM is Θ(N).
46
Theory of algorithms: example 2
Goals.
・Establish “difficulty” of a problem and develop “optimal” algorithms.
・Ex. 3-SUM.
Upper bound. A specific algorithm.
・Ex. Brute-force algorithm for 3-SUM.
・Running time of the optimal algorithm for 3-SUM is O(N 3).
47
Theory of algorithms: example 2
Goals.
・Establish “difficulty” of a problem and develop “optimal” algorithms.
・Ex. 3-SUM.
Upper bound. A specific algorithm.
・Ex. Improved algorithm for 3-SUM.
・Running time of the improved algorithm for 3-SUM is O(N 2 log N ).
Lower bound.
・Running time of the improved algorithm for 3-SUM is Ω(N ).2
Open problems.
・Optimal algorithm for 3-SUM?
・Subquadratic algorithm for 3-SUM?
48
Algorithm design approach
Start.
・Develop an algorithm.
・Prove a lower bound.
Gap?
・Lower the upper bound (discover a new algorithm).
・Raise the lower bound (more difficult).
Golden Age of Algorithm Design.
・1970s-.
・Steadily decreasing upper bounds for many important problems.
・Many known optimal algorithms.
49
A NALYSIS OF A LGORITHMS
‣ introduction
‣ mathematical models
‣ order-of-growth classifications
‣ theory of algorithms
‣ memory
http://algs4.cs.princeton.edu
Basics
51
Typical memory usage for primitive types and arrays
boolean 1 char[] 2 N + 24
byte 1 int[] 4 N + 24
char 2 double[] 8 N + 24
int 4
one-dimensional arrays
float 4
long 8
type bytes
double 8
char[][] ~2MN
primitive types
int[][] ~4MN
double[][] ~8MN
two-dimensional arrays
52
Typical memory usage for objects in Java
4 bytes (int)
4 bytes (int)
4 bytes (int)
4 bytes (padding)
32 bytes
53
Typical memory usage for objects in Java
2N + 64 bytes
54
Typical memory usage summary
55