Algorithm Analysis
1
Outline
• Run time analysis
• Growth rate of functions
• Big-O notation
• Examples
• Worst-case, best-case and average case
analysis of algorithms
2
Algorithm
• An algorithm is a set of instructions to be followed to
solve a problem.
– There can be more than one solution (more than one
algorithm) to solve a given problem.
– An algorithm can be implemented using different
programming languages on different platforms.
• An algorithm must be correct. It should correctly solve
the problem.
– e.g. For sorting, this means even if (1) the input is already
sorted, or (2) it contains repeated elements.
• Once we have a correct algorithm for a problem, we
have to determine the efficiency of that algorithm.
3
Algorithmic Performance
There are two aspects of algorithmic performance:
• Time
• Instructions take time.
• How fast does the algorithm perform?
• What affects its runtime?
• Space
• Data structures take space
• What kind of data structures can be used?
• How does choice of data structure affect the runtime?
We will focus on time:
– How to estimate the time required for an algorithm
– How to reduce the time required
4
Analysis of Algorithms
• When we analyze algorithms, we employ
mathematical techniques that analyze algorithms
independently of specific implementations,
computers, or data.
• To analyze algorithms:
– First, we start to count the number of significant
operations in a particular solution to assess its
efficiency.
– Then, we will express the efficiency of algorithms
using growth functions.
5
The Running Time of Algorithms
• Each operation in an algorithm (or a program) has a cost.
Each operation takes a certain of time.
count = count + 1; take a certain amount of time, but it is constant
A sequence of operations:
count = count + 1; Cost: c1
sum = sum + count; Cost: c2
Total Cost = c1 + c2
6
Run Time Analysis
Example: Simple If-Statement
Cost Times
if (n < 0) c1 1
absval = -n c2 1
else
absval = n; c3 1
Total Cost <= c1 + max(c2,c3)
7
Run Time Analysis (cont.)
Example: Simple Loop
Cost Times
i = 1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
i = i + 1; c4 n
sum = sum + i; c5 n
}
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5
The time required for this algorithm is proportional to n
8
Exercise
Which of these loops takes constant time regardless of
the value of n?
(a) for (i=n/10; i<n; i++) sum+=i;
(b) for (i=0; i<n; i+=n/10) sum++;
(c) for (i=n; i<2*n; i++) sum--;
(d) for (i=0; i<n; i+=10) sum++;
(e) for (i=0; i<n/10; i+=10) sum*=2;
9
Run Time Analysis (cont.)
Example: Nested Loop
Cost Times
i=1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
j=1; c4 n
while (j <= n) { c5 n*(n+1)
sum = sum + i; c6 n*n
j = j + 1; c7 n*n
}
i = i +1; c8 n
}
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
The time required for this algorithm is proportional to n2
10
Example: Nested Loop
Problem: given n numbers in an array A, calculate
the sum of all distinct pairwise multiplications.
// A is an array of integers of size n
double sum = 0.0;
for (int i=0; i < n; i++) {
for (int j=i; j < n; j++) {
sum += A[i]*A[j];
}
}
The time required for this algorithm is proportional to n2 11
Algorithm Growth Rates
• We measure an algorithm’s time requirement as a function of the
problem size.
– Problem size depends on the application: e.g. number of elements in a list for a
sorting algorithm, the number disks for towers of hanoi.
• The most important thing to learn is how quickly the algorithm’s
time requirement grows as a function of the problem size.
– Algorithm A requires time proportional to n2.
– Algorithm B requires time proportional to n.
• An algorithm’s proportional time requirement is known as growth
rate.
• We can compare the efficiency of two algorithms by comparing
their growth rates.
12
Algorithm Growth Rates (cont.)
Time requirements as a function
of the problem size n
13
Running Times for Small Inputs
Running time
Input size (x = n)
14
Running Times for Large Inputs
Running time
Input size (x = n) 15
Big-O Notation
• Big O notation is a mathematical notation that describes
the limiting behavior of a function when the argument tends
towards a particular value or infinity.
• We use Big O notation to describe the computation time
(complexity) of algorithms using algebraic terms.
• O stands for ‘order’, as in ‘order of magnitude’.
16
O – Notation (Formally)
Big-O Example
• If an algorithm requires 2n2–3n+10 seconds to solve a problem
size n and constants c and n0 exist such that
2n2–3n+10 cn2 for all n n0 .
• In fact, for c = 3 and n0 = 3:
2n2–3n+10 3n2 for all n 3 .
• Thus, we say that the algorithm requires no more than 3n2 steps
for n 3 , so it is O(n2)
– The fastest growing term is 2n2.
– The constant 2 can be ignored
18
Order of Terms
• If we graph 0.0001n2 against 10000n, the linear term would be
larger for a long time, but the quadratic one would eventually
catch up (here at n = 108).
• In calculus we know that:
= =0
• As you can see, any quadratic (with a positive leading coefficient)
will eventually beat any linear. So the linear term in a quadratic
function eventually does not matter.
19
Order of Terms
• Consider the function n4 + 100 n2 + 500 = O(n4)
n n4 100n2 500 f(n)
1 1 100 500 601
10 10,000 10,000 500 20,500
100 100,000,000 1,000,000 500 101,000,500
1000 1,000,000,000,000 100,000,000 500 1,000,100,000,500
• The growth of a polynomial in n, as n increases, depends
primarily on the degree (the highest order term), and not on the
leading constant or the low-order terms
20
Big-O Summary
• Write down the cost function (i.e. number of instructions in
terms of the problem size n)
– Specifically, focus on the loops and find out how many
iterations the loops run
• Find the highest order term
• Ignore the constant scaling factor.
• Now you have a Big-O notation.
21
Common Growth Rates
Function Growth Rate Name
c Constant
log N Logarithmic
log2N Log-squared
N Linear
N log N Log-linear (Linearithmic)
N2 Quadratic
N3 Cubic
2N Exponential
22
Growth-Rate Functions
• If an algorithm takes 1 second to run with the problem size 8,
what is the time requirement (approximately) for that algorithm
with the problem size 16?
• If its order is:
O(1) T(n) = 1 second
O(log2n) T(n) = (1*log216) / log28 = 4/3 seconds
O(n) T(n) = (1*16) / 8 = 2 seconds
O(n*log2n) T(n) = (1*16*log216) / 8*log28 = 8/3 seconds
O(n2) T(n) = (1*162) / 82 = 4 seconds
O(n3) T(n) = (1*163) / 83 = 8 seconds
O(2n) T(n) = (1*216) / 28 = 28 seconds = 256 seconds
23
Logarithmic Cost O(log n)
for(int i=1; i < n; i*=2) {…}
base 2
for(int i=1; i < n; i<<=1) {…}
for(int i=n; i>0; i/=3) {…} base 3
for(int i=n; i>0; i>>=2) {…} base 4
The base does not matter, because:
O(log2 n) = O(ln n)/O(ln 2) = O(ln n)
Change of base -> Base e (natural log)
24
Motivation for other Asymptotic bounds
Could algorithm bar be better? O(n2) or O(n)?
25
Big Omega (Ω) – Notation (Formally)
Big Theta (Θ) Notation (Formally)
27
Outline
• Run time analysis
• Growth rate of functions
• Big-O notation
• Examples
• Worst-case, best-case and average case analysis of algorithms
28
What to Analyze
• An algorithm can require different times to solve
different problems of the same size.
• Eg. Searching an item in an array of n elements using
sequential search.
• Cost : 1, 2, 3, … n
29
What to Analyze
• Worst-Case Analysis –The maximum amount of time that an
algorithm require to solve a problem of size n.
– This gives an upper bound for the time complexity of an algorithm.
– Normally, we try to find worst-case behavior of an algorithm.
• Best-Case Analysis –The minimum amount of time that an
algorithm require to solve a problem of size n.
– The best case behavior of an algorithm is NOT so useful.
• Average-Case Analysis –The average amount of time that an
algorithm require to solve a problem of size n.
– Sometimes, it is difficult to find the average-case behavior of an algorithm.
– We have to look at all possible data organizations of a given size n, and their
distribution probabilities of these organizations.
– Worst-case analysis is more common than average-case analysis.
30
Sequential Search
int sequentialSearch(const int a[], int item, int n)
{
for (int i = 0; i < n && a[i]!= item; i++);
if (i == n)
return –1;
return i;
}
Unsuccessful Search: O(n)
Successful Search:
Best-Case: item is in the first location of the array O(1)
Worst-Case: item is in the last location of the array O(n)
Average-Case: The number of key comparisons 1, 2, ..., n
n
i ( n 2 n) / 2
i 1 O(n)
n n
31
Binary Search
int binarySearch(int a[], int size, int x)
{
int low =0;
int high = size –1;
int mid; // mid will be the index of
// target when it’s found.
while (low <= high) {
mid = (low + high)/2;
if (a[mid] < x)
low = mid + 1;
else if (a[mid] > x)
high = mid – 1;
else
return mid;
}
return –1;
}
32
Binary Search – Analysis
• For an unsuccessful search:
– The number of iterations in the loop is log2n + 1
O(log2n)
• For a successful search:
– Best-Case: The number of iterations is 1. O(1)
– Worst-Case: The number of iterations is log2n +1 O(log2n)
– Average-Case: The avg. # of iterations < log2n O(log2n)
0 1 2 3 4 5 6 7 an array with size 8
3 2 3 1 3 2 3 4 # of iterations
The average # of iterations = 21/8 < log28
33
How much better is O(log2n)?
n O(log2n)
16 4
64 6
256 8
1024 (1KB) 10
16,384 14
131,072 17
262,144 18
524,288 19
1,048,576 (1MB) 20
1,073,741,824 (1GB) 30
34
Recap: Algorithm Analysis
• Run-time analysis should be:
– Independent of the platform
– Independent of the programmer’s skill
– Independent of specific test cases (content and size!)
– Theoretically rigorous
35
Recap: So what do we do?
• Theoretical analysis of algorithms is used to estimate
the run-time of an algorithm as a function of the size of
the input
• This run-time is given in terms of the number of
primitive operations required (e.g., arithmetic
operations)
36
Recap: Big O notation
f(x) = O(g(x)) means that
• There exists:
Important note:
– some positive constant C
O(g(x)) is actually a set!
– some minimum x-value x0
• Such that for all x > x0:
– f(x) ≤ C * g(x)
When we say “f(x) = O(g(x))”, we are
actually saying that f(x) O(g(x)).
Recap: Estimating Running time
int sumArray(const vector<int> &arr) {
int tot_sum = 0;
int tot_product_sum = 0;
for (int i = 0; i < arr.size(); ++i) {
tot_sum += arr[i];
}
for (int i = 0; i < arr.size(); ++i) {
for (int j = 0; j < arr.size(); ++j) {
tot_product_sum += arr[i] * arr[j];
}
}
return tot_sum + tot_product_sum;
}
38
Modeling Complex Loops
for (int i = 0; i < n; i++) {
for (int j = 0; j < i; j++) {
0 + 1 + 2 + 3 +…+ i-1 n
cout << "Hello!"; +1
}
cout<<endl;
}
T(n) = (0 + 1 + 2 + 3 +…+ i-1)
T(n) = What is the Big O?
39
Complex loops
for (int i = 0; i < n; i++) {
for (int j = 0; j < i; j++) {
cout << "Hello!";
}
cout<<endl;
}
Summation of a constant Factoring out a constant Gauss’s Identity
40
Some Running Time Functions that Computer
Scientists Like
• Polynomial time: O(n4), O(n2)
• Logarithmic: O(log n), O(log log n)
• Quasi-linear: O(n log n) fairly common
• Sublinear: O(n1/2)
• Exponential: O(2n) very bad! Often indicates something
that is basically brute force
41
Analysis of Recursive Functions
42
Recursive functions
int naivePower(int x, int n){
if (n == 0)
return 1;
else
return (x * naivePower(x, n – 1));
How can we write the running time?
=> Recurrence relations
43
Recurrence Relation
• A recurrence relation is an equation that
recursively defines a function’s values in terms
of earlier values
• Very useful for analyzing an algorithm’s
running time!
44
Recurrence relations for naïvepower code
T(0) = c1
T(n) = c2 + T(n – 1)
If only we had an expression for T(n – 1)…
Recurrence relations for code
T(0) = c1
T(n) = c2 + T(n - 1)
Expanded:
T(n) = c2 + (c2 + T(n - 2))
= c2 + (c2 + (c2 +T(n -3)))
= c2 + (c2 + (c2 + (c2 +T(n - 4))))
…
= k c2 + T(n - k)
After n expansions: n c2 + T(0) = n c2 + c1
=> O(n)
Another Example
int betterPower(int x, int n):
if (n == 0)
return 1;
else if (n == 1)
return x;
else
return betterPower(x * x, n/2);
(assume for simplicity that n is a power of 2)
How can we write the running time?
Recurrence relation for code
T(0) = c1
T(1) = c2
T(n) = c3 + T(n/2)
(for simplification we’ll assume n is a power of 2)
If only we had an expression for T(n/2)…
48
Recurrence relation for code
T(0) = c1, T(1) = c2, …, T(n) = c3 + T(n/2)
Expanded:
49
Recurrence relation for code
T(0) = c1, T(1) = c2, …, T(n) = c3 + T(n/2)
Expanded:
T(n) = c3 + T(n/2)
= c3 + c3 + T(n/4) What should k be in
order for us to get down
= c3 + c3 + c3 + T(n/8) to T(1)?
…..
= kc3 + T(n/(2k))
2k = n, so k = log n
= c3 * log n
=> O(log n) 50
Recursive Patterns
• Modeling and analyzing recursive code is all about finding
patterns in how the input changes between calls and how much
work is done within each call
• Some of the more common recursive patterns
• Pattern #1: Halving the Input
• e.g. Binary Search, betterPower
• Pattern #2: Constant size input and doing work
• e.g. MergeSort (we will study it later)
• Pattern #3: Doubling the Input
• e.g. Computing Fibonacci
51
Recursive Binary Search
int binarySearch(const vector<int>& arr, int target, int left, int right)
{
if (left > right) {
return -1;
}
int mid = (left + right) / 2;
if (arr[mid] == target) {
return mid;
} else if (arr[mid] > target) {
return binarySearch(arr, target, left, mid - 1);
} else {
return binarySearch(arr, target, mid + 1, right);
}
}
52
Calculating Fibonacci
• What is the running time of recursive Fibbonacci function?
int Fib(int n){
if (n <= 1) return 1;
if (n == 2) return 1;
return Fib(n-2) + Fib(n-1);
}
• Each call creates 2 more calls
• Each new call has a copy of
the input, almost
• Almost doubling the input at
each call
Pattern #3 – Doubling the Input
53
Calculating Fibonacci
f(4)
f(3) f(3)
f(2) f(2) f(2) f(2)
f(1) f(1) f(1) f(1) f(1) f(1) f(1) f(1)
Pattern #3 – Doubling the Input
54
Recurrence Relation for Fib
int Fib(int n){
if (n <= 1) return 1;
if (n == 2) return 1;
return Fib(n-2) + Fib(n-1);
}
55
Solving the Recurrence
T(n) = 2*T(n-1) + c
= 2 * (2*T(n-2)+c) + c
= 2 * (2* (2*T(n-3)+c) + c) + c
= 23 * T(n-3) + (22+21+20)*c (assuming n>2)
when substitution repeated i-1th times
= 2i * T(n-i) + (2i-1+ ... +21+20)*c
when i=n
Summation Identity
= 2n * T(0) + (2n-1+ ... +21+20)*c Finite Geometric Series
n 1
= 2n * c1 + ( 2i )*c
i 0
= 2n * c1 + ( 2n-1 )*c = 2n*(c1+c) – c
So, the growth rate function is O(2n)
CENG 213 Data Structures 56
Hanoi Towers Problem
void hanoi(int n, char source, char dest, char spare) { Cost
if (n > 0) { c1
hanoi(n-1, source, spare, dest); c2
cout << "Move top disk from pole " << source c3
<< " to pole " << dest << endl;
hanoi(n-1, spare, dest, source); c4
} }
T(0) = c1 when n=0
T(n) = c1 + c2 + T(n-1) + c3 + c4 + T(n-1) when n>0
= 2*T(n-1) + (c1+c2+c3+c4)
= 2*T(n-1) + c recurrence equation for the growth-rate
function of hanoi-towers algorithm
• Same as Fibonacci
CENG 213 Data Structures 57
Recurrence Practice
58