Chapter 1- Introduction
Design and Analysis of Algorithms
It is about the theoretical study of design and analysis of computer
algorithms
• Analysis: predict the cost of an algorithm in terms of resources
and performance
• Design: design algorithms which minimize the cost
Cont…
• Why need algorithm analysis ?
– writing a working program is not good enough
– The program may be inefficient!
– If the program is run on a large data set, then the
running time becomes an issue
Algorithm Analysis
• We know:
– Experimental approach – problems
– Low level analysis – count operations
• Characterize an algorithm as a function of the “problem size”
E.g.
– Input data = array problem size is N (length of array)
– Input data = matrix problem size is N x M
Algorithm Analysis Cont…
• We only analyze correct algorithms
• An algorithm is correct
– If, for every input instance, it halts with the correct output
• Incorrect algorithms
– Might not halt at all on some input instances
– Might halt with other than the desired answer
• Analyzing an algorithm
– Predicting the resources that the algorithm requires
– Resources include
• Memory
• Communication bandwidth
• Computational time (usually most important)
Algorithm Analysis Cont…
• Factors affecting the running time
– computer
– compiler
– algorithm used
– input to the algorithm
• The content of the input affects the running time
• typically, the input size (number of items in the input) is
the main consideration
– E.g. sorting problem the number of items to be
sorted
– E.g. multiply two matrices together the total
number of elements in the two matrices
Running time
• The running time depends on the input: an already sorted sequence
is easier to sort.
TA(n) = time of A on length n inputs
• Generally, we seek upper bounds on the running time, to have a
guarantee of performance.
• Look at growth of T(n) as n → ∞ .i.e “Asymptotic Analysis”
We need Machine-independent time
Machine-Independent time
The RAM Model
Machine independent algorithm design depends on a hypothetical
computer called Random Acces Machine (RAM).
Assumptions:
• Each simple operation such as +, -, if ...etc takes exactly one time
step.
• Loops and subroutines are not considered simple operations.
• Each memory acces takes exactly one time step.
Running Time Calculations
We assume an arbitrary time unit.
General Rules
1. FOR loop
• The number of iterations times the time of the inside
statements.
2. Nested loops
• The product of the number of iterations times the time of the
inside statements.
3. Consecutive Statements
• The sum of running time of each segment.
4. If/Else
•The testing time plus the larger running time of the cases.
•Running time of a selection statement (if, switch) is the time for
the condition evaluation + the maximum of the running times for the
individual clauses in the selection.
Analysis Rules:
5. Execution of one of the following operations takes time 1:
Assignment Operation
Single Input/output Operation
Single Boolean Operations
Single Arithmetic Operations
Function Return
Rule 1 and 2 . Loops: Running time for a loop is equal to the running time for the
statements inside the loop * number of iterations.
• The total running time of a statement inside a group of nested loops is the
running time of the statements multiplied by the product of the sizes of all the
loops.
• For nested loops, analyze inside out.
• Always assume that the loop executes the maximum number of iterations
possible.
6. Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.
Example
N
• Calculate
i 3
i 1
1
1
2 2N+2
3 4N
4 1
• Lines 1 and 4 count for one unit each
• Line 3: executed N times, each time four units
• Line 2: (1 for initialization, N+1 for all the tests, N for all the
increments) total 2N + 2
• total cost: 6N + 4 O(N)
Some Examples
Case1: for (i=0; i<n; i++)
for (j=0; j<n; j++)
k++;
O(n2)
Case 2: for (i=0; i<n; i++)
k++;
for (i=0; i<n; i++)
for (j=0; j<n; j++)
O(n2)
k++;
Case 3: for (int i=0; i<n-1; i++)
for (int j=0; j<i; j++)
int k+=1; O(n2)
Kinds of analyses
Worst-case: (usually)
• T(n) = maximum time of algorithm on any input of size n.
Average-case: (sometimes)
• T(n) = expected time of algorithm over all inputs of size n.
• Need assumption of statistical distribution of inputs.
Best-case: (NEVER)
• Cheat with a slow algorithm that works fast on some input.
Analysis of Algorithms or Performance Analysis
• Program performance is the amount of computer memory and
time needed to run a program.
algorithm analysis – determining how long an algorithm will take
to solve a problem
Criteria for Measurement
Two criteria are used to judge algorithms:
(i) time complexity
(ii)space complexity
Space Complexity
• Space Complexity of an algorithm is the amount of memory it
needs to run to completion.
• Memory space S(P) needed by a program P, consists of two
components:
– A fixed part: needed for instruction space (byte code),
simple variable space, constants space etc. c
– A variable part: dependent on a particular instance of input
and output data. Sp(instance)
• S(P) = c + Sp(instance)
Time Complexity
• Time Complexity of an algorithm is the amount of CPU time it
needs to run to completion.
• Time required T(P) to run a program P also consists of two
components:
– A fixed part: compile time which is independent of the
problem instance c.
– A variable part: run time which depends on the problem
instance tp(instance)
• T(P) = c + tp(instance)
• The time T(P) taken by a program P, is the sum of its compile
time and its run time.
• The compile time is similar to the fixed space component since
it does not depend upon instance characteristics.
• The two table below show iterative and recursive complexity
respectively
STATEMENTS S/E FREQUE TOTAL
NCY STEPS
float sum(float list[ ],int n) 0 0 0
{ 0 0 0
float tempsum=0; 1 1 1
int i; 0 0 0
for(i=0; i<n; i++) 1 n+1 n+1
tempsum+=list[i]; 1 n n
return tempsum; 1 1 1
} 0 0 0
TOTAL 2n+3
STATEMENTS S/E FREQ TOTAL
UENC STEPS
Y
float rsum (float list [ ] ,int n) 0 0 0
{ 0 0 0
if (n) 1 n+1 n+1
return rsum (list,n-1)+ list [n-1] ; 1 n n
return 0 ; 1 1 1
} 0 0 0
TOTAL - - 2n+2
Asymptotic Notation
• O notation: asymptotic “less than”:
– f(n)=O(g(n)) implies: f(n) “≤” g(n)
• notation: asymptotic “greater than”:
– f(n)= (g(n)) implies: f(n) “≥” g(n)
• notation: asymptotic “equality”:
– f(n)= (g(n)) implies: f(n) “=” g(n)
Asymptotic Notation Cont…
The notation we use to describe the asymptotic running time of
an algorithm are defined in terms of functions whose domains
are the set of natural numbers
N 0, 1, 2, ...
Asymptotic notation
Graphic examples of , O, and .
Asymptotic notations
• O-notation
Asymptotic notation: Big-Oh
• f(N) = O(g(N))
• There are positive constants c and n0 such that
f(N) c g(N) when N n0
• The growth rate of f(N) is less than or equal to the growth rate
of g(N)
• g(N) is an upper bound on f(N)
Big-Oh: example1
• Let f(N) = 2N2. Then
– f(N) = O(N4)
– f(N) = O(N3)
– f(N) = O(N2) (best answer, asymptotically tight)
• O(N2): reads “order N-squared” or “Big-Oh N-squared”
Examples2
n4 + 100n2 + 10n + 50 is O(n4)
10n3 + 2n2 is O(n3)
n3 - n2 is O(n3)
constants
– 10 is O(1)
– 1273 is O(1)
• Show that 30n+8 is O(n).
Show c,n0: 30n+8 cn, n>n0 .
• Let c=31, n0=8. Assume n>n0=8. Then
cn = 31n = 30n + n > 30n+8, so 30n+8 < cn
Examples
• There is no unique set of values for n0 and c in proving the
asymptotic bounds
• Prove that 100n + 5 = O(n2)
– 100n + 5 ≤ 100n + n = 101n ≤ 101n2
for all n ≥ 5
n0 = 5 and c = 101 is a solution
– 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2
for all n ≥ 1
n0 = 1 and c = 105 is also a solution
Must find SOME constants c and n0 that satisfy the asymptotic
Asymptotic notations (cont.)
• - notation
(g(n)) is the set of functions
with larger or same order of
growth as g(n)
Big-Omega
• f(N) = (g(N))
• There are positive constants c and n0 such that
f(N) c g(N) when N n0
• The growth rate of f(N) is greater than or equal to the growth
rate of g(N).
Big-Omega: examples1
• Let f(N) = 2N2. Then
– f(N) = (N)
– f(N) = (N2) (best answer)
Examples2
– 5n2 = (n)
c, n0 such that: 0 cn 5n2 cn 5n2
c = 1 and n0 = 1
– 100n + 5 ≠ (n2)
c, n0 such that: 0 cn2 100n + 5
100n + 5 100n + 5n ( n 1) = 105n
cn2 105n n(cn – 105) 0
Since n is positive cn – 105 0 n 105/c
contradiction: n cannot be smaller than a constant
-n = (2n), n3 = (n2), n = (logn)
Asymptotic notations (cont.)
-notation
• -notation
(g(n)) is the set of functions
with the same order of growth as
g(n)
Big-Theta
• f(N) = (g(N)) iff
f(N) = O(g(N)) and f(N) = (g(N))
• The growth rate of f(N) equals the growth rate of
g(N)
• Example: Let f(N)=N2 , g(N)=2N2
– Since f(N) = O(g(N)) and f(N) = (g(N)),
thus f(N) = (g(N)).
• Big-Theta means the bound is the tightest
possible.
Some rules
• If T(N) is a polynomial of degree k, then
T(N) = (Nk).
• For logarithmic functions,
T(logm N) = (log N).
Example 1.
1 2 2
Show that f ( n ) n 3n ( n )
2
We must find c1 and c2 such that
1 2
c1n n 3n c2 n 2
2
2
Dividing bothsides by n2 yields
1 3
c1 c2
2 n
For n 7 , 1 n 2 3n (n 2 )
0
2
Theorem
• For any two functions f (n) and g (n) , we have
f (n) ( g (n))
if and only if
f (n) O( g (n)) and f (n) ( g (n)).
Example 2.
f (n) 3n 2 2n 5 (n 2 )
Because :
3n 2 2n 5 (n 2 )
3n 2 2n 5 O (n 2 )
Example 3.
3n 2 100n 6 O(n 2 ) since for c 3, 3n 2 3n 2 100n 6
3n 2 100n 6 O(n 3 ) since for c 1, n 3 3n 2 100n 6 when n 3
3n 2 100n 6 O(n) since for any c, cn 3n 2 when n c
3n 2 100n 6 (n 2 ) since for c 2, 2n 2 3n 2 100n 6 when n 100
3n 2 100n 6 (n 3 ) since for c 3, 3n 2 100n 6 n 3 when n 3
3n 2 100n 6 (n) since for any c, cn 3n 2 100n 6 when n 100
3n 2 100n 6 (n 2 ) since both O and apply.
3n 2 100n 6 (n 3 ) since only O applies.
3n 2 100n 6 (n) since only applies.
Example Relative of Big-Oh
5n2 is (n2)
f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1
such that f(n) c•g(n) for n n0
let c = 5 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1
such that f(n) c•g(n) for n n0
let c = 1 and n0 = 1
5n2 is (n2)
f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former,
for the latter recall that f(n) is O(g(n)) if there is a constant c > 0 and
an integer constant n0 1 such that f(n) < c•g(n) for n n0
Let c = 5 and n0 = 1
Algorithm 1
int MaxSubSum1(const vector <int> & a) {
int maxSum=0;
for (int i=0; i<a.size(); i++)
for (int j=i; j<a.size(); j++) {
int thisSum=0;
for (int k=i; k<=j; k++)
thisSum+=a[k];
if (thisSum>maxSum)
O(n )
3
maxSum=thisSum;
}
return maxSum;
}
Algorithm 2
int MaxSubSum2(const vector <int> & a) {
int maxSum=0;
for (int i=0; i<a.size(); i++) {
thisSum=0;
for (int j=i; j<a.size(); j++) {
thisSum+=a[j];
}
if (thisSum>maxSum)
maxSum=thisSum; O(n )
2
}
return maxSum;
}
Algorithm 3
int MaxSubSum4(const vector <int> & a) {
int maxSum=0, thisSum=0;
for (int j=0; j<a.size(); j++) {
thisSum+=a[j];
if (thisSum>maxSum)
maxSum=thisSum;
}
else if (thisSum<0)
thisSum=0; O(n)
return maxSum;
}
Comparison of running times
Searches
– Linear: n steps
– Binary: log2 n steps
– Binary search is about as fast as you can get
Sorts
– Bubble: n2 steps
– Insertion: n2 steps
– There are other, more efficient, sorting techniques
• In principle, the fastest are heap sort, quick sort, and merge
sort
• These each take take n * log2 n steps
• In practice, quick sort is the fastest, followed by merge sort
Complexity Time