University of Buea
Department of Computer Science
Lecture Notes for CSC207
Algorithm Analysis and Complexity
Theory
By Nyanga Bernard Y.
CSC207 2022/2023 1
Problem Solving: Main Steps
1. Problem definition
2. Algorithm design / Algorithm specification
3. Algorithm analysis
4. Implementation
5. Testing
6. [Maintenance]
10/29/2022 2
1. Problem Definition
• What is the task to be accomplished?
– Calculate the average of the grades for a given
student
– Understand the talks given out by politicians and
translate them in Chinese
• What are the time / space / speed /
performance requirements ?
10/29/2022 3
2. Algorithm Design / Specifications
• Algorithm: Finite set of instructions that, if followed,
accomplishes a particular task.
• Describe: in natural language / pseudo-code / diagrams /
etc.
• Criteria to follow:
– Input: Zero or more quantities (externally produced)
– Output: One or more quantities
– Definiteness: Clarity, precision of each instruction
– Finiteness: The algorithm has to stop after a finite (may be very
large) number of steps
– Effectiveness: Each instruction has to be basic enough and feasible
10/29/2022 4
4,5,6: Implementation, Testing,
Maintainance
• Implementation
– Decide on the programming language to use
• C, C++, Lisp, Java, Perl, Prolog, assembly, etc. , etc.
– Write clean, well documented code
• Test, test, test
• Integrate feedback from users, fix bugs, ensure compatibility
across different versions Maintenance
10/29/2022 5
Algorithm Analysis and Complexity
• The performances of algorithms can be measured on the
scales of Time and Space.
• The Time Complexity of an algorithm or a program is a
function of the running time of the algorithm or a
program.
• The Space Complexity of an algorithm or aprogram is a
function of the space needed by the algorithm or
program to run to completion.
10/29/2022 6
Algorithm Analysis and Complexity
• The Time Complexity of an algorithm can be computedeither
by an
– Empirical or Posteriori Testing
– Theoretical or AprioriApproach
• The Empirical or Posteriori Testing approach calls for
implementing the complete algorithm and executes them on
a computer for various instances of the problem.
• The Theoretical or Apriori Approach calls for mathematically
determining the resources such as time and space needed by
the algorithm, as a function of parameter related to the
instances of the problem considered.
10/29/2022 7
Algorithm Analysis and Complexity
• Apriori analysis computed the efficiency of the program as a
function of the total frequency count of the statements
comprising the program.
• Example: Let us estimate the frequency count of the
statement x = x+2 occurring in the following three program
segments A, B and C.
10/29/2022 8
Experimental Approach
• Write a program that implements the
algorithm
• Run the program with data sets of varying
size.
• Determine the actual running time using a
system call to measure time (e.g. system
(date) );
• Problems?
10/29/2022 9
Experimental Approach
• It is necessary to implement and test the
algorithm in order to determine its running
time.
• Experiments can be done only on a limited set
of inputs, and may not be indicative of the
running time for other inputs.
• The same hardware and software should be
used in order to compare two algorithms. –
condition very hard to achieve!
10/29/2022 10
Use a Theoretical Approach
• Based on high-level description of the
algorithms, rather than language dependent
implementations
• Makes possible an evaluation of the
algorithms that is independent of the
hardware and software environments
Generality
10/29/2022 11
Space Complexity
• Space complexity = The amount of memory required by an
algorithm to run to completion
– *Core dumps = the most often encountered cause is “memory leaks” –
the amount of memory required larger than the memory available on
a given system]
• Some algorithms may be more efficient if data completely
loaded into memory
– Need to look also at system limitations
– E.g. Classify 2GB of text in various categories [politics, tourism, sport,
natural disasters, etc.] – can I afford to load the entire collection?
10/29/2022 12
Space Complexity (cont’d)
1. Fixed part: The size required to store certain data/variables,
that is independent of the size of the problem:
- e.g. name of the data collection
- same size for classifying 2GB or 1MB of texts
2. Variable part: Space needed by variables, whose size is
dependent on the size of the problem:
- e.g. actual text
- load 2GB of text VS. load 1MB of text
10/29/2022 13
Space Complexity (cont’d)
• S(P) = c + S(instance characteristics)
– c = constant
• Example:
void float sum (float* a, int n)
{
float s = 0;
for(int i = 0; i<n; i++) {
s+ = a[i];
}
return s;
}
Space? one word for n, one for a [passed by reference!], one for i
constant space!
10/29/2022 14
Time Complexity
• Often more important than space complexity
– space available (for computer programs!) tends to be larger and larger
– time is still a problem for all of us
• 3-4GHz processors on the market
– still …
– researchers estimate that the computation of various transformations
for 1 single DNA chain for one single protein on 1 TerraHZ computer
would take about 1 year to run to completion
• Algorithms running time is an important issue
10/29/2022 15
Running Time
• Problem: prefix averages
– Given an array X
– Compute the array A such that A[i] is the average of elements X[0]
… X*i+, for i=0..n-1
• Sol 1
– At each step i, compute the element X[i] by traversing the array A
and determining the sum of its elements, respectively the average
• Sol 2
– At each step i update a sum of the elements in the array A
– Compute the element X[i] as sum/I
Big question: Which solution to choose?
10/29/2022 16
Running time
5 ms worst-case
4 ms
3 ms
}
average-case?
best-case
2 ms
1 ms
A B C D E F G
Input
Suppose the program includes an if-then statement that may
execute or not: variable running time
Typically algorithms are measured by their worst case
10/29/2022 17
Algorithm Description
• How to describe algorithms independent of a programming
language
• Pseudo-Code = a description of an algorithm that is
– more structured than usual prose but
– less formal than a programming language
• (Or diagrams)
• Example: find the maximum element of an array.
Algorithm arrayMax(A, n):
Input: An array A storing n integers.
Output: The maximum element in A.
currentMax A[0]
for i 1 to n -1 do
if currentMax < A[i] then currentMax A[i]
return currentMax
10/29/2022 18
Pseudo Code
• Expressions: use standard mathematical symbols
– use for assignment ( ? in C/C++)
– use = for the equality relationship (? in C/C++)
• Method Declarations: -Algorithm name(param1, param2)
• Programming Constructs:
– decision structures: if ... then ... [else ..]
– while-loops while ... do
– repeat-loops: repeat ... until ...
– for-loop: for ... do
– array indexing: A[i]
• Methods
– calls: object method(args)
– returns: return value
• Use comments
• Instructions have to be basic enough and feasible!
10/29/2022 19
Low Level Algorithm Analysis
• Based on primitive operations (low-level computations
independent from the programming language)
• E.g.:
– Make an addition = 1 operation
– Calling a method or returning from a method = 1 operation
– Index in an array = 1 operation
– Comparison = 1 operation etc.
• Method: Inspect the pseudo-code and count the number of
primitive operations executed by the algorithm
10/29/2022 20
Example
Algorithm arrayMax(A, n):
Input: An array A storing n integers.
Output: The maximum element in A.
currentMax A[0]
for i 1 to n -1 do
if currentMax < A[i] then
currentMax A[i]
return currentMax
How many operations ?
10/29/2022 21
Asymptotic Notation
• Need to abstract further
• Give an “idea” of how the algorithm performs
• n steps vs. n+5 steps
• n steps vs. n2 steps
10/29/2022 22
Problem
• Fibonacci numbers
– F[0] = 0
– F[1] = 1
– F[i] = F[i-1] + F[i-2] for i 2
• Pseudo-code
• Number of operations
10/29/2022 23
Algorithm Analysis
• Last time:
– Experimental approach – problems
– Low level analysis – count operations
• Abstract even further
• Characterize an algorithm as a function of the
“problem size”
• E.g.
– Input data = array problem size is N (length of
array)
– Input data = matrix problem size is N x M
10/29/2022 24
Asymptotic Notation
• Goal: to simplify analysis by getting rid of
unneeded information (like “rounding”
1,000,001≈1,000,000)
• We want to say in a formal way 3n2 ≈ n2
• The “Big-Oh” Notation:
– given functions f(n) and g(n), we say that f(n) is
O(g(n)) if and only if there are positive constants c
and n0 such that f(n)≤ c g(n) for n ≥ n0
10/29/2022 25
Graphic Illustration
• f(n) = 2n+6 f(n) = 2n + 6
• Conf. def:
– Need to find a
function g(n) and c g (n ) = 4 n
a const. c such as
f(n) < cg(n)
• g(n) = n and c = 4
• f(n) is O(n)
g (n ) = n
• The order of f(n)
is n n
10/29/2022 26
More examples
• What about f(n) = 4n2 ? Is it O(n)?
– Find a c such that 4n2 < cn for any n > n0
• 50n3 + 20n + 4 is O(n3)
– Would be correct to say is O(n3+n)
• Not useful, as n3 exceeds by far n, for large values
– Would be correct to say is O(n5)
• OK, but g(n) should be as closed as possible to f(n)
• 3log(n) + log (log (n)) = O( ? )
•Simple Rule: Drop lower order terms and constant factors
10/29/2022 27
Asymptotic analysis - terminology
• Special classes of algorithms:
logarithmic: O(log n)
linear: O(n)
quadratic: O(n2)
polynomial: O(nk), k ≥ 1
exponential: O(an), n > 1
• Polynomial vs. exponential ?
• Logarithmic vs. polynomial ?
10/29/2022 28
Some Numbers
log n n n log n n2 n3 2n
0 1 0 1 1 2
1 2 2 4 8 4
2 4 8 16 64 16
3 8 24 64 512 256
4 16 64 256 4096 65536
5 32 160 1024 32768 4294967296
10/29/2022 29
Example
Remember the algorithm for computing prefix averages
- compute an array A starting with an array X
- every element A[i] is the average of all elements X[j] with j < I
Remember some pseudo-code … Solution 1
Algorithm prefixAverages1(X):
Input: An n-element array X of numbers.
Output: An n -element array A of numbers such that A[i] is the average of
elements X[0], ... , X[i].
Let A be an array of n numbers.
for i 0 to n - 1 do
a0
for j 0 to i do
a a + X[j]
Analyze this
A[i] a/(i+ 1)
return array A
10/29/2022 30
Example (cont’d)
Algorithm prefixAverages2(X):
Input: An n-element array X of numbers.
Output: An n -element array A of numbers such that A[i] is the
average of elements X[0], ... , X[i].
Let A be an array of n numbers.
s 0
for i 0 to n do
s s + X[i]
A[i] s/(i+ 1)
return array A
10/29/2022 31
Back to the original question
• Which solution would you choose?
– O(n2) vs. O(n)
• Some math …
– properties of logarithms:
logb(xy) = logbx + logby logb (x/y) = logbx - logby
Logbx a = alogbx logba= logxa/logxb
– properties of exponentials:
a(b+c) = aba c abc = (ab)c ab /ac = a(b-c)
b = a logab bc = a c*logab
10/29/2022 32
Mathematical Background
• Is it always possible to have definite results?
NO !
The running times of algorithms can change
because of the platform, the properties of the
computer, etc.
We use asymptotic notations (O, Ω, θ, o)
• compare relative growth
• compare only algorithms
33
Big Oh Notation (O)
Provides an “upper bound” for the function f
• Definition :
T(N) = O (f(N)) if there are positive constants c and n0
such that
T(N) ≤ cf(N) when N ≥ n0
– T(N) grows no faster than f(N)
– growth rate of T(N) is less than or equal to growth rate of
f(N) for large N
– f(N) is an upper bound on T(N)
• not fully correct !
34
Big Oh Notation (O)
• Analysis of Algorithm A
TA (N) = 1000 N = O(N)
1000 N ≤ cN N n0
if c= 2000 and n0 = 1 for all N
TA (N) = 1000 N = O(N) is right
35
Examples
• 7n+5 = O(n)
for c=8 and n0 =5
7n+5 ≤ 8n n>5 = n0
• 7n+5 = O(n2)
for c=7 and n0=2
7n+5 ≤ 7n2 n≥n0
• 7n2+3n = O(n) ?
36
Advantages of O Notation
• It is possible to compare of two algorithms
with running times
• Constants can be ignored.
– Units are not important
O(7n2) = O(n2)
• Lower order terms are ignored
– O(n3+7n2+3) = O(n3)
37
Running Times of Algorithm A and B
TA(N) = 1000 N = O(N)
TB(N) = N2 = O(N2)
A is asymptotically faster than B !
38
Properties of Big-Oh
• If f(n) is O(g(n)) then af(n) is O(g(n)) for any a.
• If f(n) is O(g(n)) and h(n) is O(g’(n)) then f(n)+h(n) is O(g(n)+g’(n))
• If f(n) is O(g(n)) and h(n) is O(g’(n)) then f(n)h(n) is O(g(n)g’(n))
• If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n))
• If f(n) is a polynomial of degree d , then f(n) is O(nd)
• nx = O(an), for any fixed x > 0 and a > 1
– An algorithm of order n to a certain power is better than an algorithm of order a ( >
1) to the power of n
• log nx is O(log n), fox x > 0 – how?
• log x n is O(ny) for x > 0 and y > 0
– An algorithm of order log n (to a certain power) is better than an algorithm of n
raised to a power y.
10/29/2022 39
Omega Notation (Ω)
• Definition :
T(N) = Ω (f(N)) if there are positive constants c and n0
such that T(N) ≥ c f(N) when N≥ n0
– T(N) grows no slower than f(N)
– growth rate of T(N) is greater than or equal to growth rate
of f(N) for large N
– f(N) is a lower bound on T(N)
• not fully correct !
40
Omega Notation
Example:
• n 1/2 = W (lg n) .
for c = 1 and n0 = 16
Let n > 16
c*(lg n) ≤ n 1/2
41
Omega Notation
• Theorem:
f(N) = O(g(n)) <=> g(n) = Ω(f(N))
Proof:
f(N) ≤ c1g(n) <=> g(n) ≥ c2f(N)
divide the left side with c1
1/c1f(N) ≤ g(n) <=> g(n) ≥ c2f(N)
if we choose c2 as 1/c1 then theorem is right.
42
Omega Notation
• 7n2 + 3n + 5 = O(n4)
• 7n2 + 3n + 5 = O(n3)
• 7n2 + 3n + 5 = O(n2)
• 7n2 + 3n + 5 = Ω(n2)
• 7n2 + 3n + 5 = Ω(n)
• 7n2 + 3n + 5 = Ω(1)
n2 and 7n2 + 3n + 5 grows at the same rate
7n2 + 3n + 5 = O(n2) = Ω(n2) = θ (n2)
43
Theta Notation (θ)
• Definition :
T(N) = θ (h(N)) if and only if
T(N) = O(h(N)) and T(N) = Ω(h(N))
– T(N) grows as fast as h(N)
– growth rate of T(N) and h(N) are equal for large N
– h(N) is a tight bound on T(N)
• not fully correct !
44
Theta Notation (θ)
• Example :
T(N) = 3N2
T(N) = O(N4)
T(N) = O(N3)
T(N) = θ(N2) best
45
Little O Notation (o)
• Definition :
T(N) = o(p(N)) if
T(N) = O(p(N)) and T(N)≠θ(p(N))
– f(N) grows strictly faster than T(N)
– growth rate of T(N) is less than the growth rate of
f(N) for large N
– f(N) is an upperbound on T(N) (but not tight)
• not fully correct !
46
Little O Notation (o)
• Example :
T(N) = 3N2
T(N) = o(N4)
T(N) = o(N3)
T(N) = θ(N2)
47
RULES
• RULE 1:
if T1(N) = O(f(N)) and T2(N) = O(g(N)) then
a) T1(N) + T2(N) = max (O(f(N)), O(g(N)))
b) T1(N) * T2(N) = O(f(N) * g(N))
You can prove these ?
Is it true for θ notation ?
What about Ω notation?
48
RULES
• RULE 2:
if T(N) is a polynomial of degree k
T(N) = akNk + ak-1Nk-1 + … + a1N + a0
then
T(N) = θ(Nk)
49
RULES
• RULE 3:
logk N = o(N) for any constant k
logarithm grows very slowly !
50
Some Common Functions
• c = o (log N) => c=O(log N) but c≠Ω(logN)
• log N = o(log2 N)
• log2 N = o(N)
• N = o(N log N)
• N = o (N2)
• N2 = o (N3)
• N3 = o (2N)
51
• Example :
T(N) = 4N2
• T(N) = O(2N2)
correct but bad style
T(N) = O(N2)
drop the constants
• T(N) = O(N2+N)
correct but bad style
T(N) = O(N2)
ignore low order terms
52
Another Way to Compute Growth Rates
f (N )
lim = 0 f ( N ) = o( g ( N ))
N g ( N )
= c 0 f ( N ) = ( g ( N ))
= g ( N ) = o( f ( N ))
= oscilate there is no relation
53
• Example :
f(N) = 7N2 g(N) = N2 + N
54
• Example :
f(N) = N logN g(N) = N1.5
compare logN with N0.5
compare log2N with N
compare log2N with o(N)
N logN = o(N1.5)
55
Important Series
N
S ( N ) = 1 2 N = i = N (1 N ) / 2
i =1
• Sum of squares:
N
N ( N 1)(2 N 1) N 3
i =1
i =
2
6
3
for large N
• Sum of exponents: N
N k 1
i =1
i
k
| k 1|
for large N and k -1
• Geometric series:
N
AN 1 1
– Special case when A = 2
i =0
A =
i
A 1
• 20 + 21 + 22 + … + 2N = 2N+1 - 1
10/29/2022 56
Analyzing recursive algorithms
function foo (param A, param B) {
statement 1;
statement 2;
if (termination condition) {
return;
foo(A’, B’);
}
10/29/2022 57
Solving recursive equations by repeated
substitution
T(n) = T(n/2) + c substitute for T(n/2)
= T(n/4) + c + c substitute for T(n/4)
= T(n/8) + c + c + c
= T(n/23) + 3c in more compact form
= …
= T(n/2k) + kc “inductive leap”
T(n) = T(n/2logn) + clogn “choose k = logn”
= T(n/n) + clogn
= T(1) + clogn = b + clogn = θ(logn)
10/29/2022 58
Solving recursive equations by telescoping
T(n) = T(n/2) + c initial equation
T(n/2) = T(n/4) + c so this holds
T(n/4) = T(n/8) + c and this …
T(n/8) = T(n/16) + c and this …
…
T(4) = T(2) + c eventually …
T(2) = T(1) + c and this …
T(n) = T(1) + clogn sum equations, canceling the
terms appearing on both sides
T(n) = θ(logn)
10/29/2022 59