CS3230 • solution: knockout tournament ⇒ n + dlg ne − 2 2.1.
t tournament ⇒ n + dlg ne − 2 2.1. if the resulting graph is connected, M replies 0 • f (n) = O(g(n)) ⇐⇒ g(n) = Ω(f (n))
AY21/22 SEM 2 (i.e. edge does not exist) • f (n) = o(g(n)) ⇐⇒ g(n) = ω(f (n))
github/jovyntls 2.2. else:replies 1 (edge exists) • misc
n
3. after < 2 queries, at least one entry of the adjacency • if f (n) = ω(g(n)), then f (n) = Ω(g(n))
matrix is unqueried. • if f (n) = o(g(n)), then f (n) = O(g(n))
01. COMPUTATIONAL MODELS log log n < log n < (log n)k < nk < kn
• algorithm → a well-defined procedure for finding the 02. ASYMPTOTIC ANALYSIS insertion sort: O(n2 ) with worst case Θ(n2 )
correct solution to the input • algorithm → a finite sequence of well-defined instructions
• correctness 1. bracket system: n − 1 matches to solve a given computational problem
• worst-case correctness → correct on every valid input • word-RAM model → runtime is the total number of
03. ITERATION, RECURSION,
• every non-winner has lost exactly once
• other types of correctness: correct on random input/with 2. then compare the elements that have lost to the largest instructions executed DIVIDE-AND-CONQUER
high probability/approximately correct • the 2nd largest element must have lost to the winner • operators, comparisons, if, return, etc
• efficiency / running time → measures the number of steps • compares dlg ne elements that have lost to the • each instruction operates on a word of data (limited size) Iterative Algorithms
executed by an algorithm as a function of the input size winner using dlg ne − 1 comparisons ⇒ fixed constant amount of time • iterative → loop(s), sequentially processing input elements
(depends on computational model used) • loop invariant implies correctness if
• number input: typically the length of binary representation Sorting Asymptotic Notations • initialisation - true before the first iteration of the loop
• worst-case running time → max number of steps Claim. there is a sorting algorithm that requires upper bound (≤): f (n) = O(g(n)) • maintenance - if true before an iteration, it remains true at
executed when run on an input of size n ≤ n lg n − n + 1 comparisons. if ∃c > 0, n0 > 0 such that ∀n ≥ n0 , the beginning of the next iteration
0 ≤ f (n) ≤ cg(n) • termination - true when the algorithm terminates
adversary argument → Proof. every sorting algorithm must make ≥ lg(n!)
inputs are decided such that they have different solutions comparisons. lower bound (≥): f (n) = Ω(g(n)) examples
1. let set U be the set of all permutations of the set if ∃c > 0, n0 > 0 such that ∀n ≥ n0 ,
Comparison Model 0 ≤ cg(n) ≤ f (n) • insertionSort: with loop variable as j , A[1..J − 1] is sorted.
{1, . . . , n} that the adversary could choose as array A.
• algorithm can compare any two elements in one time unit • selectionSort: with loop variable as j , the array A[1..j − 1]
|U | = n!
(x > y , x < y , x = y ) tight bound: f (n) = Θ(g(n)) is sorted and contains the j − 1 smallest elements of A.
2. for each query "is Ai > Aj ?",
• running time = number of pairwise comparisons made if Uyes = {A ∈ U : Ai > Aj } is of size ≥ |U |/2, set if ∃c1 , c2 , n0 > 0 such that ∀n ≥ n0 , • Misra-Gries algorithm (determines which bit occurs more in
• array can be manipulated at no cost 0 ≤ c1 g(n) ≤ f (n) ≤ c2 g(n) an n-bit array A):
U := Uyes . else: U := U \Uyes
• if there is an equal number of 0’s and 1’s, then id = ⊥
3. the size of U decreases by at most half with each
o-notation (<): f (n) = o(g(n)) and count = 0
Decision Tree comparison
if ∀c > 0, ∃n0 > 0 such that ∀n ≥ n0 , • if z ∈ {0, 1} is the majority element, then id = z and
4. with < lg(n!) comparisons, U will still contain at least 2
• each node is a comparison 0 ≤ f (n) < cg(n) count equals the difference between the count of the bits.
permutations
• each branch is an outcome of the comparison
• each leaf is a class label (decision after all comparisons) n! ≥ ( n e
)n ω -notation (>): f (n) = ω(g(n)) Divide-and-Conquer
• worst-case runtime = height of tree ⇒ lg(n!) ≥ n lg( n
e
) = − n lg e
n lg n if ∀c > 0, ∃n0 > 0 such that ∀n ≥ n0 ,
• # of leaves = # of permutations ⇒ lg(n!) = Θ(n lg n) ≈ n lg n − 1.44n 0 ≤ cg(n) < f (n) powering a number
• any decision tree that can sort n elements must have height ⇒ roughly n lg n comparisons are required and sufficient for
Ω(n lg n). sorting n numbers Limits problem: compute f (n, m) = an (mod m) for all n, m ∈ Z
Assume f (n), g(n) > 0. • observation: f (x + y, m) = f (x, m) ∗ f (y, m) (mod m)
String Model
Max Problem • naive solution: recursively compute and combine
input string of n bits f (n)
lim =0 ⇒ f (n) = o(g(n)) f (n − 1, m) ∗ f (1, m) (mod m)
each query find out one bit of the string n→∞ g(n)
problem: find largest element in array A of n distinct elements • T (n) = T (n − 1) + T (1) + Θ(1) ⇒ T (n) = Θ(n)
• n queries are necessary and sufficient to check if the input f (n) • better solution: divide and conquer
string is all 0s. lim <∞ ⇒ f (n) = O(g(n))
n→∞g(n) • divide: trivial
Proof. n − 1 comparisons are needed
• query complexity → number of bits of the input string • conquer: recursively compute f (bn/2c, m)
fix an algorithm M that solves the Max problem on all f (n)
queried by the algorithm 0 < lim <∞ ⇒ f (n) = Θ(g(n)) • combine:
inputs using < n − 1 comparisons. construct graph G n→∞ g(n)
• evasive → a problem requiring n query complexity • f (n, m) = f (bn/2c, m)2 (mod m) if n is even
where nodes i and j are adjacent iff M compares i & j . f (n) • f (n, m) = f (1, m) ∗ f (bn/2c, m)2 (mod m) if odd
Graph Model lim >0 ⇒ f (n) = Ω(g(n))
n→∞ g(n) • T (n) = T (n/2) + Θ(1) ⇒ Θ(log n)
input (symmetric) adjacency matrix of an n-node f (n)
lim =∞ ⇒ f (n) = ω(g(n)) Solving Recurrences
undirected graph n→∞ g(n)
each query find out if an edge is present between two Proof. using delta epsilon definition for a sub-problems of size n
b
where f (n) is the time to divide
chosen nodes (one entry of G) and combine,
n
• evasive → requires 2 queries Properties of Big O T (n) = aT ( nb ) + f (n)
M cannot differentiate A and A0 . • Proof. determining whether the graph is connected is evasive Θ(g(n)) = O(g(n)) ∩ Ω(g(n))
n
(requires 2 queries) • transitivity - applies for O, Θ, Ω, o, ω Recursion tree
n
Second Largest Problem 1. suppose M is an algorithm making ≤ 2 queries. f (n) = O(g(n)) ∧ g(n) = O(h(n)) ⇒ f (n) = O(h(n))
total = height × number of leaves
2. whenever M makes a query, the algorithm tries not • reflexivity - for O, Ω, Θ, f (n) = O(f (n))
problem: find the second largest element in < 2n − 3 adding this edge, but adding all remaining unqueried • symmetry - f (n) = Θ(g(n)) ⇐⇒ g(n) = Θ(f (n)) • each node represents the cost of a single subproblem
comparisons (2x Maximum ⇒ (n−1)+((n−1)−1)=2n−3 ) edges. • complementarity - • height of the tree = longest path from root to leaf
lg
Pn n2 • reliable (as n ↑, chances of deviation from avg case ↓) Linearity of Expectation
= lg n−k
k=1 • issues with quicksort
For any two events X, Y and a constant a,
n2 lg lg n by approx. of harmonic series ( k1 ) • distribution-sensitive → time taken depends on the
P
=
E[X + Y ] = E[X] + E[Y ]
initial (input) permutation
Proof. T (n) = 4T (n/2) + n ⇒ O(n2 ) E[aX] = aE[X]
To show that for all n ≥ n0 , T (n) ≤ c1 n2 − c2 n
Randomised Algorithms
Coupon Collector Problem
• randomised algorithms → output and running time are
1. Set c1 = q + 1, c2 = 1, n0 = 1.
functions of the input and random bits chosen n types of coupon are put into a box and randomly drawn with
2. Base case (n = 1): subbing into c1 n2 − c2 n, • vs non-randomised: output & running time are functions of replacement. What is the expected number of draws needed to
T (1) = q ≤ (q + 1)(1)2 − (1)(1) the input only collect at least one of each type of coupon?
3. Recursive case (n > 1): • expected running time = worst-case running time = • let Ti be the time to collect the i-th coupon after the i − 1
• by strong induction, assume T (k) ≤ c1 · k2 − c2 · k E(n) = max E[Runtime of RandAlg on x] coupon has been collected.
input x of size n (n−(i−1))
for all n > k ≥ 1 • randomised quicksort: choose pivot at random • Probability of collecting a new coupon, pi = n
• T (n) = 4T (n/2) + n • probability that the runtime of randomised quicksort • Ti has a geometric distribution
= 4(c1 (n/2)2 − c2 (n/2)) + n x • E[Ti ] = 1/pi
exceeds average by x% = n− 100 ln ln n
= c1 n2 − 2c2 n + n n
• P(time takes at least double of the average) = 10−15
P
• total number of draws, T = Ti
= c1 n2 − c2 n + (1 − c2 )n i=1
• distribution insensitive
Master method = c1 n2 − c2 n since c2 = 1 ⇒ 1 − c2 = 0 n
P n
P
• E[T ] = E[ Ti ] = E[Ti ] by linearity of expectation
a ≥ 1, b > 1, and f is asymptotically positive Randomised Quicksort Analysis
i=1 i=1
T (n) = aT ( nb ) + f (n) = T (n) = n − 1 + T (q − 1) + T (n − q) n
P n Pn
1
log a 04. AVERAGE-CASE ANALYSIS & = =n· = Θ(n lg n)
Θ(n b )
if f (n) < nlogb a polynomially Let A(n) = E[T (n)] where the expectation is over the i=1
n−(i−1)
i=1
i
Θ(nlogb a log n) if f (n) = nlogb a RANDOMISED ALGORITHMS randomness in expectation.
Taking expectations and
P applying linearity of expectation:
05. HASHING
if f (n) > nlogb a polynomially
Θ(f (n)) • average case A(n) → expected running time when the 1 n
input is chosen uniformly at random from the set of all n! A(n) = n − 1 + n q=1 (A(q − 1) + A(n − q))
Dictionary ADT
three common cases n−1
permutations P 2 P
1 =n−1+ A(q) • different types:
1. If f (n) = O(nlogb a− ) for some constant > 0, • A(n) = n! π Q(π) where Q(π) is the time
n
q=1 • static - fixed set of inserted items; only care about queries
• f (n) grows polynomially slower than nlogb a by n complexity when the input is permutation π . A(n) = n log n ⇒ same as average case quicksort • insertion-only - only insertions and queries
factor. • A(n) = E [Runtime of Alg on x] • dynamic - insertions, deletions, queries
x∼Dn
• then T (n) = Θ(nlogb a ). Randomised Quickselect
• implementations
• Ex∼Dn is a probability distribution on U restricted to
2. If f (n) = Θ(nlogb a logk n) for some k ≥ 0, • O(n) to find the kth smallest element • sorted list (static) - O(log N ) query
inputs of size n.
• f (n) and nlogb a grow at similar rates. • randomisation: unlikely to keep getting a bad split • balanced search tree (dynamic) - O(log N ) all operations
• then T (n) = Θ(nlogb a log n) Quicksort Analysis • direct access table
3. If f (n) = Ω(nlogb a+ ) for some constant > 0, Types of Randomised Algorithms
• divide & conquer, linear-time Θ(n) partitioning subroutine • × needs items to be represented as non-negative
• and f (n) satisfies the regularity condition • randomised Las Vegas algorithms integers (prehashing)
• assume we select the first array element as pivot
• af (n/b) ≤ cf (n) for some constant c < 1 and • output is always correct • × huge space requirement
• T (n) = T (j) + T (n − j − 1) + Θ(n)
all sufficiently large n • runtime is a random variable • using H for dictionaries: need to store both the hash table
• if the pivot produces subarrays of size j and (n − j − 1)
• this guarantees that the sum of subproblems is • e.g. randomised quicksort, randomised quickselect and the matrix A.
• worst-case: T (n) = T (0) + T (n − 1) + Θ(n) ⇒ Θ(n2 )
smaller than f (n). • randomised Monte Carlo algorithms • additional storage overhead = Θ(log N · log |U |), if
• f (n) grows polynomially faster than nlogb a by n factor Proof. for quicksort, A(n) = O(n log n) • output may be incorrect with some small probability M = Θ(N )
• then T (n) = Θ(f (n)). let P (i) be the set of all those permutations of elements • runtime is deterministic • other universal hashing constructions may have more
{e1 , e2 , . . . , en } that begins with ei . examples efficient hash function evaluation
Substitution method • smallest enclosing circle: given n points in a plane, compute
Let G(n, i) be the average running time of quicksort • associative array - has both key and value (dictionary in this
1. guess that T (n) = O(f (n)). the smallest radius circle that encloses all n points
over P (i). Then context has only key)
2. verify by induction: • best deterministic algorithm: O(n), but complex
2.1. to show that for n ≥ n0 , T (n) ≤ c · f (n) G(n) = A(i
1 P− 1) + A(n − i) + (n − 1).
n • Las Vegas: average O(n), simple solution Hashing
2.2. set c = max{2, q} and n0 = 1 A(n) = n i=1 G(n, i)
1 Pn
• minimum cut: given a connected graph G with n vertices • hash function, h : U → {1, . . . , M } gives the location of
2.3. verify base case(s): T (n0 ) = q = n i=1 (A(i − 1) + A(n − i) + (n − 1)) and m edges, compute the smallest set of edges whose
2 Pn where to store in the hash table
2.4. recursive case (n > n0 ): = n i=1 A(i − 1) + n − 1 removal would disconnect G. • notation: [M ] = {1, . . . , M }[M ] = {1, . . . , M }
• by strong induction, assume T (k) ≤ c · f (k) for = O(n log n) by taking it as area under integration • best deterministic algorithm: O(mn) • storing N items in hash table of size M
n > k ≥ n0 • Monte Carlo: O(m log n), error probability n−c for any c • collision → for two different keys x and y , h(x) = h(y)
• T(n) = <recurrence> ... ≤ c · f (n) quicksort vs mergesort • primality testing: determine if an n bit integer is prime • resolve by chaining, open addressing, etc
2.5. hence T (n) = O(f (n)). average best worst • best deterministic algorithm: O(n6 ) • desired properties
! may not be a tight bound! quicksort 1.39n lg n n lg n n(n − 1) • Monte Carlo: O(kn2 ), error probability 2−k for any k • X minimise collisions - query(x) and delete(x) take
example mergesort n lg n n lg n n lg n time Θ(|h(x)|)
Geometric Distribution
• X minimise storage space - aim to have M = O(N )
Proof. T (n) = 4T (n/2) + n2 / lg n ⇒ Θ(n2 lg lg n) Let X be the number of trials repeated until success.
• disadvantages of mergesort: • X function h is easy to compute (assume constant time)
n2 X is a random variable and follows a geometric distribution
T (n) = 4T (n/2) + lg n • overhead of temporary storage • if |U | ≥ (N − 1)M + 1, for any h : U → [M ], there is a
with probability p.
(n/2)2 n2 • cache misses set of N elements having the same hash value.
= 4(4T (n/4) + lg n−lg2 ) + lg n
1
Expected number of trials, E[X] = p
• advantages of quicksort • Proof : pigeonhole principle
n2 n2
= 16T (n/4) + lg n−lg 2
+ lg n
• in place P r[X = k] = q k−1 p • use randomisation to overcome the adversary
• e.g. randomly choose between two deterministic hash Special case - Suppose z is 1 at the i-th coordinate but 06. FINGERPRINTING & STREAMING Using Hash Table
functions h1 and h2 0 everywhere else. Then Az is the i-th column of A.
fi ≤ E[fˆi ] ≤ fi + M/k
⇒ for any pair of keys, with probability ≥ 21 , there will be Since the i-th column is uniformly random, String Pattern Matching
no collision P (Az = 0) = 21m = M 1
. • increment/decrement A[h(j)] on an empty table A of size k
problem: does the pattern string P occur as a substring of the • collision ⇒ false
General case - Suppose z is 1 at the i-th coordinate. text string T ?
P positives ⇒ may give overestimate of fi
Universal Hashing • A[h(i)] = j:h(j)=h(i) fj ≥ fi
Let z = [z1 z2 . . . zu ]T . A = [A1 A2 . . . Au ]
Suppose H is a set of hash functions mapping U to [M ]. m = length of P , n = length of T , ` = size of alphabet • if h is drawn from a universal family,
hence Ak is the k-th column of A.
|h∈H:h(x)=h(y)| 1 • assumption: operations on strings of length O(log n) can be overestimate, E[A[h(i)] − fi ] ≤ M/k
H is universal if ∀ x 6= y , ≤ Then Az = z1 A1 + z2 A2 + · · · + zu Au .
|H| M
executed in O(1) time. (word-RAM model) • space: O( 1 · lg M + lg U · lg M )
or P r [h(x) = h(y)] ≤ M1 Az = 0 ⇒ z1 A1 = −(z2 A2 + · · · + zu Au ) (∗)
h∼H We fix z1 A1 to be an arbitrary m × 1 matrix of 1s and • naive solution: Θ(n2 ) let k = 1 for some > 0.
• aka: for any x 6= y , if h is chosen uniformly at random from 0s. The probability that (∗) holds is 21m . • number of rows = O( 1 )
1
a universal H, then there is at most M probability that Fingerprinting approach (Karp-Rabin) • size of each row = O(lg M )
h(x) = h(y) Perfect Hashing • faster string equality check:
• size of hash function (using universal hash family from
• probability where h is sampled uniformly from H ch.05) = O(lg U · lg M )
static case - N fixed items in the dictionary x1 , x2 , . . . , xN • for substring X , check h(X) == h(P ) for a hash
• aka: for any x 6= y , the fraction of hash functions with • Count-Min Sketch → gives a bound on the probability that
1 To perform Query in O(1) worst-case time. function h ⇒ Θ(1) + cost of hashing instead of Θ(|X|)
collisions is at most M . • Rolling Hash: O(m + n) fˆi deviates from fi instead of a bound on the expectation of
Properties of universal hashing Quadratic Space: M = N 2 • update the hash from what we already have from the the gap
previous hash - O(1)
Collision Analysis if H is universal and M = N 2 , and h is sampled uniformly 07. AMORTIZED ANALYSIS
• compute n − m + 1 hashes in O(n) time
• for any N elements x1 , . . . , xN ∈ U , the expected from H, then the expected number of collisions is < 1.
• Monte Carlo algorithm • amortized analysis → guarantees the average
number of collisions between xN and other elements is
Proof. for i 6= j , let indicator r.v. Aij be equal to 1 if performance of each operation in the worst case.
< N/M .
h(xi ) = h(xj ), or 0 otherwise. Division Hash • total amortized cost provides an upper bound on the total
• it follows that for K operations, the expected cost of the
true cost
last operation is < K/M = O(1) if M > K . By universality, E[Aij ] = P (Aij = 1) ≤ 1/N 2 Choose a random prime number p in the range {1, . . . , K}.
• For a sequence of n operations o1 , o2 , . . . , on ,
E[# collisions] =
P
E[Aij ] ≤ N 1
<1 For integer x, hp (x) = x (mod p)
Proof. by definition of Universal Hashing, each element 2 N2 • let t(i) be the time complexity of the i-th operation oi
1 i<j • if p is small and x is b-bits long in binary, hashing ⇒ O(b)
x1 , . . . , xN −1 ∈ U has at most M probability of • let f (n) be the worst-case time complexity for any of the
collision with xN (over random choice of h). • hash family {hp } is approximately universal n operations
It follows that there exists h ∈ H causing no collisions
1 . expected
by indicator r.v., E[Ai ]=P (Ai =1)≤ M (because if not, E[#collisions] would be ≥ 1). • if 0 ≤ x < y < 2b , then P r [hp (x) = hp (y)] < b ln K
K
• let T (n) be the time complexity of all n operations
h Pn
1 N T (n) = t(i) = nf (n)
number of collisions = (N − 1) · M < M . i=1
2-Level Scheme: M = N Proof. hp (x) = hp (y) when y − x = 0 (mod p).
• if x1 , . . . , xN are added to the hash table, and M > N , the Types of Amortized Analysis
expected number of pairs (i, j) with collisions is < 2N . • No collision and less space needed Let z = y − x.
Proof. let Aij be an indicator r.v. for collision.
Construction Since z < 2b , then z can have at most b distinct Aggregate method
N
Choose h : U → [N ] from a universal hash family. prime factors. • look at the whole sequence, sum up the cost of operations
P P P
E[ Aij ] = E[Aii ] + E[Aij ] • Let Lk be the number of xi ’s for which h(xi ) = k. p divides z if p is one of these ≤ b prime factors. and take the average - simpler but less precise
1≤i,j≤N i=1 i6=j • Choose h1 , . . . , hN second-level hash functions • e.g. binary counter - amortized O(1)
≤ N · 1 + N (N − 1) · 1
< 2N hk : [N ] → [(Lk )2 ] s.t. there are no collisions among the number of primes in range {1, . . . , K} is > lnKK ,
M • e.g. queues (with INSERT and EMPTY ) - amortized O(1)
Lk elements mapped to k by h. hence the probability is b/ lnKK = b lnK
K
Expected Cost
• for any sequence of N operations, if M > N , then the • quadratic second-level table → ensures no collisions Accounting method
values of K
expected total cost for executing the sequence is O(N ). using quadratic space • charge the i-th operation a fictitious amortized cost c(i)
• higher K = lower probability of false positive
Analysis
1 • amortized cost c(i) is a fixed cost for each operation
Proof. linearity of expectation; sum up expected costs • for δ = 100n , P(false positive) < 1%.
if H is universal and his sampled • true cost t(i) depends on when the operation is called
uniformly from H, then
Construction of Universal Family ∀δ > 0, if X 6= Y and K = 2m · lg ` · lg( 2m lg `), then • amortized cost c(i) must satisfy:
L2k < 2N
P
E δ δ
Obtain a universal family of hash functions with M = O(N ). k P r[h(X) = h(Y )] < δ Pn
t(i) ≤
Pn
c(i) for all n
i=1 i=1
• Suppose U is indexed by u-bit strings and M = 2m . • take the extra amount for cheap operations early on as
Proof. For i, j ∈ [1, N ], define indicator r.v. Aij = 1 if Streaming
• For any m × u binary matrix A, hA (x) = Ax (mod 2) "credit" paid in advance for expensive operations
h(xi ) = h(xj ), or 0 otherwise.
• each element x => x % 2 • invariant: bank balance never drops below 0
Aij = P collisions = # pairs * 2 =
# possibleP L2k problem: Consider a sequence of insertions or deletions of
• x is a u × 1 matrix ⇒ Ax is m × 1
L2k = Aij items from a large universe U . At the end of the stream, the • the total amortized cost provides an upper bound on the
• Claim: {hA : A ∈ {0, 1}m×u } is universal Hence
k i,j frequency fi of item i is its net count. total true cost
• e.g. U = {00, 01, 10, 11}, M = 2 P P P
• hab means A = [a b] E[ Aij ] = E[Aii ] + E[Aij ] Let M be the sum of all frequencies at the end of stream. Potential method
i,j i i6=j
00 01 10 11 1
naive solutions • φ : potential function associated with the algo/DS
≤ N · 1 + N (N − 1) ·
h00 0 0 0 0 N • direct access table - Ω(U ) space • φ(i): potential at the end of the i-th operation
< 2N
h01 0 1 0 1 • sorted list - Ω(M ) space, no O(1) update • ci : amortized cost of the i-th operation
h10 0 0 1 1 • binary search tree - O(M ) space • ti : true cost of the i-th operation
Hash Table Resizing
h11 0 1 1 0
• when number of inserted items, N is not known
Frequency Estimation Pn ci = ti + φ(i) − φ(i −P
1)
n
Proof. Let x 6= y . Let z = x − y . We know z 6= 0. • rehashing - choose a new hash function of a larger size i=1 ci = φ(n) − φ(0) + i=1 ti
Collision: P (Ax=Ay)=P [A(x−y)=0]=P (Az=0). and re-hash all elements an approximation fˆi is -approximate if • hence as long as φ(n) ≥ 0, then amortized cost is an upper
1
To show P (Az = 0) ≤ M . • costly but infrequent ⇒ amortize fi − M ≤ fˆi ≤ fi + M bound of the true cost.
Pn Pn
i=1 ci ≥ i=1 ti brute force solution Changing Coins Binary Coding
• usually take φ(0) = 0 • check all possible subsequences of A to see if it is also a problem: use the fewest number of coins to make up n cents
• e.g. for queue: subsequence of B , then output the longest one. using denominations d1 , d2 , . . . , dn . Let M [j] be the fewest Given an alphabet set A : {a1 , a2 , . . . , an } and a text file F
• let φ(i) = # of elements in queue after the i-th operation • analysis: O(m2n ) number of coins needed to change j cents. (sequence of alphabets), how many bits are needed to encode
• amortized cost for insert: • checking each subsequence takes O(m) • optimal substructure:
a text file with m characters?
ci = ti + φ(i) − φ(i − 1) = 1 + 1 = 2 • 2n possible subsequences 1 + min M [j − di ], j > 0
• fixed length encoding: m · dlog2 ne
i∈[k]
• amortized cost for empty (for k elements):
recursive solution • M [j] = 0, j=0 • encode each alphabet to unique binary string of length
ci = ti + φ(i) − φ(i − 1) = k + 0 − k = 0
• try to keep c(i) small: using c(i) = t(i) + ∆φi let LCS(i, j): longest common subsequence of A[1..i] and
∞, j<0 dlog2 ne
B[1..j] • total bits needed for m characters = m · dlog2 ne
• if t(i) is small, we want ∆φi to be positive and small Proof. Suppose M [j] = t, meaning
• base case: LCS(i, 0) = ∅ for all i, LCS(0, j) = ∅ for all j • variable length encoding
• if t(i) is large, we want ∆φi to be negative and large j = di1 + di2 + · · · + dit for some
• general case: • different characters occur with different frequency ⇒ use
e.g. Dynamic Table (insertion only) i1 , . . . , it ∈ {1, . . . , k}. fewer bits for more frequent alphabets
• if last characters of A, B are an = bm , then
Then, if j 0 = di1 + di2 + · · · + dit −1 ,
P
Aggregate method LCS(n, m) must terminate with an = bm • average bit length, ABL(γ) = f (x) · |γ(x)|
M [j 0 ] = t − 1, because otherwise if M [j 0 ] < t − 1, x∈A
cost of nP
insertions = • the optimal solution will match an with bm • BUT overlapping prefixes cause indistinguishable
blog(n−1)c
by cut-and-paste argument, M [j] < t.
Pn
t(i) ≤ n + 2j ≤ 3n • if an 6= bm , then either an or bm is not the last symbol characters
i=1 j=1 • runtime: O(nk) for n cents, k denominations
• optimal substructure: (general case)
• if an = bm ,
LCS(n, m) = LCS(n − 1, m − 1) :: an 09. GREEDY ALGORITHMS
Prefix coding
• if an 6= bm , • solve only one subproblem at each step
LCS(n, m) = LCS(n − 1, m) || LCS(n, m − 1) • beats DP and divide-and-conquer when it works • a coding γ(A) is a prefix coding if 6 ∃x, y ∈ A such that
• simplified problem: • greedy-choice property → a locally optimal choice is γ(x) is a prefix of γ(y).
Accounting method • L(n, m) = 0 if n = 0 or m = 0 globally optimal • labelled binary tree: γ(A) = label of path from root
• charge $3 per insertion • if an = bm , then L(n, m) = L(n − 1, m − 1) + 1 Examples
• $1 for insertion itself • if an 6= bm , then
• $1 for moving itself when the table expands L(n, m) = max(L(n, m − 1), L(n − 1, m)) Fractional Knapsack
• $1 for moving one of the existing items when the table analysis • O(n log n)
expands • number of distinct subproblems = (n + 1) × (m + 1) • greedy-choice property: let j ∗ be the item with maximum
Potential method • to use O(min{m, n}) space: bottom-up approach, value/kg, vj /wi . Then there exists an optimal knapsack
let φ(i) = 2i − size(T ) column by column containing min(wj ∗ , W ) kg of item j ∗ .
• memoize for DP ⇒ makes it O(mn) instead of • optimal substructure: if we remove w kg of item j from the
exponential time optimal knapsack, then the remaining load must be the
optimal knapsack weighing at most W − w kgs that one can
Knapsack Problem • for each prefix code A of n alphabets, there exists a binary
take from n − 1 original items and wj − w kg of item j .
• input: (w1 , v1 ), (w2 , v2 ), . . . , (wn , vn ) and capacity tree T on n leaves such that there is a bijective mapping
P W Proof. cut-and-paste argument
P S ⊆ {1, 2, . . . , n} that maximises i∈S vi
• output: subset between the alphabets
P and the leaves
P
Suppose the remaining load after removing w kgs of • ABL(γ) = f (x) · |γ(x)| = f (x) · |depthT (x)|
such that i∈S wi ≤ W
item j was not the optimal knapsack weighing ... x∈A x∈A
• the binary tree corresponding to an optimal prefix coding
Then there is a knapsack of value > X − vj · ww with
j must be a full binary tree.
weight ... • every internal node has degree exactly 2
Combining this knapsack with w kg of item j gives a • multiple possible optimal trees - most optimal depends on
• show that SUM of amortized cost ≥ SUM of actual cost knapsack of value > X ⇒ contradiction! alphabet frequencies
• conclude that sum of amortized cost is O(f (n)) ⇒ sum of
Minimum Spanning Trees • accounting for alphabet frequencies:
actual cost is O(f (n))
• let a1 , a2 , . . . , an be the alphabets of A in
• 2n subsets ⇒ naive algorithm is costly for a connected, undirected graph G = (V, E), find a non-decreasing order of their frequencies.
08. DYNAMIC PROGRAMMING • recursive solution: spanning tree T that connects all vertices with minimum • a1 must be a leaf node; a2 can be a sibling of a1 .
• cut-and-paste proof → proof by contradiction - suppose
P
• let m[i, j] be the maximum value that can be obtained weight. Weight of spanning tree T , w(T ) = w(u, v). • there exists an optimal prefix coding in which a1 and a2
you have an optimal solution. Replacing ("cut") subproblem using a subset of items {1, 2, . . . , i} with total weight no (u,v)∈T
are siblings
solutions with this subproblem solution ("paste" in) should more than j . • optimal substructure: let T be a MST. remove any edge
• derivation of optimal prefix coding: Huffman’s algorithm
• m[i, (u, v) ∈ T . then T is partitioned into T1 , T2 which are
j] =
improve the solution. If the solution doesn’t improve, then it’s
• keep merging the two least frequent items
not optimal (contradiction). MSTs of G1 = (V1 , E1 ) and G2 = (V2 , E2 ).
0,
if i = 0 or j = 0
• overlapping subproblems - recursive solution contains a Proof. cut-and-paste: w(T ) = w(u, v) + w(T1 ) + w(T2 )
max{m[i−1,j−wi ]+vi ,m[i−1,j]}, if wi ≤ j Huffman(C):
small number of distinct subproblems repeated many times
m[i − 1, j], otherwise if w(T10 ) < w(T1 ) for G1 , then Q = new PriorityQueue(C)
T 0 = {(u, v)} ∪ T10 ∪ T2 would be a lower-weight while Q:
Longest Common Subsequence • analysis: O(nW )
spanning tree than T for G. allocate a new node z
• for sequence A : a1 , a2 , . . . , an stored in array • ! O(nW ) is not a polynomial time algorithm
⇒ contradiction, T is the MST z.left = x = extractMin(Q)
• C is a subsequence of A → if we can obtain C by • not polynomial in input bitsize
• W can be represented in O(lg W ) bits z.right = y = extractMin(Q)
removing zero or more elements from A. • Prim’s algorithm - at each step, add the least-weight edge
• n can be represented in O(lg n) bits z.val = x.val + y.val
problem: given two sequences A[1..n] and B[1..m], from the tree to some vertex outside the tree
Q.add(z)
compute the longest sequence C such that C is a • polynomial time is strictly in terms of the number of bits for • Kruskal’s algorithm - at each step, add the least-weight
edge that does not cause a cycle to form return extractMin(Q) // root
subsequence of A and B . the input
10. REDUCTIONS & INTRACTABILITY Polynomial Time Proof. If Vertex-Cover, then Independent-Set.
Reduction • polynomial time → runtime is polynomial in the length of Same as above, but flip IS and VC
Consider two problems A and B , A can be solved as follows: the encoding of the problem instance
1. convert instance α of A to an instance of β in B • "standard" encodings e.g. Set-Cover
2. solve β to obtain a solution • binary encoding of integers
3. based on the solution of β , obtain the solution of α. • list of parameters enclosed in braces (graphs/matrices) Given integers k and n, and collection S of subsets of
4. ⇒ then we say A reduces B . • pseudo-polynomial algorithm → runs in time polynomial in {1, . . . , n}, are there ≤ k of these subsets whose union
the numeric value if the input but is exponential in the equals {1, . . . , n}?
length of the input
Claim: Vertex-Cover ≤P Set-Cover
• e.g. DP algo for Knapsack since W is in numeric value
• Knapsack is NOT polynomial time: O(nW log M ) but W is Reduction: given (G, k) instance of Vertex-Cover, generate
not the number of bits an instance (n, k0 , S) of Set-Cover. NP-Hard and NP-Complete
• Fractional Knapsack is polynomial time: • a problem A is said to be NP-Hard if for every problem
instance → another word for input O(n log n log W log M ) B ∈ N P , B ≤P A.
Proof. For each node v in G, construct a set Sv containing all
e.g. Matrix Multiplication & Squaring its outgoing edges. (Number each edge) • aka A is at least as hard as every problem in NP.
Decision Problems • a problem A is said to be NP-Complete if it is in NP and is
• Mat-Multi: matrix multiplication • decision problem → a function that maps an instance also NP-Hard
• input: two N × N matrices A and B . space I to the solution set {Y ES, N O} • aka the hardest problems in NP.
• output: A × B • decision vs optimisation problem: • Cook-Levin Theorem → every problem in NP-Hard can be
• Mat-Sqr: matrix squaring • decision problem: given a directed graph G, is there a poly-time reduced to 3-SAT. Hence, 3-SAT is NP-Hard and
• input: one N × N matrix C . output: C × C path from vertex u to v of length ≤ k? NP-Complete.
• Mat-Sqr can be reduced to Mat-Multi • optimisation problem: given ..., what is the length of the • NP-Complete problems can still be approximated in
• Proof. Given input matrix C for Mat-Sqr, let A = C and shortest path ... ? poly-time! (e.g. greedy algorithm gives a 2-approximation for
B = C be inputs for Mat-Multi. Then AB = C 2 . • convert from decision → optimisation: given an e.g. 3-SAT Vertex-Cover)
• Mat-Multi can also 0beAreduced to Mat-Sqr! instance of the optimisation problem and a number k, is
• SAT: given a CNF formula Φ, does it have a satisfying truth
• Proof. let C = B 0 there a solution with value ≤ k? showing NP-Completeness
assignment?
⇒ C2 = 0 A 0 A
AB 0
B 0 B 0 = 0 BA • the decision problem is no harder than the optimisation 1. show that X is in NP. ⇒ a YES-instance has a certificate
• literal: a boolean variable or its negation x, x̄
problem. that can be verified in polynomial time
T-Sum • clause: a disjunction (OR) of literals
• given the optimal solution, check that it is ≤ k. 2. show that X is NP-hard
• 0-Sum: given array A, output i, j ∈ (1, n) such that • conjunctive normal form (CNF): formula Φ that is a
• if we cannot solve the decision problem quickly ⇒ then • by giving a poly-time reduction from another NP-hard
A[i] + A[j] = 0 conjunction (AND) of clauses
we cannot solve the optimisation problem quickly problem A to X . ⇒ X is at least as hard as A
• T-Sum: given array B , output i, j ∈ (1, n) such that • 3-SAT → SAT where each clause contains exactly 3 literals
• decision ≤P optimisation • reduction should not depend on whether the instance of
B[i] + B[j] = T • 3-SAT ≤P Independent-Set
• Reduction: Construct an instance (G, k) of Indep-Set s.t.
A is a YES- or NO-instance
• reduce T-Sum to 0-Sum: Reductions between Decision Problems
3. show that the reduction is valid
• given array B , define array A s.t. A[i] = B[i] − T /2. given two decision problems A and B , a polynomial-time G has an independent set of size k ⇐⇒ Φ is satisfiable
3.1. reduction runs in poly time
• if i, j satisfy A[i] + A[j] = 0, then B[i] + B[j] = T . reduction from A to B denoted A ≤P B is a transformation • node: each literal term
3.2. if the instance of A is a YES-instance, then the
from instances α of A and β of B such that • edge: connect 3 literals in a clause in a triangle
p(n)-time Reduction instance of X is also a YES-instance
1. α is a YES-instance of A ⇐⇒ β is a YES-instance of B • edge: connect literal to all its negations
3.3. if the instance of A is a NO-instance, then the
• p(n)-time Reduction → if for any instance α of problem A 2. the transformation takes polynomial time in the size of α • reduction runs in polynomial time
instance of X is also a NO-instance
of size n, • ⇒ for k clauses, connecting k vertices form an
• an instance β for B can be constructed in p(n) time independent set in G.
• a solution to problem A for input α can be recovered from
a solution to problem B for input β in time p(n).
• ! n is in bits! Examples
• if there is a p(n)-time reduction from problem A to B and a • Independent-Set: given a graph G = (V, E) and an
T (n)-time algorithm to solve problem B , then there is a integer k, is there a subset of ≤ k vertices such that no 2 are
T (O(p(n))) + O(p(n)) time algorithm to solve A. adjacent?
• Vertex-Cover: given a graph G = (V, E) and an integer
k, is there a subset of ≤ k vertices such that each edge is
11. NP-COMPLETENESS
incident to at least one vertex in this subset? • P → the class of decision problems solvable in showing NP-HARD
• Independent-Set ≤P Vertex-Cover (deterministic) polynomial time 1. take any NP-Complete problem A
• Reduction: to check whether G has an independent set of • NP → the class of decision problems for which 2. show that A ≤P X
size k, we check whether G has vertex cover of size polynomial-time verifiable certificates of YES-instances
n − k. exist.
• aka non-deterministic polynomial
• A ≤P B → if there is a p(n)-time reduction from A to B Proof. If Independent-Set, then Vertex-Cover.
• i.e. no poly-time algo, but verification can be poly-time
for some polynomial function p(n) = O(nc ) for some Suppose (G, k) is a YES-instance of Indep-Set. Then • certificate → result that can be checked in poly-time to
constant c. ("A is a special case of B ") there is subset S of size ≥ k that is an independent set. verify correctness
• if B has a polynomial time algorithm, then so does A V − S is a vertex cover of size ≤ n − k. Proof: Let • P ⊆ N P : any problem in P is in NP.
• "polynomial time" ≈ reasonably efficient (u, v) ∈ E . Then u 6∈ S or v 6∈ S . • if P = N P , then all these algos can be solved in poly
• A ≤P B, B ≤P C ⇒ A ≤P C So either u or v is in V − S , the vertex cover. time
helpful approximations
n
P Qn
stirling’s approximation: T (n) = log(n − i) = log i=0 (n − i) = Θ(n log n)
i=0
n
P 1
harmonic number, Hn = k
= Θ(lg n)
k=1
N
P 1 1 N →∞
basel problem: n2
≤2− N −−−−→ 2
n=1
PN 1 Plog3 n 1 PN 1 1 1 1
because n=1 N 2 ≤ 1 + x=2 (x−1)x =1+ n=2 ( n−1 − n
) =1+1− N
=2− N
number of primes in range {1, . . . , K} is > lnKK
asymptotic bounds
√
1 < log n < n < n < n log n < n2 < n3 < 2n < 22n
loga n < n < an < n! < nn
a
for any a, b > 0, loga n < nb
multiple parameters
for two functions f (m, n) and g(m, n), we say that f (m, n) = O(g(m, n)) if there exists constants c, m0 , n0 such that 0 ≤ f (m, n) ≤ c · g(m, n) for all m ≥ m0 or n ≥ n0 .
set notation
O(g(n)) is actually a set of functions. f (n) = O(g(n)) means f (n) ∈ O(g(n))
• O(g(n)) = {f (n) : ∃c, n0 > 0 | ∀n ≥ n0 , 0 ≤ f (n) ≤ cg(n)}
• Ω(g(n)) = {f (n) : ∃c, n0 > 0 | ∀n ≥ n0 , 0 ≤ cg(n) ≤ f (n)}
• Θ(g(n)) = {f (n) : ∃c1 , c2 , n0 > 0 | ∀n ≥ n0 , 0 ≤ c1 · g(n) ≤ f (n) ≤ c2 · g(n)} = O(g(n)) ∩ Ω(g(n))
• o(g(n)) = {f (n) : ∀c > 0, ∃n0 > 0 | ∀n ≥ n0 , 0 ≤ f (n) < cg(n)}
• ω(g(n)) = {f (n) : ∀c > 0, ∃n0 > 0 | ∀n ≥ n0 , 0 ≤ cg(n) < f (n)}
example proofs
Proof. that 2n2 = O(n3 )
let f (n) = 2n2 . then f (n) = 2n2 ≤ n3 when n ≥ 2.
set c = 1 and n0 = 2.
we have f (n) = 2n2 ≤ c · n3 for n ≥ n0 .
Proof. n = o(n2 )
For any c > 0, use n0 = 2/c.
Proof. n2 − n = ω(n)
For any c > 0, use n0 = 2(c + 1).
Example. let f (n) = n and g(n) = n1+sin(n) .
Because of the oscillating behaviour of the sine function, there is no n0 for which f dominates g or vice versa.
Hence, we cannot compare f and g using asymptotic notation.
Example. let f (n) = n and g(n) = n(2 + sin(n)).
1
Since 3 g(n) ≤ f (n) ≤ g(n) for all n ≥ 0, then f (n) = Θ(g(n)). (note that limit rules will not work here)
mentioned algorithms
• ch.3 - Misra Gries - space-efficient computation of the majority bit in array A
• ch.3 - Euclidean - efficient computation of GCD of two integers
• ch.3 - Tower of Hanoi - T (n) = 2n − 1
1. move the top n − 1 discs from the first to the second peg using the third as temporary storage.
2. move the biggest disc directly to the empty third peg.
3. move the n − 1 discs from the second peg to the third using the first peg for temporary storage.
• ch.3 - MergeSort - T (n) = T (bn/2c) + T (dn/2e) + Θ(n)
• ch.3 - Karatsuba Multiplication - multiply two n-digit numbers x and y in O(nlog2 3 )
• worst-case runtime: T (n) = 3T (dn/2e) + Θ(n)
uncommon notations
• ⊥ - false