Greedy Algorithm
Greedy Algorithm
Greedy Algorithm
Hyoungshick Kim
Department of Software
College of Software
Sungkyunkwan University
Greedy Algorithms
• A greedy algorithm is a simple, intuitive algorithm
strategy that is used in optimization problems
• The algorithm always makes the choice that looks
best at each step
– Greedy algorithms have only one shot to compute the
optimal solution so that they never go back and reverse
the decision.
– Greedy algorithms do not always yield optimal solutions,
but for many problems they do
• The greedy method is quite powerful and works well for
a wide range of problems.
– Activity selection problem
– Minimum spanning tree
– Fractional knapsack problem
– Huffman codes
– Dijkstra’s algorithm for shortest paths from a single source
– Set-cover heuristic
Assume that f1 f2 … fn
Greedy Approach for Activity-
Selection
A A B A
A A B A
Huffman Code Example
Symbol A B C D
Frequency 12.5% 25% 50% 12.5%
Original 00 01 10 11
Encoding 2 bits 2 bits 2 bits 2 bits
Huffman 110 10 0 111
Encoding 3 bits 2 bits 1 bit 3 bits
• Expected size
– Original 1/82 + 1/42 + 1/22 + 1/82 = 2 bits / symbol
– Huffman 1/83 + 1/42 + 1/21 + 1/83 = 1.75 bits / symbol
Huffman Code Data Structures
• Binary (Huffman) tree D A
– Represents Huffman code
– Edge code (0 or 1)
B
– Leaf symbol 1 0
– Path to leaf encoding
– Example C
1 0
• A = “110”, B = “10”, C = “0”
• Priority queue 1 0
– To efficiently build a binary tree
Huffman Code Algorithm Overview
• Encoding
– Calculate frequencies of symbols in file
– Create a binary tree representing “best” encoding
– Use the binary tree to encode a compressed file
• For each symbol, output each path from the root to a
leaf
• Size of encoding = length of the path
– Save the binary tree
Huffman Code – Creating Tree
• Algorithm
– Place each symbol in a leaf
• Weight of leaf = symbol frequency
– Select two trees L and R (initially leaves)
• Such that L and R have lowest frequencies in tree
– Create a new (internal) node
• Left child L
• Right child R
• New frequency frequency( L ) + frequency( R )
– Repeat until all nodes merged into one tree
Huffman Tree Construction 1
A C E H I
3 5 8 2 7
Huffman Tree Construction 2
A H C E I
3 2 5 8 7
5
Huffman Tree Construction 3
A E I
H
8 7
3 2
C
5 5
10
Huffman Tree Construction 4
E I
A H
8 7
3 2
C
15
5 5
10
Huffman Tree Construction 5
A H
I = 00
3 2 E = 01
C E I C = 10
1 0
5 5 8 7 H = 110
1 0 1 0 A = 111
10 15
1 0
25
Huffman Coding Example
I = 00
• Huffman code E = 01
C = 10
H = 110
A = 111
• Input
– ACE
• Output
– (111)(10)(01) = 1111001
Huffman Code Algorithm Overview
• Decoding
– Read a compressed file & binary tree
– Use the binary tree to the decode file
• Follow a path from the root to leaf
Huffman Decoding 1
A H
1111001
3 2
C E I
1 0
5 5 8 7
1 0 1 0
10 15
1 0
25
Huffman Decoding 2
A H
1111001
3 2
C E I
1 0
5 5 8 7
1 0 1 0
10 15
1 0
25
Huffman Decoding 3
A H
1111001
3 2
C E I
1 0
5 5 8 7 A
1 0 1 0
10 15
1 0
25
Huffman Decoding 4
A H
1111001
3 2
C E I
1 0
5 5 8 7 A
1 0 1 0
10 15
1 0
25
Huffman Decoding 5
A H
1111001
3 2
C E I
1 0
5 5 8 7 AC
1 0 1 0
10 15
1 0
25
Huffman Decoding 6
A H
1111001
3 2
C E I
1 0
5 5 8 7 AC
1 0 1 0
10 15
1 0
25
Huffman Decoding 7
A H
1111001
3 2
C E I
1 0
5 5 8 7 ACE
1 0 1 0
10 15
1 0
25
Huffman Code Properties
• Prefix code
– No code is a prefix of another code
– Example
• Huffman(“I”) 00
• Huffman(“X”) 001 // not legal prefix code
– We can stop as soon as the complete code is found
– No need for an end-of-code marker
• Nondeterministic
– Multiple Huffman codes can be generated for the same input
– There can exist more than two trees with the same minimal weight
Huffman Code Properties
• Greedy algorithm
– Chooses the best local solution at each step
– Combines 2 trees with the lowest frequencies
n = 4,
W = 50
45 50
30
20
10
$60 $100 $120 $135 knapsack
($6/pound) ($5/pound) ($4/pound) ($3/pound)
Knapsack Problem:
Greedy Vs. Dynamic
• The fractional problem can be solved greedily
– Greedy strategy: take in the order of
dollars/pound
$220
10 $135
$60 $160 30
($6/pound) 30
45 50 50 50 50
20 45
20 $120
($4/pound) 20
10
$100 $135
($5/pound) ($3/pound) value value optimal
per unit
0-1 Knapsack Problem
$220
$135
Difficult to get the
optimal solution with a
greedy strategy. $160 30
50 50 50 50
Dynamic 20 45
Programming :
20
10
value value optimal
per unit
Greedy Algorithm vs
Dynamic Programming