Principles of Communications
Meixia Tao
Shanghai Jiao Tong University
Chapter 10: Channel Coding
Selected from Chapter 13.1 13.3 of Fundamentals of
Communications Systems, Pearson Prentice Hall 2005,
by Proakis & Salehi
Meixia Tao @ SJTU
Topics to be Covered
Source
A/D
converter
Source
encoder
Channel
encoder
Absent if
source is
digital
User
D/A
converter
Modulator
Noise
Source
decoder
Channel
decoder
Channel
Detector
Linear block code ()
Convolutional code ()
Meixia Tao @ SJTU
Information Theory and Channel Coding
Shannons noisy channel coding theorem tells that adding
controlled redundancy allows transmission at arbitrarily low
bit error rate (BER) as long as R<=C
Error control coding (ECC) uses this controlled redundancy
to detect and correct errors
ECC depends on the system requirements and the nature
of the channel
The key in ECC is to find a way to add redundancy to the
channel so that the receiver can fully utilize that redundancy to
detect and correct the errors, and to reduce the required
transmit power coding gain
Meixia Tao @ SJTU
Example
We want to transmit data over a telephone link using a
modem under the following conditions
Link bandwidth = 3kHz
The modem can operate up to the speed of 3600 bits/sec at an
error probability
Target: transmit the data at rate of 1200 bits/sec at
maximum output SNR = 13 dB with a prob. of error
Meixia Tao @ SJTU
Solution: Shannon Theorem
Channel capacity is
Since B = 3000 and S/N = 20 (13 dB = 10log1020)
Thus, by Shannons theorem, we can transmit the data
with an arbitrarily small error probability
Note that without coding Pe =
For the given modem, criterion Pe =
Meixia Tao @ SJTU
is not met.
Solution: A Simple Code Design
Repetition code: every bit is transmitted 3 times
when bk = 0 or 1, transmit codeword 000 or 111
Based on the received codewords, the decoder attempts to extract the
transmitted bits using majority-logic decoding scheme
Clearly, the transmitted bits will be recovered correctly as long as no
more than one of the bits in the codeword is affected by noise
Tx bits bk
Codewords
Rx bits
Meixia Tao @ SJTU
With this simple error control coding, the probability of error is
Meixia Tao @ SJTU
Channel Coding
Coding techniques are classified as either block codes or
convolutional codes, depending on the presence or
absence of memory
A block code has no memory
Information sequence is broken into blocks of length k
Each block of k infor. bits is encoded into a block of n coded bits
No memory from one block to another block
A convolutional code has memory
A shift register of length
is used.
Information bits enter the shift register bits at a time; then
coded bits are generated
These bits depend not only on the recent bit that just entered
the shift register, but also on the
previous bits.
Meixia Tao @ SJTU
Block Codes
An (n,k) block code is a collection of M = 2k codewords of length n
Each codeword has a block of k information bits followed by a
group of r = n-k check bits that are derived from the k information
bits in the block preceding the check bits
Channel Encoder
Message
n bit codewords
k
k bits
The code is said to be linear if any linear combination of 2
codewords is also a codeword
i.e. if
and
are codewords, then
is also a
codeword (where the addition is always module-2)
Meixia Tao @ SJTU
Code rate (rate efficiency) =
Matrix description
Each block code can be generated using a Generator
Matrix G (dim:
)
Given G, then
Codeword
message
Meixia Tao @ SJTU
10
Generator Matrix G
is identity matrix of order k
is matrix of order
, which is selected so
that the code will have certain desirable properties
Meixia Tao @ SJTU
11
Systematic Codes
The form of G implies that the 1st k components of any
codeword are precisely the information symbols
This form of linear encoding is called systematic
encoding
Systematic-form codes allow easy implementation and
quick look-up features for decoding
For linear codes, any code is equivalent to a code in
systematic form (given the same performance). Thus
we can restrict our study to only systematic codes
Meixia Tao @ SJTU
12
Example: Hamming Code
A family of (n,k) linear block codes that have the
following parameters:
Codeword length
# of message bits
# of parity check bits
Capable of providing single-error correction
capability with
Meixia Tao @ SJTU
13
(7, 4) Hamming Code
Consider a (7,4) Hamming code with generator
matrix
Find all codewords
Meixia Tao @ SJTU
14
Solution
Let
Meixia Tao @ SJTU
15
List of all Codewords
n = 7, k = 4
message blocks
codeword
Message
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
Meixia Tao @ SJTU
0
1
1
0
0
1
1
0
1
0
0
1
1
0
0
1
0
0
1
1
1
1
0
0
1
1
0
0
0
0
1
1
0
1
1
0
1
0
0
1
0
1
1
0
1
0
0
1
16
Parity Check Matrix
For each G, it is possible to find a corresponding parity
check matrix H
H can be used to verify if a codeword C is generated by G
Let C be a codeword generated by
Example: Find the parity check matrix of (7,4) Hamming code
Meixia Tao @ SJTU
17
Error Syndrome
Received codeword
where e = Error vector or Error Pattern
it is 1 in every position where data word is in error
Example
Meixia Tao @ SJTU
18
Error Syndrome (contd)
= Error Syndrome
But
1. If s=0 r = c and m is the 1st k bits of r
2. If s 0, and s is the jth row of HT 1 error in jth
position of r
Meixia Tao @ SJTU
19
Consider the (7,4) Hamming code example
How many error
syndromes in total
Note that s is the last row
of HT
So if
Also note error took place
in the last bit
But if
=> Syndrome indicates
position of error
= Error syndrome s
Meixia Tao @ SJTU
20
Cyclic Codes
A code
is cyclic if
(7,4) Hamming code is cyclic
message codeword
0001
0001101
1000
1000110
0100
0100011
Meixia Tao @ SJTU
21
Important Parameters
Hamming Distance between codewords ci and cj:
d(ci, cj) = # of components at which the 2 codewords differ
Hamming weight of a codeword ci is
w(ci) = # of non-zero components in the codeword
Minimum Hamming Distance of a code:
dmin = min d(ci, cj) for all i j
Minimum Weight of a code:
wmin = min w(ci) for all ci 0
Theorem: In any linear code,
Exercise: Find dmin for (7,4) Hamming code
Meixia Tao @ SJTU
22
Soft-Decision and Hard-Decision
Decoding
Soft-decision decoder operates directly on the
decision statistics
Hard-decision decoder makes hard decision (0 or 1)
on individual bits
Here we only focus on hard-decision decoder
Meixia Tao @ SJTU
23
Hard-Decision Decoding
Minimum Hamming Distance Decoding
Given the received codeword r, choose c which is closest to r
in terms of Hamming distance
To do so, one can do an exhaustive search
too much if k is large.
Syndrome Decoding
Syndrome testing:
with
This implies that the corrupted codeword r and the error
pattern have the same syndrome
A simplified decoding procedure based on the above
observation can be used
Meixia Tao @ SJTU
24
Standard Array
Let the codewords be denoted as {c1 , c2 ,, cM } with
the all-zero codeword
being
A standard array is constructed as
Coset leader
2n k
2k
c1
c2
e1
e2
e2nk 1
e1 c2
e2 c2
e2nk 1 c2
Syndrome s
cM
coset
e1 cM
e2 cM
e2nk 1 cM
0
s = e1 H T
s = e2 H T
s = e2nk 1 H T
Error patterns
Meixia Tao @ SJTU
25
Hard-Decoding Procedure
Find the syndrome by r using s=rHT
Find the coset corresponding to s by using the
standard array
Find the cost leader and decode as
Exercise: try (7,4) Hamming code
Meixia Tao @ SJTU
26
Error Correction Capability
A linear block code with a minimum distance dmin can
Detect up to
Correct up to
errors in each codeword
errors in each codeword
t is known as the error correction capability of the code
Meixia Tao @ SJTU
27
Probability of Codeword Error for HardDecision Decoding
Consider a linear block code (n, k) with an error correcting
capability t. The decoder can correct all combination of
errors up to and including t errors.
Assume that the error probability of each individual coded
bit is p and that bit errors occur independently since the
channel is memoryless
If we send n-bit block, the probability of receiving a specific
pattern of m errors and (n-m) correct bits is
Meixia Tao @ SJTU
28
Total number of distinct pattern of n bits with m errors and
(n-m) correct bits is
Total probability of receiving a pattern with m errors is
Thus, the codeword error probability is upper-bounded by
(with equality for perfect codes)
Meixia Tao @ SJTU
29
Error Detection vs. Error Correction
0
5
B
dmin
dmin
(a)
(b)
B
t
e
(c)
To detect e bit errors, we have d min e + 1
To correct t bit errors, we have d min 2t + 1
Meixia Tao @ SJTU
30
Major Classes of Block Codes
Repetition Code
Hamming Code
Golay Code
BCH Code
Reed-Solomon Codes
Walsh Codes
LDPC Codes: invented by Robert Gallager in his
PhD thesis in1960, now proved to be capacityapproaching
Meixia Tao @ SJTU
31
Convolutional Codes
A convolutional code has memory
It is described by 3 integers: n, k, and L
Maps k bits into n bits using previous (L-1) k bits
The n bits emitted by the encoder are not only a function of
the current input k bits, but also a function of the previous
(L-1)k bits
k/n = Code Rate (information bits/coded bit)
L is the constraint length and is a measure of the code
memory
n does not define a block or codeword length
Meixia Tao @ SJTU
32
Convolutional Encoding
A rate k/n convolutional encoder with constraint length L
consists of
kL-stage shift register and n mod-2 adders
At each unit of time:
k bits are shifted into the 1st k stages of the register
All bits in the register are shifted k stages to the right
The output of the n adders are sequentially sampled to give
the coded bits
There are n coded bits for each input group of k information
or message bits. Hence R = k/n information bits/coded bit is
the code rate (k<n)
Meixia Tao @ SJTU
33
Encoder Structure
(rate k/n, constraint length L)
input bit
sequence m
+
1
modulo-2 adder
encoded sequence U
Typically, k=1 for binary codes. Hence, consider rate 1/n codes
Meixia Tao @ SJTU
34
Convolution Codes Representation
Encoding function: characterizes the relationship
between the information sequence m and the output
coded sequence U
Four popular methods for representation
Connection pictorial and connection polynomials
(usually for encoder)
State diagram
Tree diagram
Usually for decoder
Trellis diagram
Meixia Tao @ SJTU
35
Connection Representation
Specify n connection vectors,
each of the n mod-2 adders
for
Each vector has kL dimension and describes the
connection of the shift register to the mod-2
adders
A 1 in the ith position of the connection vector
implies shift register is connected
A 0 implies no connection exists
Meixia Tao @ SJTU
36
Example: L = 3, Rate 1/2
Or
100
0
11 10 11
If Initial Register content is 0 0 0
and Input Sequence is 1 0 0. Then
Output Sequence is 11 10 11
Meixia Tao @ SJTU
37
State Diagram Representation
The contents of the rightmost L-1 stages (or the previous
L-1 bits) are considered the current state =>
states
Knowledge of the current state and the next input is
necessary and sufficient to determine the next output and
next state
For each state, there are only 2 transitions (to the next
state) corresponding to the 2 possible input bits
The transitions are represented by paths on which we write
the output word associated with the state transition
A solid line path corresponds to an input bit 0
A dashed line path corresponds to an input bit 1
Meixia Tao @ SJTU
38
Example: L =3, Rate = 1/2
0/00
Input/output
1/11
a=00
0/11
Current Input Next Output
State
State
00
0/10
b=10
1/01
1/00
d=11
1/10
c=01
0/01
10
01
11
Meixia Tao @ SJTU
0
1
0
1
0
1
0
1
00
10
01
11
00
10
01
11
00
11
10
01
11
00
01
10
39
Example
Assume that m =11011 is the input followed by L-1 = 2 zeros to
flush the register. Also assume that the initial register contents
are all zero. Find the output sequence U
Input
bit mi
Register
contents
State at
time ti
State at
time ti+1
000
100
110
011
101
110
011
001
00
00
10
11
01
10
11
01
State ti
00
10
11
01
10
11
01
00
-1
1
0
1
1
0
0
State ti+1
Branch word at time ti
u1
u2
-1
1
0
1
1
0
0
-1
1
0
1
1
0
0
Output sequence: U = 11 01 01 00 01 01 11
Meixia Tao @ SJTU
40
Trellis Diagram
The trellis diagram is similar to the state diagram,
except that it adds the dimension of time
The code is represented by a trellis where each
trellis branch describes an output word
Blue trellis lights
-- Columbus Park, New York City
Meixia Tao @ SJTU
41
Trellis Diagram
state
00
00
00
00
00
a=00
11
11
11
11
11
11
11
11
b=10
00
10
10
c=01
00
00
10
10
01
01
01
01
01
01
01
d=11
10
10
10
0
1
Trellis structure repeats itself after depth L = 3
Meixia Tao @ SJTU
42
Every input sequence (m1 ,m2 ,) corresponds to
a path in the trellis
a state transition sequence (s0 ,s1 ,), (assume s0=0 is fixed)
an output sequence ((u1 ,u2) , (u3 ,u4) ,)
Example: Let s0 =00, then
gives output 000000 and states aaaa
gives output 111011 and states abca
b1
a = 00
b = 10
00
11
b2
00
00
01
d = 11
11
00
10
10
c = 01
b3
10
00
11
00
10
01
01
11
11
Meixia Tao @ SJTU
0
1
43
We have introduced conv. code
Constraint length L and rate R = 1/n
Update
Polynomials representation
State diagram representation
Trellis diagram representation
We will talk about decoding of convolutional code
Maximum Likelihood Decoding
Viterbi Algorithm
Transfer Function
Meixia Tao @ SJTU
44
Maximum Likelihood Decoding
Transmit a coded sequence
(correspond to
message sequence m) using a digital modulation
scheme (e.g. BPSK or QPSK)
Received sequence
Maximum likelihood decoder
Find the sequence
such that
Will minimize the probability of error if m is equally
likely
Meixia Tao @ SJTU
45
Maximum Likelihood Metric
Assume a memoryless channel, i.e. noise components are
independent. Then, for a rate 1/n code
i-th branch of Z
Then the problem is to find a path through the trellis such that
Log-likelihood path metric
by taking log
i-th branch metric
Log-likelihood of
Meixia Tao @ SJTU
46
Decoding Algorithm: Log-Likelihood
For AWGN channel (soft-decision)
and P(
) is Gaussian with mean
and
variance
Hence
Note that the objective is to compare which i ln(p(z|u)) for different
u is larger, hence, constant and scaling does not affect the results
Then, we let the log-likelihood be
and
Thus, soft decision ML decoder is to choose the path whose
corresponding sequence is at the minimum Euclidean distance to
the received sequence
47
Meixia Tao @ SJTU
For binary symmetric channel (hard decision)
u
z
1-p
if z u
p
0
0
p( z | u ) =
p
1 p if z = u
p
1
1
1-p
if z ji u ji i
ln p
LL( z ji | u ji ) = ln p( z ji | u ji ) =
ln(1 p ) if z ji = u ji
(since p<0.5)
Thus
Hamming distance between Z and
U(m) , i.e. they differ in dm positions
Hard-Decision ML Decoder = Minimum Hamming Distance Decoder
Meixia Tao @ SJTU
48
Maximum Likelihood Decoding Procedure
Compute, for each branch , the branch metric using the
output bits
associated with that branch and
the received symbols
Compute, for each valid path through the trellis (a valid
codeword sequence
), the sum of the branch metrics
along that path
The path with the maximum path metric is the decoded path
To compare all possible valid paths we need to do
exhaustive search or brute-force, not practical as the # of
paths grow exponentially as the path length increases
The optimum algorithm for solving this problem is the
Viterbi decoding algorithm or Viterbi decoder
Meixia Tao @ SJTU
49
BS & MS in MIT
PhD in University of Southern California
Invention of Viterbi algorithm in 1967
Co-founder of Qualcomm Inc. in 1983
Andrew Viterbi
(1935- )
Meixia Tao @ SJTU
50
Viterbi Decoding (R=1/2, L=3)
1
Coded sequence U:
11
01
01
00
01
Received sequence Z:
11
01
01
10
01
Input data sequence m:
Branch metric
a=00
2
0
b=10
2
c=01
0
d=11
0
2
1
2
0
0
Meixia Tao @ SJTU
2
51
Viterbi Decoder
Basic idea:
If any 2 paths in the trellis merge to a single state, one of
them can always be eliminated in the search
Let cumulative path metric of a given path at ti = sum of the
branch metrics along that path up to time ti
Consider t5
The upper path metric is 4, the lower math metric is 1
The upper path metric CANNOT be part of the optimum
path since the lower path has a lower metric
This is because future output branches depend only on
the current state and not the previous state
Meixia Tao @ SJTU
52
Path Metrics for 2 Merging Paths
Path metric = 4
a=00
2
0
b=10
2
Path metric = 1
1
1
1
c=01
0
d=11
0
2
1
1
0
0
Meixia Tao @ SJTU
2
53
Viterbi Decoding
At time ti, there are
states in the trellis
Each state can be entered by means of 2 states
Viterbi Decoding consists of computing the metrics for
the 2 paths entering each state and eliminating one of
them
This is done for each of the
nodes at time ti
The decoder then moves to time ti+1 and repeats the
process
Meixia Tao @ SJTU
54
Example
Meixia Tao @ SJTU
55
Meixia Tao @ SJTU
56
Distance Properties
dfree= Minimum Free distance = Minimum distance of any
pair of arbitrarily long paths that diverge and remerge
A code can correct any t channel errors where (this is an
approximation)
00
00
00
00
00
a=00
11
11
b=10
11
11
11
11
11
11
00
10
10
c=01
00
00
10
10
01
01
d=11
10
01 01
Meixia Tao @ SJTU
10
01
01
10
01
57
Transfer Function
The distance properties and the error rate performance of
a convolutional code can be obtained from its transfer
function
Since a convolutional code is linear, the set of Hamming
distances of the code sequences generated up to some
stages in the trellis, from the all-zero code sequence, is the
same as the set of distances of the code sequences with
respect to any other code sequence
Thus, we assume that the all-zero path is the input to the
encoder
Meixia Tao @ SJTU
58
State Diagram Labeled according to
distance from all-zero path
00
11
a=00
input
10
b=10
01
c=01
11
10
e=00
output
d=11
10
DNL
Dm denote m non-zero output bits
N if the input bit is non-zero
L denote a branch in the path
Meixia Tao @ SJTU
59
The transfer function T(D,N,L), also called the weight
enumerating function of the code is
By solving the state equations we get
The transfer function indicates that:
There is one path at distance 5 and length 3, which differs in 1 input
bit from the correct all-zeros path
There are 2 paths at distance 6, one of which is of length 4, the
other length 5, and both differ in 2 input bits from all-zero path
Meixia Tao @ SJTU
60
Known Good Convolutional Codes
Good convolutional codes can only be found in general by
computer search
There are listed in tables and classified by their constraint
length, code rate, and their generator polynomials or
vectors (typically using octal notation).
The error-correction capability of a convolutional code
increases as n increases or as the code rate decreases.
Thus, the channel bandwidth and decoder complexity
increases
Meixia Tao @ SJTU
61
Good Codes with Rate 1/2
Constraint
Length
3
Generator
Polynomials
(5,7)
dfree
(15,17)
(23,35)
(53,75)
(133,171)
10
(247,371)
10
(561,753)
12
10
(1167,1545)
12
Meixia Tao @ SJTU
62
Good Codes with Rate 1/3
Constraint
Length
3
Generator
Polynomials
(5,7,7)
dfree
(13,15,17)
10
(25,33,37)
12
(47,53,75)
13
(133,145,175)
15
(225,331,367)
16
(557,663,711)
18
10
(1117,1365,1633)
20
Meixia Tao @ SJTU
63
Basic Channel Coding for Wideband
CDMA
Convolutional Codes
BER = 10-3
Inner coding
(conv.)
Inner
interleaving
Inner coding
(conv.)
Inner
interleaving
Block Codes
BER = 10-6
Outer coding
(RS)
Outer
interleaving
Service-specific coding
Convolutional code is rate 1/3 and rate ,
all with constraint length 9
Meixia Tao @ SJTU
64
Channel Coding for Wireless LAN
(IEEE802.11a)
TX signals
Input bits
Conv. Encoder
r=1/2, K=7
Puncturing
Baseband
Modulator
OFDM
Source: 802.11 Wireless Networks: The Definitive Guide / by M. Gast / OReilly
Meixia Tao @ SJTU
65
Other Advanced Channel Coding
Low-density parity check codes: Robert Gallager 1960
Turbo codes: Berrou et al 1993
Trellis-coded modulation: Ungerboeck 1982
Space-time coding: Vahid Tarokh et al 1998
Polar codes: Erdal Arikan 2009
Meixia Tao @ SJTU
66
Exercise
Find out the coding techniques adopted in LTE
Meixia Tao @ SJTU
67