0% found this document useful (0 votes)
179 views20 pages

Shannon's Source Coding Theorem Explained

This lecture discusses source coding and Huffman coding. It introduces information theory concepts like entropy and self-information. The source coding theorem states that it is possible to encode source symbols into a bitstream at a rate of at least the source entropy. Huffman coding is an optimal variable-length coding technique that assigns shorter codes to more probable symbols. It constructs a binary tree by recursively combining the least probable symbols.

Uploaded by

Harsha
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
179 views20 pages

Shannon's Source Coding Theorem Explained

This lecture discusses source coding and Huffman coding. It introduces information theory concepts like entropy and self-information. The source coding theorem states that it is possible to encode source symbols into a bitstream at a rate of at least the source entropy. Huffman coding is an optimal variable-length coding technique that assigns shorter codes to more probable symbols. It constructs a binary tree by recursively combining the least probable symbols.

Uploaded by

Harsha
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

3100/7100

Introduction to Communications
Lecture 31: Source Coding
This lecture:
1. Information Theory.
2. Entropy.
3. Source Coding.
4. Huffman Coding.
Ref: CCR pp. 697–709, A Mathematical Theory of Communication.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 1 / 20


Information Theory

Claude Shannon’s paper, A Mathematical Theory of


Communication, showed that reliable
communication is possible at non-zero rate.
I Shannon proposed the following model for

point-to-point communication.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 2 / 20


Information Theory (2)
I Shannon showed that there is a very widely
applicable, quantitative de inition of
information.
I The source generates information at a certain
rate.
I The noisy channel can be shown to have a
capacity at which it can reliably transmit
information.
I Reliable transmission is possible if and only if
the source’s rate is not greater than the
channel’s capacity.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 3 / 20


The Mathematical Model

The source produces symbols from an alphabet.


I The alphabet is the (discrete) set of all
possible symbols.
E.g., the Roman alphabet, the Unicode character set or the range of
possible pixel intensities from a camera sensor.

I The source produces these symbols in a


discrete-time sequence.
I The channel has its own alphabet or, rather,
alphabets: one at the input and one at the
output.
E.g., consider a channel including a polar NRZ modulator and
demodulator, so the input and output alphabets are {−A, +A}.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 4 / 20


The Mathematical Model (2)
I At each use of the channel, the channel output
is random (because of noise) but dependent
on the input.
I The rate at which the source generates
symbols may be different to the rate at which
the channel is used.
I Some mechanism is needed to translate
between the source and the channel alphabets
and back again.
I Shannon found that this can always be broken
into two independent processes at the
transmitter: source and channel coding.
I Similarly, at the receiver, source and channel decoding.
COMS3100/COMS7100 Intro to Communications L31 - Source Coding 5 / 20
The Mathematical Model (3)

I Without loss of generality, the common


language of the source and channel (de)coder
is a bitstream.
I Source symbol generation, intermediate
bitstream and channel use may all be at
different rates.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 6 / 20


Information & Entropy
We measure information and entropy in terms of
probability and random variables.
Self-Information
I In order to use random variables, map the
source alphabet to the numbers {0, . . . , M − 1}
where M is the size of the alphabet.
I Let the source symbol selected for
transmission at a certain time instant be
represented as a discrete r.v. X.
I Let i represent one possible value for X and
de ine pi = P (X = i).

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 7 / 20


Self-Information
The amount of information (or surprise) at
learning that X = i is log(1/pi ) = − log pi .
I Hence, the rarer the event, i.e., the less probable,

the more surprising and the more informative


it is.
I Shannon calls this self-information.
I The base of the logarithm has not been
speci ied, but it is usually taken to be 2, in
which case the unit is bits.
I If we take the natural logarithm, the unit is nats.
I (Technically, self-information is dimensionless.)
I Measurement in bits is natural: if eight
symbols are equally likely, it makes sense that
they have 3 bits of information each.
COMS3100/COMS7100 Intro to Communications L31 - Source Coding 8 / 20
Entropy
The expected self-information is called the
entropy of X:
[ ] ∑
M−1
1
H (X) = E log =− pi log pi .
pi i=0

I Shannon used this name because of the


similar expression that arises in statistical
thermodynamics.
I It can be regarded as measure of randomness
or disorder.
I Degenerate r.v.s have zero entropy.
COMS3100/COMS7100 Intro to Communications L31 - Source Coding 9 / 20
Entropy (2)

I It can be shown that uniformly distributed r.v.s


have the highest entropy for a given M, so that

0 ≤ H (X) ≤ log M.
I We’ll use Hb (X) when we need to be explicit
about using a logarithm to the base b.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 10 / 20


The Source Coding Theorem
If, each time the source emits a symbol, it is
independent of previous symbols and identically
distributed, we call it a discrete memoryless source
(DMS).
I Suppose the DMS emits r symbols per second.

I Shannon showed that it is possible to use


source coding to encode the symbols in a
bitstream at rH2 (X) bits per second.
I Conversely, he showed it is impossible to have a uniquely
decodable bitstream at a lower rate.

I This is Shannon’s source coding theorem (and


converse).

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 11 / 20


Shannon’s Source Coding Procedure

In proving the source coding theorem, Shannon


devised a simple but impractical source coding
scheme.
I We group N symbols together into a block.

I All likely sequences (in a certain sense) are

identi ied.
I These sequences are enumerated using binary

words of NH2 (X) bits (rounded down;


ignoring some sequences if too many).
I This constitutes the codebook.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 12 / 20


Coding Procedure (2)

I To perform source coding, compare a given


symbol sequence against those in the
codebook & output the code, if there is one.
I The probability that this scheme doesn’t work
→ 0 as N → ∞.
I This scheme is impractical because there is not
necessarily much structure in the codebook.
⇒ The codebook may require massive storage space.
⇒ The codebook may need to be exhaustively searched.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 13 / 20


Variable-Length Codes

An ideal source-coding scheme is theoretically


easy but practically dif icult.
I Also, for inite N, the scheme is unreliable (not

all sequences have codes!).


I How to make the best possible codes for inite

block sizes?
I We’ll start with codes for a single symbol, i.e.,

N = 1.
I Consider a variable-length source code where

each symbol maps to a variable number of bits.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 14 / 20


Variable-Length Codes(2)

I Suppose symbol i is assigned a code of ni bits.


I It turns out that the code can be made
uniquely decodable if and only if the Kraft
inequality is satis ied:

M−1
2−ni ≤ 1.
i=0

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 15 / 20


Huffman Coding

In 1952, David Huffman discovered a simple


method of constructing an optimal
variable-length, uniquely decodable source code.
I The method constructs a binary tree—a tree in

which each node has at most two children.


I It proceeds from the bottom up, combining

leaves into ‘twigs’, twigs into ‘branches’ and so


on until the tree is built.
I Let’s call any partially assembled portion of

the tree a twig.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 16 / 20


Huffman Coding Technique

I To start with, there are M twigs which consist


only of the leaves themselves, the symbols.
I The probability of a twig is the sum of the
probabilities of all of its leaves.
I The code construction algorithm is simply the
following step, iterated (M − 1 times) until
only one twig remains:
I Choose the two twigs with the least probability and assemble
them together to make a larger twig.
I To read off the codes, descend through the tree
towards the symbol’s leaf.
I Each time we take a left branch, output a ‘0’, otherwise a ‘1’.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 17 / 20


Huffman Code Example
Consider a source with M = 5 for which the probabilities are
p0 = p1 = 0.25, p2 = 0.2, p3 = p4 = 0.15.

p=1 Step 4
0

p=
Step 3 0.55
1

0 1

p=
p= 0 0.45
0.3
0 1
p0=0.25 0 1
Step 2
Step 1 1 2
3 4 p1=0.25 p2=0.2
p3=0.15 p4=0.15

I Average code length is 2.3 bits and the entropy is 2.29 bits.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 18 / 20


Developments of Source Coding

Source coding is also known as data compression.


I More particularly, lossless data compression,

since the input and output symbols are


identical.
I Our exposition required that the probability

distribution is known in advance.


I If we don’t, we can use universal source
coding.
I Examples: Lempel-Ziv (LZ77) & Lempel-Ziv-Welch (LZW)
algorithms & derivatives such as DEFLATE in ZIP & gzip
software.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 19 / 20


Source Coding Applications

I For sources like English text, lossless coding is


very important.
I In other applications, like audio, images and
video, we may be able to put up with some
distortion for a lower bit rate.
I In 1963, Shannon developed rate-distortion
theory, the basis of modern lossy data
compression.
I Examples: voice coding in mobile phones, MP3 for music,
JPEG for images, MPEG for video.

COMS3100/COMS7100 Intro to Communications L31 - Source Coding 20 / 20

You might also like