0% found this document useful (0 votes)
72 views17 pages

Image Compression Unit 4

Dip file .....
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views17 pages

Image Compression Unit 4

Dip file .....
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Unit – 4

Image Compression
• Images usually require an enormous amount of data to represent:
Example: A standard 8.5” by 11” sheet of paper scanned at 100
samples per inch (dpi) and quantized to two gray levels (binary
image) would require more than 100k bytes to represent. (8.5×100)
(11×100)/8 =116875 bytes.

• Image compression involves reducing the amount of data (bits)


required to represent a digital image. This is done by removing
redundant information.

• Image compression is crucial for efficient storage, retrieval,


transmission, and manipulation of images.

• More generally, data compression involves the efficient


representation of data in digital format.

• Information theory --- a field pioneered by Claude E. Shannon in


the 1940s --- is the theoretical basis for most data compression
techniques.

• Compression can be either lossless (information preserving) or


lossy. In lossy compression, one can achieve higher levels of data
reduction with less than perfect reproduction of the original image.

• Data compression refers to the process of reducing the amount of


data required to represent a given quantity of information.

• Various amounts of data may be used to represent/describe the


same information. Some representations maybe less efficient in the
sense that they have data redundancy.
• If n1 and n2 are the number of information carrying units (ex. bits)
in two datasets that represent the same information, the relative
redundancy RD of the first dataset is defined as

1 n1
RD =1− , where CR = CR n2

is called the compression ratio.

• For images, data redundancy can be of three types:


Coding redundancy: This refers to the binary codewords
used to represent grayvalues.
Interpixel redundancy: This refers to the correlation
between adjacent pixels in an image.
Psychovisual redundancy: This refers to the unequal
sensitivity of the human eye to different visual information.

• A fidelity criterion or error criterion is required to quantify the


loss of information (if any) due to a compression scheme.

• Objective fidelity criteria are based on some quantitive function of


the original input image and the compressed and subsequently
decompressed image.
Example: Root-mean-square (RMS) error, which is defined as the
square-root of the MSE.

ˆ
e(m,n) = f (m,n) − f (m,n)

f (m,n): Original image


ˆ
f (m,n): Reconstructed image
e(m,n): Error image
erms
MN m=0 n=0

• A related measure is the mean-square signal-to-noise ratio (SNRms):

SNRms
m=0 n=0

• The rms value of SNR, denoted SNR rms, is the square-root of


SNRrms.

• When the ultimate image is to be viewed by a human, subjective


fidelity criterion may be more appropriate.

• Here, image quality is measured by subjective evaluations by a


human observer.

• Ratings by a number of human observers, based on “typical”


decompressed images, are averaged to obtain this subjective
fidelity criterion.

• Example of an absolute comparison scale:


Value Rating Description
1 Excellent An image of extremely high quality --- as good
as desired.
2 Fine An image of high quality, providing enjoyable
viewing. Interference is not objectionable.
3 Passable An image of acceptable quality. Interference is
not objectionable.
4 Marginal An image of poor quality; you wish you could
improve it. Interference is somewhat
objectionable.
5 Inferior A very poor image, but you could watch it.
Objectionable interference is definitely
present.
6 Unusable An image so bad that you cold not watch it.

• Example of a relative comparison scale, based on a “side-by-side”


comparison of the original image and the decompressed image:
Value −3 −2 −1 0 1 2 3
Rating Much Worse Slightly Same Slightly Better Much
Worse Worse better Better

Image Compression model

f (m,n)
Source Channel Channel channel source f(m,n)
• Source Encoder is used to remove redundancy in the input
image.
• Channel Encoder is used to introduce redundancy in a controlled
fashion to help combat noise. Example: Parity bit.
• This provides a certain level of immunity from noise that is
inherent in any storage/transmission system. If the channel is not
prone to noise, this block maybe eliminated.
• The Channel could be a communication link or a storage/retrieval
system.
• Channel Decoder and Source Decoder invert the operations of the
corresponding encoder blocks.
• We will mainly concentrate on the source encoder/decoder blocks
and not on the channel encoder/decoder steps.

Source Encoder and Decoder


• Source encoder is responsible for reducing or eliminating any
coding, interpixel, or psychovisual redundancy.

f (m, n) Symbol To Channel


Mapper Quantizer
Encoder
(Compressed
Image)
Source Encoder

• The first block “Mapper” transforms the input data into a (usually
nonvisual) format, designed to reduce interpixel redundancy. This
block is reversible and may or may not reduce the amount of data.
Example: run-length encoding, image transform.
• The Quantizer reduces accuracy of the mapper
output in accordance with some fidelity criterion. This block
reduces psychovisual redundancy and is usually not invertible.
• The Symbol Encoder creates a fixed or variable length codeword to
represent the quantizer output and maps the output in accordance
with this code. This block is reversible and reduces coding
redundancy.

From Channel ˆf (m,n)


Symbol Inverse
Decoder Mapper
(Compressed
Image)

Source Decoder

• The decoder blocks are inverse operations of the corresponding


encoder blocks (except the quantizer block, which is not
invertible).

Elements of Information Theory


• What is information --- how to quantify it?
• What is the minimum amount of data that is sufficient to represent
an image without loss of information?

• What is theoretically the best compression possible?


• What is the theoretically best possible transmission rate for
reliable communication over a noisy channel?

• Information theory provides answers to these and other related


fundamental questions.

• The fundamental premise of information theory is that the


generation of information can be modeled as a probabilistic
process.

• A discrete source of information generates one of N possible


symbols from a source alphabet set A={a0,a1, ,aN−1}, in unit time.

Example: A={a, b, c, , z}, {0, 1}, {0, 1, 2, , 255}.


• The source output can be modeled as a discrete random variable E,
which can take values in set A={a0,a1, ,aN−1}, with corresponding
probabilities {p0, p1, , pN−1}.

• We will denote the symbol probabilities by the vector z = [P(a0),


P(a1) ,P(aN−1)]T = [p0, p1 , pN−1]T .
N−1

• Naturally, pi ≥ 0, and pi =1.


i=0
• The information source is characterized by the pair (A,z).
Observing an occurrence (or realization) of the random variable E
results in some gain of information denoted by I(E). This gain of
information was defined to be (Shannon):
1
I(E) = log =−logP(E)
P(E)

• The base for the logarithm depends on the units for measuring
information. Usually, we use base 2, which gives the
information in units of “binary digits” or “bits.” Using a base
10 logarithm would give the entropy in the units of decimal
digits.

• The amount of information attributed to an event E is inversely


related to the probability of that event.

• Examples:

Certain event:P(E) =1.0. In this case I(E) = log(1/1) = 0. This


agrees with intuition, since if the event E is certain to occur (has
probability 1), knowing that it has occurred has not led to any gain
of information.

Coin toss:P(E = Heads) = 0.5. In this case


I(E) = log(1/0.5) = log(2) =1 bit. This again agrees with intuition.

Rare event:P(E) = 0.001. In this case


I(E) = log(1/0.001) = log(1000) = 9.97 bits. This again agrees
with intuition, since knowing that a rare event has occurred leads
to a significant gain of information.

• The entropy H(z)of a source is defined as the average amount of


information gained by observing a single source symbol:
N−1

H(z) =− pi log pi
i=0

By convention, in the above formula, we set 0log 0 = 0.

• The entropy of a source quantifies the “randomness” of a source. •


Higher the source entropy, more the uncertainty associated with a
source output, and higher the information associated with a source.

• For a fixed number of source symbols, the entropy is maximized if all


the symbols are equally likely (recall uniform histogram).

Example:

Symbol ai Probability pi Information (in bits)


I(ai ) =−log pi
0 ½ 1
1 ¼ 2
2 1/8 3
3 1/16 4
4 1/32 5
5 1/64 6
6 1/64 6

Source entropy:
H(z) = −∑6 pi log pi = −[1 2log1 2 +1 4log14 + +164log164]
i=0

= [1 2 +1 2+ +3 32]= 6332 =1.96875 bits

Given that a source produces the above symbols with indicated


probabilities, how do we represent them using binary strings?

Symbol ai Probability Binary String Length of


(Codeword) codeword li
0 ½ 000 3
1 ¼ 001 3
2 1/8 010 3
3 1/16 011 3
4 1/32 100 3
5 1/64 101 3
6 1/64 110 3

Average length of a codeword:

6 6 6

Lavg = pili = pi (3) = 3 pi = 3 bits


i=0 i=0 i=0
• Is this is the best we can do (in terms of Lavg)? For a fixed length
codeword scheme, yes. How about if we employ a variable length
scheme?

• Idea: Since the symbols are not all equally likely, assign shorter
codewords to symbols with higher probability and longer codewords
to symbols with lower probability, such that the average length is
smaller.

Consider the following scheme

Symbol ai Probability pi Binary String Length of


(Codeword) codeword li
0 ½ 0 1
1 ¼ 10 2
2 1/8 110 3
3 1/16 1110 4
4 1/32 11110 5
5 1/64 111110 6
6 1/64 111111 6

Lavg = pili =[1 2 +1 2 + +3 32]=


63
32 =1.96875 bit
i=0

• Notice that this is the same as the source entropy!


Shannon’s noiseless coding theorem

Let (A,z) be a discrete source with probability vector z and entropy


H(z). The average codeword length of any distortionless (uniquely
decodable) coding is bounded by
Lavg ≥ H(z)
In other words, no codes exist that can losslessly represent the source
if Lavg <H(z).

Note that Shannon’s theorem is quite general in that it refers to any


code, not a particular coding scheme.

• Also, it does not specify a scheme to construct codes whose average


length satisfies Lavg ≥ H(z), nor does it claim that a code satisfying Lavg
= H(z) exists.

• Indeed, the Huffman code (to be studied later) is a particular


algorithm for assigning codewords, with average codeword length
satisfying

H(z) ≤ Lavg < H(z) +1


• Using a block coding scheme (code a block of n symbols at a time
instead of a single symbol), one can obtain codewords with average
codeword length satisfying

L′
H(z) ≤ avg
≡ Lavg < H(z)
1
+ n n

• The efficiency of an encoding scheme is defined as

H(z)
η= ≤1
Lavg
Error free (lossless) compression
• Applications: archiving of medical or business documents where lossy
compression is prohibited for legal reasons.

• Typical compression ratio: 2 to 10


• 2 operations are involved:

1. Reduce the interpixel redundancies using a mapping or


transformation
2. Coding the representation to eliminate coding redundancies

• Variable Length Coding (VLC) – Assigns the shortest possible code


words to the most probable gray levels. – The input to VLC could be
the gray levels of an image or the output of a mapping operation.

Lossy compression
• In a lossless compression, the reconstructed signal is identical to the
original signal

• Only a limited amount of compression can be obtained with lossless


compression

• There is a floor (entropy of the source) below which we cannot drive


the size of the compressed signal

• In some applications consequences of loss of information may prohibit


us from loss of information (bank records, medical images)

• If the resources are limited and we do not require absolute integrity,


we can improve the amount of compression by accepting certain
degree of loss during compression

• Lossless compression: rate is the only performance measure

• In lossy compression rate by itself is not enough

• A distortion measure (difference between original and reconstructed


data) is also required

• Goal in lossy compression: incur minimum amount of distortion while


compressing to the lowest possible rate

• There is a trade off between minimizing rate and keeping distortion


small

Image compression standards

• The standards have been developed under the supervision of ISO and
CCITT.

• Binary image compression standards


• The most widely used standards for bilevel image compression are
CCITT Group 3 and Group 4.

• They were originally developed for FAX coding.

• Group 3: includes two coding schemes: 1. 1-D: coding of each line is


performed independently of any other line 2. 2-D: coding of one line
is performed using line to line correlation.

G3

• 1-D coding: is a run-length coding scheme in which each line is


represented as alternative white and black runs from left to right.

• The first run is always a white run. So, if the first pixel is black, the
white run has a length of zero.

• Runs of different lengths occur with different probabilities, therefore


they are coded using VLC.

• CCITT uses Huffman coding

• The number of possible length of runs is huge and it is not feasible to


build a code book that large.

• The run length rl is expressed as: – rl=64*m+t t=0,1,..,63


m=1,2,…, 27

• To represent rl, we use the corresponding codes for m and t. • The


code for t are called terminating codes and for m make-up codes.

• A unique EOL (end of line) is used to terminate each line.

• 2-D coding: rows of a facsimile image are heavily correlated.


Therefore it would be easier to code the transition points with
reference to the previous line.
• a0: the last pixel known to both encoder and decoder. At the beginning
of encoding each line a0 refers to an imaginary white pixel to the left
of actual pixel. While it is often a transition pixel it does not have to
be.

• a1: the first transition point to the right of a0. It has an opposite color
of a0.

• a2: the second transition pixel to the right of a0. Its color is opposite of
a1.

• b1: the first transition pixel on the line above currently being coded, to
the right of a0 whose color is opposite of a0.

• b2: the first transition pixel to the right of b1 in the line above the
current line

• If b1 and b2 lie between a0 and a1 (pass mode): no transition until b2.


Then a0 moves to b2 and the coding continues.

• If a1 is detected before b2 – If distance between a1 and b1 (number of


pixels) is less than or equal to 3, we send the location of a1 with
respect to b1, move a0 to a1 and the coding continues (vertical mode)
– If the distance between a1 and b1 is larger than 3, we go back to 1-
D run length coding and send the distance between a0 and a1 and a1
and a2 (horizontal mode)

-------------------------THE END------------------------------

You might also like