0% found this document useful (0 votes)
11 views108 pages

Image Compression

The document discusses image compression, a technique essential for reducing the data required to represent digital images by eliminating redundancies. It covers various forms of data redundancy, compression methods such as Huffman coding, Golomb coding, and arithmetic coding, as well as the roles of encoders and decoders in the compression process. Additionally, it addresses fidelity criteria for measuring information loss during compression and outlines different coding standards and techniques used in image formats.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views108 pages

Image Compression

The document discusses image compression, a technique essential for reducing the data required to represent digital images by eliminating redundancies. It covers various forms of data redundancy, compression methods such as Huffman coding, Golomb coding, and arithmetic coding, as well as the roles of encoders and decoders in the compression process. Additionally, it addresses fidelity criteria for measuring information loss during compression and outlines different coding standards and techniques used in image formats.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Image

Processing

Chapter 8
Image Compression
Image

Compression
Digital images: take huge amount of data
• Storage, processing and communications requirements might
be impractical
• More efficient representation of digital images is necessary
• Image compression: reduces the amount of data required to
represent a digital image by removing redundant data
• Theoretical work (1940 Shannon): information, its
representation, transmission and compression
• Image compression is an enabling technology: multimedia,
video conferencing, remote sensing, Fax, ….
Fundamentals
• Data compression: process of reducing the amount of data
required to represent a given quantity of information
• Data and information are not the same.
• Data is used to convey information.
• Various amount of data may be used to represent the same
information
Fundamentals
• Suppose that two representations have the same amount of
information one using n1 units of data and the other one
n2 units of data.
• Compression ratio: CR=n1/n2
• Relative data redundancy of the first set is defined as:
• RD=1-1/ CR
• n2<<n1, CR approaches infinity and RD approaches one. This
means that the data n1 has been highly redundant.
• n2>>n1, CR approaches zero and RD approaches minus
infinity. This means that the data n2 contains more
data (expansion instead of compression).
Fundamentals
• A typical compression ratio: 10
• CR=10, RD=1-1/ 10=0.9
• 90% of the data in the first set is redundant

• Three forms of data redundancies exist in digital images


1. Coding redundancy
2. Spatial and temporal redundancy
3. Irrelevant information
Coding redundancy
• The gray-level histogram of an image can provide a great
deal of insight into the construction of codes to reduce
the amount of data used to represent the image
Normalized gray levels

Probability of k th gray-level
Coding redundancy
• Suppose that gray-level rk is represented by l(rk ) bits
• Average length of code words used for the image:

Total bits=MNLavg

Average length of a code is sometimes called the rate of that code


Coding redundancy
• If the same number of bits is used for all gray levels (natural
m-bit binary code)

• If fewer bits are assigned to more probable gray-levels than


to the less probable ones, compression can be achieved.
• This is called Variable Length Coding (VLC).
Spatial and temporal
redundancy
Spatial and temporal
• redundancy
Codes designed based on histogram do not exploit the
correlation between pixels.
• These correlations result from the structural or geometrical
relationships between the objects in the image.
• The value of any pixel is highly correlated to its neighbors
and the information carried by individual pixels is small.
Spatial and temporal
• redundancy
How to reduce spatial and temporal redundancy?
• The image (which is a 2-D array) is transformed into a more
efficient format.
• This process is called mapping or transformations.
• The transformation should reduce the redundancy.
• Example: the transform could be the difference between
adjacent pixels
• The transform (or mapping) should be reversible. The original
image should be recoverable from the transform.
Irrelevant Information

 The eye does not response with equal sensitivity


to all visual information
 Certain information has less relative importance
than other
 This information is said to be psychovisually
redundant. It can be eliminated without
significantly impairing the quality of image
perception.
 Since the elimination of irrelevant information
results in a loss of quantitative information, it is
commonly referred to as quantization.
Measuring Image
• Information
A random event E that occurs with probability P(E) is said to
contain :

units of information

I(E) self-information

If P(E)=1 then I(E)=0,


no uncertainty is associated with the event, no information
would be transformed by communicating that E has occurred.
Elements of Information
• Theory
Information source generates a random sequence of symbols.
• The set of source symbols (source alphabet) A:

Probability of the
symbols of the
source
Elements of Information
• Theory
Average information per source output
• H: uncertainty or entropy of the source
• H is the average amount of information obtained by observing
a single source output
• If the source symbols are equally probable, the entropy is
maximized
Noiseless coding
theorem
 What is the minimum average code word length
that can be achieved?
 Noiseless coding theorem or Shannon’s first
theorem.
 The minimum average code word length that can
be achieved is equal to the entropy of the source.
 One of the ways that we can get closer and
closer to the entropy is to group the source
outputs together and build n- tuples (nth-
extension)
Noiseless coding
theorem
Entropy: H=0.918 bits/symbol
Average code word length per symbol (first extension) = 1
Average code word length per symbol (second extension) =
1.89/2=0.945
Fidelity Criteria
• Since information may be lost during compression a means
of quantifying the information loss is desirable.
• 2 type of criteria:
1. Objective fidelity criteria
2. Subjective fidelity criteria
• Objective fidelity criteria: the information loss is expressed
as a function of input image (original image) and output
image (compressed and decompressed)
Fidelity Criteria
• Root mean square (rms) error:

Mean square signal to noise ratio:


Fidelity Criteria
• Objective fidelity criteria: simple and convenient, sometimes
misleading
• Subjective fidelity criteria: measuring image quality by the
subjective evaluation of a group of viewers and averaging
their evaluations.
Image Compression
• AModels
compression system consists of two distinct structure of
blocks: encoder and decoder.
• Encoder: creates a set of symbols from the input image
• Decoder: reconstructs an output image
Encoder
• Encoder: responsible for reducing or eliminating any coding,
interpixel or pschovisual redundancies in the input image.
• Encoding can be modeled by three independent operations.
• Each operation is designed to reduce one of the three
redundancies.

Mapper Quantizer Symbol


Encoder
Encoder
• Mapper (transform): transforms the input data into a format
designed to spatial and temporal redundancies.
– This operation generally is reversible.
– It may or may not reduce the amount of data required to represent the
image.
• Quantizer: reduces the accuracy of the mapper’s output.
– This stage removes the irrelevant information.
– This operation is irreversible. It should be omitted when error-free
(loss less) compression is desired.

Mapper Quantizer Symbol


Encoder
Encoder
• Symbol coder: creates a fixed or variable length code to
represent the quantizer’s output.
– In most cases a VLC is used (shortest code words to the most
frequently occurring values of the quantizer’s output)
– The operation is reversible.

Mapper Quantizer Symbol


Encoder
Decoder
• Contains two components:
• Symbol decoder
• Inverse mapper
• They perform the inverse operations of symbol encoder and
mapper.

Symbol Inverse
Decoder Mapper
Image Formats,
Compression Standards
Huffman coding
• Huffman coding is the most popular technique for VLC
coding and removing the coding redundancies.
• The first step in Huffman coding is to create a series of source
reductions by ordering the probabilities of the symbols and
combining the lowest probabilities into a single symbol that
replaces them in the next source reduction.
• The second step is to code each reduced source.
Huffman coding
A2 0.4 1 0.4 1 0.4 1 0.4 1 0.6 0
A6 0.3 00 0.3 00 0.3 00 0.3 00 0.4 1
A1 0.1 011 0.1 011 0.2 010 0.3 01
A4 0.1 0100 0.1 0100 0.1 011
A3 0.06 01010 0.1 0101
A5 0.04 01011

Lavg=0.4+0.3x2+0.1x3+0.1x4+0.06x5+0.04x5=2.2 bits/symbol
H=2.14 bits/symbol
Huffman coding

 Huffman code is an instantaneous, uniquely


decodable, block code:
 Instantaneous: each code word in a string of code
symbols can be decoded without referencing
following symbols
 Block code: each source symbol is mapped into a
fixed code word.
 Unique: each string of code symbols can be
decoded in only one way.
Golomb Coding
• Coding of nonnegative integers with exponentially decaying
probability
• n nonnegative integer to be coded, m parameter of the
Golomb code
1. Form the unary code of quotient bn/mc(unary code of an
integer q is q 1s followed by a 0)
2. Let k = dlog2me c = 2 k — m r = n mod
and compute r’
r m to k-1 bits
truncated 0 r < c

r0 = r + c truncated to k bits
otherwise
3. Concatenate the results of steps 1 and 2
Golomb Coding
Golomb Coding
• Golomb codes are seldom used for coding of image intensities
• Intensity differences can be coded using Golomb codes after
converting negative values to positive

2n n ≥ 0 2 |
M (n) =
n| — 1n < 0
Arithmetic Coding
• There is no one to one correspondence between source
symbols and code words.
• An entire sequence of source symbols is assigned a single
code word.
• Each source symbol is represented by an interval in [0,1). As
the number of symbols increases the size of the interval
reduces in accordance with the probability of occurrence of
symbols.
Arithmetic Coding
a1 0.2 [0,0.20
a2 0.2 [0.2,0.4)
a3 0.4 [0.4,0.8)
a4 0.2 [0.8,1.0)
Sequence to be coded a1 a2 a3 a3 a4
Arithmetic Coding
• Inaccurate probability models can lead to non-optimal results
• Solution: use an adaptive, context dependent probability
model
• Adaptive: symbols probabilities are updated as the symbols
are coded.
• Context dependent: probabilities are based on a predefined
neighborhood of pixels around the symbol being coded.
LZW
• At the beginning of the coding process, a codebook or
dictionary containing the source symbols to be coded is
constructed.
• The encoder examines each symbol in the message and
concatenates it with the symbols following it to find
the longest sequence which is in the dictionary.
• If after concatenating, the sequence is not in the dictionary,
the encoder adds the sequence as the new element to the
dictionary.
LZW: Example
Encode “wabba_wabba_wabba_wabba_woo_woo_woo”
Output: 5 2 3 3 2 1 6 8 10 12 9 11 7 16 5 4 4 11 21 23 4
Dictionary:
1. _ 6. wa 11. _w 16. ba_ 21. oo
2. a 7. ab 12. wab 17. _wa 22. o_
3. b 8. bb 13. bba 18. abb 23. _wo
4. o 9. ba 14. a_w 19. ba_w 24. oo_
5. w 10. a_ 15. wabb 20. wo 25. _woo
LZW
• LZW compression has been integrated in many imaging file
formats including GIF, TIFF and PDF.
• Most practical applications require a strategy for handling
dictionary overflow
• A simple solution is to flush or reinitialize the dictionary
when it becomes full
• A more complex solution is to monitor compression
performance and flush the dictionary when compression
becomes poor or unacceptable.
Run Length Coding
• Run-length coding: coding the length of runs instead of coding
individual values
– Exp: 190 white pixels, followed by 30 black, followed by 210 white
– Instead of coding 430 pixels individually, we code the sequence of
190,30, 210 along with an indication of the color of the first string
G3
• Group 3: includes two coding schemes:
1. 1-D: coding of each line is performed independently of any other line
2. 2-D: coding of one line is performed using line to line correlation.
• 1-D coding: is a run-length coding scheme in which each line
is represented as alternative white and black runs from left
to right.
• The first run is always a white run. So, if the first pixel is
black, the white run has a length of zero.
• Runs of different lengths occur with different probabilities,
therefore they are coded using VLC (variable length codes).
• CCITT uses Huffman coding
G3
• The number of possible length of runs is huge and it is not
feasible to build a code book that large.
• The run length rl is expressed as:
– rl=64*m+t t=0,1,..,63 m=1,2,…, 27
• To represent rl, we use the corresponding codes for m and
t.
• The code for t are called terminating codes and for m make-up
codes.
• If rl<63 only a terminating code is used
• Otherwise both a make-up code and a terminating code are
used
• A unique EOL (end of line) codeword 000000000001 is used
to terminate each line.
G3

 2-D coding (modified READ=MR): rows of a


facsimile image are heavily correlated. Therefore it
would be easier to code the transition points with
reference to the previous line.
 a0: the last pixel known to both encoder and
decoder. At the beginning of encoding each line
a0 refers to an imaginary white pixel to the left of
actual pixel. While it is often a transition pixel it
does not have to be.
 a1: the first transition point to the right of
a0. It has an opposite color of a0.
 a2: the second transition pixel to the right of a0.
Its color is opposite of a1.
G3
• b1: the first transition pixel on the line above currently being
coded, to the right of a0 whose color is opposite of a0.
• b2: the first transition pixel to the right of b1 in the line above
the current line
G3
• If b1 and b2 lie between a0 and a1 (pass mode): no transition
until b2. Then a0 moves to b2 and the coding continues.
Transmitter transmits code 0001.
• If a1 is detected before b2
– If distance between a1 and b1 (number of pixels) is less than or equal
to 3, we send the location of a1 with respect to b1, move a0
to a1 and the coding continues (vertical mode)
– If the distance between a1 and b1 is larger than 3, we go back to 1-D
run length coding and encode the distance between a0 and a1
and a1 and a2 (horizontal mode) using run-length coding. a0 is
moved to a2 and the coding continues.
Symbol Coding
• Symbol or token coding: image is represented as a collection
of frequently occurring sub-images called symbol
• Symbols are stored in the symbol dictionary
• Image is coded as a set of triplets
{(x 1 , y 1 , t 1 ), (x 2 , y 2 , t 2 ), · · · }

• Storing repeated symbols once can compress images


Symbol Coding
Bit-plane coding
• This method is based on decomposing a multilevel
(monochrome or color) image into a series of binary images
and compressing each binary image.
• We review different approaches for decomposition and
compression of binary images.
Bit plane
decomposition 7
One 8-bit pixel value 6
5
Bit plane 7 4
3
(most significant) 2
1
0

Bit plane 0
(least significant)
Bit plane slicing
0 0 1
0 1 0
0 0 1
17 64 128 0 0 0
0 0 0
1 0 0 0 1
15 63 132 0
0 0 0 0 1
0 0
11 60 142 MSB 0 0
0 0 1 0
0 0 1 0 0
0 0 1 0
1 0
1 0 1
0 0 1 1 1
1 0 1
1 0 1
0 0
0 0 1 1 1
0 1 0
0 0 1 LSB 1 11
1 0 1
1 0 0 1 0 0
1 1 0
1 0 0
Bit-plane
• decomposition
Disadvantage: a small change in gray level can have a
significant impact in bit planes.
• Example: two adjacent pixels with values of 127 and
128.
– 127: 01,111,111
– 128: 10,000,000
– There is a transition (0 to 1 or 1 to 0) in every bit plane at the location
of these pixels.
– This transition reduces the coding efficiency.
Bit-plane
• decomposition
Alternative method: first represent the image by an m-bit
Gray Code.
• Property of Gray code: successive code words differ only in
one bit position.
• Small changes in the gray level are less likely to affect all bit
planes.
• Example: two adjacent pixels with values of 127 and 128.
– 127: 01,000,000
– 128: 11,000,000
• To convert am-1 am-2 ……a0 to Gray code:

• gm-1= am-1 gi= ai ai+1


Compressing the bit
• A planes
simple but effective method of compressing a binary image
(or bit plane) is to use special code words to identify large
areas of contiguous 1’s or 0’s.
• Constant Area Coding:
– The image is divided into mxn blocks.
– The blocks are classified as all white, all black or mixed.
– The most probable category is assigned the one bit code word 0.
– The other two categories are assigned the code words 10 and 11.
– The mixed intensity category code is followed by the mn bit patterns
of the block.
Compressing the bit
• planes
Binary Tree based method: the binary image is decomposed
successively into smaller and smaller subblocks.
• A white block is coded as a 0 and all other blocks are assigned
a prefix of 1 and divided into smaller subblocks.
• If a subblock is white it is represented by 0 so the code will be
10.
• If the subblock is not white, the decomposing process is
repeated until a predefined subblock size is reached and coded
as either a 0 (if it is all white) or a 1 followed by the block bit
pattern.
Compressing the bit
• planes
One dimensional run-length coding:
• Each row of an image or bit plane is represented by a
sequence of lengths that describe successive runs of black
and white pixels.
• Example: 0000111111000 -> 4 6 3
• The problem is to determine the value of the first run.
• The most common approaches for determining the value of
first run:
1. To specify the value of the first run in each row
2. To assume that each row begins with a white run whose length may
be zero
Transform Coding
• The modification is performed on the transform of an image
• A reversible, linear transform is used to map the image into a
set of transform coefficients.
• The transform coefficients are then quantized and coded.
• A significant number of the coefficients have small magnitude
and can be coarsely quantized (or discarded entirely) with
little image distortion.
Transform Coding
• The goal of transformation process is to decorelate the pixels
or to pack as much information as possible into the smallest
number of coefficients.
• Note that compression is achieved during the quantization of
the transformed coefficients not during transform.
Transform Coding
• The choice of particular transform depends on the amount of
reconstruction error that can be tolerated and computational
resources available.

DFT
Discrete Cosine
Transform) DCT
Walsh-Hadamard
Transform (WHT)

bk(z) is the kth bit (from right to left) in the


binary representation of z

p0(u)=bm-1(u)
p1(u)=bm-1(u)+ bm-2(u)

Pm-1(u)=b1(u)+ b0(u)
Transform Coding
• Transformations that pack the most information into the
fewest coefficients provide the best approximation,
and consequently the smallest reconstruction error.
• Karhunen-Loeve Transform (KLT) is the optimal
transformation in an information packing sense.
• It minimizes the mean-square-error for any input image and
any number of retained coefficients.
• KLT is data dependent: obtaining the KLT basis images is
computationally complex.
Transform Coding
• KLT is seldom used in practice, instead DCT whose bases
images are fixed, normally is selected.
• DCT has been used in international standards for image
compression.
• It has been implemented in a single IC.
• Subimage size: affects transform coding error and
computational complexity.
– Normally it is selected as a power of 2 (simplifies
the computations)
Bit allocation
• Bit allocation: which transform coefficients are kept and
what is the precision that is used to represent them.
• Retaining coefficients is equivalent to applying a mask to the
coefficients. The mask is one where the coefficients are
kept and zero where the coefficients are discarded.
• Precision of representation: number of levels of scalar
quantizer used for a coefficient
• Retaining the coefficients:
1. Maximum variance (zonal coding)
2. Maximum magnitude (threshold coding)
Zonal Coding
• Uses the information theory concept of viewing information
as uncertainty.
• Transform coefficients of maximum variance carry the most
picture information and should be retained.
• How to calculate the variance:
1. From the (N/n)(N/n) transformed subimage arrays
2. An assumed image model
• Coefficients of maximum variance are located around the
origin of the transform
• The retained coefficients must be quantized and coded.
Threshold coding
• In zonal coding a single, fixed mask is used for all subimages.
• Threshold coding is adaptive: the location of the transform
coefficients retained for each subimage vary from one
subimage to another one.
• For each subimage, the transform coefficients of largest
magnitude make the most significant contribution to
reconstructed subimage quality.
• When the mask is applied to the subimage for which it was
derived, the resulting nxn array is reordered (zigzag scanning)
and the 1-D sequence which contains long runs of zeros is
run-length coded.
Threshold coding
• What is the threshold and how it is obtained?
1. A single global threshold for all subimages
2. A single threshold for each subimage
3. The threshold can be varied as a function of the location of each
coefficient within the subimage.
Lossless predictive
coding
 In this technique there is no need to decompose
the image into bit planes.
 This technique eliminates the interpixel
redundancies of closely spaced pixels.
 How? By extracting and coding only the new
information in each pixel
 New information: the difference between actual
and predicted value of that pixel
Lossless predictive
• coding
Predictors in encoder and decoder are the same.
• Various local, global and adaptive methods can be used to
generate the prediction
Lossless predictive
• coding
Linear predictor is common:
• Previous pixels are used to estimate the value of the current
pixel
• The previous pixels could be on the same row (column) with
the current pixel (1-D prediction) or around the current
pixel (2-D)
"
Xm
ˆ
f(n) = round ↵i f(n —
i) i=
1
Lossy compression
• Lossy compression: compromising the accuracy of the
reconstructed image in exchange for increased compression
• Good reconstructed images even with a compression factor of
30:1.
• The principle difference between lossy and lossless
compression is presence or absence of quantizer.
Lossy predictive
• coding
The prediction at the decoder and encoder should
be the same.
• This will prevent error built up at the decoder
output
Delta modulator
Optimal quantizer

 t=q(s)
 Quantizer is completely described by ti and si
the the first quadrant.
 si: decision points
 ti: reconstruction levels
 Design problem: select si and ti for a particular
optimization criterion and input probability
density function p(s)
Optimal quantizer
If mean square error is used and p(s) is assumed to
be even then (Lloyd-Max quantizer):
Wavelet coding
• Image is wavelet transformed
• Many of the computed coefficients carry little visual
information, they can be quantized.
• The quantization can exploit positional correlation across
decomposition levels.
Wavelet coding
• Wavelet functions selected for wavelet compression, affect
the performance of wavelet compression system
• The most widely used wavelet functions for compression are
Daubechies wavelets.
• Another factor affecting wavelet coding computational
complexity and reconstruction error is the number of levels.
• In many applications, the resolution of image and the scale of
the lowest useful approximation normally determine the
number of transform levels.
Wavelet coding
Wavelet coding
• Largest factor effecting wavelet compression and
reconstruction error is coefficient quantization
• Effectiveness of the quantization can be improved
significantly by
1. introducing an enlarge quantization interval around zero called a dead
zone
2. adapting the size of quantization interval from scale to scale
JPEG 2000
• JPEG2000 extends JPEG to provide increased flexibility in
compression and access to compressed data
• First step of encoding process is to DC level shift the samples
of the image by subtracting 2Ssiz-1 (Ssiz is number of bits
representing each pixel)
• After the image has been level shifted it is optionally divided
into tiles
• Tiles are rectangular arrays of pixels
• One dimensional discrete wavelet transform of rows and
columns of each tile is then computed
• For lossless compression, transform is based on a
biorthogonal 5-3 coefficient scaling and wavelet vector
JPEG 2000
• A rounding procedure is defined for non-integer valued
transform coefficients
• In lossy compression, a 9-7 coefficient scaling and wavelet
vector is employed
• The transform produces four subbands, an approximation,
horizontal, vertical and diagonal
• Transform might be repeated on the approximation
component
• Coefficients in subbands are quantized
• The final steps of encoding are coefficient bit modeling,
arithmetic coding, bit stream layering and packetizing.
JPEG 2000
• Coefficients of each transformed tile subbands are arranged
into rectangular blocks called code blocks
• Code blocks are individually coded a bit plane at a time
• Each bit plane is processed in three passes: significant
propagation, magnitude refinement, and cleanup
• Outputs are arithmetically coded and grouped with similar
passes from other code blocks to form layers
Video compression
• standards
Video compression standards can be grouped into two
categories:
1. video teleconferencing standards (H.261, H. 262. H.263)
2. multimedia standards (MPEG1, MPEG2, MPEG4)
• Both groups are built around a hybrid block-based DPCM/
DCT coding scheme
• There are three basic types of encoded output frames
1. Frames that are coded without any reference to past frames (I
frames)
Video compression
standards
– Because I frames do not use temporal correlation, the compression
rate is low
– I frames provide the highest degree of random access, ease of editing,
and greatest resistance to errors
2. Predictive coded (P frames): are coded using motion
compensation prediction from the last I or P frame,
whichever happens to be closest.
– Compression efficiency of P frames is substantially higher than I
frames.
Video compression
3. standards
Bidirectionally predictive coded (B frames): achieve a high
level of compression by using motion compensation
prediction from the most recent frame and the closest future
frame.
– By using both past and future for prediction, generally we can get
better compression than if we only use prediction based on the past.
Video compression
standards
Lossy compression
• In a lossless compression, the reconstructed signal is identical
to the original signal
• Only a limited amount of compression can be obtained with
lossless compression
• There is a floor (entropy of the source) below which we
cannot drive the size of the compressed signal
• In some applications consequences of loss of information may
prohibit us from loss of information (bank records, medical
images)
Lossy compression
• If the resources are limited and we do not require absolute
integrity, we can improve the amount of compression by
accepting certain degree of loss during compression
• Lossless compression: rate is the only performance
measure
• In lossy compression rate by itself is not enough
• A distortion measure (difference between original and
reconstructed data) is also required
• Goal in lossy compression: incur minimum amount of
distortion while compressing to the lowest possible rate
• There is a tradeoff between minimizing rate and keeping
distortion small
Source Coding
• WhatTheorems
is the smallest rate, subject to a given distortion, at which the
information of the source can be represented?
• For every source there is a Rate-Distortion (RD) function that gives the
minimum rate achievable for a given distortion.
• RD function can be computed analytically for simple sources and
distortion measures and iteratively for more complex situation.
• RD functions have typical form shown below.
• RD function is a lower bound, and the source coding theorem tells us that
we can get as close as possible to this lower bound as we want.
Error free (lossless)
• compression
Applications: archiving of medical or business documents
where lossy compression is prohibited for legal reasons.
• Typical compression ratio: 2 to 10
• 2 operations are involved:
1. Reduce the interpixel redundancies using a mapping or
transformation
2. Coding the representation to eliminate coding
redundancies

• Variable Length Coding (VLC)


– Assigns the shortest possible code words to the most probable gray
levels.
– The input to VLC could be the gray levels of an image or the output
of a mapping operation.
Contour tracing and
• coding
Different approaches for representing the contour of a binary
image:
1. Set of boundary points
2. A single boundary point and a set of directions (directional contour
tracking)
3. Predictive differential quantization (PDQ)

PD
’: difference between the starting coordinate of the front
contour on adjacent lines Q
 ’’: difference between front to back contour lengths (d1-
d2)
• New start: a new contour
• Merge: old contour ends
• If ’’ is replace by ’’’ (difference between back contour
coordinates of adjacent lines) the code is called double delta
coding (DDC).
Image compression
• standards
The standards have been developed under the supervision of
ISO and CCITT.
• Binary image compression standards
• The most widely used standards for bilevel image
compression are CCITT Group 3 and Group 4.
• They were originally developed for FAX coding.
• Group 3: includes two coding schemes:
1. 1-D: coding of each line is performed independently of
any other line
2. 2-D: coding of one line is performed using line to line
correlation.
G3
• 1-D coding: is a run-length coding scheme in which each line
is represented as alternative white and black runs from left
to right.
• The first run is always a white run. So, if the first pixel is
black, the white run has a length of zero.
• Runs of different lengths occur with different probabilities,
therefore they are coded using VLC.
• CCITT uses Huffman coding
G3

 The number of possible length of runs is huge


and it is not feasible to build a code book that
large.
 The run length rl is expressed as:
 – rl=64*m+t t=0,1,..,63 m=1,2,…, 27
 To represent rl, we use the corresponding codes
for m and t.
 The code for t are called terminating codes and for
m make-up codes.
 A unique EOL (end of line) is used to terminate
each line.
G3
• 2-D coding: rows of a facsimile image are heavily correlated.
Therefore it would be easier to code the transition points
with reference to the previous line.
• a0: the last pixel known to both encoder and decoder. At the
beginning of encoding each line a0 refers to an imaginary
white pixel to the left of actual pixel. While it is often a
transition pixel it does not have to be.
• a1: the first transition point to the right of a0. It has an
opposite color of a0.
• a2: the second transition pixel to the right of a0. Its color is
opposite of a1.
G3
• b1: the first transition pixel on the line above currently being
coded, to the right of a0 whose color is opposite of a0.
• b2: the first transition pixel to the right of b1 in the line above
the current line

b1 b2

a0 a1 a2
G3
• If b1 and b2 lie between a0 and a1 (pass mode): no transition
until b2. Then a0 moves to b2 and the coding continues.
• If a1 is detected before b2
– If distance between a1 and b1 (number of pixels) is less than or equal
to 3, we send the location of a1 with respect to b1, move a0
to a1 and the coding continues (vertical mode)
– If the distance between a1 and b1 is larger than 3, we go back to 1-D
run length coding and send the distance between a0 and a1 and
a1 and a2 (horizontal mode)
Still image
• compression
JPEG: most popular and comprehensive continuous tone, still
image compression standard.
• JPEG defines 3 different coding systems:
1. Baseline coding (lossy, very common)
2. Extended coding, for greater compression, higher precision or
progressive reconstruction applications
3. Lossless coding

• Baseline system (sequential baseline system)


• Input and output precision is limited to 8 bits.
• Quantized DCT values are restricted to 11 bits.
JPEG

 The image is divided into 8x8 blocks, which are


processed from left to right and top to bottom.
 The 64 pixel values in each block are first level
shifted by subtracting 128.
 The 2-D DCT of the block is then computed and
quantized and reordered using the zigzag
pattern.
 The DC coefficient ( first DCT coefficient) is
differentially coded relative to the DC coefficient of
the previous block, and then Huffman coded.
 The rest of coefficients are coded using a variable
length code that uses the value of the coefficients
and number of zeros before the coefficient.
JPEG

 Since the number of values that the quantized


coefficients can assume is large, a Huffman code
for such an alphabet is unmanageable.
 The values are divided into ranges
(categories) and the categories are Huffman
coded.
 The elements in each category is specified by
tacking on some extra bits.

You might also like