JPEG COMPRESSION
https://medium.com/breaktheloop/jpeg-compression-algorithm-
969af03773da
OVERALL JPEG SCHEMATIC (Transmitter side)
• JPEG stands for Joint Photographic Experts Group. We perform such
type of compression to reduce the size of the file without damaging
its quality.
• Each image is divided to 8*8 pixel blocks called MCUs (Minimum
Coded Units). Thus each MCU has 64 units.
JPEG Compression Algorithm
• JPEG Compression algorithm has five main basic steps.
1. RGB color space to YCbCr color space Conversion
2. Preprocessing for DCT transformation
3. DCT Transformation
4. Co-efficient Quantization
5. Lossless Encoding
• 1. RGB color space to YCbCr color space conversion
• A digital image in RGB format that is a combination of Red, Green, Blue
color channel is converted to YCbCr color channels. Y is the brightness of
the image and Cb is the blue difference relative to the green color and Cr is
the red difference relative to the green color.
• YCbCr is a color space, that means YCbCr is another colour space just like
RGB.
• What is meant by color space?
• When colors need to be used in digital media like cameras and laptops,
colors need to be presented in numbers. because digital media can only
understand numbers. Therefore color space is a set of rules that allows
describing colors with numbers.
• RGB(Red-Green-Blue) is a color space that gives entire spectrum of
colors.
Q: Why color is converted from RGB to YCbCr in JPEG Compression?
• Here Y is the luma or luminance component of the color. Luma
component is the brightness of the color. That means the light
intensity of the color. The human eye is more sensitive to this
component.
• Cb and Cr is the blue component and red component related to the
chroma component. That means “Cb is the blue component relative
to the green component. Cr is the red component relative to the
green component.” These components are less sensitive to the
human eyes.
• Since the Y component is more sensitive to the human eye, it needs
to be more correct and Cb and Cr is less sensitive to the human eye.
Therefore it needs not to be more accurate. When in JPEG
compression, it uses these sensitivities of the human eye and
eliminate the unnecessary details of the image.
• The high frequency colour components are less sensitive to eye and
they carry finer details of the image which the eyes cannot detect. So,
these frequencies are suppressed and eliminated during coding.
• SO THE FIRST STEP CONVERTS RGB COLOR SPACE TO YCbCr COLOR
SPACE SO THAT HIGH FREQUENCY COLOURS CAN BE ELIMINATED.
STEP 2: PREPROCESSING FOR DCT
• First an image need to be separated for 8*8 pixel blocks. That means
each block has 8*8 pixels and it is 64 pixels in one block. Let’s assume
that the dimensions of this image is 240*320. That means this image
has 76800 pixels. if we divide this in 64 we can get the number of
blocks. 76800 = 64 * 1200. That means we have 1200 blocks in this
image. Following image is the partitioned image.
This 1 block is an MCU which has 8*8 = 64 pixels. These
pixels will have certain pixel values in grayscale
For this subtract 127 from each of these values. The pixel values are shifted into
the range {-128, 127} with zero as the centre for helping us perform the next
steps.
The new matrix is in next slide.
The new matrix
STEP 3: DCT TRANSFORM
• DCT Transform is done on the new matrix which has new pixel values
depending on intensities of gray scale. Forward Direct Cosine
Transform or FDCT is applied.
• Till now, the pixel values were showing time domain values, that is
the picture is in space-time domain. Now in FDCT, the picture is
converted to frequency domain. This helps in encoding the picture so
that the high frequency colours are suppressed since our eyes are less
sensitive to high frequency colours. Suppressing high frequency
colours does not affect the quality of picture much as high frequency
colours give very fine details of image which our eyes cannot resolve.
JPEG PROCESS
• What we do with this Direct Cosine transform is that we represent the
image pixels with different of cosine waves. By doing this we
eliminate high frequencies in the signal. Because human eye is not
sensitive to the very high frequency changes of the image. It is the
graphical explanation of this transformation. If we talk about the
numerical explanation, this transformation creates a new matrix that
has the values in left upper corner of the matrix and the values at
other places are almost nearly zero. Following is the new matrix after
the transformation.
In each pixel block, i.e. 1 MCU (Minimum Coded Unit), there are 64 DCT
values after applying DCT transform. These values are compared with a
standard JPEG Quantisation Table and a matrix of quantised values is
formed. When DCT values are quantised, noise or loss is introduced
STEP 4: COEFFICIENT QUANTIZATION
• 4. Coefficient Quantization
• Here what this step does is that the values near to 0 is converted to
the 0 and other elements also shrink towards zero. Then each value
of the resultant matrix is divided by another matrix called standard
jpeg quantization table. Quantization helps to reduce the number of
bits needed to encode the image in JPEG compression. However
quantization introduces loss or noise in compression.
• Quantisation is used to convert continuous values to discrete values
by referring a Q-table.(Quantisation table)
• After applying quantization, the matrix is
Quantization converts continuous values to discrete values by division
and rounding off.
• LOSSLESS Run length encoding is applied ON ZIGZAG ORDERED
quantised DCT values. The zigzag ordering helps in applying run
length encoding.
JPEG STEPS:
1. The RGB picture is converted to YCbCr space.
2. The picture is pre-processed by dividing the
picture to 8*8 pixel Minimum coded units.
3. Each of the 64 pixels in MCU in time domain
are converted to frequency domain by applying
Direct Cosine Transform (DCT)
• The DCT values of pixels indicate frequency values. These values are
compared with JPEG Quantisation table values and are divided with
standard JPEG quantisation table values to generate a matrix of 64
quantised values. Quantization introduces loss or noise. However, our
visual senses often overlook the loss and this property is used for
compression. Those frequencies (especially the high frequencies of
colour) to which our eye is not very sensitive are not transmitted. The
quality of picture is not affected much. Quantisation enables to
encode the signal with fewer bits thus increasing CR (compression
ratio).
• The quantized DCT values are stored in a zig zag
manner. The top left corner of the 64 valued matrix
has some values but most of the high frequency DCT
values (middle to bottom right) are zero. The
quantised DCT values are arranged or ordered in a zig-
zag manner so that run length encoding can be
applied on it. So, run length encoding (LOSSLESS
ENCODING) is applied which makes the encoding
efficient. The image now is JPEG Compressed.
• The top leftmost pixel value of DCT is called DC value of the block as it
shows the fundamental colour this is called AC00 value. The DC value
of DCT gives the fundamental colour of the block. This is the least
frequency colour. All other DCT values in the matrix are AC values like
A01, AC02, …. AC07 or AC10, AC20, ……AC70 which form a matrix of
DCT AC values or frequency of colours. AC00 is the least frequency
colour (fundamental colour), AC77 is the highest frequency colour.
(Refer the picture) . It is seen that the DCT values are there for top
leftmost corner of the matrix, other values are zero or near to zero.
So RUN LENGTH ENCODING CAN BE APPLIED FOR GOOD
COMPRESSION RATIO (CR)
• Now the DCT AC values are stored in a zig-zag pattern It is seen that
the top leftmost part of the matrix has all the lower frequencies with
some values. But all higher frequencies have zero or near zero DCT AC
values. All these frequencies can be removed applying Run length
encoding/Huffman encoding which are lossless encoding
• Explain JPEG Compression with neat diagram
• Why RGB is converted to YCbCr space in JPEG Compression?
• What is MCU? Minimum Coded Unit. It is a 8*8 pixel block. During
JPEG compression, a picture is divided to MCUs for pre-processing the
image.
• How many MCU blocks are there in a picture of 480*640 pixels?
• Ans: 480*640/8*8 = 4800 blocks
• Why JPEG DCT (Direct Cosine) quantised values are ordered in Zig-Zag
manner? To enable Run Length Encoding
• What is DCT AC00 value of the frequency component of 8*8 pixel
block. Each of the 64 bit blocks are transformed to frequency domain
values called DCT AC values. AC00 is DCT DC value which is
fundamental colour frequency.
• Q: What does AC77 value indicate?
• AC77 is the highest frequency with very less DCT value.
• Q: We have applied lossless RUN LENGTH ENCODING or HUFFMAN
ENCODING to compress the DCT Quantised values in JPEG. Why then
is JPEG compression lossy?
• When we are quantizing the DCT coefficients, then only we are
introducing noise and thus loss in introduced. This loss remain when
we apply lossless encoding at the last stage. So, JPEG is lossy.
• At the decoding end, Inverse Direct Cosine method is applied (IDCT)
to get back the signal which is image with lossy compression (JPEG)
What the compression standards and CRs
• ***Q: What are the 4 modes of JPEG compression that JPEG should
compulsorily support?
• OR
• What 4 modes of operation must JPEG compulsorily support?
JPEG REQUIREMENTS
• JPEG implementation should be independent of image size.
• JPEG implementation should be applicable to any image & pixel
aspect ratio.
• Color representation should be independent of any special
implementation.
• Image content may be of any complexity.
• JPEG standard specifications should be state-of-art w.r.t.
compression factor & achieved image quality.
• Processing complexity must permit a software solution. To run on as
many available standard processors as possible.
• Sequential line by line decoding & progressive decoding MUST be
POSSIBLE. A lossless hierarchical coding of the same image MUST BE
SUPPORTED
Q:JPEG is used to compress: Text, audio, video, image. Ans: Image
Q: In JPEG, during preprocessing of image, The MCUs consist of
• 8*8 pixels, 4*4 pixels, 256*256 pixels, none of the above
• Ans: 8*8 pixels.
• A picture/image of 320*240 blocks has how many blocks or MCUs?
• Ans: 320*240/8*8 = 1200
• The no. of DCT components of picture in JPEG is: 32, 64, 128: Ans: 64
• AC00 value is called: (i) DCT DC value, (ii) fundamental colour
component, none of the above, both (i) & (ii) And: Both (i) and (ii)
• The quantized DCT components are arranged in zig zag fashion for
• (1) Efficient Run length encoding (ii) for quantisation (iii) for scaling
• Ans: (i) Efficient run length encoding.
• Our eyes are more sensitive to luminance/intensity values than
______
• Chrominance, hue, saturation
• Ans: Chrominance
• In JPEG, the color image is converted from RGB space to
• (i) YMCK space (ii) YUV space (iii) YCbCr pace
• Ans: YCbCr space