A Lecture Series on
DATA COMPRESSION
Lossy Compression | Transforms
Abdou Youssef (Speaker)
Anastase Nakassis (MDVTG Manager)
1
Motivation for Transforms
Why transform the data
{ To decorrelate the data so that fast scalar (rather than
slow vector) quantization can be used
{ To exploit better the characteristics of the human visual
system (HVS) by separating the data into vision-sensitive
parts and vision-insensitive parts
{ To compact most of the \energy" in a few coecients, so
that to discard most of the coecients and thus achieve
compression
2
Desirable Transforms
Desirable Properties of transforms
{ Data Decorrelation, exploitation of HVS, and energy com-
paction
{ Data-Independence (same transform for all data)
{ Speed
{ Separability (for fast transform of multidimensional data)
Various transforms achieve those properties to various ex-
tents
{ Fourier Transform
{ Discrete Cosine Transform (DCT)
{ Other Fourier-like transforms: Haar, Walsh, Hadamard
{ Wavelet transforms
The Karhunen-Loeve Transform
{ Optimal w.r.t. data decorrelation and energy compaction
{ But it is data-dependent
{ And slow because the transform matrix has to be com-
puted every time
{ Therefore, KL is only of theoretical interest to data com-
pression
3
Dierent Perspectives of Transforms
Statistical perspective
Frequency perspective
Vector space perspective
End-use perspective (matrix formulation)
4
Matrix Formulation of Transforms
Simply stated, a transform is a matrix multiplication of the
input signal and the transform-matrix
Each of the standard transforms mentioned earlier is dened
by an N N square non-singular matrix A N
Transform of a 1D discrete input signal (a column vector x
of N components) is the computation of
y=A x N
Transform of an N M image I is transform of each column
followed by transform of each row. In matrix form, transform
of image I is the computation of
J = A IA N
t
M
The inverse transform is simply x = A, y for 1D signals,
1
and I = A,1J (A,1)
N
N M
t
for images
5
Transform-Based Lossy Compression
Compression of an image I :
1. Transform I , yielding J = A IA N
t
M
2. Scalar-quantize J , yielding J 0
3. Losslessly compress J 0, yielding a bit stream B
Image reconstruction
1. Losslessly decompress B back to J 0
2. Dequantize J 0, yielding an approximation J^ of J
3. Inverse-transform J^, yielding a reconstructed image
I^ = A,1JA
^ ,1 :
N M
t
6
Denition of The Matrices of
the Standard Transforms
Except for the case of the KL transform, the characterizing
matrix A of each of the standard transforms is independent
N
of the input, that is, A is the same for all 1D signals and
N
images.
In the following, the matrix A of each transform will be
N
dened for arbitrary N , and then A2, A4 and A8 will be
shown
7
The Matrix of the Fourier Transform
The matrix A = (a ) for the Fourier Transform:
N kl
v
u
1 , 2Ni
u
u
a = e ; for k; l = 0; 1; : : : ; N , 1
kl
u
t
N
kl
Remark: A = A = A, t
N N N
1
s 01 1
1 CC
A =
2
1 B
2
B
@
1 ,1 A
0 1
B
B 1 1 1 1 CC
B
B 1 , i , 1 i CC
A4 = 12 BBBB 1 ,1 1 ,1 CCCCC
B
B CA
@
1 i ,1 ,i
p p
Let a = (1 + i) and a = 22 (1 , i)
2
2
0 1
B
B 1 1 1 1 1 1 1 1 CC
B
B
B
B 1 a ,i ,a ,1 ,a i a CC
CC
B
B
B
B 1 ,i ,1 i 1 ,i ,1 i CC
CC
1 B 1 ,a , , ,a
B CC
s B B i a
A8 = 8 BB 1 ,1 1 ,1 1 ,1 1
B
1 a i CC
CC
B
B
,1 CCC
B
B
B
B 1 ,a ,i a ,1 a i ,a CC
CC
B
B
B
B 1 ,i ,1 ,i 1 i ,1 ,i CC
CC
@
1 a i ,a ,1 ,a ,i a A
8
The Matrix of
the Discrete Cosine Transform (DCT)
The matrix A = (a ) for DCT:
N kl
v
u
u
u 1
a = N ; for l = 0; 1; : : : ; N , 1
0l
u
t
v
u
a = uut N2 cos (l +N )k ; for 1 k N , 1; 0 l N , 1
u 1
2
kl
Remark: A, = A N
1 t
N
0 1
BB 1 p p1 p1 p1 CCCC
BB s s s s
A = 1 B
B 1+
BB
2
1, 2
, 1, 2
, 1+ 2 C
2 C
C
,p1 ,p1
2 2 2
2 B 1
CC
4
BB s
B@ p s s s 1
p CCCA
1, 2
2
, 1+ 2
2
1+ 2
2
, 1, 2
2
9
The Matrix of the Hadamard Transform
The matrix A = (a ) for the Hadamard Transform is de-
N kl
ned recursively:
{ A1 = (1) 2 3
p1 6664
A N2 A N 7
{A =N
2 A N2 ,A
2 77
N
5
2
Remark: A, = A = A
N
1 t
N N
0 1
BB 1 1 1 1 CC
s 01 1
1 CC BB
,1 1 ,1 CCCCC
A = 1 B
B A = 1 B BB 1
2 2
@
1 ,1 A 4 2 B 1
BB1 ,1 ,1 CCCA
1 ,1 ,1 1
@
0 1
B
B 1 1 1 1 1 1 1 1 CC
B
B
B
B 1 ,1 1 ,1 1 ,1 1 ,1 CCCCC
B
B
B
B 1 1 ,1 ,1 1 1 ,1 ,1 CCCC
A =
s BB
1 B 1
B
B
,1 ,1 1 1 ,1 ,1 1 CCCC
8 8 B
B
B
B
1 1 1 1 ,1 ,1 ,1 ,1 CCCCC
B
B
B
B 1 ,1 1 ,1 ,1 1 ,1 1 CCC
B
B
B
B 1 1 ,1 ,1 ,1 ,1 1 1 CCCCA
@
1 ,1 ,1 1 ,1 1 1 ,1
10
The Matrix of the Walsh Transform
The matrix A of the Walsh Transform is derived from the
N
Hadamard matrix by permuting the rows of the latter in a
certain way
Remark: A, = A = A
N
1 t
N N
0 1
BB 1 1 1 1 CC
s 01 1
1 CC BB
1 ,1 , 1 CCCC
A = 1 B
B A = 1 B BB 1
2 2
@
1 ,1 A 4 2 B 1 ,1 1 ,
BB 1 CCCCA
1 ,1 ,1 1
@
0 1
B
B 1 1 1 1 1 1 1 1 CC
B
B
B
B 1 1 1 1 ,1 ,1 ,1 ,1 CCCCC
B
B
B
B 1 1 ,1 ,1 1 1 ,1 ,1 CCCC
A =
B
1 B 1
s BB
B
1 ,1 ,1 ,1 ,1 1 1 CCCC
8 8 B
B
B
B
1 ,1 1 ,1 1 ,1 1 ,1 CCCCC
B
B
B
B 1 ,1 1 ,1 ,1 1 ,1 1 CCC
B
B
B
B 1 ,1 ,1 1 1 ,1 ,1 1 CCCCA
@
1 ,1 ,1 1 ,1 1 1 ,1
11
The Matrix of the Haar Transform
The matrix A = (a ) for the Haar Transform, where N =
N kl
2:
n
{ a = p for l = 0; 1; : : : ; N , 1
0l
1
{ for k 8> 1, pk,n= 2 + q, 0 q 2 , 1, 0 p n , 1
N
p p
>
> 2 2 if q 2 , l < (q + )2 ,
n p 1 n p
>
a = <>> ,2 p,2 n if (q + )2 , l < (q + 1)2 ,
2
1 n p n p
kl
> 2
>
: 0 otherwise
Remark: A, = A 1
N
t
N
0 1
BB 1 1 1 1 CCC
s 01 1
p p1 ,1 ,
BB
1 CC B1 1 CCCC
A =
2
1 B
2
B
@
1 ,1 A A = 4
1 B
2 B
B
BB 2 , 2 p0 p 0 CCCC
B@
0 0 2, 2 A
0 1
B
B 1 1 1 1 1 1 1 1 CC
B
B
B
B 1 1
p p p p 1 1 , 1 ,1 , 1 , 1 CC
CC
B
B
B
B 2 2 , 2 , 2 p0 p0 p0 p0 CCCCC
s B 0 2 2 , 2 , 2 CCCC
1 B 0 0 0
B
B
A =
8 B
8 B
B 2 ,2 0 0 0 0 0 0 CCCCC
B
B
B
B
B
B 0 0 2 ,2 0 0 0 0 CCC
B
B
B
B 0 0 0 0 2 ,2 0 0 CCCCA
@
0 0 0 0 0 0 2 ,2
12
Vector Space Perspective
Analog signals are treated as an innite-dimensional func-
tional vector space
Finite Discrete are signals treated as nite-dimensional vec-
tor spaces
In either case, the vector space has a basis fe j k = 0; 1; :::g
k
A transform of a signal x is a linear decomposition of x along
the basis fe g: k
{ x = P y e , where the fy g are real/complex numbers
k k k k
{ Transform: x ,! (y ) k k
{ (y ) is a representation of x
k k
Compression-related desirable properties of a vector-space
basis
{ Correspondence with the human visual system
{ Specically, only a very small number of basis vectors are
relevant to (i.e., visible by) the HVS, while the majority
of the basis vectors are invisible to the HVS
{ Uncorrelated decomposition-coecients (y ) k k
13
Relationship between the Vector Basis
and the Matrix Formulation of Transforms
Consider nite 1D discrete signals of N components
{ They form an N -dimensional vector space R N
{ Any basis consists of N linearly independent column vec-
tors e0; e1; : : : ; e ,1 N
{ For any signal x = (x0 x1 : : : x ,1) , x = P =0,1 y e
N
t N
k k k
0 1 0 1
x
BB 0 CC y BB 0 CC
x
BB CC
{ That is, = (e e : : : e
BB CC y BB
BB
CC
CC
N ,1 ) B
1 1
BB
BB
@
CC
CC
A
0 1
BB
B@
CC
CC
A
x , N 1 y , N 1
{ Equivalently, y = Ax, where the columns of A, are the 1
basis column vectors e ; e ; : : : ; e ,
0 1 N 1
14
Consider now N M images
{ They form an NM -dimensional vector space R N M
{ Any basis consists of N M matrices fE g of dimensions
N M
kl
{ Following the same analysis as above, a transform
I ,! J = A IA corresponds to basis
N
t
M
E = (column k of A, ):(column l of A, ) = e :e
kl N
1
M
1 t
k
t
l
for k = 0; 1; :::; N , 1 and l = 0; 1; :::; M , 1
{I=P J E k;l kl kl
Remark: For analog signals, the vector space is innite-
dimensional, and its basis is the innite set of sine and cosine
waves, to be addressed later
15
Visualization of DCT Basis Images
16
Fourier Transform
Consider a function x(t) that is either
{ of nite support [0; T ], or
{ periodic of period T
Assume x(t) to be square-integrable over [0; T ]
Fourier series of x(t) is:
2
b sin 2T kt; or
1
X 1
X
k =0
a cos T kt +
k
k =0
k
In (more elegant) complex form:
1
X 2 ikt 1Z , 2
k=,1
yek ; where y = T
T k
T
0
x(t)e T ikt
dt
Fourier Transform: x(t) ,! (y ) k k
17
Theorem: For all natural signals,
{ + , = P1 ,1 y e 2T
x(t )+x(t )
2 k= k
ikt
{ If x is continuous at t, x(t) = ,1 y e T
P1
k=
2
k
ikt
Therefore, x(t) is largely representable by (y )
k k
Theorem: y ,! 0 as jkj ,! 1
k
The representation x(t) = P1 ,1 y e 2T is periodic of pe-
k= k
ikt
riod T . Thus, even if x(t) is dened over [0; T ] only, the
Fourier series \periodizes" x(t)
18
Connection with The Human Visual System
e 2T is periodic of period ; thus, its frequency is
ikt T
k
k
T
The higher k, the higher the frequency
In x(t) = P1 ,1 y e 2T , y is the k-th frequency content of
k= k
ikt
k
x(t)
Experiments have shown that
{ suppressing a y (along with y, ) for any high frequency
k k
k causes HARDLY VISIBLE or NO VISIBLE change to
x(t)
{ suppressing a y (along with y, ) for some low frequency
k k
k causes VISIBLE changes to x(t)
Thus, the HVS is sensitive to low-frequency data but insen-
sitive to high-frequency data
19
Connection to Compression
(\First Cut")
Facts
{ x(t) = P1 ,1 y e 2T
k= k
ikt
{ y ,! 0 as jkj ,! 1
k
x^(t) = P
r
k=, y e 2
T is a good mathematical and visual ap-
k
ikt
proximation of x(t)
r
The faster the decay of y , the smaller r can be
k
Thus, (y )j j is a very small representation of x (^x to be
k k r
precise), leading to high compression
20
Treatment of Discrete Signals
(Discrete Fourier Transform)
Sample N values (x ) of x(t) at N points l , that is, x = T
x(l ) for l = 0; 1; :::; N , 1.
l l
N
T
N
x = x(l ) = P y e 2T NT = P y e 2N
l
T
N k k
ikl
k k
ikl
Since (x ) is discrete and nite, there is no need to keep an
l
innity of y 's; rather, y0; y1; :::; y ,1 are sucient. That is,
k N
,1 2 ikl
x= X
y e N ; l = 0; 1; :::; N , 1
N
l k
k =0
,1
y = X
x e 2N ; k = 0; 1; :::; N , 1
N
ikl
k k
l =0
DFT: (x ) ,! (y )
l l k k
Put in matrix form: y = A x, where A = (e N ) N
2
N
ikl
kl
Again, for large k, y can be suppressed is broadly quantized
k
21
Why Use DCT rather than DFT
(Boundary Problems of the Fourier Transforms)
Discontinuities at the boundaries cause large high-frequency
contents
Eliminating those frequency contents cause boundary arti-
facts (known as Gibbs phenomenon, ringing, echoing, etc.)
22
Consider the function
8
>
>
>
<T
t
if 0 t 2 T
x(t) = >
>
(2 , 1) , + 1 if 2 t T
t T
>
:0 otherwise
T
Its Fourier series is: 2
+1
x(t) = 4 + X 66 (1
4
, ) (,1) , 1 + i3775 e 2Ti
k
kt
k =06 2(k) 2
2k
23
Special case = 0:
1 X (,1) , 1 2Ti k
x(t) = 4 + =6 0 2(k)2 e
k
kt
Special case = 1:
x(t) = 12 + X i e 2Ti kt
6
k =0 2k
24
Relation of DCT to FFT
Let (x ) be an original signal, and (y ) its DCT transform,
l = 0; 1; :::; N , 1
l k
Shue x to become almost symmetric; that is, create a new
signal (x0 ) by taking the even-indexed terms followed by the
l
reverse of the odd-indexed terms:
{ x0 = x2 and x0 , ,1 = x2 +1, for 0 l N=2 , 1
l l N l l
y'=DFT(x');
s
y = Real
0
s
( x 0 ),
1
N 0
and y = k
2
N
Real(e ,2kN x0 ), k = 1; 2; :::; N , 1
k
25
Quantization in DCT-based Compression
26
DCT vs. KL
For most natural signals, the KL basis and the DCT basis
are almost identical
Therefore, DCT is near optimal (in decorrelation, energy
compaction, and rme distortion) because KL is optimal
Unlike KL, DCT is not signal-dependent
Hence the popularity of DCT
27