Wavelets and Subband Coding Overview
Wavelets and Subband Coding Overview
Subband Coding
Martin Vetterli
University of California at Berkeley
Jelena Kovačević
AT&T Bell Laboratories
Für meine Eltern.
A Marie-Laure.
— MV
A Giovanni.
Mojoj zvezdici, mami i tati.
— JK
Contents
Preface xiii
vii
viii CONTENTS
Bibliography 476
Index 499
Preface
A central goal of signal processing is to describe real life signals, be it for com-
putation, compression, or understanding. In that context, transforms or linear ex-
pansions have always played a key role. Linear expansions are present in Fourier’s
original work and in Haar’s construction of the first wavelet, as well as in Gabor’s
work on time-frequency analysis. Today, transforms are central in fast algorithms
such as the FFT as well as in applications such as image and video compression.
Over the years, depending on open problems or specific applications, theoreti-
cians and practitioners have added more and more tools to the toolbox called signal
processing. Two of the newest additions have been wavelets and their discrete-
time cousins, filter banks or subband coding. From work in harmonic analysis and
mathematical physics, and from applications such as speech/image compression
and computer vision, various disciplines built up methods and tools with a similar
flavor, which can now be cast into the common framework of wavelets.
This unified view, as well as the number of applications where this framework
is useful, are motivations for writing this book. The unification has given a new
understanding and a fresh view of some classic signal processing problems. Another
motivation is that the subject is exciting and the results are cute!
The aim of the book is to present this unified view of wavelets and subband
coding. It will be done from a signal processing perspective, but with sufficient
background material such that people without signal processing knowledge will
xiii
xiv PREFACE
find it useful as well. The level is that of a first year graduate engineering book
(typically electrical engineering and computer sciences), but elementary Fourier
analysis and some knowledge of linear systems in discrete time are enough to follow
most of the book.
After the introduction (Chapter 1) and a review of the basics of vector spaces,
linear algebra, Fourier theory and signal processing (Chapter 2), the book covers
the five main topics in as many chapters. The discrete-time case, or filter banks,
is thoroughly developed in Chapter 3. This is the basis for most applications, as
well as for some of the wavelet constructions. The concept of wavelets is developed
in Chapter 4, both with direct approaches and based on filter banks. This chapter
describes wavelet series and their computation, as well as the construction of mod-
ified local Fourier transforms. Chapter 5 discusses continuous wavelet and local
Fourier transforms, which are used in signal analysis, while Chapter 6 addresses
efficient algorithms for filter banks and wavelet computations. Finally, Chapter 7
describes signal compression, where filter banks and wavelets play an important
role. Speech/audio, image and video compression using transforms, quantization
and entropy coding are discussed in detail. Throughout the book we give examples
to illustrate the concepts, and more technical parts are left to appendices.
This book evolved from class notes used at Columbia University and the Uni-
versity of California at Berkeley. Parts of the manuscript have also been used at the
University of Illinois at Urbana-Champaign and the University of Southern Cali-
fornia. The material was covered in a semester, but it would also be easy to carve
out a subset or skip some of the more mathematical subparts when developing a
curriculum. For example, Chapters 3, 4 and 7 can form a good core for a course in
Wavelets and Subband Coding. Homework problems are included in all chapters,
complemented with project suggestions in Chapter 7. Since there is a detailed re-
view chapter that makes the material as self-contained as possible, we think that
the book is useful for self-study as well.
The subjects covered in this book have recently been the focus of books, special
issues of journals, special conference proceedings, numerous articles and even new
journals! To us, the book by I. Daubechies [73] has been invaluable, and Chapters 4
and 5 have been substantially influenced by it. Like the standard book by Meyer
[194] and a recent book by Chui [49], it is a more mathematically oriented book
than the present text. Another, more recent, tutorial book by Meyer gives an
excellent overview of the history of the subject, its mathematical implications and
current applications [195]. On the engineering side, the book by Vaidyanathan
[308] is an excellent reference on filter banks, as is Malvar’s book [188] for lapped
orthogonal transforms and compression. Several other texts, including edited books,
have appeared on wavelets [27, 51, 251], as well as on subband coding [335] and
multiresolution signal decompositions [3]. Recent tutorials on wavelets can be found
PREFACE xv
ACKNOWLEDGEMENTS
Some of the work described in this book resulted from research supported by the
National Science Foundation, whose support is gratefully acknowledged. We would
like also to thank Columbia University, in particular the Center for Telecommu-
nications Research, the University of California at Berkeley and AT&T Bell Lab-
oratories for providing support and a pleasant work environment. We take this
opportunity to thank A. Oppenheim for his support and for including this book in
his distinguished series. We thank K. Gettman and S. Papanikolau of Prentice-Hall
for their patience and help, and K. Fortgang of bookworks for her expert help in
the production stage of the book.
To us, one of the attractions of the topic of Wavelets and Subband Coding is
its interdisciplinary nature. This allowed us to interact with people from many
different disciplines, and this was an enrichment in itself. The present book is the
result of this interaction and the help of many people.
Our gratitude goes to I. Daubechies, whose work and help has been invaluable, to
C. Herley, whose research, collaboration and help has directly influenced this book,
and O. Rioul, who first taught us about wavelets and has always been helpful.
We would like to thank M.J.T. Smith and P.P. Vaidyanathan for a continuing
and fruitful interaction on the topic of filter banks, and S. Mallat for his insights
and interaction on the topic of wavelets.
Over the years, discussions and interactions with many experts have contributed
to our understanding of the various fields relevant to this book, and we would
like to acknowledge in particular the contributions of E. Adelson, T. Barnwell,
P. Burt, A. Cohen, R. Coifman, R. Crochiere, P. Duhamel, C. Galand, W. Lawton,
D. LeGall, Y. Meyer, T. Ramstad, G. Strang, M. Unser and V. Wickerhauser.
Many people have commented on several versions of the present text. We thank
I. Daubechies, P. Heller, M. Unser, P.P. Vaidyanathan, and G. Wornell for go-
ing through a complete draft and making many helpful suggestions. Comments
on parts of the manuscript were provided by C. Chan, G. Chang, Z. Cvetković,
V. Goyal, C. Herley, T. Kalker, M. Khansari, M. Kobayashi, H. Malvar, P. Moulin,
A. Ortega, A. Park, J. Princen, K. Ramchandran, J. Shapiro and G. Strang, and
are acknowledged with many thanks.
xvi PREFACE
T he topic of this book is very old and very new. Fourier series, or expansion of
periodic functions in terms of harmonic sines and cosines, date back to the early
part of the 19th century when Fourier proposed harmonic trigonometric series [100].
The first wavelet (the only example for a long time!) was found by Haar early in
this century [126]. But the construction of more general wavelets to form bases
for square-integrable functions was investigated in the 1980’s, along with efficient
algorithms to compute the expansion. At the same time, applications of these
techniques in signal processing have blossomed.
While linear expansions of functions are a classic subject, the recent construc-
tions contain interesting new features. For example, wavelets allow good resolution
in time and frequency, and should thus allow one to see “the forest and the trees.”
This feature is important for nonstationary signal analysis. While Fourier basis
functions are given in closed form, many wavelets can only be obtained through a
computational procedure (and even then, only at specific rational points). While
this might seem to be a drawback, it turns out that if one is interested in imple-
menting a signal expansion on real data, then a computational procedure is better
than a closed-form expression!
1
2 CHAPTER 1
The recent surge of interest in the types of expansions discussed here is due
to the convergence of ideas from several different fields, and the recognition that
techniques developed independently in these fields could be cast into a common
framework.
The name “wavelet” had been used before in the literature,1 but its current
meaning is due to J. Goupillaud, J. Morlet and A. Grossman [119, 125]. In the
context of geophysical signal processing they investigated an alternative to local
Fourier analysis based on a single prototype function, and its scales and shifts.
The modulation by complex exponentials in the Fourier transform is replaced by a
scaling operation, and the notion of scale2 replaces that of frequency. The simplicity
and elegance of the wavelet scheme was appealing and mathematicians started
studying wavelet analysis as an alternative to Fourier analysis. This led to the
discovery of wavelets which form orthonormal bases for square-integrable and other
function spaces by Meyer [194], Daubechies [71], Battle [21, 22], Lemarié [175],
and others. A formalization of such constructions by Mallat [180] and Meyer [194]
created a framework for wavelet expansions called multiresolution analysis, and
established links with methods used in other fields. Also, the wavelet construction
by Daubechies is closely connected to filter bank methods used in digital signal
processing as we shall see.
Of course, these achievements were preceded by a long-term evolution from the
1910 Haar wavelet (which, of course, was not called a wavelet back then) to work
using octave division of the Fourier spectrum (Littlewood-Paley) and results in
harmonic analysis (Calderon-Zygmund operators). Other constructions were not
recognized as leading to wavelets initially (for example, Stromberg’s work [283]).
Paralleling the advances in pure and applied mathematics were those in signal
processing, but in the context of discrete-time signals. Driven by applications such
as speech and image compression, a method called subband coding was proposed by
Croisier, Esteban, and Galand [69] using a special class of filters called quadrature
mirror filters (QMF) in the late 1970’s, and by Crochiere, Webber and Flanagan
[68]. This led to the study of perfect reconstruction filter banks, a problem solved
in the 1980’s by several people, including Smith and Barnwell [270, 271], Mintzer
[196], Vetterli [315], and Vaidyanathan [306].
In a particular configuration, namely when the filter bank has octave bands,
one obtains a discrete-time wavelet series. Such a configuration has been popular
in signal processing less for its mathematical properties than because an octave
band or logarithmic spectrum is more natural for certain applications such as audio
1
For example, for the impulse response of a layer in geophysical signal processing by Ricker
[237] and for a causal finite-energy function by Robinson [248].
2
For a beautiful illustration of the notion of scale, and an argument for geometric spacing of
scale in natural imagery, see [197].
1.1. SERIES EXPANSIONS OF SIGNALS 3
compression since it emulates the hearing process. Such an octave-band filter bank
can be used, under certain conditions, to generate wavelet bases, as shown by
Daubechies [71].
In computer vision, multiresolution techniques have been used for various prob-
lems, ranging from motion estimation to object recognition [249]. Images are suc-
cessively approximated starting from a coarse version and going to a fine-resolution
version. In particular, Burt and Adelson proposed such a scheme for image coding
in the early 1980’s [41], calling it pyramid coding.3 This method turns out to be
similar to subband coding. Moreover, the successive approximation view is similar
to the multiresolution framework used in the analysis of wavelet schemes.
In computer graphics, a method called successive refinement iteratively inter-
polates curves or surfaces, and the study of such interpolators is related to wavelet
constructions from filter banks [45, 92].
Finally, many computational procedures use the concept of successive approxi-
mation, sometimes alternating between fine and coarse resolutions. The multigrid
methods used for the solution of partial differential equations [39] are an example.
While these interconnections are now clarified, this has not always been the
case. In fact, maybe one of the biggest contributions of wavelets has been to bring
people from different fields together, and from that cross fertilization and exchange
of ideas and methods, progress has been achieved in various fields.
In what follows, we will take mostly a signal processing point of view of the
subject. Also, most applications discussed later are from signal processing.
The set {ϕi } is complete for the space S, if all signals x ∈ S can be expanded as in
(1.1.1). In that case, there will also exist a dual set {ϕ̃i }i∈Z such that the expansion
coefficients in (1.1.1) can be computed as
αi = ϕ̃i [n] x[n],
n
3
The importance of the pyramid algorithm was not immediately recognized. One of the review-
ers of the original Burt and Adelson paper said, “I suspect that no one will ever use this algorithm
again.”
4 CHAPTER 1
e1 ~
e1 = ϕ ϕ1
1
ϕ0 ϕ1 e1
e0 = ϕ0 e0 = ϕ0
e0
ϕ1 ϕ2
(a) (b) ϕ~0 (c)
Figure 1.1 Examples of possible sets of vectors for the expansion of R2 . (a)
Orthonormal case. (b) Biorthogonal case. (c) Overcomplete case.
when they are real continuous-time functions. The above expressions are the inner
products of the ϕ̃i ’s with the signal x, denoted by ϕ̃i , x. An important particular
case is when the set {ϕi } is orthonormal and complete, since then we have an
orthonormal basis for S and the basis and its dual are the same, that is, ϕi = ϕ̃i .
Then
ϕi , ϕj = δ[i − j],
where δ[i] equals 1 if i = 0, and 0 otherwise. If the set is complete and the vectors
ϕi are linearly independent but not orthonormal, then we have a biorthogonal basis,
and the basis and its dual satisfy
If the set is complete but redundant (the ϕi ’s are not linearly independent), then we
do not have a basis but an overcomplete representation called a frame. To illustrate
these concepts, consider the following example.
a possible reconstruction basis is identical (up to a scale factor), namely, ϕ̃i = 2/3 ϕi (the
reconstruction basis is not unique). This set behaves as an orthonormal basis, even though
the vectors are linearly dependent.
1.1. SERIES EXPANSIONS OF SIGNALS 5
4
The Fourier transform of nonperiodic signals is also possible. It is an integral transform rather
than a series expansion and lacks any time locality.
6 CHAPTER 1
(a)
(b)
Figure 1.2 Musical notation and orthonormal wavelet bases. (a) The western
musical notation uses a logarithmic frequency scale with twelve halftones per
octave. In this example, notes are chosen as in an orthonormal wavelet basis,
with long low-pitched notes, and short high-pitched ones. (b) Corresponding
time-domain functions.
A popular alternative to the STFT is the wavelet transform. Using scales and
shifts of a prototype wavelet, a linear expansion of a signal is obtained. Because the
scales used are powers of an elementary scale factor (typically 2), the analysis uses
a constant relative bandwidth (or, the frequency axis is logarithmic). The sampling
of the time-frequency plane is now very different from the rectangular grid used in
the STFT. Lower frequencies, where the bandwidth is narrow (that is, the basis
functions are stretched in time) are sampled with a large time step, while high
frequencies (which correspond to short basis functions) are sampled more often. In
Figure 1.2, we give an intuitive illustration of this time-frequency trade-off, and
relate it to musical notation which also uses a logarithmic frequency scale.5 What
is particularly interesting is that such a wavelet scheme allows good orthonormal
bases whereas the STFT does not.
In the discussions above, we implicitly assumed continuous-time signals. Of
course there are discrete-time equivalents to all these results. A local analysis
can be achieved using a block transform, where the sequence is segmented into
adjacent blocks of N samples, and each block is individually transformed. As is to be
expected, such a scheme is plagued by boundary effects, also called blocking effects.
A more general expansion relies on filter banks, and can achieve both STFT-like
analysis (rectangular sampling of the time-frequency plane) or wavelet-like analysis
(constant relative bandwidth in frequency). Discrete-time expansions based on
filter banks are not arbitrary, rather they are structured expansions. Again, for
5
This is the standard western musical notation based on J.S. Bach’s “Well Tempered Piano”.
Thus one could argue that wavelets were actually invented by J.S. Bach!
1.1. SERIES EXPANSIONS OF SIGNALS 7
t
t0 T
(a)
f f
t t
t0 T t0 T
(b) (c)
f f
t t
t0 T t0 T
(d) (e)
FIGURE 1.3 fig1.3
Figure 1.3 Time-frequency tilings for a simple discrete-time signal [130]. (a)
Sine wave plus impulse. (b) Expansion onto the identity basis. (c) Discrete-
time Fourier series. (d) Local discrete-time Fourier series. (e) Discrete-time
wavelet series.
Note that the local Fourier transform and the wavelet transform can be used
for signal analysis purposes. In that case, the goal is not to obtain orthonormal
bases, but rather to characterize the signal from the transform. The local Fourier
transform retains many of the characteristics of the usual Fourier transform with a
localization given by the window function, which is thus constant at all frequencies
1.2. MULTIRESOLUTION CONCEPT 9
(this phenomenon can be seen already in Figure 1.3(d)). The wavelet, on the
other hand, acts as a microscope, focusing on smaller time phenomenons as the
scale becomes small (see Figure 1.3(e) to see how the impulse gets better localized
at high frequencies). This behavior permits a local characterization of functions,
which the Fourier transform does not.8
The above example is just one among many schemes where multiresolution de-
compositions are useful in communications problems. Others include transmission
8
For example, in [137], this mathematical microscope is used to analyze some famous lacunary
Fourier series that was proposed over a century ago.
10 CHAPTER 1
coarse
D I I
−
x + + x
residual
MR encoder MR decoder
over error-prone channels, where the coarse resolution can be better protected to
guarantee some minimum level of quality.
Multiresolution decompositions are also important for computer vision tasks
such as image segmentation or object recognition: the task is performed in a suc-
cessive approximation manner, starting on the coarse version and then using this
result as an initial guess for the full task. However, this is a greedy approach which
is sometimes suboptimal. Figure 1.5 shows a famous counter-example, where a
multiresolution approach would be seriously misleading . . .
Interestingly, the multiresolution concept, besides being intuitive and useful in
practice, forms the basis of a mathematical framework for wavelets [181, 194]. As
in the pyramid example shown in Figure 1.4, one can decompose a function into a
coarse version plus a residual, and then iterate this to infinity. If properly done,
this can be used to analyze wavelet schemes and derive wavelet bases.
expansions which will reappear throughout the book as a recurring theme: the Haar
and the sinc bases. They are limit cases of orthonormal expansions with good time
localization (Haar) and good frequency localization (sinc). This naturally leads to
an in-depth study of two-channel filter banks, including analytical tools for their
analysis as well as design methods. The construction of orthonormal and linear
phase filter banks is described. Multichannel filter banks are developed next, first
through tree structures and then in the general case. Modulated filter banks, cor-
responding conceptually to a discrete-time local Fourier analysis, are addressed as
well. Next, pyramid schemes and overcomplete representations are explored. Such
schemes, while not critically sampled, have some other attractive features, such
as time invariance. Then, the multidimensional case is discussed both for simple
separable systems, as well as for general nonseparable ones. The latter systems
involve lattice sampling which is detailed in an appendix. Finally, filter banks for
telecommunications, namely transmultiplexers and adaptive subband filtering, are
presented briefly. The appendix details factorizations of orthonormal filter banks
(corresponding to paraunitary matrices).
Chapter 4 is devoted to the construction of bases for continuous-time signals,
in particular wavelets and local cosine bases. Again, the Haar and sinc cases play
illustrative roles as extremes of wavelet constructions. After an introduction to
series expansions, we develop multiresolution analysis as a framework for wavelet
constructions. This naturally leads to the classic wavelets of Meyer and Battle-
Lemarié or Stromberg. These are based on Fourier-domain analysis. This is followed
by Daubechies’ construction of wavelets from iterated filter banks. This is a time-
domain construction based on the iteration of a multirate filter. Study of the
iteration leads to the notion of regularity of the discrete-time filter. Then, the
wavelet series expansion is considered both in terms of properties and computation
of the expansion coefficients. Some generalizations of wavelet constructions are
considered next, first in one dimension (including biorthogonal and multichannel
wavelets) and then in multiple dimensions, where nonseparable wavelets are shown.
Finally, local cosine bases are derived and they can be seen as a real-valued local
Fourier transform.
Chapter 5 is concerned with continuous wavelet and Fourier transforms. Unlike
the series expansions in Chapters 3 and 4, these are very redundant representa-
tions useful for signal analysis. Both transforms are analyzed, inverses are derived,
and their main properties are given. These transforms can be sampled, that is,
scale/frequency and time shift can be discretized. This leads to redundant series
representations called frames. In particular, reconstruction or inversion is discussed,
and the case of wavelet and local Fourier frames is considered in some detail.
Chapter 6 treats algorithmic and computational aspects of series expansions.
First, a review of classic fast algorithms for signal processing is given since they
1.3. OVERVIEW OF THE BOOK 13
form the ingredients used in subsequent algorithms. The key role of the fast Fourier
transform (FFT) is pointed out. The complexity of computing filter banks, that is,
discrete-time expansions, is studied in detail. Important cases include the discrete-
time wavelet series or transform and modulated filter banks. The latter corresponds
to a local discrete-time Fourier series or transform, and uses FFT’s for efficient com-
putation. These filter bank algorithms have direct applications in the computation
of wavelet series. Overcomplete expansions are considered next, in particular for
the computation of a sampled continuous wavelet transform. The chapter concludes
with a discussion of special topics related to efficient convolution algorithms and
also application of wavelet ideas to numerical algorithms.
The last chapter is devoted to one of the main applications of wavelets and
filter banks in signal processing, namely signal compression. The technique is often
called subband coding because signals are considered in spectral bands for com-
pression purposes. First comes a review of transform based compression, including
quantization and entropy coding. Then follow specific discussions of one-, two- and
three-dimensional signal compression methods based on transforms. Speech and
audio compression, where subband coding was first invented, is discussed. The
success of subband coding in current audio coding algorithms is shown on spe-
cific examples such as the MUSICAM standard. A thorough discussion of image
compression follows. While current standards such as JPEG are block transform
based, some innovative subband or wavelet schemes are very promising and are
described in detail. Video compression is considered next. Besides expansions,
motion estimation/compensation methods play a key role and are discussed. The
multiresolution feature inherent in pyramid and subband coding is pointed out as
an attractive feature for video compression, just as it is for image coding. The final
section discusses the interaction of source coding, particularly the multiresolution
type, and channel coding or transmission. This joint source-channel coding is key
to new applications of image and video compression, as in transmission over packet
networks. An appendix gives a brief review of statistical signal processing which
underlies coding methods.
14 CHAPTER 1
2
15
16 CHAPTER 2
2.1 N OTATIONS
Let C, R, Z and N denote the sets of complex, real, integer and natural numbers,
respectively. Then, C n , and Rn will be the sets of all n-tuples (x1 , . . . , xn ) of
complex and real numbers, respectively.
The superscript ∗ denotes complex conjugation, or, (a + jb)∗ = (a − jb), where
the symbol j is used for the square root of −1 and a, b ∈ R. The subscript ∗ is used
to denote complex conjugation of the constants but not the complex variable, for
example, (az)∗ = a∗ z where z is a complex variable. The superscript T denotes the
transposition of a vector or a matrix, while the superscript ∗ on a vector or matrix
denotes hermitian transpose, or transposition and complex conjugation. Re(z) and
Im(z) denote the real and imaginary parts of the complex number z.
We define the N th root of unity as WN = e−j2π/N . It satisfies the following:
WNN = 1, (2.1.1)
WNkN +i = WNi , with k, i in Z, (2.1.2)
N −1
N n = lN, l ∈ Z,
WNk·n = (2.1.3)
0 otherwise.
k=0
x[n] = f (nT ), n ∈ Z, T ∈ R.
In particular, δ(t) and δ[n] denote continuous-time and discrete-time Dirac func-
tions, which are very different indeed. The former is a generalized function (see
Section 2.4.4) while the latter is the sequence which is 1 for n = 0 and 0 otherwise
(the Dirac functions are also called delta or impulse functions).
2.2. HILBERT SPACES 17
(a) Does the set {vk } span the space Rn or C n , that is, can every vector in Rn or
C n be written as a linear combination of vectors from {vk }?
(b) Are the vectors linearly independent, that is, is it true that no vector from
{vk } can be written as a linear combination of the others?
(c) How can we find bases for the space to be spanned, in particular, orthonormal
bases?
(b) The orthogonality of a vector with respect to another vector (or set of vectors),
for example,
x, y = 0,
with an appropriately defined scalar product,
n
x, y = xi yi .
i=1
So far, we relied on the fact that the spaces were finite-dimensional. Now, the idea
is to generalize our familiar notion of a vector space to infinite dimensions. It is
1
Unless otherwise specified, we will assume a squared norm.
18 CHAPTER 2
necessary to restrict the vectors to have finite length or norm (even though they
are infinite-dimensional). This leads naturally to Hilbert spaces. For example, the
space of square-summable sequences, denoted by l2 (Z), is the vector space “C ∞ ”
with a norm constraint. An example of a set of vectors spanning l2 (Z) is the set
{δ[n − k]}, k ∈ Z. A further extension with respect to linear algebra is that vectors
can be generalized from n-tuples of real or complex values to include functions of
a continuous variable. The notions of norm and orthogonality can be extended to
functions using a suitable inner product between functions, which are thus viewed
as vectors. A classic example of such orthogonal vectors is the set of harmonic sine
and cosine functions, sin(nt) and cos(nt), n = 0, 1, . . . , on the interval [−π, π].
The classic questions from linear algebra apply here as well. In particular, the
question of completeness, that is, whether the span of the set of vectors {vk } covers
the whole space, becomes more involved than in the finite-dimensional case. The
norm plays a central role, since any vector in the space must be expressed by a
linear combination of vk ’s such that the norm of the difference between the vector
and the linear combination of vk ’s is zero. For l2 (Z), {δ[n − k]}, k ∈ Z, constitute
a complete set which is actually an orthonormal basis. For the space of square-
integrable functions over the interval [−π, π], denoted by L2 ([−π, π]), the harmonic
sines and cosines are complete since they form the basis used in the Fourier series
expansion.
If only a subset of the complete set of vectors {vk } is used, one is interested in
the best approximation of a general element of the space by an element from the
subspace spanned by the vectors in the subset. This question has a particularly
easy answer when the set {vk } is orthonormal and the goal is least-squares approx-
imation (that is, the norm of the difference is minimized). Because the geometry
of Hilbert spaces is similar to Euclidean geometry, the solution is the orthogonal
projection onto the approximation subspace, since this minimizes the distance or
approximation error.
In the following, we formally introduce vector spaces and in particular Hilbert
spaces. We discuss orthogonal and general bases and their properties. We often use
the finite-dimensional case for intuition and examples. The treatment is not very
detailed, but sufficient for the remainder of the book. For a thorough treatment,
we refer the reader to [113].
(a) Commutativity: x + y = y + x.
(e) Additive inverse: for all x in E, there exists a (−x) in E, such that
x + (−x) = 0.
infinite set {δ[n − k]}k∈Z . Since they are linearly independent, the space is infinite-
dimensional.
Next, we equip the vector space with an inner product that is a complex function
fundamental for defining norms and orthogonality.
D EFINITION 2.2
An inner product on a vector space E over C (or R), is a comple-valued
function ·, ·, defined on E × E with the following properties:
Note that (b) and (c) imply ax, y = a∗ x, y. From (a) and (b), it is clear
that the inner product is linear. Note that we choose the definition of the inner
product which takes the complex conjugate of the first vector (follows from (b)).
For illustration, the standard inner product for complex-valued functions over R
and sequences over Z are
∞
f, g = f ∗ (t) g(t)dt,
−∞
and
∞
x, y = x∗ [n] y[n],
n=−∞
respectively (if they exist). The norm of a vector is defined from the inner product
as
Finally, the inner product can be used to define orthogonality of two vectors x and
y, that is, vectors x and y are orthogonal if and only if
x, y = 0.
If two vectors are orthogonal, which is denoted by x ⊥ y, then they satisfy the
Pythagorean theorem,
x + y2 = x2 + y2 ,
D EFINITION 2.3
A complete inner product space is called a Hilbert space.
We are particularly interested in those Hilbert spaces which are separable because a
Hilbert space contains a countable orthonormal basis if and only if it is separable.
Since all Hilbert spaces with which we are going to deal are separable, we implicitly
assume that this property is satisfied (refer to [113] for details on separability).
Note that a closed subspace of a separable Hilbert space is separable, that is, it also
contains a countable orthonormal basis.
Given a Hilbert space E and a subspace S, we call the orthogonal complement
of S in E, denoted S ⊥ , the set {x ∈ E | x ⊥ S}. Assume further that S is closed,
that is, it contains all limits of sequences of vectors in S. Then, given a vector y in
E, there exists a unique v in S and a unique w in S ⊥ such that y = v + w. We can
thus write
E = S ⊕ S⊥,
or, E is the direct sum of the subspace and its orthogonal complement.
Let us consider a few examples of Hilbert spaces.
The above holds for the real space R as well (note that then yi∗ = yi ). For
n
Thus, l2 (Z) is the space of all sequences such that x < ∞. This is obviously an
infinite-dimensional space, and a possible orthonormal basis is {δ[n − k]}k∈Z .
For the completeness of l2 (Z), one has to show that if xn [k] is a sequence of
vectors in l2 (Z) such that xn −xm → 0 as n, m → ∞ (that is, a Cauchy sequence),
then there exists a limit x in l2 (Z) such that xn −x → 0. The proof can be found,
for example, in [113].
where
k−1
vk = yi , xk yi .
i=1
As will be seen shortly, the vector vk is the orthogonal projection of xk onto the
subspace spanned by the previous orthogonalized vectors and this is subtracted
from xk , followed by normalization.
A standard example of such an orthogonalization procedure is the Legendre
polynomials over the interval [−1, 1]. Start with xk (t) = tk , k = 0, 1, . . . and apply
the Gram-Schmidt procedure to get yk (t), of degree k, norm 1 and orthogonal to
yi (t), i < k (see Problem 2.1).
The coefficients αk of the expansion are called the Fourier coefficients of y (with
respect to {xi }) and are given by
This can be shown by using the continuity of the inner product (that is, if xn → x,
and yn → y, then xn , yn → x, y) as well as the orthogonality of the xk ’s. Given
2.2. HILBERT SPACES 25
n
xk , y = lim xk , αi xi = αk ,
n→∞
i=0
x3
d x2
y^
x1
x2
x2
〈 x 2, y〉 y
〈 x̃ 2, y〉 y
x1 x1
ŷ = 〈 x 1, y〉 〈 x̃ 1, y〉 ŷ = 〈 x , y〉
1
(a) (b)
for x in S is attained for x = i αi xi with
αi = xi , y,
that is, the Fourier coefficients. An immediate consequence of this result is the
successive approximation property of orthogonal expansions. Call ŷ (k) the best
approximation of y on the subspace spanned by {x1 , x2 , . . . , xk } and given by the
coefficients {α1 , α2 , . . . , αk } where αi = xi , y. Then, the approximation ŷ (k+1) is
given by
ŷ (k+1) = ŷ (k) + xk+1 , yxk+1 ,
that is, the previous approximation plus the projection along the added vector xk+1 .
While this is obvious, it is worth pointing out that this successive approximation
property does not hold for nonorthogonal bases. When calculating the approxima-
tion ŷ (k+1) , one cannot simply add one term to the previous approximation, but has
to recalculate the whole approximation (see Figure 2.2). For a further discussion
of projection operators, see Appendix 2.A.
(b) There exist strictly positive constants A, B, Ã, B̃ such that, for all y in E
A y2 ≤ |xk , y|2 ≤ B y2 , (2.2.8)
k
à y 2
≤ |x̃k , y|2 ≤ B̃ y2 . (2.2.9)
k
Compare these inequalities with (2.2.5) in the orthonormal case. Bases which satisfy
(2.2.8) or (2.2.9) are called Riesz bases [73]. Then, the signal expansion formula
becomes
y = xk , y x̃k = x̃k , y xk . (2.2.10)
k k
It is clear why the term biorthogonal is used, since to the (nonorthogonal) basis
{xi } corresponds a dual basis {x̃i } which satisfies the biorthogonality constraint
28 CHAPTER 2
(2.2.7). If the basis {xi } is orthogonal, then it is its own dual, and the expansion
formula (2.2.10) becomes the usual orthogonal expansion given by (2.2.3–2.2.4).
Equivalences similar to Theorem 2.4 hold in the biorthogonal case as well, and
we give the Parseval’s relations which become
y2 = xi , y∗ x̃i , y, (2.2.11)
i
and
y1 , y2 = xi , y1 ∗ x̃i , y2 , (2.2.12)
i
= x̃i , y1 ∗ xi , y2 . (2.2.13)
i
A, B are called frame bounds, and when they are equal, we call the frame tight. In
a tight frame we have
|xk , y|2 = A y2 ,
k
While this last equation resembles the expansion formula in the case of an or-
thonormal basis, a frame does not constitute an orthonormal basis in general. In
particular, the vectors may be linearly dependent and thus not form a basis. If all
2.3. ELEMENTS OF LINEAR ALGEBRA 29
the vectors in a tight frame have unit norm, then the constant A gives the redun-
dancy ratio (for example, A = 2 means there are twice as many vectors as needed
to cover the space). Note that if A = B = 1, and xk = 1 for all k, then {xk }
constitutes an orthonormal basis.
Because of the linear dependence which exists among the vectors used in the
the expansion is not unique anymore. Consider the set {x1 , x2 , . . .}
expansion,
where i βi xi = 0 (where not all βi ’s are zero) because of linear dependence. If y
can be written as
y = αi xi , (2.2.15)
i
then one can add βi to each αi without changing the validity of the expansion
(2.2.15). The expansion (2.2.14) is unique in the sense that it minimizes the norm
of the expansion among all valid expansions. Similarly, for general frames, there
exists a unique dual frame which is discussed in Section 5.3.2 (in the tight frame
case, the frame and its dual are equal).
This concludes for now our brief introduction of signal expansions. Later, more
specific expansions will be discussed, such as Fourier and wavelet expansions. The
fundamental properties seen above will reappear in more specialized forms (for
example, Parseval’s equality).
While we have only discussed Hilbert spaces, there are of course many other
spaces of functions which are of interest. For example, Lp (R) spaces are those
containing functions f for which |f |p is integrable [113]. The norm on these spaces
is defined as ∞
f p = ( |f (t)|p dt)1/p , (2.2.16)
−∞
reference texts exist on the subject, see [106, 280]. Good reviews can also be found
in [150] and [308]. We give only a brief account here, focusing on basic concepts
and topics which are needed later, such as polynomial matrices.
Note that the matrix product is not commutative in general, that is, A B = B A.5
It can be shown that (A B)T = B T AT .
The inner product of two (column) vectors from RN is v 1 , v 2 = v T1 · v2 , and if
the vectors are from C n , then v 1 , v 2 = v ∗1 · v 2 . The outer product of two vectors
from Rn and Rm is an n × m matrix given by v 1 · v T2 .
To define the notion of a determinant, we first need to define a minor. A minor
M ij is a submatrix of the matrix A obtained by deleting its ith row and jth column.
More generally, a minor can be any submatrix of the matrix A obtained by deleting
some of its rows and columns. Then the determinant of an n × n matrix can be
defined recursively as
n
det(A) = Aij (−1)i+j det(M ij )
i=1
where j is fixed and belongs to {1, . . . , n}. The cofactor C ij is (−1)i+j det(M ij ).
A square matrix is said to be singular if det(A) = 0. The product of two matrices
is nonsingular only if both matrices are nonsingular. Some properties of interest
include the following:
(Aα)T (y − Ax̂) = 0
or
AT Ax̂ = AT y,
which are called the normal equations of the least-squares problem. If the columns
of A are linearly independent, then AT A is invertible. The unique least-squares
solution is
x̂ = (AT A)−1 AT y (2.3.4)
(recall that A is either rectangular or rank deficient, and does not have a proper
inverse) and the orthogonal projection ŷ is equal to
Ap = λp,
A = U ΛU ∗ .
This result constitutes the spectral theorem for hermitian matrices. Hermitian
symmetric matrices commute with their hermitian transpose. More generally, a
matrix N that commutes with its hermitian transpose is called normal, that is, it
satisfies N ∗ N = N N ∗ . Normal matrices are exactly those that have a complete
set of orthogonal eigenvectors.
The importance of eigenvectors in the study of linear operators comes from the
following fact: Assuming a full set of eigenvectors,
a vector x can be written as a
linear combination of eigenvectors x = αi v i . Then,
Ax = A αi vi = αi (Avi ) = αi λi v i .
i i i
U x = x, ∀x ∈ C \ ,
as well as
U x, U y = x, y, ∀x, y ∈ C \ ,
2.3. ELEMENTS OF LINEAR ALGEBRA 35
Sometimes, the elements ti are matrices themselves, in which case the matrix is
called block Toeplitz. Another important matrix is the DFT (Discrete Fourier
Transform) matrix. The (i, k)th element of the DFT matrix of size n × n is
Wnik = e−j2πik/n . The DFT matrix diagonalizes circulant matrices, that is, its
columns and rows are the eigenvectors of circulant matrices (see Section 2.4.8 and
Problem 2.18).
A real symmetric matrix A is called positive definite if all its eigenvalues are
greater than 0. Equivalently, for all nonzero vectors x, the following is satisfied:
xT Ax > 0.
Finally, for a positive definite matrix A, there exists a nonsingular matrix W such
that
A = WTW,
where W is intuitively a “square root” of A. One possible way to choose such a
T
√ as A = QΛQ and then, since all the eigenvalues
square root is to diagonalize A
T
are positive, choose W = Q Λ (the square root is applied on each eigenvalue in
the diagonal matrix Λ). The above discussion carries over to hermitian symmetric
matrices by using hermitian transposes.
that is, it can be written either as a matrix containing polynomials as its entries,
or a polynomial having matrices as its coefficients.
The question of the rank in polynomial matrices is more subtle. For example,
the matrix
a + bx 3(a + bx)
,
c + dx λ(c + dx)
with λ = 3, always has rank less than 2, since the two columns are proportional
to each other. On the other hand, if λ = 2, then the matrix would have the rank
2.4. FOURIER THEORY AND SAMPLING 37
less than 2 only if x = −a/b or x = −c/d. This leads to the notion of normal rank.
First, note that H(x) is nonsingular only if det(H(x)) is different from 0 for some
x. Then, the normal rank of H(x) is the largest of the orders of minors that have
a determinant not identically zero. In the above example, for λ = 3, the normal
rank is 1, while for λ = 2, the normal rank is 2.
An important class of polynomial matrices are unimodular matrices, whose de-
terminant is not a function of x. An example is the following matrix:
1+x x
H(x) = ,
2+x 1+x
whose determinant is equal to 1. There are several useful properties pertaining
to unimodular matrices. For example, the product of two unimodular matrices
is again unimodular. The inverse of a unimodular matrix is unimodular as well.
Also, one can prove that a polynomial matrix H(x) is unimodular, if and only if
its inverse is a polynomial matrix. All these facts can be proven using properties
of determinants (see, for example, [308]).
The extension of the concept of unitary matrices to polynomial matrices leads
to paraunitary matrices [308] as studied in circuit theory. In fact, these matrices
are unitary on the unit circle or the imaginary axis, depending if they correspond
to discrete-time or continuous-time linear operators (z-transforms or Laplace trans-
forms). Consider the discrete-time case and x = ejω . Then, a square matrix U (x)
is unitary on the unit circle if
[U (ejω )]∗ U (ejω ) = U (ejω )[U (ejω )]∗ = I.
Extending this beyond the unit circle leads to
[U (x−1 )]T U (x) = U (x)[U (x−1 )]T = I, (2.3.7)
since (ejω )∗ = e−jω . If the coefficients of the polynomials are complex, the coeffi-
cients need to be conjugated in (2.3.7), which is usually written [U ∗ (x−1 )]T . This
will be studied in Chapter 3.
As a generalization of polynomial matrices, one can consider the case of rational
matrices. In that case, each entry is a ratio of two polynomials. As will be shown
in Chapter 3, polynomial matrices in z correspond to finite impulse response (FIR)
discrete-time filters, while rational matrices can be associated with infinite impulse
response (IIR) filters. Unimodular and unitary matrices can be defined in the
rational case, as in the polynomial case.
(a) The continuous-time Fourier transform (CTFT), often simply called the Fourier
transform.
In all the Fourier cases, {ψ} = {ψ̃}. The above transforms and series will be
discussed in this section. Later, more general expansions will be introduced, in par-
ticular, series expansions of discrete-time signals using filter banks in Chapter 3,
series expansions of continuous-time signals using wavelets in Chapter 4, and in-
tegral expansions of continuous-time signals using wavelets and short-time Fourier
bases in Chapter 5.
which is called the Fourier analysis formula. The inverse Fourier transform is given
by ∞
1
f (t) = F (ω)ejωt dω, (2.4.2)
2π −∞
or, the Fourier synthesis formula. Note that ejωt is not in L2 (R), and that the set
{ejωt } is not countable. The exact conditions under which (2.4.2) is the inverse
of (2.4.1) depend on the behavior of f (t) and are discussed in standard texts on
Fourier theory [46, 326]. For example, the inversion is exact if f (t) is continuous
(or if f (t) is defined as (f (t+ ) + f (t− ))/2 at a point of discontinuity).6
When f (t) is square-integrable, then the formulas above hold in the L2 sense
(see Appendix 2.C), that is, calling fˆ(t) the result of the analysis followed by the
synthesis formula,
f (t) − fˆ(t) = 0.
Assuming that the Fourier transform and its inverse exist, we will denote by
f (t) ←→ F (ω)
6
We assume that f (t) is of bounded
variation. That is, for f (t) defined on a closed interval [a, b],
there exists a constant A such that N n=1 |f (tn ) − f (tn−1 )| < A for any finite set {ti } satisfying
a ≤ t0 < t1 < . . . < tN ≤ b. Roughly speaking, the graph of f (t) cannot oscillate over an infinite
distance as t goes over a finite interval.
40 CHAPTER 2
Linearity Since the Fourier transform is an inner product (see (2.4.1)), it follows
immediately from the linearity of the inner product that
which indicates the essential symmetry of the Fourier analysis and synthesis formu-
las.
ejω0 t f (t) ←→ F (ω − ω0 ).
∂ n F (ω)
(−jt)n f (t) ←→ .
∂ω n
2.4. FOURIER THEORY AND SAMPLING 41
and is denoted h(t) = f (t) ∗ g(t) = g(t) ∗ f (t) since (2.4.9) is symmetric in f (t)
and g(t). Denoting by F (ω) and G(ω) the Fourier transforms of f (t) and g(t),
respectively, the convolution theorem states that
This result is fundamental, and we will prove it for f (t) and g(t) being in L1 (R).
Taking the Fourier transform of f (t) ∗ g(t),
∞ ∞
f (τ )g(t − τ )dτ e−jωt dt,
−∞ −∞
changing the order of integration (which is allowed when f (t) and g(t) are in L1 (R);
see Fubini’s theorem in [73, 250]) and using the shift property, we get
∞ ∞ ∞
−jωt
f (τ ) g(t − τ )e dt dτ = f (τ )e−jωτ G(ω)dτ = F (ω) G(ω).
−∞ −∞ −∞
The result holds as well when f (t) and g(t) are square-integrable, but requires a
different proof [108].
An alternative view of the convolution theorem is to identify the complex ex-
ponentials ejωt as the eigenfunctions of the convolution operator, since
∞ ∞
ejω(t−τ )
g(τ )dτ = ejωt
e−jωτ g(τ )dτ = ejωt G(ω).
−∞ −∞
The associated eigenvalue G(ω) is simply the Fourier transform of the impulse
response g(τ ) at frequency ω.
42 CHAPTER 2
that is,
h (t) = f (t) ∗ g(t) = f (t) ∗ g (t).
This is useful when convolving a signal with a filter which is known to be the
derivative of a given function such as a Gaussian, since one can think of the result
as being the convolution of the derivative of the signal with a Gaussian.
Note that the factor 1/2π comes from our definition √ of the Fourier transform (2.4.1–
2.4.2). A symmetric definition, with a factor 1/ 2π in both the analysis and
synthesis formulas (see, for example, [73]), would remove the scale factor in (2.4.12).
The proof of (2.4.11) uses the fact that
f ∗ (t) ←→ F ∗ (−ω)
2.4. FOURIER THEORY AND SAMPLING 43
and the frequency-domain convolution relation (2.4.10). That is, since f ∗ (t) · g(t)
has Fourier transform (1/2π)(F ∗ (−ω) ∗ G(ω)), we have
∞ ∞
∗ −jωt 1
f (t) g(t) e dt = F ∗ (−Ω) G(ω − Ω) dΩ,
−∞ 2π −∞
f (t + T ) = f (t),
with T /2
1
F [k] = f (t) e−jkω0 t dt. (2.4.14)
T −T /2
7
Again, we consider nonpathological functions (that is, of bounded variation).
44 CHAPTER 2
That the set {ϕk } is complete is shown in [326] and means that there exists no
periodic function f (t) with L2 norm greater than zero that has all its Fourier series
coefficients equal to zero. Actually, there is equivalence between norms, as shown
below.
Best Approximation Property While the following result is true in a more gen-
eral setting (see Section 2.2.3), it is sufficiently important to be restated for Fourier
series, namely
N N
f (t) − ϕk , f ϕk (t) ≤ f (t) − ak ϕk (t) ,
k=−N k=−N
where {ak } is an arbitrary set of coefficients. That is, the Fourier series coefficients
are the best ones for an approximation in the span of {ϕk (t)}, k = −N, . . . , N .
Moreover, if N is increased, new coefficients are added without affecting the previous
ones.
Fourier series, beside their obvious use for characterizing periodic signals, are
useful for problems of finite size through periodization. The immediate concern,
however, is the introduction of a discontinuity at the boundary, since periodization
of a continuous signal on an interval results, in general, in a discontinuous periodic
signal.
Fourier series can be related to the Fourier transform seen earlier by using
sequences of Dirac functions which are also used in sampling. We will turn our
attention to these functions next.
2.4. FOURIER THEORY AND SAMPLING 45
then δ(t) = limε→0 δε (t). More generally, one can use any smooth function ψ(t)
with integral 1 and define [278]
1 t
δ(t) = lim ψ .
→0
Any operation involving a Dirac function requires a limiting operation. Since we are
reviewing standard results, and for notational convenience, we will skip the limiting
process. However, let us emphasize that Dirac functions have to be handled with
care in order to get meaningful results. When in doubt, it is best to go back to the
definition and the limiting process. For details see, for example, [215]. It follows
from (2.4.15) that ∞
δ(t) dt = 1, (2.4.16)
−∞
as well as8
∞ ∞
f (t − t0 ) δ(t) dt = f (t) δ(t − t0 ) dt = f (t0 ). (2.4.17)
−∞ −∞
One more standard relation useful for the Dirac function is [215]
The Fourier transform of δ(t − t0 ) is, from (2.4.1) and (2.4.17), equal to
δ(t − t0 ) ←→ e−jωt0 .
Using the symmetry property (2.4.3) and the previous results, we see that
According to the above and using the modulation theorem (2.4.10), f (t) ejω0 t has
Fourier transform F (ω − ω0 ).
Next, we introduce the train of Dirac functions spaced T > 0 apart, denoted
sT (t) and given by
∞
sT (t) = δ(t − nT ). (2.4.20)
n=−∞
Before getting its Fourier transform, we derive the Poisson sum formula. Note that,
given a function f (t) and using (2.4.18),
∞ ∞
f (τ ) sT (t − τ ) dτ = f (t − nT ). (2.4.21)
−∞ n=−∞
Call the above T -periodic function f0 (t). Further assume that f (t) is sufficiently
smooth and decaying rapidly such that the above series converges uniformly to
f0 (t). We can then expand f0 (t) into a uniformly convergent Fourier series
∞
1 T /2
f0 (t) = f0 (τ )e−j2πkτ /T dτ ej2πkt/T .
T −T /2
k=−∞
Consider the Fourier series coefficient in the above formula, using the expression
for f0 (t) in (2.4.21)
T /2 ∞
(2n+1)T /2
f0 (τ )e−j2πkτ /T dτ = f (τ ) e−j2πkτ /T dτ
−T /2 n=−∞ (2n−1)T /2
2πk
= F .
T
One can use the Poisson formula to derive the Fourier transform of the impulse
train sT (t) in (2.4.20). It can be shown that
∞
2π 2πk
ST (ω) = δ(ω − ). (2.4.23)
T T
k=−∞
We have explained that sampling the spectrum and periodizing the time-domain
function are equivalent. We will see the dual situation, when sampling the time-
domain function leads to a periodized spectrum. This is also an immediate appli-
cation of the Poisson formula.
2.4.5 Sampling
The process of sampling is central to discrete-time signal processing, since it pro-
vides the link with the continuous-time domain. Call fT (t) the sampled version of
f (t), obtained as
∞
fT (t) = f (t) sT (t) = f (nT ) δ(t − nT ). (2.4.24)
n=−∞
Using the modulation theorem of the Fourier transform (2.4.10) and the transform
of sT (t) given in (2.4.23), we get
∞ ∞
1 2π 1 2π
FT (ω) = F (ω) ∗ δ ω−k = F ω−k , (2.4.25)
T T T T
k=−∞ k=−∞
where we used (2.4.18). Thus, FT (ω) is periodic with period 2π/T , and is obtained
by overlapping copies of F (ω) at every multiple of 2π/T . Another way to prove
(2.4.25) is to use the Poisson formula. Taking the Fourier transform of (2.4.24)
results in
∞
FT (ω) = f (nT ) e−jnT ω ,
n=−∞
since fT (t) is a weighted sequence of Dirac functions with weights f (nT ) and shifts
of nT . To use the Poisson formula, consider the function gΩ (t) = f (t) e−jtΩ , which
has Fourier transform GΩ (ω) = F (ω + Ω) according to (2.4.19). Now, applying
(2.4.22) to gΩ (t), we find
∞
∞
1 2πk
gΩ (nT ) = GΩ
n=−∞
T T
k=−∞
48 CHAPTER 2
where
sin (πt/T )
sincT (t) = .
πt/T
Note that sincT (nT ) = δ[n], that is, it has the interpolation property since it is 1
at the origin but 0 at nonzero multiples of T . It follows immediately that (2.4.27)
holds at the sampling instants t = nT .
P ROOF
The proof that (2.4.27) is valid for all t goes as follows: Consider the sampled version of
f (t), fT (t), consisting of weighted Dirac functions (2.4.24). We showed that its Fourier
transform is given by (2.4.25). The sampling frequency ωs equals 2ωm , where ωm is the
bandlimiting frequency of F (ω). Thus, F (ω − kωs ) and F (ω − lωs ) do not overlap for k = l.
To recover F (ω), it suffices to keep the term with k = 0 in (2.4.25) and normalize it by
T . This is accomplished with a function that has a Fourier transform which is equal to T
from −ωm to ωm and 0 elsewhere. This is called an ideal lowpass filter. Its time-domain
impulse response, denoted sincT (t) where T = π/ωm , is equal to (taking the inverse Fourier
transform)
ωm
1 T jπt/T sin(πt/T )
sincT (t) = T e−jωt dω = e − e−jπt/T = . (2.4.28)
2π −ωm 2πjt πt/T
9
We will say that a function f (t) is bandlimited to ωm if its Fourier transform F (ω) = 0 for
|ω| ≥ ωm .
2.4. FOURIER THEORY AND SAMPLING 49
Convolving fT (t) with sincT (t) filters out the repeated spectrums (terms with k = 0 in
(2.4.25)) and recovers f (t), as is clear in frequency domain. Because fT (t) is a sequence
of Dirac functions of weights f (nT ), the convolution results in a weighted sum of shifted
impulse responses,
∞
∞
f (nT )δ(t − nT ) ∗ sincT (t) = f (nT ) sincT (t − nT ),
n=−∞ n=−∞
proving (2.4.27)
Now, assume a bandlimited signal f (t) and consider the inner product ϕn,T , f .
Again using Parseval’s relation,
√ ω
T m √
ϕn,T , f = ejωnT F (ω) dω = T f (nT ),
2π −ωm
(the only change is that we normalized the sinc basis functions to have unit norm).
What happens if f (t) is not bandlimited? Because {ϕn,T } is an orthogonal set,
the interpolation formula (2.4.30) represents the orthogonal projection of the input
50 CHAPTER 2
signal onto the subspace of bandlimited signals. Another way to write the inner
product in (2.4.30) is
∞
ϕn,T , f = ϕ0,T (τ − nT ) f (τ ) dτ = ϕ0,T (−t) ∗ f (t)|t=nT ,
−∞
which equals ϕ0,T (t)∗f (t) since ϕ0,T (t) is real and symmetric in t. That is, the inner
products, or coefficients, in the interpolation formula are simply the outputs of an
ideal lowpass filter with cutoff π/T sampled at multiples of T . This is the usual
view of the sampling theorem as a bandlimiting convolution followed by sampling
and reinterpolation.
To conclude this section, we will demonstrate a fact that will be used in Chap-
ter 4. It states that the following can be seen as a Fourier transform pair:
f (t), f (t + n) = δ[n] ←→ |F (ω + 2kπ)|2 = 1. (2.4.31)
k∈Z
The left side of the equation is simply the deterministic autocorrelation10 of f (t)
evaluated at integers, that is, sampled autocorrelation. If we denote the auto-
correlation of f (t) as p(τ ) = f (t), f (t + τ ), then the left side of (2.4.31) is
p1 (τ ) = p(τ )s1 (τ ), where s1 (τ ) is as defined in (2.4.20) with T = 1. The Fourier
transform of p1 (τ ) is (apply (2.4.25))
P1 (ω) = P (ω − 2kπ).
k∈Z
Since the Fourier transform of p(t) is P (ω) = |F (ω)|2 , we get that the Fourier
transform of the right side of (2.4.31) is the left side of (2.4.31).
Comparing (2.4.32–2.4.33) with the equivalent expressions for Fourier series (2.4.13–
2.4.14), one can see that they are duals of each other (within scale factors). Fur-
thermore, if the sequence f [n] is obtained by sampling a continuous-time function
f (t) at instants nT ,
f [n] = f (nT ), (2.4.34)
then the discrete-time Fourier transform is related to the Fourier transform of f (t).
Denoting the latter by Fc (ω), the Fourier transform of its sampled version is equal
to (see (2.4.26))
∞
∞
−jnT ω 1 2π
FT (ω) = f (nT ) e = Fc ω − k . (2.4.35)
n=−∞
T T
k=−∞
Because of these close relationships with the Fourier transform and Fourier series,
it follows that all properties seen earlier carry over and we will only repeat two of
the most important ones (for others, see [211]).
Convolution Given two sequences f [n] and g[n] and their discrete-time Fourier
transforms F (ejω ) and G(ejω ), then
∞
∞
f [n] ∗ g[n] = f [n − l] g[l] = f [l] g[n − l] ←→ F (ejω ) G(ejω ).
l=−∞ l=−∞
52 CHAPTER 2
N −1
F [k] = f [n] WNnk , k ∈ Z, (2.4.38)
n=0
N−1
1
f [n] = F [k] WN−nk , n ∈ Z, (2.4.39)
N
k=0
N −1
N −1
f [n] ∗ g[n] = f [n − l] g[l] = f0 [(n − l) mod N ] g0 [l], (2.4.40)
l=0 l=0
where f0 [·] and g0 [·] are equal to one period of f [·] and g[·] respectively. That is,
f0 [n] = f [n], n = 0, . . . , N − 1, and 0 otherwise, and similarly for g0 [n]. Then, the
convolution property is given by
−1 N −1
1 ∗
N
∗
f [n] g[n] = F [k] G[k].
N
n=0 k=0
2.4. FOURIER THEORY AND SAMPLING 53
Just as the Fourier series coefficients were related to the Fourier transform of one
period (see (2.4.14)), the coefficients of the discrete-time Fourier series can be ob-
tained from the discrete-time Fourier transform of one period. If we call F0 (ejω )
the discrete-time Fourier transform of f0 [n], (2.4.32) and (2.4.38) imply that
∞
N −1
F0 (ejω ) = f0 [n] e−jωn = f [n] e−jωn ,
n=−∞ n=0
leading to
F [k] = F0 (ejω )|ω=k2π/N .
The sampling of F0 (ejω ) simply repeats copies of f0 [n] at integer multiples of N ,
and thus we have
∞
N −1
1
N −1
1
f [n] = f0 [n − lN ] = F [k] ejnk2π/N
= F0 ejk2π/N ejnk2π/N ,
N N
l=−∞ k=0 k=0
(2.4.42)
which is the discrete-time version of the Poisson sum formula. It actually holds
for f0 [·] with support larger than 0, . . . , N − 1, as long as the first sum in (2.4.42)
converges. For n = 0, (2.4.42) yields
∞
N −1
1
f0 [lN ] = F0 ejk2π/N .
N
l=−∞ k=0
N −1
F [k] = f [n] WNnk , (2.4.43)
n=0
where WN = e−j2π/N . These are the same formulas as (2.4.38–2.4.39), except that
f [n] and F [k] are not defined for n, k ∈ {0, . . . , N −1}. Recall that the discrete-time
54 CHAPTER 2
Fn,k = WNnk , n, k = 0, . . . , N − 1,
f ∗p g = Cg = F −1 ΛF g,
f̂ = F f ,
where we used (2.4.45), that is, the fact that F ∗ is the inverse of F up to a scale
factor of N .
Other properties of the DFT follow from their counterparts for the discrete-time
Fourier transform, bearing in mind the underlying circular structure implied by the
discrete-time Fourier series (for example, a shift is a circular shift).
2.4. FOURIER THEORY AND SAMPLING 55
f (t) F (ω)
(a)
t ω
f (t) F (ω)
(b)
t ω
T 2π
------
T
f (t) F (ω)
(c)
t ω
2π ωs
-------
ωs
f [n] F [k]
(d)
n k
N N
Between the Fourier transform, where both time and frequency variables are con-
tinuous, and the discrete-time Fourier series (DTFS), where both variables are
discrete, there are a number of intermediate cases.
First, in Table 2.1 and Figure 2.3, we compare the Fourier transform, Fourier
56 CHAPTER 2
f (t) F (ω)
(a)
t ω
2π ωs
------ ------
ωs 2
f (t) F (ω)
(b)
t ω
T 2π
------
T
f (t) F (ω)
(c)
•••
t ω
2π T 2π ωs
------ ------
ωs T ------
2
f (t) F (ω)
(d)
•••
t ω
0 1 2 ••• N-1 2π 2π
------
N
series, discrete-time Fourier transform and discrete-time Fourier series. The table
shows four combinations of continuous versus discrete variables in time and fre-
quency. As defined in Section 2.4.1, we use a short-hand CT or DT for continuous-
versus discrete-time variable, and we call it a Fourier transform or series if the
synthesis formula involves an integral or a summation.
Then, in Table 2.2 and Figure 2.4, we consider the same transforms but when
Table 2.1 Fourier transforms with various combinations of continuous/discrete time and fre-
quency variables. CT and DT stand for continuous and discrete time, while FT and FS stand
for Fourier transform (integral synthesis) and Fourier series (summation synthesis). P stands
for a periodic signal. The relation between sampling period T and sampling frequency ωs is
ωs = 2π/T . Note that in the DTFT case, ωs is usually equal to 2π (T = 1).
Analysis
Transform Time Freq. Duality
Synthesis
F (ω) = t f (t) e−jωt dt
(a) Fourier transform self-
C C
CTFT dual
f (t) = 1/2π ω F (ω) ejωt dω
2.4. FOURIER THEORY AND SAMPLING
T /2
F [k] = 1/T −T /2 f (t) e−j2πkt/T dt dual
(b) Fourier series C
D with
CTFS P
f (t) = F [k] ej2πkt/T DTFT
k
(c) Discrete-time F (ejω ) = n f [n] e−j2πωn/ωs dual
C
Fourier transform D with
P ωs /2 jω
DTFT f [n] = 1/ωs −ωs /2 F (e ) ej2πωn/ωs dω CTFS
N −1
(d) Discrete-time F [k] = n=0 f [n] e−j2πnk/N
D D self
Fourier series
P P N −1 -dual
DTFS f [n] = 1/N n=0 F [k] ej2πnk/N
57
58
Table 2.2 Various Fourier transforms with restrictions on the signals involved. Either the signal
is of finite length (FL) or the Fourier transform is bandlimited (BL).
the signal satisfies some additional restrictions, that is, when it is limited either in
time or in frequency. In that case, the continuous function (of time or frequency)
can be sampled without loss of information.
finite-length signal has the whole complex plane as its ROC (assuming it converges
anywhere), since it is both left- and right-sided and connected.
If a signal is two-sided, that is, neither left- nor right-sided, then its ROC is the
intersection of the ROC’s of its left- and right-sided parts. This ROC is therefore
either empty or of the form of a vertical strip.
Given a Laplace transform (such as a rational expression), different ROC’s lead
to different time-domain signals. Let us illustrate this with an example.
Example 2.1
Assume F (s) = 1/((s + 1)(s + 2)). The ROC {Re(s) < −2} corresponds to a left-sided
signal
f (t) = −(e−t − e−2t ) u(−t).
The ROC {Re(s) > −1} corresponds to a right-sided signal
Finally, the ROC {−2 < Re(s) < −1} corresponds to a two-sided signal
Note that only the right-sided signal would also have a Fourier transform (since its ROC
includes the jω-axis).
For the inversion of the Laplace transform, recall its relation to the Fourier
transform of an exponentially weighted signal. Then, it can be shown that its
inverse is σ+j∞
1
f (t) = F (s) est ds,
2πj σ−j∞
where σ is chosen inside the ROC. We will denote a Laplace transform pair by
For a review of Laplace transform properties, see [212]. Next, we will concentrate
on filtering only.
with an ROC containing the intersection of the ROC’s of H(s) and G(s).
The differentiation property of the Laplace transform says that
∂f (t)
←→ s F (s),
∂t
with ROC containing the ROC of F (s). Then, it follows that linear constant-
coefficient differential equations can be characterized by a Laplace transform called
the transfer function H(s). Linear, time-invariant differential equations, given by
N
∂ k y(t)
M
∂ k x(t)
ak = bk , (2.5.1)
∂tk ∂tk
k=0 k=0
that is, the input and the output are related by a convolution with a filter having
impulse response h(t), where h(t) is the inverse Laplace transform of H(s).
To take this inverse Laplace transform, we need to specify the ROC. Typically,
we look for a causal solution, where we solve the differential equation forward
in time. Then, the ROC extends to the right of the vertical line which passes
through the rightmost pole. Stability11 of the filter corresponding to the transfer
function requires that the ROC include the jω-axis. This leads to the well-known
requirement that a causal system with rational transfer function is stable if and
only if all the poles are in the left half-plane (the real part of the pole location is
smaller than zero). In the above discussion, we have assumed initial rest conditions,
that is, the homogeneous solution of differential Equation (2.5.1) is zero (otherwise,
the system is neither linear nor time-invariant).
1
|HN (jω)|2 = , (2.5.2)
1 + (jω/jωc )2N
where ωc is a parameter which will specify the cutoff frequency beyond which sinusoids are
substantially attenuated. Thus, ωc defines the bandwidth of the lowpass Butterworth filter.
11
Stability of a filter means that a bounded input produces a bounded output.
62 CHAPTER 2
Since |HN (jω)|2 = H(jω)H ∗ (jω) = H(jω)H(−jω) when the filter is real, and noting that
(2.5.2) is the Laplace transform for s = jω, we get
1
H(s) H(−s) = . (2.5.3)
1 + (s/jωc )2N
π(2k + 1) π
|sk | = ωc , arg[sk ] = + ,
2N 2
and k = 0, . . . , 2N − 1. The poles thus lie on a circle, and they appear in pairs at ±sk .
To get a stable and causal filter, one simply chooses the N poles which lie on the left-hand
side half-circle. Since pole locations specify the filter only up to a scale factor, set s = 0
in (2.5.3) which leads to H(0) = 1. For example, a second-order Butterworth filter has the
following Laplace transform:
ωc2
H2 (s) = . (2.5.4)
(s + ωc e jπ/4 )(s + ωc e−jπ/4 )
One can find its “physical” implementation by going back, through the inverse Laplace
transform, to the equivalent linear constant-coefficient differential equation. See also Ex-
ample 3.6 in Chapter 3, for discrete-time Butterworth filters.
where z ∈ C. On the unit circle z = ejω , this is the discrete-time Fourier transform
(2.4.32), and for z = ρejω , it is the discrete-time Fourier transform of the sequence
f [n] · ρn . Similarly to the Laplace transform, there is a region of convergence
(ROC) associated with the z-transform F (z), namely a region of the complex plane
where F (z) converges. Consider the case where the z-transform is rational and
the sequence is bounded in amplitude. The ROC does not contain any pole. If the
sequence is right-sided (left-sided), the ROC extends outward (inward) from a circle
with the radius corresponding to the modulus of the outermost (innermost) pole. If
the sequence is two-sided, the ROC is a ring. The discrete-time Fourier transform
2.5. SIGNAL PROCESSING 63
converges absolutely if and only if the ROC contains the unit circle. From the
above discussion, it is clear that the unit circle in the z-plane of the z-transform
and the jω-axis in the s-plane of the Laplace transform play equivalent roles.
Also, just as in the Laplace transform, a given z-transform corresponds to dif-
ferent signals, depending on the ROC attached to it.
The inverse z-transform involves contour integration in the ROC and Cauchy’s
integral theorem [211]. If the contour of integration is the unit circle, the inver-
sion formula reduces to the discrete-time Fourier transform inversion (2.4.33). On
circles centered at the origin but of radius ρ different from 1, one can think of for-
ward and inverse z-transforms as the Fourier analysis and synthesis of a sequence
f [n] = ρn f [n]. Thus, convergence properties are as for the Fourier transform of the
exponentially weighted sequence. In the ROC, we can write formally a z-transform
pair as
f [n] ←→ F (z), z ∈ ROC.
When z-transforms are rational functions, the inversion is best done by partial frac-
tion expansion followed by term-wise inversion. Then, the z-trans-
form pairs,
1
an u[n] ←→ |z| > |a|, (2.5.6)
1 − az −1
and
1
−an u[−n − 1] ←→ |z| < |a|, (2.5.7)
1 − az −1
are useful, where u[n] is the unit-step function (u[n] = 1, n ≥ 0, and 0 otherwise).
The above transforms follow from the definition (2.5.5) and the sum of geometric
series, and they are a good example of identical z-transforms with different ROC’s
corresponding to different signals.
As a simple example, consider the sequence
f [n] = a|n|
that is, a nonempty ROC only if |a| < 1. For more z-transform properties, see
[211].
64 CHAPTER 2
The z-transform G(z) is thus the eigenvalue of the convolution operator for that
particular value of z. The convolution theorem follows as
with an ROC containing the intersection of the ROC’s of H(z) and G(z). Convo-
lution with a time-reversed filter can be expressed as an inner product,
f [n] = x[k] h[n − k] = x[k] h̃[k − n] = x[k], h̃[k − n],
k k
x[n − k] ←→ z −k X(z).
N
M
ak y[n − k] = bk x[n − k], (2.5.8)
k=0 k=0
and taking its z-transform using the delay property, we get the transfer function as
the ratio of the output and input z-transforms,
M −1
Y (z) k=0 bk z
H(z) = = N .
X(z) −1
k=0 ak z
The output is related to the input by a convolution with a discrete-time filter having
as impulse response h[n], the inverse z-transform of H(z). Again, the ROC depends
2.5. SIGNAL PROCESSING 65
that is, P (ejω ) is a nonnegative function on the unit circle. In other words, the
following is a Fourier-transform pair:
(recall that the subscript * implies conjugation of the coefficients but not of z).
Note that from the above, it is obvious that if zk is a zero of P (z), so is 1/zk∗ (that
also means that zeros on the unit circle are of even multiplicity). When h[n] is
real, and zk is a zero of H(z), then zk∗ , 1/zk , 1/zk∗ are zeros as well (they are not
necessarily different).
Suppose now that we are given an autocorrelation function P (z) and we want
to find H(z). Here, H(z) is called a spectral factor of P (z) and the technique of
extracting it, spectral factorization. These spectral factors are not unique, and are
obtained by assigning one zero out of each zero pair to H(z) (we assume here that
p[m] is FIR, otherwise allpass functions (2.5.10) can be involved). The choice of
which zeros to assign to H(z) leads to different spectral factors. To obtain a spectral
factor, first factor P (z) into its zeros as follows:
"
Nu "
N "
N
P (z) = α ((1 − z1i z ) (1 − z1i z)) (1 − z2i z ) (1 − z2∗i z),
−1 −1
where the first product contains the zeros on the unit circle, and thus |z1i | = 1,
and the last two contain pairs of zeros inside/outside the unit circle, respectively.
In that case, |z2i | < 1. To obtain various H(z), one has to take one zero out of
each zero pair on the unit circle, as well as one of two zeros inside/outside the
unit circle. Note that all these solutions have the same magnitude response but
different phase behavior. An important case is the minimum phase solution which
is the one, among all causal spectral factors, that has the smallest phase term. To
get a minimum phase solution, we will consistently choose the zeros inside the unit
circle. Thus, H(z) would be of the form
√ "
Nu "
N
H(z) = α (1 − z1i z −1 ) (1 − z2i z −1 ).
i=1 i=1
FIR filters of length L. When the impulse response is symmetric, one can write
where L is the length of the filter, and A(ω) is a real function of ω. Thus, the phase
is a linear function of ω. Similarly, when the impulse response is antisymmetric,
one can write
H(ejω ) = je−jω(L−1)/2 B(ω),
where B(ω) is a real function of ω. Here, the phase is an affine function of ω (but
usually called linear phase).
One way to design discrete-time filters is by transformation of an analog filter.
For example, one can sample the impulse response of the analog filter if its magni-
tude frequency response is close enough to being bandlimited. Another approach
consists of mapping the s-plane of the Laplace transform into the z-plane. From
our previous discussion of the relationship between the two planes, it is clear that
the jω-axis should map into the unit circle and the left half-plane should become
the inside of the unit circle in order to preserve stability. Such a mapping is given
by the bilinear transformation [211]
1 − z −1
B(z) = β .
1 + z −1
Then, the discrete-time filter Hd is obtained from a continuous-time filter Hc by
setting
Hd (z) = Hc (B(z)).
Considering what happens on the jω-axis and the unit circle, it can be verified that
the bilinear transform warps the frequency axis as ω = 2 arctan(ωc /β), where ω
and ωc are the discrete and continuous frequency variables, respectively.
As an example, the discrete-time Butterworth filter has a magnitude frequency
response equal to
1
|H(ejω )|2 = . (2.5.9)
1 + (tan(ω/2)/ tan(ω0 /2))2N
This squared magnitude is flat at the origin, in the sense that its first 2N − 1
derivatives are zero at ω = 0. Note that since we have a closed-form factorization of
the continuous-time Butterworth filter (see (2.5.4)), it is best to apply the bilinear
transform to the factored form rather than factoring (2.5.9) in order to obtain
H(ejω ) in its cascade form.
Instead of the above indirect construction, one can design discrete-time filters
directly. This leads to better designs at a given complexity of the filter or, con-
versely, to lower-complexity filters for a given filtering performance.
68 CHAPTER 2
In the particular case of FIR linear phase filters (that is, a finite-length sym-
metric or antisymmetric impulse response), a powerful design method called the
Parks-McClellan algorithm [211] leads to optimal filters in the minimax sense (the
maximum deviation from the desired Fourier transform magnitude is minimized).
The resulting approximation of the desired frequency response becomes equiripple
both in the passband and stopband (the approximation error is evenly spread out).
It is thus very different from a monotonically decreasing approximation as achieved
by a Butterworth filter.
Finally, we discuss the allpass filter, which is an example of what could be called
a unitary filter. An allpass filter has the property that
for all ω. Calling y[n] the output of the allpass when x[n] is input, we have
1 1 1
y2 = Y (ejω )2 = Hap (ejω ) X(ejω )2 = X(ejω )2 = x2 ,
2π 2π 2π
which means it conserves the energy of the signal it filters. An elementary single-
pole/zero allpass filter is of the following form (see also Appendix 3.A in Chapter
3):
z −1 − a∗
Hap (z) = . (2.5.11)
1 − az −1
Writing the pole location as a = ρejθ , the zero is at 1/a∗ = (1/ρ)ejθ . A general
allpass filter is made up of elementary sections as in (2.5.11)
"
N
z −1 − a∗i P̃ (z)
Hap (z) = = , (2.5.12)
1 − ai z −1 P (z)
i=1
P ∗ (ejω )
Hap (ejω ) = e−jωN ,
P (ejω )
and property (2.5.10) follows easily. That all rational functions satisfying (2.5.10)
can be factored as in (2.5.12) is shown in [308].
continuous-time signal and resample it at a different rate, most often, the rate
changes are being done in the discrete-time domain. We review some of the key
results. For further details, see [67] and [308].
y[n] = x[nN ],
that is, all samples with indexes modulo N different from zero are discarded. In
the Fourier domain, we get
1 j(ω−2πk)/N
N −1
jω
Y (e ) = X e , (2.5.13)
N
k=0
1 k 1/N
N −1
Y (z) = X WN z , (2.5.14)
N
k=0
where WN = e−j2π/N as usual. To prove (2.5.14), consider first a signal x [n] which
equals x[n] at multiples of N , and 0 elsewhere. If x[n] has z-transform X(z), then
X (z) equals
N −1
1
X (z) = X(WNk z) (2.5.15)
N
k=0
as can be shown by using the orthogonality of the roots of unity (2.1.3). To obtain
y[n] from x [n], one has to drop the extra zeros between the nonzero terms or
contract the signal by a factor of N . This is obtained by substituting z 1/N for z in
(2.5.15), leading to (2.5.14). Note that (2.5.15) contains the signal X as well as its
13
Sometimes, the term decimation is used even though it historically stands for “keep 9 out of
10” in reference to a Roman practice of killing every tenth soldier of a defeated army.
70 CHAPTER 2
(a) jω
1 X(e )
ω
π 2π 3π 4π 5π 6π
ω
π 2π 3π 4π 5π 6π
N − 1 modulated versions (on the unit circle, X(WNk z) = X(ej(ω−k2π/N ) )). This
is the reason why in Chapter 3, we will call the analysis dealing with X(WNk z),
modulation-domain analysis.
An alternative proof of (2.5.13) (which is (2.5.14) on the unit circle) consists
of going back to the underlying continuous-time signal and resampling with an
N -times larger sampling period. This is considered in Problem 2.10.
By way of an example, we show the case N = 3 in Figure 2.5. It is obvious
that in order to avoid aliasing, downsampling by N should be preceded by an ideal
lowpass filter with cutoff frequency π/N (see Figure 2.6(a)). Its impulse response
h[n] is given by
π/N
1 sin πn/N
h[n] = ejωn dω = . (2.5.16)
2π −π/N πn
2.5. SIGNAL PROCESSING 71
perfect interpolation of x[n] in the sense that the missing samples have been filled
in without disturbing the original ones.
A rational sampling rate change by M/N is obtained by cascading upsampling
and downsampling with an interpolation filter in the middle, as shown in Figure
2.6(c). The interpolation filter is the cascade of the ideal lowpass for the upsampling
and for the downsampling, that is, the narrower of the two in the ideal filter case.
Finally, we demonstrate a fact that will be extensively used in Chapter 3. It
can be seen as an application of downsampling followed by upsampling to the de-
terministic autocorrelation of g[n]. This is the discrete-time equivalent of (2.4.31).
We want to show that the following holds:
N −1
g[n], g[n + N l] = δ[l] ←→ G(WNk z) G(WN−k z −1 ) = N. (2.5.19)
k=0
The left side of the above equation is simply the autocorrelation of g[n] evaluated
at every N th index m = N l. If we denote the autocorrelation of g[n] as p[n], then
the left side of (2.5.19) is p [n] = p[N n]. The z-transform of p [n] is (apply (2.5.14))
N −1
1
P (z) = P (WNk z 1/N ).
N
k=0
Replace now z 1/N by z and since the z-transform of p[n] is P (z) = G(z)G(z −1 ), we
get that the z-transform of the left side of (2.5.19) is the right side of (2.5.19).
Multirate Identities
For the two expressions to be equal, kM mod N has to be a permutation, that is,
kM mod N = l has to have a unique solution for all l ∈ {0, . . . , N − 1}. If M and N
2.5. SIGNAL PROCESSING 73
(M,N) coprime
(a) M N N M
have a common factor L > 1, then M = M L and N = N L. Note that (kM mod
N ) mod L is zero, or kM mod N is a multiple of L and thus not a permutation.
If M and N are coprime, then Bezout’s identity [209] guarantees that there exist
two integers m and n such that mM + nN = 1. It follows that mM mod N = 1
thus, k = ml mod N is the desired solution to the equation k M mod N = l. This
property has an interesting generalization in multiple dimensions (see for example
[152]).
N −1
N −1
X(WNK z 1/N k 1/N N
) H (WN z ) = H(z) X(WNk z 1/N ),
k=0 k=0
Interchange of Filtering and Upsampling Filtering with a filter having the z-transform
H(z), followed by upsampling by N , is equivalent to upsampling followed by filtering
with H(z N ).
Using (2.5.18), it is immediate that both systems lead to an output with z-
transform X(z N )H(z N ) when the input is X(z).
In short, the last two properties simply say that filtering in the downsampled
domain can always be realized by filtering in the upsampled domain, but then with
74 CHAPTER 2
x[n]
3 3 +
z 3 3 z-1 +
z2 3 3 z-2
Figure 2.8 Polyphase transform (forward and inverse transforms for the case
N = 3 are shown). FIGURE 2.8 fignew2.4.4
the upsampled filter (down and upsampled stand for low versus high sampling rate
domain). The last two relations are shown in Figures 2.7(b) and (c).
These are called signal polyphase components. In z-transform domain, we can write
X(z) as the sum of shifted and upsampled polyphase components. That is,
N −1
X(z) = z −i Xi (z N ), (2.5.20)
i=0
2.5. SIGNAL PROCESSING 75
where
∞
Xi (z) = x[nN + i] z −n . (2.5.21)
n=−∞
Figure 2.8 shows the signal polyphase transform and its inverse (for the case N = 3).
Because the forward shift requires advance operators which are noncausal, a causal
version would produce a total delay of N − 1 samples between forward and inverse
polyphase transform. Such a causal version is obtained by multiplying the noncausal
forward polyphase transform by z −N +1 .
Later we will need to express the output of filtering with H followed by down-
sampling in terms of the polyphase components of the input signal. That is, we
need the 0th polyphase component of H(z)X(z). This is easiest if we define a
polyphase decomposition of the filter to have the reverse phase of the one used for
the signal, or
N −1
H(z) = z i Hi (z N ), (2.5.22)
i=0
with
∞
Hi (z) = h[N n − i]z −n , i = 0, . . . , N − 1. (2.5.23)
n=−∞
N −1
Y (z) = Hi (z) Xi (z).
i=0
⎛ . ⎞ ⎛ ⎞⎛
.. .. .. .. .. .. ⎞
⎜ . . . . ⎟ .
⎜ ⎟ ⎜ · · · h[L − 1] · · · h[L − N ] h[L − N − 1] · · · ⎟ ⎜ ⎟
⎜ y[0] ⎟ ⎜ x[0] ⎟
⎜ ⎟ = ⎜
⎜
⎟⎜
⎟ ⎟,
⎝ y[1] ⎠ ⎝ · · · 0 · · · 0 h[L − 1] · · · ⎠⎝ x[1] ⎠
.. .. .. .. .. ..
. . . . . .
where L is the filter length, and the matrix operator will be denoted by H. Simi-
76 CHAPTER 2
Here the matrix operator is denoted by G. Note that if h[n] = g[−n], then H = GT ,
a fact that will be important when analyzing orthonormal filter banks in Chapter 3.
Iω
|F (ω)|2 It t
| f (t)|2
ω ω
f”
f
6ω0
5ω0
4ω0
f f'
ω0 3ω0
f'
2ω0
ω0
t t
τ τ0 2τ0 3τ0 4τ0 5τ0 6τ0
(a) (b)
example, one can define intervals It and Iω which contain 90% of the energy of
the time- and frequency-domain functions, respectively, and are centered around
the center of gravity of |f (t)|2 and |F (ω)|2 (see Figure 2.9). This defines what we
call a tile in the time-frequency domain, as shown in Figure 2.9. For simplicity, we
assumed a complex basis function. A real basis function would be represented by
two mirror tiles at positive and negative frequencies.
Consider now elementary operations on a basis function and their effects on the
tile. Obviously, a shift in time by τ results in shifting of the tile by τ . Similarly,
modulation by ejω0 t shifts the tile by ω0 in frequency (vertically). This is shown
78 CHAPTER 2
where the function ψ(t) is usually a bandpass filter. Thus, large a’s (a 1)
correspond to long basis functions, and will identify long-term trends in the signal
to be analyzed. Small a’s (0 < a < 1) lead to short basis functions, which will follow
short-term behavior of the signal. This leads to the following: Scale is proportional
to the duration of the basis functions used in the signal expansion.
Because of this, and assuming that a basis function is a bandpass filter as in
wavelet analysis, high-frequency basis functions are obtained by going to small
scales, and therefore, scale is loosely related to inverse frequency. This is only
a qualitative statement, since scaling and modulation are fundamentally different
operations as was seen in Figure 2.10. The discussed scale is similar to those in
geographical maps, where large means a coarse, global view, and small corresponds
to a fine, detailed view.
Scale changes can be inverted if the function is continuous-time. In discrete
time, the situation is more complicated. From the discussion of multirate signal
processing in Section 2.5.3, we can see that upsampling (that is, a stretching of the
sequence) can be undone by downsampling by the same factor, and this with no
loss of information if done properly. Downsampling (or contraction of a sequence)
involves loss of information in general, since either a bandlimitation precedes the
downsampling, or aliasing occurs. This naturally leads to the notion of resolution of
a signal. We will thus say that the resolution of a finite-length signal is the minimum
number of samples required to represent it. It is thus related to the information
content of the signal. For infinite-length signals having finite energy and sufficient
decay, one can define the length as the essential support (for example, where 99%
of the energy is).
In continuous time, scaling does not change the resolution, since a scale change
affects both the sampling rate and the length of the signal, thus keeping the number
of samples constant. In discrete time, upsampling followed by interpolation does
2.6. TIME-FREQUENCY REPRESENTATIONS 79
not affect the resolution, since the interpolated samples are redundant. Downsam-
pling by N decreases the resolution by N , and cannot be undone. Figure 2.11 shows
the interplay of scale and resolution on simple discrete-time examples. Note that
the notion of resolution is central to multiresolution analysis developed in Chap-
ters 3 and 4. There, the key idea is to split a signal into several lower-resolution
components, from which the original, full-resolution signal can be recovered.
P ROOF
Consider the integral of t f (t) f (t). Using Cauchy-Schwarz inequality (2.2.2),
! !2
! !
! tf (t) f
(t) dt ! ≤ |tf (t)|2
dt |f (t)|2 dt. (2.6.4)
! !
R R R
The first integral on the right side is equal to Δ2t . Because f (t) has Fourier trans-
form jωF (ω), and using Parseval’s formula, we find that the second integral is equal
to (1/(2π))Δ2ω . Thus, the integral on the left side of (2.6.4) is bounded from above by
(1/(2π))Δ2t Δ2ω . Using integration by parts, and noting that f (t)f (t) = (1/2)(∂f 2 (t))/(∂t),
1 ∂f 2 (t) 1 !∞ 1
tf (t) f (t) dt = t dt = t f 2 (t)!−∞ − f 2 (t) dt.
R 2 R ∂t 2 2 R
By assumption, the limit of tf 2 (t) is zero at infinity, and, because the function is of unit
norm, the above equals −1/2. Replacing this into (2.6.4), we obtain
1 1 2 2
≤ Δt Δω ,
4 2π
or (2.6.2). To find a function that meets the lower bound note that Cauchy-Schwarz in-
equality is an equality when the two functions involved are equal within a multiplicative
factor, that is, from (2.6.4),
f (t) = ktf (t).
is maximized. It can be shown [216, 268] that the solution f (t) is the eigenfunction with
the largest eigenvalue satisfying
T
sin ω0 (t − τ )
f (τ ) dτ = λf (t). (2.6.6)
−T π(t − τ )
Call fn (t) the eigenfunction of (2.6.6) with eigenvalue λn . Then (i) each fn (t) is unique (up
to a scale factor), (ii) fn (t) and fm (t) are orthogonal for n = m, and (iii) with proper nor-
malization the set {fn (t)} forms an orthonormal basis for functions bandlimited to (−ω0 , ω0 )
[216]. These functions are called prolate spheroidal wave functions. Note that while (2.6.6)
seems to depend on both T and ω0 , the solution depends only on the product T · ω0 .
That is, one measures the similarity between the signal and shifts and modulates
of an elementary window, or
where
gω,τ (t) = w(t − τ )ejωt .
Thus, each elementary function used in the expansion has the same time and fre-
quency resolution, simply a different location in the time-frequency plane. It is
82 CHAPTER 2
t
(a) (b)
(c) (d)
Figure 2.12 The short-time Fourier and wavelet transforms. (a) Modulates
fig2.5.4
and shifts of a Gaussian window usedFIGURE 2.13
in the expansion. (b) Tiling of the time-
frequency plane. (c) Shifts and scales of the prototype bandpass wavelet. (d)
Tiling of the time-frequency plane.
thus natural to discretize the STFT on a rectangular grid (mω0 , nτ0 ). If the win-
dow function is a lowpass filter with a cutoff frequency of ωb , or a bandwidth of
2ωb , then ω0 is chosen smaller than 2ωb and τ0 smaller than π/ωb in order to get an
adequate sampling. Typically, the STFT is actually oversampled. A more detailed
discussion of the sampling of the STFT is given in Section 5.2, where the inversion
formula is also given. A real-valued version of the STFT, using cosine modulation
and an appropriate window, leads to orthonormal bases, which are discussed in
Section 4.8.
Examples of STFT basis functions and the tiling of the time-frequency plane
are given in Figures 2.12(a) and (b). To achieve good time-frequency resolution, a
Gaussian window (see (2.6.5)) can be used, as originally proposed by Gabor [102].
Thus, the STFT is often called Gabor transform as well.
The spectrogram is the energy distribution associated with the STFT, that is,
Because the STFT can be thought of as a bank of filters with impulse responses
gω,τ (−t) = w(−t − τ ) e−jωτ , the spectrogram is the magnitude squared of the filter
outputs.
where a ∈ R+ and b ∈ R. That is, we measure the similarity between the signal
f (t) and shifts and scales of an elementary function, since
where
1 t−b
ψa,b (t) = √ ψ
a a
√
and the factor 1/ a is used to conserve the norm. Now, the functions used in
the expansion have changing time-frequency tiles because of the scaling. For small
a (a < 1), ψa,b (t) will be short and of high frequency, while for large a (a > 1),
ψa,b (t) will be long and of low frequency. Thus, a natural discretization will use
large time steps for large a, and conversely, choose fine time steps for small a. The
discretization of (a, b) is then of the form (an0 , an0 · τ0 ), and leads to functions for the
expansion as shown in Figure 2.12(c). The resulting tiling of the time-frequency
plane is shown in Figure 2.12(d) (the case a = 2 is shown). Special choices for
ψ(t) and the discretization lead to orthonormal bases or wavelet series as studied
in Chapter 4, while the overcomplete, continuous wavelet transform in (2.6.8) is
discussed in Section 5.1.
function of the interval [nT, (n+1)T ), periodizing each windowed signal with period
T and applying an expansion such as the Fourier series on each periodized signal (see
Section 4.1.2). Of course, the arbitrary segmentation at points nT creates artificial
boundary problems. Yet, such transforms are used due to their simplicity. For
example, in discrete time, block transforms such as the Karhunen-Loève transform
(see Section 7.1.1) and its approximations are quite popular.
T F Dg (ω, τ ) = T DFf (ω − ω0 , τ − τ0 ).
D EFINITION 2.8
An operator A which maps one Hilbert space H1 into another Hilbert space
H2 (which may be the same) is called a linear operator if for all x, y in H1
and α in C
The operator A−1 is called the inverse of A. An important result is the following:
Suppose A is a bounded linear operator mapping H onto itself, and A < 1. Then
I − A is invertible, and for every y in H,
∞
−1
(I − A) y = Ak y. (2.A.1)
k=0
86 CHAPTER 2
Note that although the above expansion has the same form for a scalar as well
as an operator, one should not forget the distinction between the two. Another
important notion is that of an adjoint operator.15 It can be shown that for every x
in H1 and y in H2 , there exists a unique y ∗ from H1 , such that
±1
± 1 U1
•••
(a) U2
•••
±1 Un-2
•••
•••
•••
Un-1
±1
•••
±1
•••
Un
(b) •••
•••
•••
•••
•••
•••
•••
•••
Ui
cos α − sin α
Gα = . (2.B.1)
sin α cos α
The way to demonstrate this is to show that any real, unitary n × n matrix U n can
be expressed as
U n−1 0
U n = Rn−2 · · · R0 , (2.B.2)
0 ±1
88 CHAPTER 2
where D is diagonal with dii = ejθi , and H i are Householder blocks I − 2ui uTi .
The fact that we mention the Householder factorization here is because we will
use its polynomial version to factor lossless matrices in Chapter 3.
Note that the Householder building block is unitary, and that the factorization
in (2.B.3) can be proved similarly to the factorization using Givens rotations. That
is, we can first show that
jα
1 e 0 0
√ H 1U = ,
c 0 U1
In Section 2.4.3, when discussing Fourier series, we pointed out possible convergence
problems such as the Gibbs phenomenon. In this appendix, we first review different
types of convergence and then discuss briefly some convergence properties of Fourier
series and transforms. Then, we discuss regularity of functions and the associated
decay of the Fourier series and transforms. More details on these topics can be
found for example in [46, 326].
2.C.1 Convergence
This is a relatively weak form of convergence, since certain properties of fn (t), such
as continuity, are not passed on to the limit. Consider the truncated Fourier series,
that is (from (2.4.13))
n
fn (t) = F [k] ejkwot . (2.C.1)
k=−n
This Fourier series converges pointwise for all t when F [k] are the Fourier coefficients
(see (2.4.14)) of a piecewise smooth17 function f (t). Note that while each fn (t) is
continuous, the limit need not be.
17
A piecewise smooth function on an interval is piecewise continuous (finite number of disconti-
nuities) and its derivative is also piecewise continuous.
90 CHAPTER 2
lim f − fn 2 = 0.
n→∞
Note that this does not mean that limn→∞ fn = f for all t, but only almost ev-
erywhere. For example, the truncated Fourier series (2.C.1) of a piecewise smooth
function converges in the mean square sense to f (t) when F [k] are the Fourier se-
ries coefficients of f (t), even though at a point of discontinuity t0 , f (t0 ) might be
different from limn→∞ fn (t0 ) which equals the mean of the right and left limits.
In the case of the Fourier transform, the concept analogous to the truncated
Fourier series (2.C.1) is the truncated integral defined from the Fourier inversion
formula (2.4.2) as
c
1
fc (t) = F (ω) ejωt dω
2π −c
where F (ω) is the Fourier transform of f (t) (see (2.4.1)). The convergence of the
above integral as c → ∞ is an important question, since the limit limc→∞ fc (t)
might not equal f (t). Under suitable restrictions on f (t), equality will hold. As an
example, if f (t) is piecewise smooth and absolutely integrable, then limc→∞ fc (t0 ) =
f (t0 ) at each point of continuity and is equal to the mean of the left and right limits
at discontinuity points [326].
2.C.2 Regularity
So far, we have mostly discussed functions satisfying some integral conditions (abso-
lutely or square-integrable functions for example). Instead, regularity is concerned
with differentiability. The space of continuous functions is called C 0 , and similarly,
C n is the space of functions having n continuous derivatives.
A finer analysis is obtained using Lipschitz (or Hölder) exponents. A function
f is called Lipschitz of order α, 0 < α ≤ 1, if for any t and some small , we have
It can be shown (see [216]) that if a function f (t) and all its derivatives up
to order n exist and are of bounded variation, then the Fourier transform can be
bounded by
c
F (ω) ≤ , (2.C.3)
1 + |ω|n+1
that is, it decays as O(1/|ω|n+1 ) for large ω. Conversely, if F (ω) has a decay as in
(2.C.3), then f (t) has n−1 continuous derivatives, and the nth derivative exists but
might be discontinuous. A finer analysis of regularity and associated localization in
Fourier domain can be found in [241], in particular for functions in Hölder spaces
and using different norms in Fourier domain.
92 CHAPTER 2
P ROBLEMS
2.1 Legendre polynomials: Consider the interval [−1, 1] and the vectors 1, t, t2 , t3 , . . .. Using
Gram-Schmidt orthogonalization, find an equivalent orthonormal set.
2.2 Prove Theorem 2.4, parts (a), (b), (d), (e), for finite-dimensional Hilbert spaces, Rn or C n .
2.3 Orthogonal transforms and l∞ norm: Orthogonal transforms conserve the l2 norm, but not
others, in general. The l∞ norm of a vector is defined as (assume v ∈ Rn ):
(a) Consider n = 2 and the set of real orthogonal transforms T2 , that is, plane rotations.
Given the set of vectors v with unit l2 norm (that is, vectors on the unit circle), give
lower and upper bounds such that
a2 ≤ l∞ [T2 · v] ≤ b2 .
(b) Give the lower and upper bounds for the general case n > 2, that is, an and bn .
2.4 Norm of operators: Consider operators that map l2 (Z) to itself, and indicate their norm,
or bounds on their norm.
2.6 Least-squares solution: Show that for the least-squares solution obtained in Section 2.3.2,
the partial derivatives ∂(|y − ŷ|2 )/∂ x̂i are all zero.
2.7 Least-squares solution to a linear system of equations: The general solution was given in
Equation (2.3.4–2.3.5).
2.8 Parseval’s formulas can be proven by using orthogonality and biorthogonality relations of
the basis vectors.
(a) Show relations (2.2.5–2.2.6) using the orthogonality of the basis vectors.
(b) Show relations (2.2.11–2.2.13) using the biorthogonality of the basis vectors.
PROBLEMS 93
2.9 Consider the space of square-integrable real functions on the interval [−π, π], L2 ([−π, π]),
and the associated orthonormal basis given by
&
1 cos nx sin nx
√ , √ , √ , n = 1, 2, . . .
2π π π
Consider the following two subspaces: S – space of symmetric functions, that is, f (x) =
f (−x), on [−π, π], and A – space of antisymmetric functions, f (x) = −f (−x), on [−π, π].
(a) Show how any function f (x) from L2 ([−π, π]) can be written as f (x) = fs (x) + fa (x),
where fs (x) ∈ S and fa (x) ∈ A.
(b) Give orthonormal bases for S and A.
(c) Verify that L2 ([−π, π]) = S ⊕ A.
2.10 Downsampling by N : Prove (2.5.13) by going back to the underlying time-domain signal
and resampling it with an N -times longer sampling period. That is, consider x[n] and
y[n] = x[nN ] as two sampled versions of the same continuous-time signal, with sampling
periods T and N T , respectively. Hint: Recall that the discrete-time Fourier transform
X(ejω ) of x[n] is (see (2.4.36))
1
∞
ω ω 2π
X(ejω ) = XT ( ) = XC −k ,
T T T T
k=−∞
where T is the sampling period. Then Y (ejω ) = XNT (ω/N T ) (since the sampling period
is now N T ), where XNT (ω/N T ) can be written similarly to the above equation. Finally,
split the sum involved in XNT (ω/N T ) into k = nN + l, and gathering terms, (2.5.13) will
follow.
2.11 Downsampling and aliasing: If an arbitrary discrete-time sequence x[n] is input to a filter
followed by downsampling by 2, we know that an ideal half-band lowpass filter (that is,
|H(ejω )| = 1, |ω| < π/2, and H(ejω ) = 0, π/2 ≤ |ω| ≤ π) will avoid aliasing.
2.12 In pattern recognition, it is sometimes useful to expand a signal using the desired pattern,
or template, and its shifts, as basis functions. For simplicity, consider a signal of length N ,
x[n], n = 0, . . . , N − 1, and a pattern p[n], n = 0, . . . , N − 1. Then, choose as basis functions
(a) Derive a simple condition on p[n], so that any x[n] can be written as a linear combi-
nation of {ϕk }.
94 CHAPTER 2
(b) Assuming the previous condition is met, give the coefficients αk of the expansion
N−1
x[n] = αk ϕk [n].
k=0
2.13 Show that a linear, periodically time-varying system of period N can be implemented with
a polyphase transform followed by upsampling by N , N filter operations and a summation.
(a) Give the expression for h2 (t), and verify that it decays as 1/t2 .
(b) Same for h3 (t), which decays as 1/t3 . Show that H3 (ω) has a continuous derivative.
(c) By generalizing the construction above of H2 (ω) and H3 (ω), show that one can obtain
hi (t) with decay 1/ti . Also, show that Hi (ω) has a continuous (i − 2)th derivative.
However, the filters involved become spread out in time, and the result is only inter-
esting asymptotically.
2.15 Uncertainty relation: Consider the uncertainty relation Δ2ω Δ2t ≥ π/2.
2 2
√ does not change Δω · Δt . Either use scaling that conserves the L2
(a) Show that scaling
norm (f (t) = af (at)) or be sure to renormalize Δ2ω , Δ2t .
(b) Can you give the time-bandwidth product of a rectangular pulse, p(t) = 1, −1/2 ≤
t ≤ 1/2, and 0 otherwise?
(c) Same as above, but for a triangular pulse.
(d) What can you say about the time-bandwidth product as the time-domain function is
obtained from convolving more and more rectangular pulse with themselves?
(a) Assume the filter has real coefficients. Show pole-zero locations, and that numerator
and denominator polynomials are mirrors of each other.
(b) Given h[n], the causal, real-coefficient
impulse response of a stable allpass filter, give
its autocorrelation a[k] = n h[n]h[n − k]. Show that the set {h[n − k]}, k ∈ Z, is an
orthonormal basis for l2 (Z). Hint: Use Theorem 2.4.
(c) Show that the set {h[n − 2k]} is an orthonormal set but not a basis for l2 (Z).
2.17 Parseval’s relation for nonorthogonal bases: Consider the space V = Rn and a biorthogonal
basis, that is, two sets {αi } and {βi } such that
αi , βi = δ[i − j] i, j = 0, . . . , n − 1
PROBLEMS 95
(a) Show that any vector v ∈ V can be written in the following two ways:
n−1
n−1
v = αi , v βi = βi , v αi
i=0 i=0
(b) Call vα the vector with entries αi , v and similarly vβ with entries βi , v. Given v,
what can you say about vα and vβ ?
(c) Show that the generalization of Parseval’s identity to biorthogonal systems is
v2 = v, v = vα , vβ
and
v, g = vα , gβ .
2.18 Circulant matrices: An N × N circulant matrix C is defined by its first line, since subse-
quent lines are obtained by a right circular shift. Denote the first line by {c0 , cN−1 , . . . , c1 }
so that C corresponds to a circular convolution with a filter having impulse response
{c0 , c1 , c2 , . . . , cN−1 }.
2.19 Walsh basis: To define the Walsh basis, we need the Kronecker product of matrices defined
in (2.3.2). Then, the matrix W k , of size 2k × 2k , is
1 1 1 1
Wk = ⊗ W k−1 , W 0 = [1], W1 = .
1 −1 1 −1
Therefore, we would like to construct orthonormal sets of basis functions, {ϕk [n]},
which are complete in the space of square-summable sequences, l2 (Z). More general,
biorthogonal and overcomplete sets, will be considered as well.
The discrete-time Fourier series, seen in Chapter 2, is an example of such an
orthogonal series expansion, but it has a number of shortcomings. Discrete-time
bases better suited for signal processing tasks will try to satisfy two conflicting
requirements, namely to achieve good frequency resolution while keeping good time
locality as well. Additionally, for both practical and computational reasons, the set
of basis functions has to be structured. Typically, the infinite set of basis functions
{ϕk } is obtained from a finite number of prototype sequences and their shifted
versions in time. This leads to discrete-time filter banks for the implementation of
97
98 CHAPTER 3
such structured expansions. This filter bank point of view has been central to the
developments in the digital signal processing community, and to the design of good
basis functions or filters in particular. While the expansion is not time-invariant,
it will at least be periodically time-invariant. Also, the expansions will often have
a successive approximation property. This means that a reconstruction based on
an appropriate subset of the basis functions leads to a good approximation of the
signal, which is an important feature for applications such as signal compression.
Linear signal expansions have been used in digital signal processing since at
least the 1960’s, mainly as block transforms, such as piecewise Fourier series and
Karhunen-Loève transforms [143]. They have also been used as overcomplete ex-
pansions, such as the short-time Fourier transform (STFT) for signal analysis and
synthesis [8, 226] and in transmultiplexers [25]. Increased interest in the subject,
especially in orthogonal and biorthogonal bases, arose with work on compression,
where redundancy of the expansion such as in the STFT is avoided. In particular,
subband coding of speech [68, 69] spurred a detailed study of critically sampled
filter banks. The discovery of quadrature mirror filters (QMF) by Croisier, Esteban
and Galand in 1976 [69], which allows a signal to be split into two downsampled
subband signals and then reconstructed without aliasing (spectral foldbacks) even
though nonideal filters are used, was a key step forward.
Perfect reconstruction filter banks, that is, subband decompositions, where the
signal is a perfect replica of the input, followed soon. The first orthogonal solution
was discovered by Smith and Barnwell [270, 271] and Mintzer [196] for the two-
channel case. Fettweiss and coworkers [98] gave an orthogonal solution related
to wave digital filters [97]. Vaidyanathan, who established the relation between
these results and certain unitary operators (paraunitary matrices of polynomials)
studied in circuit theory [23], gave more general orthogonal solutions [305, 306]
as well as lattice factorizations for orthogonal filter banks [308, 310]. Biorthogonal
solutions were given by Vetterli [315], as well as multidimensional quadrature mirror
filters [314]. Biorthogonal filter banks, in particular with linear phase filters, were
investigated in [208, 321] and multidimensional filter banks were further studied in
[155, 163, 257, 264, 325]. Recent work includes filter banks with rational sampling
factors [166, 206] and filter banks with block sampling [158]. Additional work on
the design of filter banks has been done in [144, 205] among others.
In parallel to this work on filter banks, a generalization of block transforms
called lapped orthogonal transforms (LOT’s) was derived by Cassereau [43] and
Malvar [186, 188, 189]. An attractive feature of a subclass of LOT’s is the existence
of fast algorithms for their implementation since they are modulated filter banks
(similar to a “real” STFT). The connection of LOT’s with filter banks was shown,
in [321].
99
Multidimensional expansions and filter banks are derived in Section 3.6. Both
separable and nonseparable systems are considered. In the nonseparable case, the
focus is mostly on two-channel decompositions, while more general cases are indi-
cated as well.
Section 3.7 discusses a scheme that has received less attention in the filter bank
literature, but is nonetheless very important in applications, and is called a trans-
multiplexer. It is dual to the analysis/synthesis scheme used in compression appli-
cations, and is used in telecommunications.
The two appendices contain more details on orthogonal solutions and their fac-
torizations as well as on multidimensional sampling.
The material in this chapter covers filter banks at a level of detail which is
adequate for the remainder of the book. For a more exhaustive treatment of filter
banks, we refer the reader to the text by Vaidyanathan [308]. Discussions of fil-
ter banks and multiresolution signal processing are also contained in the book by
Akansu and Haddad [3].
where
X[k] = ϕk [l], x[l] = ϕ∗k [l] x[l], (3.1.2)
l
is the transform of x[n]. The basis functions ϕk satisfy the orthonormality1 con-
straint
ϕk [n], ϕl [n] = δ[k − l]
1
The first constraint is orthogonality between basis vectors. Then, normalization leads to
orthonormality. The terms “orthogonal” and “orthonormal” will often be used interchangeably,
unless we want to insist on the normalization and then use the latter.
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 101
and the set of basis functions is complete, so that every signal from l2 (Z) can
be expressed using (3.1.1). An important property of orthonormal expansions is
conservation of energy,
x2 = X2 .
Biorthogonal expansions, on the other hand, are given as
x[n] = ϕk [l], x[l] ϕ̃k [n] = X̃[k] ϕ̃k [n], (3.1.3)
k∈Z k∈Z
= ϕ̃k [l], x[l] ϕk [n] = X[k] ϕk [n],
k∈Z k∈Z
where
X̃[k] = ϕk [l], x[l] and X[k] = ϕ̃k [l], x[l]
are the transform coefficients of x[n] with respect to {ϕ̃k } and {ϕk }. The dual bases
{ϕk } and {ϕ̃k } satisfy the biorthogonality constraint
Note that in this case, conservation of energy does not hold. For stability of the
expansion, the transform coefficients have to satisfy
A |X[k]|2 ≤ x2 ≤ B |X[k]|2
k k
with a similar relation for the coefficients X̃[k]. In the biorthogonal case, conserva-
tion of energy can be expressed as
Finally, overcomplete expansions can be of the form (3.1.1) or (3.1.3), but with
redundant sets of functions, that is, the functions ϕk [n] used in the expansions are
not linearly independent.
N −1
(i)
X [k] = x(i) [iN + l] e−j2πkl/N k = 0, 1, . . . , N − 1. (3.1.8)
l=0
Reconstruction of x[n] from X (i) [k] is obvious. Recover x(i) [n] by inverting (3.1.8)
(see also (3.1.6)) and then get x[n] following (3.1.7) by juxtaposing the various
x(i) [n]. This leads to
∞ N −1
(i)
x[n] = X (i) [k] ϕk [n],
i=−∞ k=0
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 103
where
(i)
1 j2πkn/N
Ne n = iN + l, l = 0, 1, . . . , N − 1,
ϕk [n] =
0 otherwise.
(i)
The ϕk [n] are simply the basis functions of the DFT shifted to the appropriate
interval [iN, . . . , (i + 1)N − 1].
The above expansion is called a block discrete-time Fourier series, since the
signal is divided into blocks of size N , which are then Fourier transformed. In
matrix notation, the overall expansion of the transform is given by a block diagonal
matrix, where each block is an N × N Fourier matrix F N ,
⎛ . ⎞ ⎛. ⎞⎛ . ⎞
.. .. ..
⎜ (−1) ⎟ ⎜ ⎟ ⎜ (−1) ⎟
⎜X ⎟ ⎜ FN ⎟⎜x ⎟
⎜ (0) ⎟ ⎜ ⎟ ⎜ (0) ⎟
⎜ X ⎟ = ⎜ FN ⎟⎜ x ⎟,
⎜ (1) ⎟ ⎜ ⎟ ⎜ (1) ⎟
⎝ X ⎠ ⎝ FN ⎠⎝ x ⎠
.. .. ..
. . .
√
and X (i) , x(i) are size-N vectors. Up to a scale factor of 1/ N (see (3.1.6)), this is
a unitary transform. This transform is not shift-invariant in general, that is, if x[n]
has transform X[k], then x[n − l] does not necessarily have the transform X[k − l].
However, it can be seen that
That is, the transform is periodically time-varying with period N .2 Note that we
have achieved a certain time locality. Components of the signal that exist only in
an interval [iN . . . (i + 1)N − 1] will only influence transform coefficients in the same
interval. Finally, the basis functions in this block transform are naturally divided
into size-N subsets, with no overlaps between subsets, that is
(i) (m)
ϕk [n], ϕl [n] = 0, i = m,
simply because the supports of the basis functions are disjoint. This abrupt change
between intervals, and the fact that the interval length and position are arbitrary,
are the drawbacks of this block DTFS.
In this chapter, we will extend the idea of block transforms in order to address
these drawbacks, and this will be done using filter banks. But first, we turn our
attention to the simplest block transform case, when N = 2. This is followed by
the simplest filter bank case, when the filters are ideal sinc filters. The general case,
to which these are a prelude, lies between these extremes.
2
Another way to say this is that the ”shift by N ” and the size-N block transform operators
commute.
104 CHAPTER 3
It follows that the even-indexed basis functions are translates of each other, and so
are the odd-indexed ones, or
The transform is
1
X[2k] = ϕ2k , x = √ (x[2k] + x[2k + 1]) , (3.1.12)
2
1
X[2k + 1] = ϕ2k+1 , x = √ (x[2k] − x[2k + 1]) . (3.1.13)
2
The reconstruction is obtained from
x[n] = X[k] ϕk [n], (3.1.14)
k∈Z
as usual for an orthonormal basis. Let us prove that the set ϕk [n] given in (3.1.10)
is an orthonormal basis for l2 (Z). While the proof is straightforward in this simple
case, we indicate it for two reasons. First, it is easy to extend it to any block
transform, and second, the method of the proof can be used in more general cases
as well.
P ROPOSITION 3.1
The set of functions as given in (3.1.10) is an orthonormal basis for signals
from l2 (Z).
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 105
P ROOF
To check that the set of basis functions {ϕk }k∈Z indeed constitutes an orthonormal basis
for signals from l2 (Z), we have to verify that:
Consider (a). We want to show that ϕk , ϕl = δ[k − l]. Take k even, k = 2i. Then, for l
smaller than 2i or larger than 2i + 1, the inner product is automatically zero since the basis
functions do not overlap. For l = 2i, we have
1 1
ϕ2i , ϕ2i = ϕ22i [2i] + ϕ22i [2i + 1] = + = 1.
2 2
For l = 2i + 1, we get
ϕ2i , ϕ2i+1 = ϕ2i [2i] · ϕ2i+1 [2i] + ϕ2i [2i + 1] · ϕ2i+1 [2i + 1] = 0.
A similar argument can be followed for odd l’s, and thus, orthonormality is proven. Now
consider (b). We have to demonstrate that any signal belonging to l2 (Z) can be expanded
using (3.1.14). This is equivalent to showing that there exists no x[n] with x > 0, such
that it has a zero expansion, that is, such that ϕk , x = 0, for all k. To prove this,
suppose it is not true, that is, suppose that there exists an x[n] with x > 0, such that
ϕk , x = 0, for all k. Thus
ϕk , x = 0 ⇐⇒ ϕk , x2 = 0 ⇐⇒ |ϕk [n], x[n]|2 = 0. (3.1.15)
k∈Z
Since the last sum consists of strictly nonnegative terms, (3.1.15) is possible if and only if
First, take k even, and consider X[2k] = 0. Because of (3.1.12), it means that x[2k] =
−x[2k + 1] for all k. Now take the odd k’s, and look at X[2k + 1] = 0. From (3.1.13), it
follows that x[2k] = x[2k+1] for all k. Thus, the only solution to the above two requirements
is x[2k] = x[2k + 1] = 0, or a contradiction with our assumption. This shows that there is
no sequence x[n], x > 0 such that X = 0, and proves completeness.
Now, we would like to show how the expansion (3.1.12–3.1.14) can be implemented
using convolutions, thus leading to filter banks. Consider the filter h0 [n] with the
following impulse response:
√1
2
n = −1, 0,
h0 [n] = (3.1.16)
0 otherwise.
Note that this is a noncausal filter. Then, X[2k] in (3.1.12) is the result of the
convolution of h0 [n] with x[n] at instant 2k since
1 1
h0 [n] ∗ x[n] |n=2k = h0 [2k − l] x[l] = √ x[2k] + √ x[2k + 1] = X[2k].
l∈Z
2 2
106 CHAPTER 3
analysis synthesis
(a) y1 x1
H1 2 2 G1
x + x^
H0 2 2 G0
y0 x0
low high
band band
0 π π ω
---
2
Figure 3.1 Two-channel filter bank with analysis filters h0 [n], h1 [n] and synthe-
sis filters g0 [n], g1 [n]. If the filter bank implements an orthonormal transform,
then g0 [n] = h0 [−n] and g1 [n] = h1 [−n]. (a) Block diagram. (b) Spectrum
splitting performed by the filter bank.
It is important to note that the impulse responses of the analysis filters are time-
reversed versions of the basis functions,
since convolution is an inner product involving time reversal. Also, the filters we
defined in (3.1.16) and (3.1.17) are noncausal, which is to be expected since, for
example, the computation of X[2k] in (3.1.12) involves x[2k + 1], that is, a future
sample. To summarize this discussion, it is easiest to visualize the analysis in matrix
notation as
⎛. ⎞
..
⎜ ϕ0 [n] ⎟
⎜ 1 23 4 ⎟
⎛ . ⎞ ⎛ . ⎞ ⎜ ⎟⎛ . ⎞
.. .. ⎜ h 0 [0] h 0 [−1] ⎟ ..
⎜ ⎟
⎜ y [0] ⎟ ⎜ X[0] ⎟ ⎜ h1 [0] h1 [−1] ⎟ ⎜ x[0] ⎟
⎜ 0 ⎟ ⎜ ⎟ ⎜ 3 41 2 ⎟⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ y1 [0] ⎟ ⎜ X[1] ⎟ ⎜ ϕ1 [n] ⎟ ⎜ x[1] ⎟
⎜ ⎟ = ⎜ ⎟ = ⎜ ⎟⎜ ⎟,
⎜ y0 [1] ⎟ ⎜ X[2] ⎟ ⎜ 1
ϕ2 [n]
23 4 ⎟ ⎜ x[2] ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟
⎝ y1 [1] ⎠ ⎝ X[3] ⎠ ⎜ h0 [0] h0 [−1] ⎟ ⎝ x[3] ⎠
⎜ ⎟
.. .. ⎜ h1 [0] h1 [−1] ⎟ ..
. . ⎜ 3 41 2 ⎟ .
⎜ ⎟
⎝ ϕ3 [n] ⎠
..
.
(3.1.18)
where we again see the shift property of the basis functions (see (3.1.11)). We can
verify the shift invariance of the analysis with respect to even shifts. If x [n] =
x[n − 2l], then
1 1
X [2k] = √ (x [2k] + x [2k + 1]) = √ (x[2k − 2l] + x[2k + 1 − 2l])
2 2
= X[2k − 2l]
and similarly for X [2k + 1] which equals X[2k + 1 − 2l], thus verifying (3.1.9).
This does not hold√ for odd shifts, however. For example, δ[n] √ has the transform
(δ[n] + δ[n − 1])/ 2 while δ[n − 1] leads to (δ[n] − δ[n − 1])/ 2.
What about the synthesis or reconstruction given by (3.1.14)? Define two filters
g0 and g1 with impulse responses equal to the basis functions ϕ0 and ϕ1
Therefore
ϕ2k [n] = g0 [n − 2k], ϕ2k+1 [n] = g1 [n − 2k], (3.1.20)
108 CHAPTER 3
That is, each sample from yi [k] adds a copy of the impulse response of gi [n] shifted by
2k. This can be implemented by an upsampling by 2 (inserting a zero between every
two samples of yi [k]) followed by a convolution with gi [n] (see also Section 2.5.3).
This is shown in the right side of Figure 3.1(a), and is called a synthesis filter bank.
What we have just explained is a way of implementing a structured orthogonal
expansion by means of filter banks. We summarize two characteristics of the filters
which will hold in general orthogonal cases as well.
(a) The impulse responses of the synthesis filters equal the first set of basis func-
tions
gi [n] = ϕi [n], i = 0, 1.
(b) The impulse responses of the analysis filters are the time-reversed versions of
the synthesis ones
hi [n] = gi [−n], i = 0, 1.
What about the signal processing properties of our decomposition? From (3.1.12)
and (3.1.13), we recall that one channel computes the average and the other the
difference of two successive samples. While these are not the ”best possible” low-
pass and highpass filters (they have, however, good time localization), they lead to
an important interpretation. The reconstruction from y0 [k] (that is, the first sum
in (3.1.21)) is the orthogonal projection of the input onto the subspace spanned by
ϕ2k [n], that is, an average or coarse version of x[n]. Calling it x0 , it equals
1
x0 [2k] = x0 [2k + 1] = (x[2k] + x[2k + 1]) .
2
The other sum in (3.1.21), which is the reconstruction from y1 [k], is the orthogonal
projection onto the subspace spanned by ϕ2k+1 [n]. Denoting it by x1 , it is given by
1
x1 [2k] = (x[2k] − x[2k + 1]) , x1 [2k + 1] = −x1 [2k].
2
This is the difference or added detail necessary to reconstruct x[n] from its coarse
version x0 [n]. The two subspaces spanned by {ϕ2k } and {ϕ2k+1 } are orthogonal
and the sum of the two projections recovers x[n] perfectly, since summing (x0 [2k] +
x1 [2k]) yields x[2k] and similarly (x0 [2k + 1] + x1 [2k + 1]) gives x[2k + 1].
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 109
While the time reversal is only formal here (since g0 [n] is symmetric in n), the
shift by one is important for the completeness of the highpass and lowpass impulse
responses in the space of square-summable sequences.
Just as in the Haar case, the basis functions are obtained from the filter impulse
responses and their even shifts,
and the coefficients of the expansion ϕ2k , x and ϕ2k+1 , x are obtained by filtering
with h0 [n] and h1 [n] followed by downsampling by 2, with hi [n] = gi [−n].
P ROPOSITION 3.2
The set of functions as given in (3.1.25) is an orthonormal basis for signals
from l2 (Z).
P ROOF
To prove that the set of functions ϕk [n] is indeed an orthonormal basis, again we would
have to demonstrate orthonormality of the set as well as completeness. Let us demonstrate
orthonormality of basis functions. We will do that only for
as an exercise (Problem 3.1). First, because ϕ2k [n] = ϕ0 [n − 2k], it suffices to show (3.1.26)
for k = 0, or equivalently, to prove that
As we said, the filters in this case have perfect frequency resolution. However,
the decay of the filters in time is rather poor, being of the order of 1/n. The
multiresolution interpretation we gave for the Haar case holds here as well. The
perfect lowpass filter h0 , followed by downsampling, upsampling and interpolation
by g0 , leads to a projection of the signal onto the subspace of sequences bandlimited
to [−π/2, π/2], given by x0 . Similarly, the other path in Figure 3.1 leads to a
projection onto the subspace of half-band highpass signals given by x1 . The two
subspaces are orthogonal and their sum is l2 (Z). It is also clear that x0 is a coarse,
lowpass approximation to x, while x1 contains the additional frequencies necessary
to reconstruct x from x0 .
An example describing the decomposition of a signal into downsampled lowpass
and highpass components, with subsequent reconstruction using upsampling and
interpolation, is shown in Figure 3.2. Ideal half-band filters are assumed. The
reader is encouraged to verify this spectral decomposition using the downsampling
and upsampling formulas (see (2.5.13) and (2.5.17)) from Section 2.5.3.
3.1.4 Discussion
In both the Haar and sinc cases above, we noticed that the expansion was not
time-invariant, but periodically time-varying. We show below that time invariance
in orthonormal expansions leads only to trivial solutions, and thus, any meaningful
orthonormal expansion of l2 (Z) will be time-varying.
P ROPOSITION 3.3
An orthonormal time-invariant signal decomposition will have no frequency
resolution.
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 111
|X(ejω)|
−π π ω
(a)
(b)
(c)
(d)
(e)
|X(ejω)|
−π π ω
(f)
Figure 3.2 Two-channel decomposition of a signal using ideal filters. Left side
FIGURE
depicts the process in the lowpass TUT3.1
channel, while the right side depicts
figtut3.1
the
process in the highpass channel. (a) Original spectrum. (b) Spectrums after
filtering. (c) Spectrums after downsampling. (d) Spectrums after upsampling.
(e) Spectrums after interpolation filtering. (f) Reconstructed spectrum.
P ROOF
An expansion is time-invariant if x[n] ←→ X[k], then x[n − m] ←→ X[k − m] for all x[n] in
l2 (Z). Thus, we have that
By a change of variable, the left side is equal to ϕk [n+m], x[n], and then using k = k −m,
we find that
ϕk +m [n + m] = ϕk [n], (3.1.29)
that is, the expansion operator is Toeplitz. Now, we want the expansion to be orthonormal,
that is, using (3.1.29),
|Φ(ejω )|2 = 1,
showing that the basis functions have no frequency selectivity since they are allpass func-
tions.
112 CHAPTER 3
Haar Sinc
√ sin(π/2)n
g0 [n] (δ[n] + δ[n − 1])/ 2 √1
√ 2 (π/2)n
g1 [n] (δ[n] − δ[n − 1])/ 2 √ (−1)n g0 [−n +
1]
jω
√ −j(ω/2) 2 for ω ∈ [−π/2, π/2],
G0 (e ) 2e cos(ω/2)
√ −j(ω/2) 0 otherwise.
G1 (ejω ) 2je sin(ω/2) −e−jω G0 (−e−jω )
chapter. We start with tools for analyzing general filter banks. Then, we examine
orthonormal and linear phase two-channel filter banks in more detail. We then
present results valid for general two-channel filter banks and examine some special
cases, such as IIR solutions.
Time-Domain Analysis Recall that in the Haar case (see (3.1.18)), in order to vi-
sualize block time invariance, we expressed the transform coefficients via an infinite
matrix, that is
⎛ . ⎞ ⎛ . ⎞ ⎛ . ⎞
.. .. ..
⎜ y [0] ⎟ ⎜ X[0] ⎟ ⎜ x[0] ⎟
⎜ 0 ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ y1 [0] ⎟ ⎜ X[1] ⎟ ⎜ x[1] ⎟
⎜ ⎟ = ⎜ ⎟ = Ta · ⎜ ⎟. (3.2.1)
⎜ y0 [1] ⎟ ⎜ X[2] ⎟ ⎜ x[2] ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ y1 [1] ⎠ ⎝ X[3] ⎠ ⎝ x[3] ⎠
.. .. ..
. . .
3 41 2 3 41 2 3 41 2
y X x
Here, the transform coefficients X[k] are expressed in another form as well. In
the filter bank literature, it is more common to write X[k] as outputs of the two
branches in Figure 3.1(a), that is, as two subband outputs denoted by y0 [k] = X[2k],
114 CHAPTER 3
and y1 [k] = X[2k + 1]. Also, in (3.2.1), T a · x represents the inner products, where
T a is the analysis matrix and can be expressed as
⎛ .. .. .. .. .. .. ⎞
. . . . . .
⎜ h [L − 1] h [L − 2] h [L − 3] · · · h [0] ⎟
⎜ 0 0 0 0 0 0 ⎟
⎜ ⎟
⎜ h [L − 1] h1 [L − 2] h1 [L − 3] · · · h1 [0] 0 0 ⎟
Ta = ⎜ 1 ⎟,
⎜ 0 0 h0 [L − 1] · · · h0 [2] h0 [1] h0 [0] ⎟
⎜ ⎟
⎝ 0 0 h1 [L − 1] · · · h1 [2] h1 [1] h1 [0] ⎠
.. .. .. .. .. ..
. . . . . .
where we assume that the analysis filters hi [n] are finite impulse response (FIR)
filters of length L = 2K. To make the block Toeplitz structure of T a more explicit,
we can write
⎛ .. .. .. .. ⎞
. . . .
⎜ ⎟
⎜ · · · A0 A1 · · · AK−1 0 ···⎟
Ta = ⎜ ⎟. (3.2.2)
⎝ · · · 0 A0 · · · AK−2 AK−1 · · · ⎠
.. .. .. ..
. . . .
x = T s y = T s X = T s T a x. (3.2.6)
3.2. TWO-CHANNEL FILTER BANKS 115
where the block S i is of size 2 × 2 and FIR filters are of length L = 2K . The block
S i is
g0 [2i] g1 [2i]
Si = ,
g0 [2i + 1] g1 [2i + 1]
where g0 [n] and g1 [n] are the synthesis filters. The dual synthesis basis functions
are
Let us go back for a moment to (3.2.6). The requirement that {h0 [2k−n], h1 [2k−n]}
and {g0 [n − 2k], g1 [n − 2k]} form a dual bases pair is equivalent to
T s T a = T a T s = I. (3.2.8)
This is the biorthogonality condition or, in the filter bank literature, the perfect
reconstruction condition. In other words,
Consider the two branches in Figure 3.1(a) which produce y0 and y1 . Call H i the
operator corresponding to filtering by hi [n] followed by downsampling by 2. Then
116 CHAPTER 3
(G0 H 0 + G1 H 1 ) x.
Thus, to resynthesize the signal (the condition for perfect reconstruction), we have
that
G0 H 0 + G1 H 1 = I.
Of course, by interleaving the rows of H 0 and H 1 , we get T a , and similarly, T s
corresponds to interleaving the columns of G0 and G1 .
To summarize this part on time-domain analysis, let us stress once more that
biorthogonal expansions of discrete-time signals, where the basis functions are ob-
tained from two prototype functions and their even shifts (for both dual bases), is
implemented using a perfect reconstruction, two-channel multirate filter bank. In
other words, perfect reconstruction is equivalent to the biorthogonality condition
(3.2.8).
Completeness is also automatically satisfied. To prove it, we show that there
exists no x[n] with x > 0, such that it has a zero expansion, that is, such that
X = 0. Suppose it is not true, that is, suppose that there exists an x[n] with
x > 0, such that X = 0. But, since X = T a x, we have that
T a x = 0,
Ta x = 0 (3.2.10)
(since in a Hilbert space — l2 (Z) in this case, v2 = v, v = 0, if and only
if v ≡ 0). We know that (3.2.10) has a nontrivial solution if and only if T a is
singular. However, due to (3.2.8), T a is nonsingular and thus (3.2.10) has only a
trivial solution, x ≡ 0, violating our assumption and proving completeness.
3.2. TWO-CHANNEL FILTER BANKS 117
In the above, H m (z) is the analysis modulation matrix containing the modulated
versions of the analysis filters and xm (z) contains the modulated versions of X(z).
Relation (3.2.14) is illustrated in Figure 3.3, where the time-varying part is in
the lower channel. If the channel signals Y0 (z) and Y1 (z) are desired, that is, the
downsampled domain signals, it follows from (3.2.11) and (3.2.14) that
Y0 (z) 1 H0 (z 1/2 ) H0 (−z 1/2 ) X(z 1/2 )
= ,
Y1 (z) 2 H1 (z 1/2 ) H1 (−z 1/2 ) X(−z 1/2 )
118 CHAPTER 3
G0
+ 1
x Hm ---
2
x^
G1
(-1)n
FIGURE 3.2 figlast3.2.1
The above two conditions then ensure perfect reconstruction. Expressing (3.2.15)
and (3.2.16) in matrix notation, we get
We can solve now for G0 (z) and G1 (z) (transpose (3.2.17) and multiply by (H Tm (z))−1
from the left)
G0 (z) 2 H1 (−z)
= . (3.2.18)
G1 (z) det(H m (z)) −H0 (−z)
In the above, we assumed that H m (z) is nonsingular; that is, its normal rank is
equal to 2. Define P (z) as
2
P (z) = G0 (z) H0 (z) = H0 (z)H1 (−z), (3.2.19)
det(H m (z))
where we used (3.2.18). Observe that det(H m (z)) = − det(H m (−z)). Then, we
can express the product G1 (z)H1 (z) as
−2
G1 (z) H1 (z) = H0 (−z) H1 (z) = P (−z).
det(H m (z))
3.2. TWO-CHANNEL FILTER BANKS 119
We will show later, that the function P (z) plays a crucial role in analyzing and
designing filter banks. It suffices to note at this moment that, due to (3.2.20), all
even-indexed coefficients of P (z) equal 0, except for p[0] = 1. Thus, P (z) is of the
following form:
P (z) = 1 + p[2k + 1] z −(2k+1) .
k∈Z
or equivalently,
g0 [k] h0 [2n − k] = δ[n],
k∈Z
Similarly, starting from (3.2.15) or (3.2.16) and expressing G0 (z) and H0 (z) as
a function of G1 (z) and H1 (z) would lead to the other biorthogonality relations,
namely
Note that we obtained these relations for ϕ̃0 and ϕ̃1 but they hold also for ϕ̃2l and
ϕ̃2l+1 , respectively. This shows once again that perfect reconstruction implies the
biorthogonality conditions. The converse can be shown as well, demonstrating the
equivalence of the two conditions.
120 CHAPTER 3
y0
2 2
x + x^
y1
z 2 2 z-1
(a)
2 y0
x
Hp
z 2 y1
(b)
y0 2
Gp + x^
y1 2 z-1
(c)
where Hij is the jth polyphase component of the ith filter, or, following (2.5.22–
2.5.23),
Hi (z) = Hi0 (z 2 ) + zHi1 (z 2 ).
In (3.2.22) y(z) contains the signals in the middle of the system in Figure 3.1(a).
H p (z) contains the polyphase components of the analysis filters, and is conse-
quently denoted the analysis polyphase matrix, while xp (z) contains the polyphase
components of the input signal or, following (2.5.20),
X(z) = X0 (z 2 ) + z −1 X1 (z 2 ).
where
Gi (z) = Gi0 (z 2 ) + z −1 Gi1 (z 2 ). (3.2.24)
The synthesis filter polyphase components are defined such as those of the signal
(2.5.20–2.5.21), or in reverse order of those of the analysis filters. In Figure 3.4(c),
we show how the output signal is synthesized from the channel signals Y0 and Y1 as
−1 G00 (z 2 ) G10 (z 2 ) Y0 (z 2 )
X̂(z) = ( 1 z ) . (3.2.25)
G01 (z 2 ) G11 (z 2 ) Y1 (z 2 )
3 41 2 3 41 2
Gp (z 2 ) y (z 2 )
This equation reflects that the channel signals are first upsampled by 2 (leading to
Yi (z 2 )) and then filtered by filters Gi (z) which can be written as in (3.2.24). Note
that the matrix-vector product in (3.2.25) is in z 2 and can thus be implemented
before the upsampler by 2 (replacing z 2 by z) as shown in the figure.
Note the duality between the analysis and synthesis filter banks. The former
uses a forward, the latter an inverse polyphase transform, and Gp (z) is a transpose
of H p (z). The phase reversal in the definition of the polyphase components in
analysis and synthesis comes from the fact that z and z −1 are dual operators, or,
on the unit circle, ejω = (e−jω )∗ .
122 CHAPTER 3
Obviously the transfer function between the forward and inverse polyphase
transforms defines the analysis/synthesis filter bank. This transfer polyphase matrix
is given by
T p (z) = Gp (z) H p (z).
In order to find the input-output relationship, we use (3.2.22) as input to (3.2.25),
which yields
X̂(z) = ( 1 z −1 ) Gp (z 2 ) H p (z 2 ) xp (z 2 ),
= ( 1 z −1 ) T p (z 2 ) xp (z 2 ). (3.2.26)
following (2.5.20), that is, the analysis/synthesis filter bank achieves perfect recon-
struction with no delay and is equivalent to Figure 3.4(a).
thus relating polyphase and modulation representations of the signal, that is, xp (z)
and xm (z). For the analysis filter bank, we have that
H00 (z 2 ) H01 (z 2 ) 1 H0 (z) H0 (−z) 1 1 1
= , (3.2.28)
H10 (z 2 ) H11 (z 2 ) 2 H1 (z) H1 (−z) 1 −1 z −1
establishing the relationship between H p (z) and H m (z). Finally, following the
definition of Gp (z) in (3.2.23) and similarly to (3.2.28) we have
G00 (z 2 ) G10 (z 2 ) 1 1 1 1 G0 (z) G1 (z)
= , (3.2.29)
G01 (z 2 ) G11 (z 2 ) 2 z 1 −1 G0 (−z) G1 (−z)
Again, note that (3.2.28) is the transpose of (3.2.29), with a phase change in the
diagonal matrix. The change from the polyphase to the modulation representation
3.2. TWO-CHANNEL FILTER BANKS 123
(and vice versa) involves not only a diagonal matrix with a delay (or phase factor),
but also a sum and/or a difference operation (see the middle matrix in (3.2.27–
3.2.29)). This is actually a size-2 Fourier transform, as will become clear in cases
of higher dimension.
The relation between time domain and polyphase domain is most obvious for
the synthesis filters gi , since their impulse responses correspond to the first basis
functions ϕi . Consider the time-domain synthesis matrix, and create a matrix T s (z)
−1
K
T s (z) = S i z −i ,
i=0
where S i are the successive 2×2 blocks along a column of the block Toeplitz matrix
(there are K of them for length 2K filters), or
g0 [2i] g1 [2i]
Si = .
g0 [2i + 1] g1 [2i + 1]
where
K−1
T a (z) = Ai z −i ,
i=0
and
h0 [2(K − i) − 1] h0 [2(K − i) − 2]
Ai = ,
h1 [2(K − i) − 1] h1 [2(K − i) − 2]
K being the number of 2 × 2 blocks in a row of the block Toeplitz matrix. The
above relations can be used to establish equivalences between results in the various
representations (see also Theorem 3.7 below).
In the filter bank language, perfect reconstruction means that the output is a
delayed and possibly scaled version of the input,
X̂(z) = cz −k X(z).
This is equivalent to saying that, up to a shift and scale, the impulse responses of the
analysis filters (with time reversal) and of the synthesis filters form a biorthogonal
basis.
Among approximate reconstructions, the most important one is alias-free re-
construction. Remember that because of the periodic time-variance of analy-
sis/synthesis filter banks, the output is both a function of x[n] and its modulated
version (−1)n x[n], or X(z) and X(−z) in the z-transform domain. The aliased
component X(−z) can be very disturbing in applications and thus cancellation of
aliasing is of prime importance. In particular, aliasing represents a nonharmonic
distortion (new sinusoidal components appear which are not harmonically related
to the input) and this is particularly disturbing in audio applications.
What follows now, are results on alias cancellation and perfect reconstruction
for the two-channel case. Note that all the results are valid for a general, N -channel
case as well (substitute N for 2 in statements and proofs).
For the first result, we need to introduce pseudocirculant matrices [311]. These
are N × N circulant matrices with elements Fij (z), except that the lower triangular
elements are multiplied by z, that is
F0,j−i (z) j ≥ i,
Fij (z) =
z · F0,N +j−i (z) j < i.
P ROPOSITION 3.4
Aliasing in a one-dimensional subband coding system will be cancelled if and
only if the transfer polyphase matrix T p is pseudocirculant [311].
P ROOF
Consider a 2 × 2 pseudocirculant matrix
F0 (z) F1 (z)
T p (z) = ,
zF1 (z) F0 (z)
A corollary to Proposition 3.4, is that for perfect reconstruction, the transfer func-
tion matrix has to be a pseudocirculant delay, that is, for an even delay 2k
−k 1 0
T p (z) = z ,
0 1
while for an odd delay 2k + 1
−k−1 0 1
T p (z) = z .
z 0
The next result indicates when aliasing can be cancelled for a given analysis filter
bank. Since the analysis and synthesis filter banks play dual roles, the result that
we will discuss holds for synthesis filter banks as well.
P ROPOSITION 3.5
Given a two-channel filter bank downsampled by 2 with the polyphase matrix
H p (z), then alias-free reconstruction is possible if and only if the determinant
of H p (z) is not identically zero, that is, H p (z) has normal rank 2.
P ROOF
Choose the synthesis matrix as
resulting in
T p (z) = Gp (z) H p (z) = det (H p (z)) · I
which is pseudocirculant, and thus cancels aliasing. If, on the other hand, the system is
alias-free, then we know (see Proposition 3.4) that T p (z) is pseudocirculant and therefore
has full rank 2. Since the rank of a matrix product is bounded above by the ranks of its
terms, H p (z) has rank 2.4
Often, one is interested in perfect reconstruction filter banks where all filters
involved have a finite impulse response (FIR). Again, analysis and synthesis filter
banks play the same role.
4
Note that we excluded the case of zero reconstruction, even if technically it is also aliasing free
(but of zero interest!).
126 CHAPTER 3
P ROPOSITION 3.6
Given a critically sampled FIR analysis filter bank, perfect reconstruction
with FIR filters is possible if and only if det(H p (z)) is a pure delay.
P ROOF
Suppose that the determinant of H p (z) is a pure delay, and choose
It is obvious that the above choice leads to perfect reconstruction with FIR filters. Suppose,
on the other hand, that we have perfect reconstruction with FIR filters. Then, T p (z) has
to be a pseudocirculant shift (corollary below Proposition 3.4), or
meaning that it has l poles at z = 0. Since the synthesis has to be FIR as well, det(Gp (z))
has only zeros (or poles at the origin). Therefore, det(H p (z)) cannot have any zeros (except
possibly at the origin or ∞).
If det(H p (z)) has no zeros, neither does det(H m (z)) (because of (3.2.28) and
assuming FIR filters). Since det(H m (z)) is an odd function of z, it is of the form
be a row-vector with only the first component different from zero. One could expand
( G0 (z) G1 (z) ) into a matrix Gm (z) by modulation, that is
G0 (z) G1 (z)
Gm (z) = . (3.2.32)
G0 (−z) G1 (−z)
3.2. TWO-CHANNEL FILTER BANKS 127
The matrix T m (z) is sometimes called the aliasing cancellation matrix [272].
Let us for a moment return to (3.2.14). As we said, X(−z) is the aliased version
of the signal. A necessary and sufficient condition for aliasing cancellation is that
The solution proposed by Croisier, Esteban, Galand [69] is known under the name
QMF (quadrature mirror filters), which cancels aliasing in a two-channel filter bank:
Substituting the above into (3.2.33) leads to H0 (z)H0 (−z)− H0 (−z)H0 (z) = 0, and
aliasing is indeed cancelled. In order to achieve perfect reconstruction, the following
has to be satisfied:
Note that the left side is an odd function of z, and thus, l has to be odd. The above
relation explains the name QMF. On the unit circle H0 (−z) = H(ej(ω+π) ) is the
mirror image of H0 (z) and both the filter and its mirror image are squared. For FIR
filters, the condition (3.2.37) cannot be satisfied exactly except for the Haar filters
√
introduced in Section 3.1. Taking a causal Haar filter, or H0 (z) = (1 + z −1 )/ 2,
(3.2.37) becomes
1 1
(1 + 2z −1 + z −2 ) − (1 − 2z −1 + z −2 ) = 2z −1 .
2 2
For larger, linear phase filters, (3.2.37) can only be approximated (see Section 3.2.4).
T HEOREM 3.7
In a two-channel, biorthogonal, real-coefficient filter bank, the following are
equivalent:
(b) G0 (z)H0 (z) + G1 (z)H1 (z) = 2, and G0 (z)H0 (−z) + G1 (z)H1 (−z) = 0.
(c) T s · T a = T a · T s = I.
The proof follows from the equivalences between the various representations intro-
duced in this section and is left as an exercise (see Problem 3.4). Note that we are
assuming a critically sampled filter bank. Thus, the matrices in points (c)–(e) are
square, and left inverses are also right inverses.
or, the even shifts of synthesis filters (even shifts of time-reversed analysis filters).
We will show here that (3.2.38–3.2.40) describe orthonormal expansions, in the
general case.
5
The term orthogonal is often used, especially for the associated filters or filter banks. For filter
banks, the term unitary or paraunitary is also often used, as well as the notion of losslessness (see
Appendix 3.A).
3.2. TWO-CHANNEL FILTER BANKS 129
Orthonormality in Time Domain Start with a general filter bank as given in Fig-
ure 3.1(a). Impose orthonormality on the expansion, that is, the dual basis {ϕ̃k [n]}
becomes identical to {ϕk [n]}. In filter bank terms, the dual basis — synthesis filters
— now becomes
{g0 [n−2k], g1 [n−2k]} = {ϕ̃k [n]} = {ϕk [n]} = {h0 [2k −n], h1 [2k −n]}, (3.2.41)
or,
gi [n] = hi [−n], i = 0, 1. (3.2.42)
Thus, we have encountered the first important consequence of orthonormality: The
synthesis filters are the time-reversed versions of the analysis filters. Also, since
(3.2.41) holds and ϕk is an orthonormal set, the following are the orthogonality
relations for the synthesis filters:
with a similar relation for the analysis filters. We call this an orthonormal filter
bank.
Let us now see how orthonormality can be expressed using matrix notation.
First, substituting the expression for gi [n] given by (3.2.42) into the synthesis matrix
T s given in (3.2.7), we see that
T s = T Ta ,
T s T a = T Ta T a = I. (3.2.44)
That is, the above condition means that the matrix T a is unitary. Because it is
full rank, the product commutes and we have also T a T Ta = I. Thus, having an
orthonormal basis, or perfect reconstruction with an orthonormal filter bank, is
equivalent to the analysis matrix T a being unitary.
If we separate the outputs now as was done in (3.2.9), and note that
Gi = H Ti ,
H i H Tj = δ[i − j] I, i, j = 0, 1.
Now, the output of one channel in Figure 3.1(a) (filtering, downsampling, upsam-
pling and filtering) is equal to
M i = H Ti H i .
130 CHAPTER 3
K−1
ATi Ai = I,
i=0
K−1
ATi+j Ai = 0, j = 1, . . . , K − 1.
i=0
Using the same arguments for the other cases in (3.2.43), we also have that
On the unit circle, (3.2.46–3.2.47) become (use G(e−jω ) = G∗ (ejω ) since the filter
has real coefficients)
that is, the filter and its modulated version are power complementary (their mag-
nitudes squared sum up to a constant). Since this condition was used in [270]
for designing the first orthogonal filter banks, it is also called the Smith-Barnwell
condition. Writing (3.2.46–3.2.48) in matrix form,
G0 (z −1 ) G0 (−z −1 ) G0 (z) G1 (z) 2 0
= , (3.2.50)
G1 (z −1 ) G1 (−z −1 ) G0 (−z) G1 (−z) 0 2
that is, using the synthesis modulation matrix Gm (z) (see (3.2.32))
Since gi and hi are identical up to time reversal, a similar relation holds for the
analysis modulation matrix H m (z) (up to a transpose), or H m (z −1 ) H Tm (z) = 2I.
A matrix satisfying (3.2.51) is called paraunitary (note that we have assumed
that the filter coefficients are real). If all its entries are stable (which they are in this
case, since we assumed the filters to be FIR), then such a matrix is called lossless.
The concept of losslessness comes from classical circuit theory [23, 308] and is
discussed in more detail in Appendix 3.A. It suffices to say at this point that having
a lossless transfer matrix is equivalent to the filter bank implementing an orthogonal
transform. Concentrating on lossless modulation matrices, we can continue our
analysis of orthogonal systems in the modulation domain. First, from (3.2.50) we
can see that ( G1 (z −1 ) G1 (−z −1 ) )T has to be orthogonal to ( G0 (z) G0 (−z) )T .
It will be proven in Appendix 3.A (although in polyphase domain), that this implies
that the two filters G0 (z) and G1 (z) are related as follows:
where we used (3.2.51). Since (3.2.53) also implies Gp (z) GTp (z −1 ) = I (left inverse
is also right inverse), it is clear that given a paraunitary Gp (z) corresponding to
an orthogonal synthesis filter bank, we can choose the analysis filter bank with a
polyphase matrix H p (z) = GTp (z −1 ) and get perfect reconstruction with no delay.
(c) T Ts T s = T s T Ts = I, T a = T Ts .
Again, we used the fact that the left inverse is also the right inverse in a square
matrix in relations (c), (d) and (e). The proof follows from the relations between
the various representations, and is left as an exercise (see Problem 3.7). Note that
the theorem holds in more general cases as well. In particular, the filters do not have
to be restricted to be FIR, and if their coefficients are complex valued, transposes
have to be hermitian transposes (in the case of Gm and Gp , only the coefficients of
the filters have to be conjugated, not z since z −1 plays that role).
3.2. TWO-CHANNEL FILTER BANKS 133
Because all filters are related to a single prototype satisfying (a) or (b), the
other filter in the synthesis filter bank follows by modulation, time reversal and an
odd shift (see (3.2.52)). The filters in the analysis are simply time-reversed versions
of the synthesis filters. In the FIR case, the length of the filters is even. Let us
formalize these statements:
C OROLLARY 3.9
In a two-channel, orthonormal, FIR, real-coefficient filter bank, the following
hold:
|G0 (ejω )|2 +|G0 (ej(ω+π) )|2 = 2, |G0 (ejω )|2 +|G1 (ejω )|2 = 2. (3.2.54)
(c) The highpass filter is specified (up to an even shift and a sign change)
by the lowpass filter as
G1 (z) = −z −2K+1 G0 (z −1 ).
(d) If the lowpass filter has a zero at π, that is, G0 (−1) = 0, then
√
G0 (1) = 2. (3.2.55)
Also, an orthogonal filter bank has, as any orthogonal transform, an energy conser-
vation property:
P ROPOSITION 3.10
In an orthonormal filter bank, that is, a filter bank with a unitary polyphase
or modulation matrix, the energy is conserved between the input and the
channel signals,
x2 = y0 2 + y1 2 . (3.2.56)
P ROOF
The energy of the subband signals equals
2π
1
y0 2 + y1 2 = |Y 0 (ejω )|2 + |Y 1 (ejω )|2 dω,
2π 0
134 CHAPTER 3
by Parseval’s relation (2.4.37). Using the fact that y(z) = H p (z) xp (z), the right side can
be written as,
2π ∗
2π ∗ ∗
1 1
y(ejω ) · y(ejω )dω = xp (ejω ) H p (ejω )
2π 0 2π 0
× H p (ejω ) xp (ejω ) dω,
2π ∗
1
= xp (ejω ) xp (ejω ) dω,
2π 0
= x0 2 + x1 2 .
We used the fact that H p (ejω ) is unitary and Parseval’s relation. Finally, (3.2.56) follows
from the fact that the energy of the signal is equal to the sum of the polyphase components’
energy, x2 = x0 2 + x1 2 .
Designing Orthogonal Filter Banks Now, we give two design procedures: the
first, based on spectral factorization, and the second, based on lattice structures.
Let us just note that most of the methods in the literature design analysis filters.
We will give designs for synthesis filters so as to be consistent with our approach;
however, analysis filters are easily obtained by time reversing the synthesis ones.
Designs Based on Spectral Factorizations The first solution we will show is due to
Smith and Barnwell [271]. The approach here is to find an autocorrelation se-
quence P (z) = G0 (z)G0 (z −1 ) that satisfies (3.2.46) and then to perform spectral
factorization as explained in Section 2.5.2. However, factorization becomes numeri-
cally ill-conditioned as the filter size grows, and thus, the resulting filters are usually
only approximately orthogonal.
Example 3.1
Choose p[n] as a windowed version of a perfect half-band lowpass filter,
w[n] sin(π/2n)
π/2·n
n = −2K + 1, . . . , 2K − 1,
p[n] =
0 otherwise.
where w[n] is a symmetric window function with w[0] = 1. Because p[2n] = δ[n], the
z-transform of p[n] satisfies
P (z) + P (−z) = 2. (3.2.57)
Also since P (z) is an approximation to a half-band lowpass filter, its spectral factor will be
such an approximation as well. Now, P (ejω ) might not be positive everywhere, in which
case it is not an autocorrelation and has to be modified. The following trick can be used
to find an autocorrelation sequence p [n] close to p[n] [271]. Find the minimum of P (ejω ),
δmin = minω [P (ejω )]. If δmin > 0, we need not do anything, otherwise, subtract it from
p[0] to get the sequence p [n] . Now,
and P (z) still satisfies (3.2.57) up to a scale factor (1 − δmin ) which can be divided out.
3.2. TWO-CHANNEL FILTER BANKS 135
0 0
-10 -10
-20 -20
Magnitude response [dB]
-40 -40
-50 -50
-60 -60
-70 -70
-80 -80
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Frequency [radians] Frequency [radians]
(a) (b)
0 0
-10 -10
-20 -20
Magnitude response [dB]
-40 -40
-50 -50
-60 -60
-70 -70
-80 -80
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Frequency [radians] Frequency [radians]
(c) (d)
Figure 3.5 Orthogonal filter designs. Magnitude responses of: (a) Smith and
Barnwell filter of length 8 [271], (b) Daubechies’ filter of length 8 (D4 ) [71],
(c) Vaidyanathan and Hoang filter of length 8 [310], (d) Butterworth filter for
N = 4 [133]. FIGURE 3.4 fignew3.2.1
which satisfies (3.2.57), where R(z) is symmetric (R(z −1 ) = R(z)) and positive
on the unit circle, R(ejω ) ≥ 0. Of particular interest is the case when R(z) is
136 CHAPTER 3
of minimal degree, which turns out to be when R(z) has powers of z going from
(−k+1) to (k−1). Once the solution to this constrained problem is found, a spectral
factorization of R(z) yields the desired filter G0 (z), which has automatically k zeros
at π. As always with spectral factorization, there is a choice of taking zeros either
inside or outside the unit circle. Taking them systematically from inside the unit
circle, leads to Daubechies’ family of minimum-phase filters.
The function R(z) which is required so that P (z) satisfies (3.2.57) can be found
by solving a system of linear equations or a closed form is possible in the minimum-
degree case [71]. Let us indicate a straightforward approach leading to a system of
linear equations. Assume the minimum-degree solution. Then P (z) has powers of
z going from (−2k + 1) to (2k − 1) and (3.2.57) puts 2k − 1 constraints on P (z).
But because P (z) is symmetric, k − 1 of them are redundant, leaving k active
constraints. Because R(z) is symmetric, it has k degrees of freedom (out of its
2k − 1 nonzero coefficients). Since P (z) is the convolution of (1 + z −1 )k (1 + z)k with
R(z), it can be written as a matrix-vector product, where the matrix contains the
impulse response of (1 + z −1 )k (1 + z)k and its shifts. Gathering the even terms of
this matrix-vector product (which correspond to the k constraints) and expressing
them in terms of the k free parameters of R(z), leads to the desired k × k system
of equation. It is interesting to note that the matrix involved is never singular, and
the R(z) obtained by solving the system of equations is positive on the unit circle.
Therefore, this method automatically leads to an autocorrelation, and by spectral
factorization, to an orthogonal filter bank with filters of length 2k having k zeros
at π and 0 for the lowpass and highpass, respectively.
As an example, we will construct Daubechies’ D2 filter, that is, a length-4
orthogonal filter with two zeros at ω = π (the maximum number of zeros at π is
3.2. TWO-CHANNEL FILTER BANKS 137
Example 3.2
Let us choose k = 2 and construct length-4 filters. This means that
Now, recall that since P (z) + P (−z) = 2, all even-indexed coefficients in P (z) equal 0,
except for p[0] = 1. To obtain a length-4 filter, the highest-degree term has to be z −3 , and
thus R(z) is of the form
R(z) = (az + b + az −1 ). (3.2.58)
P (z) = az 3 + (4a + b)z 2 + (7a + 4b)z + (8a + 6b) + (4b + 7a)z −1 + (b + 4a)z −2 + az −3 .
Equating the coefficients of z 2 or z −2 with 0, and the one with z 0 with 1 yields
4a + b = 0, 8a + 6b = 1.
1 1
a = − , b = ,
16 4
1 1 1 −1
R(z) = − z+ − z .
16 4 16
1 √ √
G0 (z) = √ (1 + z −1 )2 (1 + 3 + (1 − 3)z −1 ),
4 2
1 √
= √ ((1 + 3)
4 2
√ √ √
+ (3 + 3)z −1 + (3 − 3)z −2 + (1 − 3)z −3 ). (3.2.59)
Note that this lowpass filter has a double zero at z = −1 (important for constructing wavelet
bases, as will be seen in Section 4.4). A longer filter with four zeros at ω = π is shown in
Figure 3.5(b) (magnitude responses of the lowpass/highpass pair) while the impulse response
coefficients are given in Table 3.2 [71].
138 CHAPTER 3
UΚ−1 UΚ−2 U0
x0 ••• y0
That the resulting structure is paraunitary is easy to check (it is the product of
paraunitary elementary blocks). What is much more interesting is that all pa-
raunitary matrices of a given degree can be written in this form [310] (see also
Appendix 3.A.1). The lattice factorization is given in Figure 3.6.
As an example of this approach, we construct the D2 filter from the previous
example, using the lattice factorization.
Example 3.3
We construct the D2 filter which is of length 4, thus L = 2K = 4. This means that
cos α0 − sin α0 1 cos α1 − sin α1
Gp (z) = −1 ,
sin α0 cos α0 z sin α1 cos α1
cos α0 cos α1 − sin α0 sin α1 z −1 − cos α0 sin α1 − sin α0 cos α1 z −1
= .
sin α0 cos α1 + cos α0 sin α1 z −1 − sin α0 sin α1 + cos α0 cos α1 z −1
(3.2.61)
6
By canonical we mean complete factorizations with a minimum number of free parameters.
However, such factorizations are not unique in general.
3.2. TWO-CHANNEL FILTER BANKS 139
We now obtain the D2 filter by imposing a second-order zero at z = −1. So, we obtain the
first equation as
or,
cos(α0 + α1 ) − sin(α0 + α1 ) = 0.
This equation implies that
π
α0 + α1 = kπ + .
4
√
Since we also know that G0 (1) = 2 (see (3.2.55)
√
cos(α0 + α1 ) + sin(α0 + α1 ) = 2,
we get that
π
α0 + α1 = . (3.2.62)
4
Imposing now a zero at ejω = −1 on the derivative of G0 (ejω ), we obtain
!
dG0 (ejω ) !!
! = cos α1 sin α0 + 2 sin α1 sin α0 + 3 sin α1 cos α0 = 0. (3.2.63)
dω ω=π
π π
α0 = , α1 = − .
3 12
Substituting the angles α0 , α1 into the expression for G0 (z) (3.2.61) and comparing it to
(3.2.59), we can see that we have indeed obtained the D2 filter.
H1 first (that is, in this case we design the analysis part of the system, or, one of
the two biorthogonal bases).
First, note that if a filter is linear phase, then it can be written as
The right-hand side of (3.2.65) is the determinant of the polyphase matrix H p (z),
while the right-hand side of (3.2.66) is the determinant of the modulation matrix
H m (z). The synthesis filters are then equal to (see (3.2.30–3.2.31))
P ROPOSITION 3.11
In a two-channel, perfect reconstruction filter bank, where all filters are linear
phase, the analysis filters have one of the following forms:
(a) Both filters are symmetric and of odd lengths, differing by an odd mul-
tiple of 2.
(b) One filter is symmetric and the other is antisymmetric; both lengths are
even, and are equal or differ by an even multiple of 2.
(c) One filter is of odd length, the other one of even length; both have all
zeros on the unit circle. Either both filters are symmetric, or one is
symmetric and the other one is antisymmetric (this is a degenerate case)
.
3.2. TWO-CHANNEL FILTER BANKS 141
The proof can be found in [319] and is left as an exercise (see Problem 3.8).
We will discuss it briefly. The idea is to consider the product polynomial P (z) =
H0 (z)H1 (−z) that has to satisfy (3.2.66). Because H0 (z) and H1 (z) (as well as
H1 (−z)) are linear phase, so is P (z). Because of (3.2.66), when P (z) has more
than two nonzero coefficients, it has to be symmetric with one central coefficient
at 2l − 1. Also, the end terms of P (z) have to be of an even index, so they cancel
in P (z) − P (−z). The above two requirements lead to the symmetry and length
constraints for cases (a) and (b). In addition, there is a degenerate case (c), of little
practical interest, when P (z) has only two nonzero coefficients,
P (z) = z −j (1 ± z 2N −1−2j ),
which leads to zeros at odd roots of ±1. Because these are distributed among H0 (z)
and H1 (−z) (rather than H1 (z)), the resulting filters will be a poor set of lowpass
and highpass filters.
Another result that we mentioned at the beginning of this section is:
P ROPOSITION 3.12
There are no two-channel perfect reconstruction, orthogonal filter banks, with
filters being FIR, linear phase, and with real coefficients (except for the Haar
filters).
P ROOF
We know from Theorem 3.8 that orthonormality implies that
H p (z)H Tp (z −1 ) = I,
We also know that in orthogonal filter banks, the filters are of even length. Therefore,
following Proposition 3.11, one filter is symmetric and the other one is antisymmetric. Take
the symmetric one, H0 (z) for example, and use (3.2.64)
1
H00 (z) H00 (z −1 ) = .
2
142 CHAPTER 3
1
H00 (z) = √ z −l .
2
√
Performing a similar analysis for H01 (z), we obtain that H01 (z) = 1/ 2z −k , which, in turn,
means that
1
H0 (z) = √ (z −2l + z −2k−1 ), H1 (z) = H0 (−z),
2
or, the only solution yields Haar filters (l = k = 0) or trivial variations thereof.
Lattice Structure for Linear Phase FiltersUnlike in the paraunitary case, there are no
canonical factorizations for general matrices of polynomials.7 But there are lattice
structures that will produce, for example, linear phase perfect reconstruction filters
[208, 321]. To obtain it, note that H p (z) has to satisfy (if the filters are of the same
length)
1 0 −k −1 0 1
H p (z) = · z · H p (z ) · . (3.2.69)
0 −1 1 0
Here, we assume that Hi (z) = Hi0 (z 2 ) + z −1 Hi1 (z 2 ) in order to have causal filters.
This is referred to as the linear phase testing condition (see Problem 3.9). Then,
assume that H p (z) satisfies (3.2.69) and construct H p (z) as
1 1 α
H p (z) = H p (z) .
z −1 α 1
It is then easy to show that H p (z) satisfies (3.2.69) as well. The lattice
K−1
"
1 1 1 1 αi
H p (z) = C , (3.2.70)
−1 1 z −1 αi 1
i=1
5K−1
with C = −(1/2) i=1 (1/(1 − α2i )), produces length L = 2K symmetric (lowpass)
and antisymmetric (highpass) filters leading to perfect reconstruction filter banks.
Note that the structure is incomplete [321] and that |αi | = 1. Again, just as in the
paraunitary lattice, perfect reconstruction is structurally guaranteed within a scale
factor (in the synthesis, replace simply αi by −αi and pick C = 1).
7
There exist factorizations of polynomial matrices based on ladder steps [151], but they are not
canonical like the lattice structure in (3.2.60).
3.2. TWO-CHANNEL FILTER BANKS 143
Example 3.4
Let us construct filters of length 4 where the lowpass has a maximum number of zeros at
z = −1 (that is, the linear phase counterpart of the D2 filter). From the cascade structure,
−1 1 1 1 1 α
H p (z) =
2(1 − α2 ) −1 1 z −1 α 1
−1 1 + αz −1 α + z −1
= .
2(1 − α2 ) −1 + αz −1 −α + z −1
1 + αz −1 + αz −2 + z −3
H0 (z) = H00 (z 2 ) + z −1 H01 (z 2 ) = .
−2(1 − α2 )
1 1
H0 (z) = (1 + 3z −1 + 3z −2 + z −3 ) = (1 + z −1 )3 , (3.2.71)
16 16
which means that H0 (z) has a triple zero at z = −1. The highpass filter is equal to
1
H1 (z) = (−1 − 3z −1 + 3z −2 + z −3 ). (3.2.72)
16
Note that det(H m (z)) = (1/8) z −3 . Following (3.2.30–3.2.31), G0 (z) = 16z 3 H1 (−z) and
G1 (z) = −16z 3 H0 (−z). A causal version simply skips the z 3 factor. Recall that the key
144 CHAPTER 3
to perfect reconstruction is the product P (z) = H0 (z) · H1 (−z) in (3.2.66), which equals in
this case (using (3.2.71–3.2.72))
1
P (z) = (−1 + 9z −1 + 16z −3 + 9z −4 − z −6 )
256
1
= (1 + z −1 )4 (−1 + 4z −1 − z −2 ),
256
that is, the same P (z) as in Example 3.2. One can refactor this P (z) into a different set of
{H0 (z), H1 (−z)}, such as, for example,
that is, odd-length linear phase lowpass and highpass filters with impulse responses 1/16 [1,
2, 1] and 1/16 [-1, -2, 6, -2, -1], respectively. Table 3.3 gives impulse response coefficients
for both analysis and synthesis filters for the two cases given above.
The above example showed again the central role played by P (z) = H0 (z) · H1 (−z).
In some sense, designing two-channel filter banks boils down to designing P (z)’s
with particular properties, and factoring them in a particular way.
If one relaxes the perfect reconstruction constraint, one can obtain some desir-
able properties at the cost of some small reconstruction error. For example, popular
QMF filters have been designed by Johnston [144], which have linear phase and “al-
most” perfect reconstruction. The idea is to approximate perfect reconstruction in
a QMF solution (see (3.2.37)) as well as possible, while obtaining a good lowpass
filter (the highpass filter H1 (z) being equal to H0 (−z), is automatically as good as
the lowpass). Therefore, define an objective function depending on two quantities:
(a) stopband attenuation error of H0 (z)
π
S = |H0 (ejω )|2 dω,
ωs
Note that the solution H1 (z) is not unique [32, 319]. Also, coprimeness of
H00 (z), H01 (z) is equivalent with H0 (z) not having any pair of zeros at locations α
and −α. This can be used to prove that the filter H0 (z) = (1 + z −1 )N always has
a complementary filter (see Problem 3.12).
Example 3.5
Consider the filter H0 (z) = (1 + z −1 )4 = 1 + 4z −1 + 6z −2 + 4z −3 + z −4 . It can be verified
that its two polyphase components are coprime, and thus, there is a complementary filter.
We will find a solution to the equation
det(H p (z)) = H00 (z) · H11 (z) − H01 (z) · H10 (z) = z −1 , (3.2.73)
with H00 (z) = 1 + 6z −1 + z −2 and H01 (z) = 4 + 4z −1 . The right side of (3.2.73) was chosen
so that there is a linear phase solution. For example,
1 1
H10 (z) = (1 + z −1 ), H11 (z) = ,
16 4
is a solution to (3.2.73), that is, H1 (z) = (1 + 4z −1 + z 2 )/16. This of course leads to the
same P (z) as in Examples 3.3 and 3.4.
image coding. The main advantage of such filter banks is good frequency selectivity
and low computational complexity, just like in regular IIR filtering. However, this
advantage comes with a cost. Recall that in orthogonal filter banks, the synthesis
filter impulse response is the time-reversed version of the analysis filter. Now if
the analysis uses causal filters (with impulse response going from 0 to +∞), then
the synthesis has anticausal filters. This is a drawback from the point of view of
implementation, since in general anticausal IIR filters cannot be implemented unless
their impulse responses are truncated. However, a case where anticausal IIR filters
can be implemented appears when the signal to be filtered is of finite length, a case
encountered in image processing [234, 269]. IIR filter banks have been less popular
because of this drawback, but their attractive features justify a brief treatment as
given below. For more details, the reader is referred to [133].
First, return to the lattice factorization for FIR orthogonal filter banks (see
(3.2.60)). If one substitutes an allpass section8 for the delay z −1 in (3.2.60), the
factorization is still paraunitary. For example, instead of the diagonal matrix used
in (3.2.60), take a diagonal matrix D(z) such that
−1 F0 (z) 0 F0 (z −1 ) 0
D(z) D(z ) = = I,
0 F1 (z) 0 F1 (z −1 )
where we have assumed that the coefficients are real, and have used two allpass
sections (instead of 1 and z −1 ). What is even more interesting is that such a
factorization is complete [84].
Alternatively, recall that one of the ways to design orthogonal filter banks is to
find an autocorrelation function P (z) which is valid, that is, which satisfies
P (z) + P (−z) = 2, (3.2.74)
and then factor it into P (z) = H0 (z)H0 (z −1 ). This approach is used in [133] to
construct all possible orthogonal filter banks with rational filters. The method goes
as follows:
First, one chooses an arbitrary polynomial R(z) and forms P (z) as
2R(z)R(z −1 )
P (z) = . (3.2.75)
R(z)R(z −1 ) + R(−z)R(−z −1 )
It is easy to see that this P (z) satisfies (3.2.74). Since both the numerator and the
denominator are autocorrelations (the latter being the sum of two autocorrelations),
P (z) is as well. It can be shown that any valid autocorrelation can be written as
in (3.2.75) [133]. Then factor P (z) as H(z)H(z −1 ) and form the filter
H0 (z) = AH0 (z) H(z),
8
Remember that a filter H(ejω ) is allpass if |H(ejω )| = c, c > 0, for all ω. Here we choose
c = 1.
3.2. TWO-CHANNEL FILTER BANKS 147
where AH1 (z) is again an arbitrary allpass. The synthesis filters are then
The above construction covers the whole spectrum of possible solutions. For exam-
ple, if R(z)R(z −1 ) is in itself a valid function, then
R(z)R(z −1 ) + R(−z)R(−z −1 ) = 2,
and by choosing AH0 , AH1 to be pure delays, the solutions obtained by the above
construction are FIR.
(1 + z −1 )N (1 + z)N
P (z) = = H(z)H(z −1 ). (3.2.78)
(z −1 + 2 + z)N + (−z −1 + 2 − z)N
These filters are the IIR counterparts of the Daubechies’ filters given in Example 3.2. These
are, in fact, the N th order half-band digital Butterworth filters [211] (see also Example 2.2).
That these particular filters satisfy the conditions for orthogonality was also pointed out
in [269]. The Butterworth filters are known to be the maximally flat IIR filters of a given
order.
Choose N = 5, or P (z) equals
(1 + z)5 (1 + z −1 )5
P (z) = .
10z 4 + 120z 3 + 252 + 120z −2 + 10z −4
In this case, we can obtain a closed form spectral factorization of P (z), which leads to
1 + 5z −1 + 10z −2 + 10z −3 + 5z −4 + z −5
H0 (z) = √ , (3.2.79)
2(1 + 10z −2 + 5z −4 )
1 − 5z + 10z 2 − 10z 3 + 5z 4 − z 5
H1 (z) = z −1 √ . (3.2.80)
2(1 + 10z 2 + 5z 4 )
For the purposes of implementation, it is necessary to factor H i (z) into stable causal (poles
inside the unit circle) and anticausal (poles outside the unit circle) parts. For comparison
with earlier designs, where length-8 FIR filters were designed, we show in Figure 3.5(d) the
magnitude responses of H0 (ejω ) and H1 (ejω ) for N = 4. The form of the P (z) is then
z −4 (1 + z)4 (1 + z −1 )4
P (z) = .
1 + 28z −2 + 70z −4 + 28z −6 + z −8
148 CHAPTER 3
As we pointed out in Proposition 3.12, there are no real FIR orthogonal sym-
metric/antisymmetric filter banks. However, if we allow IIR filters instead, then
solutions do exist. There are two cases, depending if the center of symmetry/anti-
symmetry is at a half integer (such as in an even-length FIR linear phase filter)
or at an integer (such as in the odd-length FIR case). We will only consider the
former case. For discussion of the latter case as well as further details, see [133].
It can be shown that the polyphase matrix for an orthogonal, half-integer sym-
metric/antisymmetric filter bank is necessarily of the form
A(z) z −l A(z −1 )
H p (z) = ,
−z l−n A(z) z −n A(z −1 )
1 + 6z −1 + (15/7)z −2
A(z) = . (3.2.82)
(15/7) + 6z −1 + z −2
This particular solution will prove useful in the construction of wavelets (see Sec-
tion 4.6.2). Again, for the purposes of implementation, one has to implement stable
causal and anticausal parts separately.
Remarks The main advantage of IIR filters is their good frequency selectivity and
low computational complexity. The price one pays, however, is the fact that the
filters become noncausal. For the sake of discussion, assume a finite-length signal,
and a causal analysis filter, which will be followed by an anticausal synthesis filter.
The output will be infinite even though the input is of finite length. One can take
care of this problem in two ways. Either one stores the state of the filters after
the end of the input signal and uses this as an initial state for the synthesis filters
[269], or one takes advantage of the fact that the outputs of the analysis filter bank
decay rapidly after the input is zero, and stores only a finite extension of these
signals. While the former technique is exact, the latter is usually a good enough
approximation. This short discussion indicates that the implementation of IIR filter
banks is less straightforward than that of their FIR counterparts, and explains their
lesser popularity.
x H1 2
H0 2 H1 2
stage 1
H0 2
stage 2 H1 2
H0 2
stage J
(a)
2 G1 + x^
W1
2 G1 + 2 G0
W2 V1
stage 1
2 G0
V2
2 G1 + stage 2
WJ
2 G0
VJ
stage J
(b)
Example 3.7
Consider what happens if the filters gi [n] from Figure 3.7(a)-(b) are Haar filters defined in
z-transform domain as
1 1
G0 (z) = √ (1 + z −1 ), G1 (z) = √ (1 − z −1 ).
2 2
Take, for example, J = 3, that is, we will use three two-channel filter banks. Then, using
the multirate identity which says that G(z) followed by upsampling by 2 is equivalent to
upsampling by 2 followed by G(z 2 ) (see Section 2.5.3), we can transform this filter bank
into a four-channel one as given in Figure 3.8. The equivalent filters are
(1) 1
G1 (z) = G1 (z) = √ (1 − z −1 ),
2
(2) 1
G1 (z) = G0 (z) G1 (z ) = (1 + z −1 − z −2 − z −3 ),
2
2
(3)
G1 (z) = G0 (z) G0 (z 2 ) G1 (z 4 )
1
= √ (1 + z −1 + z −2 + z −3 − z −4 − z −5 − z −6 − z −7 ),
2 2
(3)
G0 (z) = G0 (z) G0 (z 2 ) G0 (z 4 )
1
= √ (1 + z −1 + z −2 + z −3 + z −4 + z −5 + z −6 + z −7 ),
2 2
9
This is also sometimes called a discrete-time wavelet transform in the literature.
3.3. TREE-STRUCTURED FILTER BANKS 151
1
y0(n) 2 ------- ( 1, – 1 )
2
1
y1(n) 48 --- ( 1, 1, – 1, – 1 )
2
+ x(n)
1
y2(n) 8 ---------- ( 1, 1, 1, 1, – 1, – 1, – 1, – 1 )
2 2
1
y3(n) 8 ---------- ( 1, 1, 1, 1, 1, 1, 1, 1 )
2 2
FIGURE 3.6
fignew3.3.2
Figure 3.8 Octave-band synthesis filter bank with Haar filters and three stages.
It is obtained by transforming the filter bank from Figure 3.7(b) using the mul-
tirate identity for filtering followed by upsampling.
stages of lowpass filters g0 [n] each preceded by upsampling by 2. It can be defined recursively
as (we give it in z-domain for simplicity)
(3) 2 (2)
"
2
k
G0 (z) = G0 (z 2 ) G0 (z) = G0 (z 2 ).
k=0
(1) (i)
Note that this implies that G0 (z) = G0 (z). On the other hand, we denote by g1 [n], the
equivalent filter corresponding to highpass filtering followed by (i − 1) stages of lowpass
filtering, each again preceded by upsampling by 2. It can be defined recursively as
Since this is an orthonormal system, the time-domain matrices representing analysis and
synthesis are just transposes of each other. Thus the analysis matrix T a representing the
(1) (2) (3) (3)
actions of the filters h1 [n], h1 [n], h1 [n], h0 [n] contains as lines the impulse responses
(1) (2) (3) (3) (j)
of g1 [n], g1 [n], g1 [n], and g0 [n] or of hi [−n] since analysis and synthesis filters are
linked by time reversal. The matrix T a is block-diagonal,
⎛ ⎞
..
⎜ . ⎟
⎜ A0 ⎟
Ta = ⎜
⎜
⎟,
⎟ (3.3.1)
⎝ A0 ⎠
..
.
152 CHAPTER 3
Now that we have seen how it works in a simple case, we take more general
filters gi [n], and a number of stages J. We concentrate on the orthonormal case
(the biorthogonal one would follow similarly). In an orthonormal octave-band filter
bank with J stages, the equivalent filters (basis functions) are given by (again we
give them in z-domain for simplicity)
(J) (J−1) J −1
"
J−1
K
G0 (z) = G0 (z) G0 (z 2 ) = G0 (z 2 ), (3.3.3)
K=0
"
j−2
(j) (j−1) 2j−1 2j−1 K
G1 (z) = G0 (z) G1 (z ) = G1 (z ) G0 (z 2 ),
K=0
j = 1, . . . , J. (3.3.4)
In time domain, each of the outputs in Figure 3.7(a) can be described as
H 1 H j−1
0 x, j = 1, . . . , J − 1
except for the last, which is obtained by
H J0 x.
Here, the time-domain matrices H 0 , H 1 are as defined in Section 3.2.1, that is,
each line is an even shift of the impulse response of gi [n], or equivalently, of hi [−n].
Since each stage in the analysis bank is orthonormal and invertible, the overall
scheme is as well. Thus, we get a unitary analysis matrix T a by interleaving the
rows of H 1 , H 1 H 0 , . . ., H 1 H J−1
0 , H J0 , as was done in (3.3.1–3.3.2). A formal
proof of this statement will be given in Section 3.3.2 under orthogonality of basis
functions.
3.3. TREE-STRUCTURED FILTER BANKS 153
Example 3.8
Let us go back to the Haar case and three stages. We can form matrices H 1 , H 1 H 0 ,
H 1 H 20 , H 30 as
⎛ ⎞
.. .. .. ..
⎜ . . . . ⎟
1 ⎜··· 1 −1 0 0 ···⎟
H1 = √ ⎜ ⎟, (3.3.5)
⎜··· 0 0 1 −1 ···⎟
2 ⎝ ⎠
.. .. .. ..
. . . .
⎛ ⎞
.. .. .. ..
⎜ . . . . ⎟
1 ⎜··· ···⎟
H0 = √ ⎜ 1 1 0 0 ⎟, (3.3.6)
⎜··· 0 0 1 1 ···⎟
2 ⎝ ⎠
.. .. .. ..
. . . .
⎛ ⎞
.. .. .. .. .. .. .. ..
⎜ . . . . . . . . ⎟
1 ⎜··· 1 1 −1 −1 0 0 0 0 ···⎟
H 1H 0 = ⎜ ⎟, (3.3.7)
2 ⎜··· 0 0 0 0 1 1 −1 −1 ···⎟
⎝ ⎠
.. .. .. .. .. .. .. ..
. . . . . . . .
⎛ ⎞
.. .. .. .. .. .. .. .. .. ..
⎜ . . . . . . . . . . ⎟
1 ⎜··· −1 −1 −1 −1 0 ···⎟
H 1 H 20 = √ ⎜ 1 1 1 1 0 ⎟, (3.3.8)
⎜··· 0 0 0 0 0 0 0 0 1 1 ···⎟
2 2 ⎝ ⎠
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
⎛ ⎞
.. .. .. .. .. .. .. .. .. ..
⎜ . . . . . . . . . . ⎟
1 ⎜··· ···⎟
H 30 = √ ⎜ 1 1 1 1 1 1 1 1 0 0 ⎟. (3.3.9)
2 2⎜
⎝··· 0 0 0 0 0 0 0 0 1 1 ···⎟
⎠
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
Now, it is easy to see that by interleaving (3.3.5–3.3.9) we obtain the matrix T a as in (3.3.1–
3.3.2). To check that it is unitary, it is enough to check that A0 is unitary (which it is, just
compute the product A0 AT0 ).
Until now, we have concentrated on the orthonormal case. If one would relax
the orthonormality constraint, we would obtain a biorthogonal tree-structured filter
bank. Now, hi [n] and gi [n] are not related by simple time reversal, but are impulse
responses of a biorthogonal perfect reconstruction filter bank. We therefore have
(j) (J)
both equivalent synthesis filters g1 [n − 2j k], g0 [n − 2J k] as given in (3.3.3–3.3.4)
(j) (J)
and analysis filters h1 [n−2j k], h0 [n−2J k], which are defined similarly. Therefore
if the individual two-channel filter banks are biorthogonal (perfect reconstruction),
then the overall scheme is as well. The proof of this statement will follow the proof
for the orthonormal case (see Section 3.3.2 for the discrete-time wavelet series case),
and is left as an exercise to the reader.
154 CHAPTER 3
where
(1)
X (1) [2k] = h0 [21 k − l], x[l],
(1)
X (1) [2k + 1] = h1 [21 k − l], x[l],
are the convolutions of the input with h0 [n] and h1 [n] evaluated at even indexes
(1) (1)
2k. In these equations hi [n] = hi [n], and gi [n] = gi [n]. In an octave-band
filter bank or discrete-time wavelet series, the lowpass channel is further split by
lowpass/highpass filtering and downsampling. Then, the first term on the right side
of (3.3.10) remains unchanged, while the second can be expressed as
(1)
(2)
X (1) [2k] h0 [21 k − n] = X (2) [2k + 1] g1 [n − 22 k]
k∈Z k∈Z
(2)
+ X (2) [2k] g0 [n − 22 k], (3.3.11)
k∈Z
where
(2)
X (2) [2k] = h0 [22 k − l], x[l],
(2)
X (2) [2k + 1] = h1 [22 k − l], x[l],
that is, we applied (3.3.10) once more. In the above, basis functions g(i) [n] are as
(2)
defined in (3.3.3) and (3.3.4). In other words, g0 [n] is the time-domain version of
(2)
G0 (z) = G0 (z) G0 (z 2 ),
(2)
while g1 [n] is the time-domain version of
(2)
G1 (z) = G0 (z) G1 (z 2 ).
3.3. TREE-STRUCTURED FILTER BANKS 155
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
g(1)
1
g(2)
1
g(3)
1
g(4)
1
g(4)
0
Repeating the process in (3.3.12) J times, one obtains the discrete-time wavelet
series over J octaves, plus the final octave containing the lowpass version. Thus,
(3.3.12) becomes
J
(j)
(J)
x[n] = X (j) [2k + 1] g1 [n − 2j k] + X (J) [2k] g0 [n − 2J k], (3.3.13)
j=1 k∈Z k∈Z
where
(j)
X (j) [2k + 1] = h1 [2j k − l], x[l], j = 1, . . . , J, (3.3.14)
(J)
X (J) [2k] = h0 [2J k − l], x[l].
(j) (J)
In (3.3.13) the sequence g1 [n] is the time-domain version of (3.3.4), while g0 [n]
(j) (j)
is the time-domain version of (3.3.3) and hi [n] = gi [−n]. Because any input
(j)
sequence can be decomposed as in (3.3.13), the family of functions {g1 [2j k −
(J)
n], g0 [2J k − n]}, j = 1, . . . , J, and k, n ∈ Z, is an orthonormal basis for l2 (Z).
Note the special sampling used in the discrete-time wavelet series. Each sub-
sequent channel is downsampled by 2 with respect to the previous one and has a
156 CHAPTER 3
Linearity Since the discrete-time wavelet series involves inner products or convo-
lutions (which are linear operators) it is obviously linear.
Shift Recall that multirate systems are not shift-invariant in general, and two-
channel filter banks downsampled by 2 are shift-invariant with respect to even
shifts only. Therefore, it is intuitive that a J-octave discrete-time wavelet series
will be invariant under shifts by multiples of 2J . A visual interpretation follows
from the fact that the dyadic grid in Figure 3.9, when moved by k2J , will overlap
with itself, whereas it will not if the shift is a noninteger multiple of 2J .
P ROPOSITION 3.14
In a discrete-time wavelet series expansion over J octaves, if
then
x[l − m2J ] ←→ X (j) [2(k − m2J−j ) + 1].
P ROOF
If y[l] = x[l − m2J ], then its transform is, following (3.3.14),
(j)
Y (j) [2k + 1] = h1 [2j k − l], x[l − m2J ]
(j)
= h1 [2j k − l − m2J ], x[l ]
(j)
= X [2j (k − m2J −j ) + 1].
Very similarly, one proves for the lowpass channel that, when x[l] produces X (J) [2k],
then x[l − m2J ] leads to X (J) [2(k − m)].
(J) (j)
Orthogonality We have mentioned before that g0 [n] and g1 [n], j = 1, . . . , J, with
appropriate shifts, form an orthonormal family of functions (see [274]). This stems
from the fact that we have used two-channel orthogonal filter banks, for which we
know that
gi [n − 2k], gj [n − 2l] = δ[i − j] δ[k − l].
3.3. TREE-STRUCTURED FILTER BANKS 157
P ROPOSITION 3.15
In a discrete-time wavelet series expansion, the following orthogonality rela-
tions hold:
(J) (J)
g0 [n − 2J k], g0 [n − 2J l] = δ[k − l], (3.3.15)
(j) (i)
g1 [n − 2j k], g1 [n − 2i l] = δ[i − j] δ[k − l], (3.3.16)
(J) (j)
g0 [n − 2J k], g1 [n − 2j l] = 0. (3.3.17)
P ROOF
We will here prove only (3.3.15), while (3.3.16) and (3.3.17) are left as an exercise to the
reader (see Problem 3.15). We prove (3.3.15) by induction.
It will be convenient to work with the z-transform of the autocorrelation of the filter
(j)
G0 (z), which we call P (j) (z) and equals
(j) (j)
P (j) (z) = G0 (z) G0 (z −1 ).
Recall that because of the orthogonality of g0 [n] with respect to even shifts, we have that
or, equivalently, that the polyphase decomposition of P (1) (z) is of the form
(1)
P (1) (z) = 1 + zP1 (z 2 ).
(j)
This is the initial step for our induction. Now, assume that g0 [n] is orthogonal to its
translates by 2j . Therefore, the polyphase decomposition of its autocorrelation can be
written as
j
2 −1
(j) j
P (j) (z) = 1 + z i Pi (z 2 ).
i=1
Now, because of the recursion (3.3.3), the autocorrelation of G(j+1) (z) equals
j
P (j+1) (z) = P (j) (z) P (1) (z 2 ).
i=1
We need to verify that the 0th polyphase component of P (j+1) (z) is equal to 1, or that
coefficients of z’s which are raised to powers multiple of 2j+1 are 0. Out of the four products
that appear when multiplying out the above right-hand side, only the product involving the
polyphase components needs to be considered,
j
2 −1
(j) j j (1) j+1
z i Pi (z 2 ) · z 2 P1 (z 2 ).
i=1
158 CHAPTER 3
The powers of z appearing in the above product are of the form l = i + k2j + 2j + m2j+1 ,
where i = 0 · · · 2j − 1 and k, m ∈ Z. Thus, l cannot be a multiple of 2j+1 , and we have
shown that
2j+1
−1 i (j+1) 2j+1
j+1
P (z) = 1 + z Pi (z ),
i=1
J
x[n] =
2
(|X (J) 2
[2k]| + |X (j) [2k + 1]|2 ).
k∈Z j=1
V0 = l2 {Z}. (3.3.18)
VJ ⊂ · · · ⊂ V2 ⊂ V1 ⊂ V0 . (3.3.19)
3.3. TREE-STRUCTURED FILTER BANKS 159
6
J
Vj = V0 = l2 {Z}.
j=0
with Vj+1 ⊥ Wj+1 , where ⊕ denotes the direct sum (see Section 2.2.2). Assume
that there exists a sequence g0 [n] ∈ V0 such that
{g0 [n − 2k]}k∈Z
is a basis for V1 . Then, it can be shown that there exists a sequence g1 [n] ∈ V such
that
{g1 [n − 2k]}k∈Z
is a basis for W1 . Such a sequence is given by
V0 = W1 ⊕ W2 ⊕ · · · ⊕ WJ ⊕ VJ , (3.3.22)
..
.
V2
V1 V0
•••
VJ WJ ••• W2 W1
••• ω
π π π π π
----- ------------- --- ---
2J 2J – 1 4 2
Figure 3.10 Ideal division of the spectrum by the discrete-time wavelet series
using sinc filters. Note that the spectrums are symmetric around zero. Division
into Vi spaces (note how Vi ⊂ Vi−1 ), and resulting Wi spaces. (Actually, Vj
and Wj are of height 2j/2 , so they have unit norm).
Because we deal with ideal filters, there is an obvious frequency interpretation. How-
ever, one has to be careful with the boundaries between intervals. With our definition of
g0 [n] and g1 [n], cos((π/2)n)10 belongs to V1 while sin((π/2)n) belongs to W1 .
(j) (j)
Denote the equivalent filters by gi [n], i = 0, . . . , 2j − 1. In other words, gi is
the ith equivalent filter going through one of the possible paths of length j. The
ordering is somewhat arbitrary, and we will choose the one corresponding to a full
tree with a lowpass in the lower branch of each fork, and start numbering from the
bottom.
Example 3.10
Let us find all equivalent filters in Figure 3.11, or the filters corresponding to depth-1 and
depth-2 trees. Since we will be interested in the basis functions, we consider the synthesis
filter banks. For simplicity, we do it in z-domain.
(1) (1)
G0 (z) = G0 (z), G1 (z) = G1 (z),
(2) (2)
G0 (z) = G0 (z) G0 (z 2 ), G1 (z) = G0 (z) G1 (z 2 ), (3.3.23)
(2) 2 (2) 2
G2 (z) = G1 (z) G0 (z ), G3 (z) = G1 (z) G1 (z ). (3.3.24)
Note that with the ordering chosen in (3.3.23–3.3.24), increasing index does not always cor-
(2)
respond to increasing frequency. It can be verified that for ideal filters, G2 (ejω ) chooses
(2) jω
the range [3π/4, π], while G3 (e ) covers the range [π/2, 3π/4] (see Problem 3.16). Be-
side the identity basis, which corresponds to the no-split situation, we have four possible
orthonormal bases, corresponding to the four trees in Figure 3.11. Thus, we have a family
W = {W0 , W1 , W2 , W3 , W4 }, where W4 is simply {δ[n − k]}k∈Z .
(2) (2) (2) (2)
W0 = {g0 [n − 22 k], g1 [n − 22 k], g2 [n − 22 k], g3 [n − 22 k]}k∈Z ,
This small example should have given the intuition behind orthonormal bases
generated from tree-structured filter banks. In the general case, with filter banks of
depth J, it can be shown that, counting the no-split tree, the number of orthonormal
bases satisfies
2
MJ = MJ−1 + 1. (3.3.25)
Among this myriad of bases, there are the STFT-like basis, given by
(J) (J)
W0 = {g0 [n − 2J k], . . . , g2J −1 [n − 2J k]}k∈Z , (3.3.26)
3.4. MULTICHANNEL FILTER BANKS 163
It can be shown that the sets of basis functions in (3.3.26) and (3.3.27), as well as
in all other bases generated by the filter bank tree, are orthonormal (for example,
along the lines of the proof in the discrete-time wavelet series case). However, this
would be quite cumbersome. A more immediate proof is sketched here. Note that
we have a perfect reconstruction system by construction, and that the synthesis
and the analysis filters are related by time reversal. That is, the inverse operator
of the analysis filter bank (whatever its particular structure) is its transpose, or
equivalently, the overall filter bank is orthonormal. Therefore, the impulse responses
of all equivalent filters and their appropriate shifts form an orthonormal basis for
l2 (Z).
It is interesting to consider the time-frequency analysis performed by various
filter banks. This is shown schematically in Figure 3.12 for three particular cases
of binary trees. Note the different trade-offs in time and frequency resolutions.
Figure 3.13 shows a dynamic time-frequency analysis, where the time and fre-
quency resolutions are modified as time evolves. This is achieved by modifying the
frequency split on the fly [132], and can be used for signal compression as discussed
in Section 7.3.4.
f f
(a) (b)
t t
(c)
figtut3.2
Figure 3.12 Time-frequency analysis achieved by different binary subband
trees. The trees are on bottom, the time-frequency tilings on top. (a) Full tree
or STFT. (b) Octave-band tree or wavelet series. (c) Arbitrary tree or one
possible wavelet packet.
analyze such filter banks in a manner similar to Section 3.2. Therefore, the channel
signals, after filtering and sampling can be expressed as
⎛ .. ⎞
.
⎜ ⎟
⎜ y0 [0] ⎟
⎜ ⎟ ⎛ ⎞⎛ . ⎞
⎜ .. ⎟ .. .. ..
⎜ . ⎟ . .
⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ yN −1 [0] ⎟ ⎜··· A0 0 · · · ⎟ ⎜ x[0] ⎟
⎜ ⎟ = ⎜ ⎟⎜ ⎟, (3.4.1)
⎜ y0 [1] ⎟ ⎝··· 0 A0 · · · ⎠ ⎝ x[1] ⎠
⎜ ⎟ .. ..
⎜ .. ⎟ ..
⎜ . ⎟ . . .
⎜ ⎟
⎝ yN −1 [1] ⎠
..
.
3.4. MULTICHANNEL FILTER BANKS 165
figtut3.3
yN-1
HN – 1 Ν
•••
Ν GN – 1
•••
•••
•••
x + x^
y1
H1 Ν
•••
Ν G1
y0
H0 Ν
•••
Ν G0
that is, we obtained the orthonormality relations for this case. Denoting by
ϕkN +i [n] = gi [n − kN ], we have that the set of basis functions {ϕkN +i [n]} =
{g0 [n − kN ], g1 [n − kN ], . . . , gN −1 [n − kN ]}, with i = 0, . . . , N − 1, and k ∈ Z, is
an orthonormal basis for l2 (Z).
Generalizations What we have seen in these two simple cases, is how to obtain
N-channel filter banks with filters of length N (block transforms) and filters of
length 2N (lapped orthogonal transforms). It is obvious that by allowing longer
filters, or more blocks Ai in (3.4.4), we can obtain general N-channel filter banks.
Defining the synthesis matrix as in (3.2.7), we obtain the basis functions of the dual
basis
ϕ̃N k+i [n] = gi [n − N k],
and they satisfy the following biorthogonality relations:
T s T a = I.
As was done in Section 3.2, we can define single operators for each branch. If the
operator H i represents filtering by hi followed by downsampling by N , its matrix
168 CHAPTER 3
representation is
⎛ ⎞
.. .. ..
⎜ . . . ⎟
⎜ · · · hi [L − 1] · · · hi [L − N ] hi [L − N − 1] · · · ⎟
⎜
Hi = ⎜ ⎟.
· · · 0 · · · 0 h [L − 1] · · · ⎟
⎝ i ⎠
.. .. ..
. . .
N −1
GTi H i = I.
i=0
We leave the details and proofs of the above relationships as an exercise (Problem
3.21), since they are simple extensions of the two-channel case seen in Section 3.2.
N −1
1 √
Y (z) = X(WNi z), WN = e−j2π/N , j= −1
N
i=0
because of the orthogonality of the roots of unity. Then, the output of the system
in Figure 3.14 becomes, in a similar fashion to (3.2.14)
1 T
X̂(z) = g (z) H m (z) xm (z),
N
the first one. To obtain perfect reconstruction, this only nonzero element has to be
equal to a scaled pure delay.
As in the two-channel case, it can be shown that the perfect reconstruction
condition is equivalent to the system being biorthogonal, as given earlier. The
proof is left as an exercise for the reader (Problem 3.21). For completeness, let us
define Gm (z) as the matrix with the ith row equal to
N −1
X(z) = z −j Xj (z N ),
j=0
where
∞
Xj (z) = x[nN + j] z −n .
n=−∞
The polyphase components of the synthesis filter gi are defined similarly, that is
N −1
Gi (z) = z −j Gij (z N ),
j=0
where
∞
Gij (z) = gi [nN + j] z −n .
n=−∞
N −1
Hi (z) = z j Hij (z N ), (3.4.7)
j=0
where
∞
Hij (z) = hi [nN − j] z −n . (3.4.8)
n=−∞
Putting it all together, the output of the analysis/synthesis filter bank in Figure 3.14
can be written as
X̂(z) = ( 1 z −1 z −1 ... z −N +1 ) · Gp (z N ) · H p (z N ) · xp (z N ).
Similarly to the two-channel case, we can define the transfer function matrix T p (z) =
Gp (z)H p (z). Then, the same results hold as in the two-channel case. Here, we just
state them (the proofs are N-channel counterparts of the two-channel ones).
(c) Given a critically sampled FIR analysis filter bank, perfect reconstruction
with FIR filters is possible if and only if det(H p (z)) is a pure delay.
3.4. MULTICHANNEL FILTER BANKS 171
Note that the modulation and polyphase representations are related via the Fourier
matrix. For example, one can verify that
⎛ ⎞
1
1 ⎜⎜ z ⎟
⎟ F xm (z),
xp (z N ) = ⎝ .. ⎠ (3.4.9)
N .
z N −1
where F kl = WNkl = e−j(2π/N )kl . Similar relationships hold between H m (z), Gm (z)
and H p (z), Gp (z), respectively (see Problem 3.22). The important point to note
is that modulation and polyphase matrices are related by unitary operations (such
as F and delays as in (3.4.9)).
Orthogonal Multichannel FIR Filter Banks Let us now consider the particular
but important case when the filter bank is unitary or orthogonal. This is an ex-
tension of the discussion in Section 3.2.3 to the N-channel case. The idea is to
implement an orthogonal transform using an N-channel filter bank, or in other
words, we want the following set:
{g0 [n − N K], . . . , gN −1 [n − N K]} , n ∈ Z
to be an orthonormal basis for l2 (Z). Then
gi [n − N k], gj [n − N l] = δ[i − j] δ[l − k]. (3.4.10)
Since in the orthogonal case analysis and synthesis filters are identical up to a time
reversal, (3.4.10) holds for hi [N k − l] as well. By using (2.5.19), (3.4.10) can be
expressed in z-domain as
N −1
Gi (WNk z) Gj (WN−k z −1 ) = N δ[i − j], (3.4.11)
k=0
or
GTm∗ (z −1 ) Gm (z) = N I,
where the subscript ∗ stands for conjugation of the coefficients but not of z (this is
necessary since Gm (z) has complex coefficients). Thus, as in the two-channel case,
having an orthogonal transform is equivalent to having a paraunitary modulation
matrix. Unlike the two-channel case, however, not all of the filters are obtained
from a single prototype filter.
Since modulation and polyphase matrices are related, it is easy to check that
having a paraunitary modulation matrix is equivalent to having a paraunitary
polyphase matrix, that is
GTm∗ (z −1 ) Gm (z) = N I ⇐⇒ GTp (z −1 ) Gp (z) = I. (3.4.12)
172 CHAPTER 3
Gi GTj = δ[i − j] I, i, j = 0, 1,
or
T Ta T a = I.
The above relations lead to a direct extension of Theorem 3.8, where the particular
case N = 2 was considered.
Thus, according to (3.4.12), designing an orthogonal filter bank with N channels
reduces to finding N × N paraunitary matrices. Just as in the two-channel case,
where we saw a lattice realization of orthogonal filter banks (see (3.2.60)), N ×
N paraunitary matrices can be parametrized in terms of cascades of elementary
matrices (2×2 rotations and delays). Such parametrizations have been investigated
by Vaidyanathan, and we refer to his book [308] for a thorough treatment. An
overview can be found in Appendix 3.A.2. As an example, we will see how to
construct three-channel paraunitary filter banks.
Example 3.11
We use the factorization given in Appendix 3.A.2, (3.A.8). Thus, we can express the 3 × 3
polyphase matrix as
⎡ ⎛ −1 ⎞ ⎤
"
K−1 z
Gp (z) = U 0 ⎣ ⎝ 1 ⎠ U i⎦ ,
i=1 1
where
⎛ ⎞⎛ ⎞
1 0 0 cos α01 0 − sin α01
U0 = ⎝ 0 cos α00 − sin α00 ⎠⎝ 0 1 0 ⎠
0 sin α00 cos α00 sin α01 0 cos α01
⎛ ⎞
cos α02 − sin α02 0
× ⎝ sin α02 cos α02 0 ⎠,
0 0 1
The degrees of freedom are given by the angles αij . To obtain the three analysis filters, we
upsample the polyphase matrix, and thus
To design actual filters, one could minimize an objective function as the one given in [306],
where the sum of all the stopbands was minimized.
3.4. MULTICHANNEL FILTER BANKS 173
HN−1 Μ yN-1
•••
•••
x Ν>Μ
H1 Μ y1
H0 Μ y0
T HEOREM 3.17
There are no finite-support bases with filters as in (3.4.13) (except trivial ones
with only N nonzero coefficients).
P ROOF
The proof consists in analyzing the polyphase matrix H p (z). Write the prototype filter
Hpr (z) in terms of its polyphase components (see (3.4.7–3.4.8))
N−1
Hpr (z) = z j Hprj (z N ),
j=0
where Fkl = WNkl = e−j(2π/N)kl . For FIR perfect reconstruction, the determinant of Hp (z)
has to be a delay (by Theorem 3.16). Now,
"
N−1
det(H p (z)) = c Hprj (z),
j=0
where c is a complex number equal to det(F ). Therefore, for perfect FIR reconstruction,
Hprj (z) has to be of the form αi · z −m , that is, the prototype filter has exactly N nonzero
coefficients. For an orthogonal solution, the αi ’s have to be unit-norm constants.
What happens if we relax the FIR requirement? For example, one can choose
the following prototype:
N −1
Hpr (z) = Pi (z N ) z i , (3.4.16)
i=0
where Pi (z) are allpass filters. The factorization (3.4.15) still holds, with Hpri (z) =
Pi (z), and since Pi (z −1 ) · Pi (z) = 1, H p (z) is paraunitary. While this gives an
orthogonal modulated filter bank, it is IIR (either analysis or synthesis will be
noncausal), and the quality of the filter in (3.4.16) can be poor.
Cosine Modulated Filter Banks The problems linked to complex modulated fil-
ter banks can be solved by using appropriate cosine modulation. Such cosine-
modulated filter banks are very important in practice, for example in audio com-
pression (see Section 7.2.2). Since they are often of length L = 2N (where N is the
downsampling rate), they are sometimes referred to as modulated LOT’s, or MLT’s.
A popular version was proposed in [229] and thus called the Princen-Bradley filter
bank. We will study one class of cosine modulated filter banks in some depth, and
refer to [188, 308] for a more general and detailed treatment. The cosine modulated
filter banks we consider here are a particular case of pseudoquadrature mirror filter
banks (PQMF) when the filter length is restricted to twice the number of channels
L = 2N . Pseudo QMF filters have been proposed as an extension to N channels
of the classical two-channel QMF filters. Pseudo QMF analysis/synthesis systems
achieve in general only cancellation of the main aliasing term (aliasing from neigh-
boring channels). However, when the filter length is restricted to L = 2N , they
can achieve perfect reconstruction. Due to the modulated structure and just as in
the STFT case, there are fast computational algorithms, making such filter banks
attractive for implementations.
A family of PQMF filter banks that achieves cancellation of the main aliasing
176 CHAPTER 3
0
k = 0 k = 1
1 1
-10
0.5 0.5
Magnitude response [dB]
2 6 10 14 2 6 10 14 -20
-0.5 -0.5
-30
-1 -1
k = 2 k = 3 -40
1 1
2 6 10 14 2 6 10 14
-60
-0.5 -0.5
(a) (b)
Example 3.12
Consider the case N = 8. The center frequency of the modulated filter hk [n] is (2k+1)2π/32,
and since this is a cosine modulation and the filters are real, there is a mirror lobe at
(32 − 2k − 1)2π/32. For the filters h0 [n] and h7 [n], these two lobes overlap to form a single
lowpass and highpass, respectively, while h1 [n], . . . , h6 [n] are bandpass filters. A possible
symmetric window of length 16 and satisfying (3.4.24) is given in Table 3.4, while the impulse
responses of the first four filters as well as the magnitude responses of all the modulated
filters are given in Figure 3.16.
Note that cosine modulated filter banks which are orthogonal have been recently
generalized to lengths L = KN where K can be larger than 2. For more details,
refer to [159, 188, 235, 308].
and perfect reconstruction is easily achievable. For example, in the FIR case if
H0 (z) and H1 (z) have no zeros in common (that is, the polynomials in z −1 are
coprime), then one can use Euclid’s algorithm [32] to find G0 (z) and G1 (z) such
that
G0 (z) H0 (z) + G1 (z) H1 (z) = 1
is satisfied leading to X̂(z) = X(z) in (3.5.1). Note how coprimeness of H0 (z) and
H1 (z), used in Euclid’s algorithm, is also a very natural requirement in terms of
signal processing. A common zero would prohibit FIR reconstruction, or even IIR
reconstruction (if the common zero is on the unit circle). Another case appears
when we have two filters G0 (z) and G1 (z) which have unit norm and satisfy
Writing this in time domain (see Example 5.2), we realize that the set {gi [n − k]},
i = 0, 1, and k ∈ Z, forms a tight frame for l2 (Z) with a redundancy factor R = 2.
The fact that {gi [n − k]} form a tight frame simply means that they can uniquely
represent any sequence from l2 (Z) (see also Section 5.3). However, the basis vectors
are not linearly independent and thus they do not form an orthonormal basis. The
redundancy factor indicates the oversampling rate; we can indeed check that it is
two in this case, that is, there are twice as many basis functions than actually needed
to represent sequences from l2 (Z). This is easily seen if we remember that until
now we needed only the even shifts of gi [n] as basis functions, while now we use the
odd shifts as well. Also, the expansion formula in a tight frame is similar to that in
the orthogonal case, except for the redundancy (which means the functions in the
expansion are not linearly independent). There is an energy conservation relation,
or Parseval’s formula, which says that the energy of the expansion coefficients equals
R times the energy of the original. In our case, calling yi [n] the output of the filter
hi [n], we can verify (Problem 3.26) that
To design such a tight frame for l2 (Z) based on filter banks, that is, to find solutions
to (3.5.2), one can find a unit norm12 filter G0 (z) which satisfies
that is, G0 (z) and G1 (z) are power complementary. Note that (3.5.2) is less restric-
tive than the usual orthogonal solutions we have seen in Section 3.2.3. For example,
odd-length filters are possible.
Of course, one can iterate such nondownsampled two-channel filter banks, and
get more general solutions.
7 In2particular,
8 by adding two-channel nondownsampled
2
filter banks with filters H0 (z ), H1 (z ) to the lowpass analysis channel and iter-
ating (raising z to the appropriate power) one can devise a discrete-time wavelet
12
Note that the unit norm requirement is not necessary for constructing a tight frame.
3.5. PYRAMIDS AND OVERCOMPLETE EXPANSIONS 181
coarse
version
~
H0 2 2 H0 V1
original difference
signal − signal
V0 + W1
of the same scale as the original. Also, a successive approximation flavor is easily
seen: One could start with the coarse version at level J, and by adding difference
signals, obtain versions at levels J − 1, . . . , 1, 0, (that is, the original).
An advantage of the pyramid scheme in image coding is that nonlinear inter-
polation and decimation operators can be used. A disadvantage, however, as we
have already mentioned, is that the scheme is oversampled, although the overhead
in number of samples decreases as the dimensionality increases. In n dimensions,
oversampling s as a function of the number of levels L in the pyramid is given by
L−1
1
i
2n
s = < , (3.5.4)
2n 2n − 1
i=0
d = (I − H T0 H 0 ) x.
I − H T0 H 0 = H T1 H 1 ,
that is, d is the projection onto the space spanned by {h1 [2k−n], k ∈ Z}. Therefore,
we can filter and downsample d by 2, since
H 1 H T1 H 1 = H 1 .
In that case, the redundancy of d is removed (d is now critically sampled) and the
pyramid is equivalent to an orthogonal subband coding system.
The signal d can be reconstructed by upsampling by 2 and filtering with h1 [n].
Then we have
H T1 (H 1 H T1 H 1 ) x = H T1 H 1 x = d
and this, added to x̄ = H T0 H 0 x, is indeed equal to x. In the notation of the
multiresolution scheme the prediction x̄ is the projection onto the space V1 and d
3.5. PYRAMIDS AND OVERCOMPLETE EXPANSIONS 183
is the projection onto W1 . This is indicated in Figure 3.17. We have thus shown
that pyramidal schemes can be critically sampled as well, that is, in Figure 3.17 the
difference signal can be followed by a filter h1 [n] and a downsampler by 2 without
any loss of information.
Note that we assumed an orthogonal filter and no quantization of the coarse
version. The benefit of the oversampled pyramid comes from the fact that arbitrary
filters (including nonlinear ones) can be used, and that quantization of the coarse
version does not influence perfect reconstruction (see Section 7.3.2).
This scheme is very popular in computer vision, not so much because perfect
reconstruction is desired but because it is a computationally efficient way to obtain
multiple resolution of an image. As a lowpass filter, an approximation to a Gaus-
sian, bell-shaped filter is often used and because the difference signal resembles the
original filtered by the Laplace operator, such a scheme is usually called a Laplacian
pyramid.
HN – 1 Μ CN − 1 Μ GN – 1
•••
•••
•••
x + x^
H1 Μ C1 Μ G1
H0 Μ C0 Μ G0
Generalizations The above two schemes are examples from a general class of
oversampled filter banks which compute running convolution. For example, the
pointwise multiplication in the above schemes can be replaced by a true convolu-
tion and will result in a longer overall convolution if adequately chosen. Another
possibility is to use analysis and synthesis filters based on fast convolution algo-
rithms other than Fourier ones. For more details, see [276, 317] and Section 6.5.1.
are obtained in this way, leading to fairly constrained designs (nonseparable filters of
size N1 ×N2 would offer N1 ·N2 free design variables versus N1 +N2 in the separable
case). Then, only rectangular divisions of the spectrum are possible, though one
might need divisions that would better capture the signal’s energy concentration
(for example, close to circular).
Choosing nonseparable solutions, while solving some of these problems, comes
at a price: the design is more difficult, and the complexity is substantially higher.
The first step toward using multidimensional techniques on multidimensional
signals is to use the same kind of sampling as before (that is, in the case of an im-
age, sample first along the horizontal and then along the vertical dimension), but use
nonseparable filters. A second step consists in using nonseparable sampling as well
as nonseparable filters. This calls for the development of a new theory that starts
by pointing out the major difference between one- and multidimensional cases —
sampling. Sampling in multiple dimensions is represented by lattices. An excellent
presentation of lattice sampling can be found in the tutorial by Dubois [86] (Ap-
pendix 3.B gives a brief overview). Filter banks using nonseparable downsampling
were studied in [11, 314]. The generalization of one-dimensional analysis methods
to multidimensional filter banks using lattice downsampling was done in [155, 325].
The topic has been quite active recently (see [19, 47, 48, 160, 257, 264, 288]).
In this section, we will give an overview of the field of multidimensional filter
banks. We will concentrate mostly on two cases: the separable case with down-
sampling by 2 in two dimensions, and the quincunx case, that is, the simplest
multidimensional nonseparable case with overall sampling density of 2. Both of
these cases are of considerable practical interest, since these are the ones mostly
used in image processing applications.
f2
π
H1 2 HH
H
H1L 2
H0 2 HL f1
−π π
x
H1 2 LH
−π
H0Η
H 2
H0 2 LL LL LH HL HH
horizontal
vertical
(a) (b)
Figure 3.19 Separable filter bank in two dimensions, with separable downsam-
FIGURE
pling by 2. (a) Cascade of horizontal and 3.15 fignew3.6.1
vertical decompositions. (b) Division
of the frequency spectrum.
n2 n2
n1 n1
(a) (b)
Figure 3.20 Two often used lattices. (a) Separable sampling by 2 in two
dimensions. (b) Quincunx sampling.
nonseparable filters. In other words, one could have a direct four-channel implemen-
tation of Figure 3.19 where the four filters could be H0 , H1 , H2 , H3 . While before,
Hi (z1 , z2 ) = Hi1 (z1 )Hi2 (z2 ) where Hi (z) is a one-dimensional filter, Hi (z1 , z2 ) is now a true
two-dimensional filter. This solution, while more general, is more complex to design and
implement. It is possible to obtain an orthogonal linear phase FIR solution [155, 156], which
cannot be achieved using separable filters (see Example 3.15 below).
xm (z1 , z2 ) =
( X(z1 , z2 ) X(−z1 , z2 ) X(z1 , −z2 ) X(−z1 , −z2 ) ) .
1 H (z , z ) H (−z , −z )
0 1 2 0 1 2
Y (z1 , z2 ) = G0 (z1 , z2 ) G1 (z1 , z2 )
2 H1 (z1 , z2 ) H1 (−z1 , −z2 )
X(z1 , z2 )
.
X(−z1 , −z2 )
Similarly to the one-dimensional case, it can be verified that the orthogonality of the system
is achieved when the lowpass filter satisfies
H0 (z1 , z2 )H0 (z1−1 , z2−1 ) + H0 (−z1 , −z2 )H0 (−z1−1 , −z2−1 ) = 2, (3.6.4)
that is, the lowpass filter is orthogonal to its shifts on the quincunx lattice. Then, a possible
highpass filter is given by
The synthesis filters are the same (within shift reversal, or Gi (z1 , z2 ) = Hi (z1−1 , z2−1 )). In
polyphase domain, define the two polyphase components of the filters as
Hi0 (z1 , z2 ) = hi [n1 + n2 , n1 − n2 ]z1−n1 z2−n2 ,
(n1 ,n2 )∈Z 2
Hi1 (z1 , z2 ) = hi [n1 + n2 + 1, n1 − n2 ]z1−n1 z2−n2 ,
(n1 ,n2 )∈Z 2
with
Hi (z1 , z2 ) = Hi0 (z1 z2 , z1 z2−1 ) + z1−1 Hi1 (z1 z2 , z1 z2−1 ).
The results on alias cancellation and perfect reconstruction are very similar to
their one-dimensional counterparts. For example, perfect reconstruction with FIR
filters is achieved if and only if the determinant of the analysis polyphase matrix is
a monomial, that is,