100% found this document useful (3 votes)
3K views519 pages

Wavelets and Subband Coding Overview

This document is the preface and first chapter of the book "Wavelets and Subband Coding" by Martin Vetterli and Jelena Kovačević. The book provides an introduction to wavelets and multiresolution signal processing. Chapter 1 discusses series expansions of signals using bases and the multiresolution concept. It presents an overview of signal decompositions and expansions using bases at different resolution levels. The chapter lays the foundation for the technical details on wavelets and filter banks that are covered in subsequent chapters.

Uploaded by

trungnt1981
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
3K views519 pages

Wavelets and Subband Coding Overview

This document is the preface and first chapter of the book "Wavelets and Subband Coding" by Martin Vetterli and Jelena Kovačević. The book provides an introduction to wavelets and multiresolution signal processing. Chapter 1 discusses series expansions of signals using bases and the multiresolution concept. It presents an overview of signal decompositions and expansions using bases at different resolution levels. The chapter lays the foundation for the technical details on wavelets and filter banks that are covered in subsequent chapters.

Uploaded by

trungnt1981
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 519

Wavelets and

Subband Coding

Martin Vetterli & Jelena Kovačević


Originally published 1995 by Prentice Hall PTR, Englewood Cliffs, New Jersey.
Reissued by the authors 2007.

This work is licensed under the Creative Commons Attribution-Noncommercial-


No Derivative Works 3.0 License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-nc-nd/3.0/ or send a letter to
Creative Commons, 171 Second Street, Suite 300, San Francisco, CA 94105 USA.
Wavelets
and
Subband Coding

Martin Vetterli
University of California at Berkeley

Jelena Kovačević
AT&T Bell Laboratories
Für meine Eltern.
A Marie-Laure.
— MV

A Giovanni.
Mojoj zvezdici, mami i tati.
— JK
Contents

Preface xiii

1 Wavelets, Filter Banks and Multiresolution Signal Processing 1


1.1 Series Expansions of Signals . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Multiresolution Concept . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Overview of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Fundamentals of Signal Decompositions 15


2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Vector Spaces and Inner Products . . . . . . . . . . . . . . . 18
2.2.2 Complete Inner Product Spaces . . . . . . . . . . . . . . . . . 21
2.2.3 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.4 General Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.5 Overcomplete Expansions . . . . . . . . . . . . . . . . . . . . 28
2.3 Elements of Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.1 Basic Definitions and Properties . . . . . . . . . . . . . . . . 30
2.3.2 Linear Systems of Equations and Least Squares . . . . . . . . 32
2.3.3 Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . 33
2.3.4 Unitary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3.5 Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.6 Polynomial Matrices . . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Fourier Theory and Sampling . . . . . . . . . . . . . . . . . . . . . . 37

vii
viii CONTENTS

2.4.1 Signal Expansions and Nomenclature . . . . . . . . . . . . . . 38


2.4.2 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.3 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.4 Dirac Function, Impulse Trains and Poisson Sum Formula . . 45
2.4.5 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4.6 Discrete-Time Fourier Transform . . . . . . . . . . . . . . . . 50
2.4.7 Discrete-Time Fourier Series . . . . . . . . . . . . . . . . . . 52
2.4.8 Discrete Fourier Transform . . . . . . . . . . . . . . . . . . . 53
2.4.9 Summary of Various Flavors of Fourier Transforms . . . . . . 55
2.5 Signal Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.5.1 Continuous-Time Signal Processing . . . . . . . . . . . . . . . 59
2.5.2 Discrete-Time Signal Processing . . . . . . . . . . . . . . . . 62
2.5.3 Multirate Discrete-Time Signal Processing . . . . . . . . . . . 68
2.6 Time-Frequency Representations . . . . . . . . . . . . . . . . . . . . 76
2.6.1 Frequency, Scale and Resolution . . . . . . . . . . . . . . . . 76
2.6.2 Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . 79
2.6.3 Short-Time Fourier Transform . . . . . . . . . . . . . . . . . 81
2.6.4 Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . 83
2.6.5 Block Transforms . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.6.6 Wigner-Ville Distribution . . . . . . . . . . . . . . . . . . . . 84
2.A Bounded Linear Operators on Hilbert Spaces . . . . . . . . . . . . . 85
2.B Parametrization of Unitary Matrices . . . . . . . . . . . . . . . . . . 86
2.B.1 Givens Rotations . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.B.2 Householder Building Blocks . . . . . . . . . . . . . . . . . . 88
2.C Convergence and Regularity of Functions . . . . . . . . . . . . . . . 89
2.C.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
2.C.2 Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

3 Discrete-Time Bases and Filter Banks 97


3.1 Series Expansions of Discrete-Time Signals . . . . . . . . . . . . . . 100
3.1.1 Discrete-Time Fourier Series . . . . . . . . . . . . . . . . . . 101
3.1.2 Haar Expansion of Discrete-Time Signals . . . . . . . . . . . 104
3.1.3 Sinc Expansion of Discrete-Time Signals . . . . . . . . . . . . 109
3.1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.2 Two-Channel Filter Banks . . . . . . . . . . . . . . . . . . . . . . . . 112
3.2.1 Analysis of Filter Banks . . . . . . . . . . . . . . . . . . . . . 113
3.2.2 Results on Filter Banks . . . . . . . . . . . . . . . . . . . . . 123
3.2.3 Analysis and Design of Orthogonal FIR Filter Banks . . . . . 128
3.2.4 Linear Phase FIR Filter Banks . . . . . . . . . . . . . . . . . 139
3.2.5 Filter Banks with IIR Filters . . . . . . . . . . . . . . . . . . 145
CONTENTS ix

3.3 Tree-Structured Filter Banks . . . . . . . . . . . . . . . . . . . . . . 148


3.3.1 Octave-Band Filter Bank and Discrete-Time Wavelet Series . 150
3.3.2 Discrete-Time Wavelet Series and Its Properties . . . . . . . 154
3.3.3 Multiresolution Interpretation of Octave-Band Filter Banks . 158
3.3.4 General Tree-Structured Filter Banks and Wavelet Packets . 161
3.4 Multichannel Filter Banks . . . . . . . . . . . . . . . . . . . . . . . . 163
3.4.1 Block and Lapped Orthogonal Transforms . . . . . . . . . . . 163
3.4.2 Analysis of Multichannel Filter Banks . . . . . . . . . . . . . 167
3.4.3 Modulated Filter Banks . . . . . . . . . . . . . . . . . . . . . 173
3.5 Pyramids and Overcomplete Expansions . . . . . . . . . . . . . . . . 179
3.5.1 Oversampled Filter Banks . . . . . . . . . . . . . . . . . . . . 179
3.5.2 Pyramid Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 181
3.5.3 Overlap-Save/Add Convolution and Filter Bank Implemen-
tations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
3.6 Multidimensional Filter Banks . . . . . . . . . . . . . . . . . . . . . 184
3.6.1 Analysis of Multidimensional Filter Banks . . . . . . . . . . . 185
3.6.2 Synthesis of Multidimensional Filter Banks . . . . . . . . . . 189
3.7 Transmultiplexers and Adaptive Filtering in Subbands . . . . . . . . 192
3.7.1 Synthesis of Signals and Transmultiplexers . . . . . . . . . . 192
3.7.2 Adaptive Filtering in Subbands . . . . . . . . . . . . . . . . . 195
3.A Lossless Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
3.A.1 Two-Channel Factorizations . . . . . . . . . . . . . . . . . . . 197
3.A.2 Multichannel Factorizations . . . . . . . . . . . . . . . . . . . 198
3.B Sampling in Multiple Dimensions and Multirate Operations . . . . . 202

4 Series Expansions Using Wavelets and Modulated Bases 209


4.1 Definition of the Problem . . . . . . . . . . . . . . . . . . . . . . . . 211
4.1.1 Series Expansions of Continuous-Time Signals . . . . . . . . . 211
4.1.2 Time and Frequency Resolution of Expansions . . . . . . . . 214
4.1.3 Haar Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 216
4.1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
4.2 Multiresolution Concept and Analysis . . . . . . . . . . . . . . . . . 222
4.2.1 Axiomatic Definition of Multiresolution Analysis . . . . . . . 223
4.2.2 Construction of the Wavelet . . . . . . . . . . . . . . . . . . . 226
4.2.3 Examples of Multiresolution Analyses . . . . . . . . . . . . . 228
4.3 Construction of Wavelets Using Fourier Techniques . . . . . . . . . . 232
4.3.1 Meyer’s Wavelet . . . . . . . . . . . . . . . . . . . . . . . . . 233
4.3.2 Wavelet Bases for Piecewise Polynomial Spaces . . . . . . . . 238
4.4 Wavelets Derived from Iterated Filter Banks and Regularity . . . . . 246
4.4.1 Haar and Sinc Cases Revisited . . . . . . . . . . . . . . . . . 247
x CONTENTS

4.4.2 Iterated Filter Banks . . . . . . . . . . . . . . . . . . . . . . . 252


4.4.3 Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
4.4.4 Daubechies’ Family of Regular Filters and Wavelets . . . . . 267
4.5 Wavelet Series and Its Properties . . . . . . . . . . . . . . . . . . . . 270
4.5.1 Definition and Properties . . . . . . . . . . . . . . . . . . . . 271
4.5.2 Properties of Basis Functions . . . . . . . . . . . . . . . . . . 276
4.5.3 Computation of the Wavelet Series and Mallat’s Algorithm . 280
4.6 Generalizations in One Dimension . . . . . . . . . . . . . . . . . . . 282
4.6.1 Biorthogonal Wavelets . . . . . . . . . . . . . . . . . . . . . . 282
4.6.2 Recursive Filter Banks and Wavelets with Exponential Decay 288
4.6.3 Multichannel Filter Banks and Wavelet Packets . . . . . . . . 289
4.7 Multidimensional Wavelets . . . . . . . . . . . . . . . . . . . . . . . 293
4.7.1 Multiresolution Analysis and Two-Scale Equation . . . . . . 293
4.7.2 Construction of Wavelets Using Iterated Filter Banks . . . . 295
4.7.3 Generalization of Haar Basis to Multiple Dimensions . . . . . 297
4.7.4 Design of Multidimensional Wavelets . . . . . . . . . . . . . . 298
4.8 Local Cosine Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
4.8.1 Rectangular Window . . . . . . . . . . . . . . . . . . . . . . . 302
4.8.2 Smooth Window . . . . . . . . . . . . . . . . . . . . . . . . . 303
4.8.3 General Window . . . . . . . . . . . . . . . . . . . . . . . . . 304
4.A Proof of Theorem 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

5 Continuous Wavelet and Short-Time Fourier Transforms


and Frames 311
5.1 Continuous Wavelet Transform . . . . . . . . . . . . . . . . . . . . . 313
5.1.1 Analysis and Synthesis . . . . . . . . . . . . . . . . . . . . . . 313
5.1.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
5.1.3 Morlet Wavelet . . . . . . . . . . . . . . . . . . . . . . . . . . 324
5.2 Continuous Short-Time Fourier Transform . . . . . . . . . . . . . . . 325
5.2.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
5.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
5.3 Frames of Wavelet and Short-Time Fourier Transforms . . . . . . . . 328
5.3.1 Discretization of Continuous-Time Wavelet and Short-Time
Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . 329
5.3.2 Reconstruction in Frames . . . . . . . . . . . . . . . . . . . . 332
5.3.3 Frames of Wavelets and STFT . . . . . . . . . . . . . . . . . 337
5.3.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
CONTENTS xi

6 Algorithms and Complexity 347


6.1 Classic Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
6.1.1 Fast Convolution . . . . . . . . . . . . . . . . . . . . . . . . . 348
6.1.2 Fast Fourier Transform Computation . . . . . . . . . . . . . . 352
6.1.3 Complexity of Multirate Discrete-Time Signal Processing . . 355
6.2 Complexity of Discrete Bases Computation . . . . . . . . . . . . . . 360
6.2.1 Two-Channel Filter Banks . . . . . . . . . . . . . . . . . . . . 360
6.2.2 Filter Bank Trees and Discrete-Time Wavelet Transforms . . 363
6.2.3 Parallel and Modulated Filter Banks . . . . . . . . . . . . . . 366
6.2.4 Multidimensional Filter Banks . . . . . . . . . . . . . . . . . 368
6.3 Complexity of Wavelet Series Computation . . . . . . . . . . . . . . 369
6.3.1 Expansion into Wavelet Bases . . . . . . . . . . . . . . . . . . 369
6.3.2 Iterated Filters . . . . . . . . . . . . . . . . . . . . . . . . . . 370
6.4 Complexity of Overcomplete Expansions . . . . . . . . . . . . . . . . 371
6.4.1 Short-Time Fourier Transform . . . . . . . . . . . . . . . . . 371
6.4.2 “Algorithme à Trous” . . . . . . . . . . . . . . . . . . . . . . 372
6.4.3 Multiple Voices Per Octave . . . . . . . . . . . . . . . . . . . 374
6.5 Special Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
6.5.1 Computing Convolutions Using Multirate Filter Banks . . . . 375
6.5.2 Numerical Algorithms . . . . . . . . . . . . . . . . . . . . . . 379

7 Signal Compression and Subband Coding 383


7.1 Compression Systems Based on Linear Transforms . . . . . . . . . . 385
7.1.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . 386
7.1.2 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
7.1.3 Entropy Coding . . . . . . . . . . . . . . . . . . . . . . . . . 403
7.1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
7.2 Speech and Audio Compression . . . . . . . . . . . . . . . . . . . . . 407
7.2.1 Speech Compression . . . . . . . . . . . . . . . . . . . . . . . 407
7.2.2 High-Quality Audio Compression . . . . . . . . . . . . . . . . 408
7.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
7.3 Image Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
7.3.1 Transform and Lapped Transform Coding of Images . . . . . 415
7.3.2 Pyramid Coding of Images . . . . . . . . . . . . . . . . . . . 421
7.3.3 Subband and Wavelet Coding of Images . . . . . . . . . . . . 425
7.3.4 Advanced Methods in Subband and Wavelet Compression . . 438
7.4 Video Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
7.4.1 Key Problems in Video Compression . . . . . . . . . . . . . . 447
7.4.2 Motion-Compensated Video Coding . . . . . . . . . . . . . . 453
7.4.3 Pyramid Coding of Video . . . . . . . . . . . . . . . . . . . . 454
xii CONTENTS

7.4.4 Subband Decompositions for Video Representation and Com-


pression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
7.4.5 Example: MPEG Video Compression Standard . . . . . . . . 463
7.5 Joint Source-Channel Coding . . . . . . . . . . . . . . . . . . . . . . 464
7.5.1 Digital Broadcast . . . . . . . . . . . . . . . . . . . . . . . . . 465
7.5.2 Packet Video . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
7.A Statistical Signal Processing . . . . . . . . . . . . . . . . . . . . . . . 467

Bibliography 476

Index 499
Preface

A central goal of signal processing is to describe real life signals, be it for com-
putation, compression, or understanding. In that context, transforms or linear ex-
pansions have always played a key role. Linear expansions are present in Fourier’s
original work and in Haar’s construction of the first wavelet, as well as in Gabor’s
work on time-frequency analysis. Today, transforms are central in fast algorithms
such as the FFT as well as in applications such as image and video compression.
Over the years, depending on open problems or specific applications, theoreti-
cians and practitioners have added more and more tools to the toolbox called signal
processing. Two of the newest additions have been wavelets and their discrete-
time cousins, filter banks or subband coding. From work in harmonic analysis and
mathematical physics, and from applications such as speech/image compression
and computer vision, various disciplines built up methods and tools with a similar
flavor, which can now be cast into the common framework of wavelets.
This unified view, as well as the number of applications where this framework
is useful, are motivations for writing this book. The unification has given a new
understanding and a fresh view of some classic signal processing problems. Another
motivation is that the subject is exciting and the results are cute!
The aim of the book is to present this unified view of wavelets and subband
coding. It will be done from a signal processing perspective, but with sufficient
background material such that people without signal processing knowledge will

xiii
xiv PREFACE

find it useful as well. The level is that of a first year graduate engineering book
(typically electrical engineering and computer sciences), but elementary Fourier
analysis and some knowledge of linear systems in discrete time are enough to follow
most of the book.
After the introduction (Chapter 1) and a review of the basics of vector spaces,
linear algebra, Fourier theory and signal processing (Chapter 2), the book covers
the five main topics in as many chapters. The discrete-time case, or filter banks,
is thoroughly developed in Chapter 3. This is the basis for most applications, as
well as for some of the wavelet constructions. The concept of wavelets is developed
in Chapter 4, both with direct approaches and based on filter banks. This chapter
describes wavelet series and their computation, as well as the construction of mod-
ified local Fourier transforms. Chapter 5 discusses continuous wavelet and local
Fourier transforms, which are used in signal analysis, while Chapter 6 addresses
efficient algorithms for filter banks and wavelet computations. Finally, Chapter 7
describes signal compression, where filter banks and wavelets play an important
role. Speech/audio, image and video compression using transforms, quantization
and entropy coding are discussed in detail. Throughout the book we give examples
to illustrate the concepts, and more technical parts are left to appendices.
This book evolved from class notes used at Columbia University and the Uni-
versity of California at Berkeley. Parts of the manuscript have also been used at the
University of Illinois at Urbana-Champaign and the University of Southern Cali-
fornia. The material was covered in a semester, but it would also be easy to carve
out a subset or skip some of the more mathematical subparts when developing a
curriculum. For example, Chapters 3, 4 and 7 can form a good core for a course in
Wavelets and Subband Coding. Homework problems are included in all chapters,
complemented with project suggestions in Chapter 7. Since there is a detailed re-
view chapter that makes the material as self-contained as possible, we think that
the book is useful for self-study as well.
The subjects covered in this book have recently been the focus of books, special
issues of journals, special conference proceedings, numerous articles and even new
journals! To us, the book by I. Daubechies [73] has been invaluable, and Chapters 4
and 5 have been substantially influenced by it. Like the standard book by Meyer
[194] and a recent book by Chui [49], it is a more mathematically oriented book
than the present text. Another, more recent, tutorial book by Meyer gives an
excellent overview of the history of the subject, its mathematical implications and
current applications [195]. On the engineering side, the book by Vaidyanathan
[308] is an excellent reference on filter banks, as is Malvar’s book [188] for lapped
orthogonal transforms and compression. Several other texts, including edited books,
have appeared on wavelets [27, 51, 251], as well as on subband coding [335] and
multiresolution signal decompositions [3]. Recent tutorials on wavelets can be found
PREFACE xv

in [128, 140, 247, 281], and on filter banks in [305, 307].


From the above, it is obvious that there is no lack of literature, yet we hope
to provide a text with a broad coverage of theory and applications and a different
perspective based on signal processing. We enjoyed preparing this material, and
simply hope that the reader will find some pleasure in this exciting subject, and
share some of our enthusiasm!

ACKNOWLEDGEMENTS

Some of the work described in this book resulted from research supported by the
National Science Foundation, whose support is gratefully acknowledged. We would
like also to thank Columbia University, in particular the Center for Telecommu-
nications Research, the University of California at Berkeley and AT&T Bell Lab-
oratories for providing support and a pleasant work environment. We take this
opportunity to thank A. Oppenheim for his support and for including this book in
his distinguished series. We thank K. Gettman and S. Papanikolau of Prentice-Hall
for their patience and help, and K. Fortgang of bookworks for her expert help in
the production stage of the book.
To us, one of the attractions of the topic of Wavelets and Subband Coding is
its interdisciplinary nature. This allowed us to interact with people from many
different disciplines, and this was an enrichment in itself. The present book is the
result of this interaction and the help of many people.
Our gratitude goes to I. Daubechies, whose work and help has been invaluable, to
C. Herley, whose research, collaboration and help has directly influenced this book,
and O. Rioul, who first taught us about wavelets and has always been helpful.
We would like to thank M.J.T. Smith and P.P. Vaidyanathan for a continuing
and fruitful interaction on the topic of filter banks, and S. Mallat for his insights
and interaction on the topic of wavelets.
Over the years, discussions and interactions with many experts have contributed
to our understanding of the various fields relevant to this book, and we would
like to acknowledge in particular the contributions of E. Adelson, T. Barnwell,
P. Burt, A. Cohen, R. Coifman, R. Crochiere, P. Duhamel, C. Galand, W. Lawton,
D. LeGall, Y. Meyer, T. Ramstad, G. Strang, M. Unser and V. Wickerhauser.
Many people have commented on several versions of the present text. We thank
I. Daubechies, P. Heller, M. Unser, P.P. Vaidyanathan, and G. Wornell for go-
ing through a complete draft and making many helpful suggestions. Comments
on parts of the manuscript were provided by C. Chan, G. Chang, Z. Cvetković,
V. Goyal, C. Herley, T. Kalker, M. Khansari, M. Kobayashi, H. Malvar, P. Moulin,
A. Ortega, A. Park, J. Princen, K. Ramchandran, J. Shapiro and G. Strang, and
are acknowledged with many thanks.
xvi PREFACE

Coding experiments and associated figures were prepared by S. Levine (audio


compression) and J. Smith (image compression), with guidance from A. Ortega and
K. Ramchandran, and we thank them for their expert work. The images used in
the experiments were made available by the Independent Broadcasting Association
(UK).
The preparation of the manuscript relied on the help of many people. D. Heap is
thanked for his invaluable contributions in the overall process, and in preparing the
final version, and we thank C. Colbert, S. Elby, T. Judson, M. Karabatur, B. Lim,
S. McCanne and T. Sharp for help at various stages of the manuscript.
The first author would like to acknowledge, with many thanks, the fruitful
collaborations with current and former graduate students whose research has influ-
enced this text, in particular Z. Cvetković, M. Garrett, C. Herley, J. Hong, G. Karls-
son, E. Linzer, A. Ortega, H. Radha, K. Ramchandran, I. Shah, N.T. Thao and
K.M. Uz. The early guidance by H.J. Nussbaumer, and the support of M. Kunt
and G. Moschytz is gratefully acknowledged.
The second author would like to acknowledge friends and colleagues who con-
tributed to the book, in particular C. Herley, G. Karlsson, A. Ortega and K. Ram-
chandran. Internal reviewers at Bell Labs are thanked for their efforts, in particular
A. Reibman, G. Daryanani, P. Crouch, and T. Restaino.
1

Wavelets, Filter Banks and Multiresolution


Signal Processing

“It is with logic that one proves;


it is with intuition that one invents.”
— Henri Poincaré

T he topic of this book is very old and very new. Fourier series, or expansion of
periodic functions in terms of harmonic sines and cosines, date back to the early
part of the 19th century when Fourier proposed harmonic trigonometric series [100].
The first wavelet (the only example for a long time!) was found by Haar early in
this century [126]. But the construction of more general wavelets to form bases
for square-integrable functions was investigated in the 1980’s, along with efficient
algorithms to compute the expansion. At the same time, applications of these
techniques in signal processing have blossomed.
While linear expansions of functions are a classic subject, the recent construc-
tions contain interesting new features. For example, wavelets allow good resolution
in time and frequency, and should thus allow one to see “the forest and the trees.”
This feature is important for nonstationary signal analysis. While Fourier basis
functions are given in closed form, many wavelets can only be obtained through a
computational procedure (and even then, only at specific rational points). While
this might seem to be a drawback, it turns out that if one is interested in imple-
menting a signal expansion on real data, then a computational procedure is better
than a closed-form expression!

1
2 CHAPTER 1

The recent surge of interest in the types of expansions discussed here is due
to the convergence of ideas from several different fields, and the recognition that
techniques developed independently in these fields could be cast into a common
framework.
The name “wavelet” had been used before in the literature,1 but its current
meaning is due to J. Goupillaud, J. Morlet and A. Grossman [119, 125]. In the
context of geophysical signal processing they investigated an alternative to local
Fourier analysis based on a single prototype function, and its scales and shifts.
The modulation by complex exponentials in the Fourier transform is replaced by a
scaling operation, and the notion of scale2 replaces that of frequency. The simplicity
and elegance of the wavelet scheme was appealing and mathematicians started
studying wavelet analysis as an alternative to Fourier analysis. This led to the
discovery of wavelets which form orthonormal bases for square-integrable and other
function spaces by Meyer [194], Daubechies [71], Battle [21, 22], Lemarié [175],
and others. A formalization of such constructions by Mallat [180] and Meyer [194]
created a framework for wavelet expansions called multiresolution analysis, and
established links with methods used in other fields. Also, the wavelet construction
by Daubechies is closely connected to filter bank methods used in digital signal
processing as we shall see.
Of course, these achievements were preceded by a long-term evolution from the
1910 Haar wavelet (which, of course, was not called a wavelet back then) to work
using octave division of the Fourier spectrum (Littlewood-Paley) and results in
harmonic analysis (Calderon-Zygmund operators). Other constructions were not
recognized as leading to wavelets initially (for example, Stromberg’s work [283]).
Paralleling the advances in pure and applied mathematics were those in signal
processing, but in the context of discrete-time signals. Driven by applications such
as speech and image compression, a method called subband coding was proposed by
Croisier, Esteban, and Galand [69] using a special class of filters called quadrature
mirror filters (QMF) in the late 1970’s, and by Crochiere, Webber and Flanagan
[68]. This led to the study of perfect reconstruction filter banks, a problem solved
in the 1980’s by several people, including Smith and Barnwell [270, 271], Mintzer
[196], Vetterli [315], and Vaidyanathan [306].
In a particular configuration, namely when the filter bank has octave bands,
one obtains a discrete-time wavelet series. Such a configuration has been popular
in signal processing less for its mathematical properties than because an octave
band or logarithmic spectrum is more natural for certain applications such as audio
1
For example, for the impulse response of a layer in geophysical signal processing by Ricker
[237] and for a causal finite-energy function by Robinson [248].
2
For a beautiful illustration of the notion of scale, and an argument for geometric spacing of
scale in natural imagery, see [197].
1.1. SERIES EXPANSIONS OF SIGNALS 3

compression since it emulates the hearing process. Such an octave-band filter bank
can be used, under certain conditions, to generate wavelet bases, as shown by
Daubechies [71].
In computer vision, multiresolution techniques have been used for various prob-
lems, ranging from motion estimation to object recognition [249]. Images are suc-
cessively approximated starting from a coarse version and going to a fine-resolution
version. In particular, Burt and Adelson proposed such a scheme for image coding
in the early 1980’s [41], calling it pyramid coding.3 This method turns out to be
similar to subband coding. Moreover, the successive approximation view is similar
to the multiresolution framework used in the analysis of wavelet schemes.
In computer graphics, a method called successive refinement iteratively inter-
polates curves or surfaces, and the study of such interpolators is related to wavelet
constructions from filter banks [45, 92].
Finally, many computational procedures use the concept of successive approxi-
mation, sometimes alternating between fine and coarse resolutions. The multigrid
methods used for the solution of partial differential equations [39] are an example.
While these interconnections are now clarified, this has not always been the
case. In fact, maybe one of the biggest contributions of wavelets has been to bring
people from different fields together, and from that cross fertilization and exchange
of ideas and methods, progress has been achieved in various fields.
In what follows, we will take mostly a signal processing point of view of the
subject. Also, most applications discussed later are from signal processing.

1.1 S ERIES E XPANSIONS OF S IGNALS


We are considering linear expansions of signals or functions. That is, given any
signal x from some space S, where S can be finite-dimensional (for example, Rn ,
C n ) or infinite-dimensional (for example, l2 (Z), L2 (R)), we want to find a set
of elementary signals {ϕi }i∈Z for that space so that we can write x as a linear
combination 
x = αi ϕi . (1.1.1)
i

The set {ϕi } is complete for the space S, if all signals x ∈ S can be expanded as in
(1.1.1). In that case, there will also exist a dual set {ϕ̃i }i∈Z such that the expansion
coefficients in (1.1.1) can be computed as

αi = ϕ̃i [n] x[n],
n
3
The importance of the pyramid algorithm was not immediately recognized. One of the review-
ers of the original Burt and Adelson paper said, “I suspect that no one will ever use this algorithm
again.”
4 CHAPTER 1

e1 ~
e1 = ϕ ϕ1
1
ϕ0 ϕ1 e1

e0 = ϕ0 e0 = ϕ0
e0

ϕ1 ϕ2
(a) (b) ϕ~0 (c)

FIGURE 1.1 fig1.1

Figure 1.1 Examples of possible sets of vectors for the expansion of R2 . (a)
Orthonormal case. (b) Biorthogonal case. (c) Overcomplete case.

when x and ϕ̃i are real discrete-time sequences, and



αi = ϕ̃i (t) x(t) dt,

when they are real continuous-time functions. The above expressions are the inner
products of the ϕ̃i ’s with the signal x, denoted by ϕ̃i , x. An important particular
case is when the set {ϕi } is orthonormal and complete, since then we have an
orthonormal basis for S and the basis and its dual are the same, that is, ϕi = ϕ̃i .
Then
ϕi , ϕj  = δ[i − j],
where δ[i] equals 1 if i = 0, and 0 otherwise. If the set is complete and the vectors
ϕi are linearly independent but not orthonormal, then we have a biorthogonal basis,
and the basis and its dual satisfy

ϕi , ϕ˜j  = δ[i − j].

If the set is complete but redundant (the ϕi ’s are not linearly independent), then we
do not have a basis but an overcomplete representation called a frame. To illustrate
these concepts, consider the following example.

Example 1.1 Set of Vectors for the Plane


We show in Figure 1.1 some possible sets of vectors for the expansion of the plane, or R2 .
The standard Euclidean √ basis is given by e0√and e1 . In part (a), an orthonormal basis is
given by ϕ0 = [1, 1]T / 2 and ϕ1 = [1, −1]T / 2. The dual basis is identical, or ϕ̃i = ϕi . In
part (b), a biorthogonal basis is given, with ϕ0 = e0 and ϕ1 = [1, 1]T . The dual basis is now
ϕ̃0 = [1, −1]T and ϕ̃1 = [0, T
√ 1] .T Finally, in part (c), √
an overcomplete set is given, namely
ϕ0 = [1, 0] , ϕ1 = [−1/2, 3/2] and ϕ2 = [−1/2, − 3/2]T . Then, it can be verified that
T

a possible reconstruction basis is identical (up to a scale factor), namely, ϕ̃i = 2/3 ϕi (the
reconstruction basis is not unique). This set behaves as an orthonormal basis, even though
the vectors are linearly dependent.
1.1. SERIES EXPANSIONS OF SIGNALS 5

The representation in (1.1.1) is a change of basis, or, conceptually, a change


of point of view. The obvious question is, what is a good basis {ϕi } for S? The
answer depends on the class of signals we want to represent, and on the choice
of a criterion for quality. However, in general, a good basis is one that allows
compact representation or less complex processing. For example, the Karhunen-
Loève transform concentrates as much energy in as few coefficients as possible, and
is thus good for compression, while, for the implementation of convolution, the
Fourier basis is computationally more efficient than the standard basis.
We will be interested mostly in expansions with some structure, that is, expan-
sions where the various basis vectors are related to each other by some elementary
operations such as shifting in time, scaling, and modulation (which is shifting in
frequency). Because we are concerned with expansions for very high-dimensional
spaces (possibly infinite), bases without such structure are useless for complexity
reasons.
Historically, the Fourier series for periodic signals is the first example of a signal
expansion. The basis functions are harmonic sines and cosines. Is this a good set
of basis functions for signal processing? Besides its obvious limitation to periodic
signals, it has very useful properties, such as the convolution property which comes
from the fact that the basis functions are eigenfunctions of linear time-invariant
systems. The extension of the scheme to nonperiodic signals,4 by segmentation and
piecewise Fourier series expansion of each segment, suffers from artificial boundary
effects and poor convergence at these boundaries (due to the Gibbs phenomenon).
An attempt to create local Fourier bases is the Gabor transform or short-time
Fourier transform (STFT). A smooth window is applied to the signal centered
around t = nT0 (where T0 is some basic time step), and a Fourier expansion is
applied to the windowed signal. This leads to a time-frequency representation since
we get an approximate information about the frequency content of the signal around
the location nT0 . Usually, frequency points spaced 2π/T0 apart are used and we
get a sampling of the time-frequency plane on a rectangular grid. The spectrogram
is related to such a time-frequency analysis. Note that the functions used in the
expansion are related to each other by shift in time and modulation, and that we
obtain a linear frequency analysis. While the STFT has proven useful in signal
analysis, there are no good orthonormal bases based on this construction. Also,
a logarithmic frequency scale, or constant relative bandwidth, is often preferable
to the linear frequency scale obtained with the STFT. For example, the human
auditory system uses constant relative bandwidth channels (critical bands), and
therefore, audio compression systems use a similar decomposition.

4
The Fourier transform of nonperiodic signals is also possible. It is an integral transform rather
than a series expansion and lacks any time locality.
6 CHAPTER 1

(a)

(b)

FIGURE 1.2 fig1.2

Figure 1.2 Musical notation and orthonormal wavelet bases. (a) The western
musical notation uses a logarithmic frequency scale with twelve halftones per
octave. In this example, notes are chosen as in an orthonormal wavelet basis,
with long low-pitched notes, and short high-pitched ones. (b) Corresponding
time-domain functions.

A popular alternative to the STFT is the wavelet transform. Using scales and
shifts of a prototype wavelet, a linear expansion of a signal is obtained. Because the
scales used are powers of an elementary scale factor (typically 2), the analysis uses
a constant relative bandwidth (or, the frequency axis is logarithmic). The sampling
of the time-frequency plane is now very different from the rectangular grid used in
the STFT. Lower frequencies, where the bandwidth is narrow (that is, the basis
functions are stretched in time) are sampled with a large time step, while high
frequencies (which correspond to short basis functions) are sampled more often. In
Figure 1.2, we give an intuitive illustration of this time-frequency trade-off, and
relate it to musical notation which also uses a logarithmic frequency scale.5 What
is particularly interesting is that such a wavelet scheme allows good orthonormal
bases whereas the STFT does not.
In the discussions above, we implicitly assumed continuous-time signals. Of
course there are discrete-time equivalents to all these results. A local analysis
can be achieved using a block transform, where the sequence is segmented into
adjacent blocks of N samples, and each block is individually transformed. As is to be
expected, such a scheme is plagued by boundary effects, also called blocking effects.
A more general expansion relies on filter banks, and can achieve both STFT-like
analysis (rectangular sampling of the time-frequency plane) or wavelet-like analysis
(constant relative bandwidth in frequency). Discrete-time expansions based on
filter banks are not arbitrary, rather they are structured expansions. Again, for

5
This is the standard western musical notation based on J.S. Bach’s “Well Tempered Piano”.
Thus one could argue that wavelets were actually invented by J.S. Bach!
1.1. SERIES EXPANSIONS OF SIGNALS 7

complexity reasons, it is useful to impose such a structure on the basis chosen


for the expansion. For example, filter banks correspond to basis sequences which
satisfy a block shift invariance property. Sometimes, a modulation constraint can
also be added, in particular in STFT-like discrete-time bases. Because we are in
discrete time, scaling cannot be done exactly (unlike in continuous time), but an
approximate scaling property between basis functions holds for the discrete-time
wavelet series.
Interestingly, the relationship between continuous- and discrete-time bases runs
deeper than just these conceptual similarities. One of the most interesting con-
structions of wavelets is the one by Daubechies [71]. It relies on the iteration
of a discrete-time filter bank so that, under certain conditions, it converges to a
continuous-time wavelet basis. Furthermore, the multiresolution framework used
in the analysis of wavelet decompositions automatically associates a discrete-time
perfect reconstruction filter bank to any wavelet decomposition. Finally, the wave-
let series decomposition can be computed with a filter bank algorithm. Therefore,
especially in the wavelet type of a signal expansion, there is a very close interaction
between discrete and continuous time.
It is to be noted that we have focused on STFT and wavelet type of expansions
mainly because they are now quite standard. However, there are many alternatives,
for example the wavelet packet expansion introduced by Coifman and coworkers
[62, 64], and generalizations thereof. The main ingredients remain the same: they
are structured bases in discrete or continuous time, and they permit different time
versus frequency resolution trade-offs. An easy way to interpret such expansions
is in terms of their time-frequency tiling: each basis function has a region in the
time-frequency plane where most of its energy is concentrated. Then, given a basis
and the expansion coefficients of a signal, one can draw a tiling where the shading
corresponds to the value of the expansion coefficient.6

Example 1.2 Different Time-Frequency Tilings


Figure 1.3 shows schematically different possible expansions of a very simple discrete-time
signal, namely a sine wave plus an impulse (see part (a)). It would be desirable to have
an expansion that captures both the isolated impulse (or Dirac in time) and the isolated
frequency component (or Dirac in frequency). The first two expansions, namely the identity
transform in part (b) and the discrete-time Fourier series7 in part (c), isolate the time and
frequency impulse, respectively, but not both. The local discrete-time Fourier series in part
(d) achieves a compromise, by locating both impulses to a certain degree. The discrete-time
wavelet series in part (e) achieves better localization of the time-domain impulse, without
sacrificing too much of the frequency localization. However, a high-frequency sinusoid would
not be well localized. This simple example indicates some of the trade-offs involved.
6
Such tiling diagrams were used by Gabor [102], and he called an elementary tile a “logon.”
7
Discrete-time series expansions are often called discrete-time transforms, both in the Fourier
and in the wavelet case.
8 CHAPTER 1

t
t0 T

(a)
f f

t t
t0 T t0 T
(b) (c)
f f

t t
t0 T t0 T
(d) (e)
FIGURE 1.3 fig1.3

Figure 1.3 Time-frequency tilings for a simple discrete-time signal [130]. (a)
Sine wave plus impulse. (b) Expansion onto the identity basis. (c) Discrete-
time Fourier series. (d) Local discrete-time Fourier series. (e) Discrete-time
wavelet series.

Note that the local Fourier transform and the wavelet transform can be used
for signal analysis purposes. In that case, the goal is not to obtain orthonormal
bases, but rather to characterize the signal from the transform. The local Fourier
transform retains many of the characteristics of the usual Fourier transform with a
localization given by the window function, which is thus constant at all frequencies
1.2. MULTIRESOLUTION CONCEPT 9

(this phenomenon can be seen already in Figure 1.3(d)). The wavelet, on the
other hand, acts as a microscope, focusing on smaller time phenomenons as the
scale becomes small (see Figure 1.3(e) to see how the impulse gets better localized
at high frequencies). This behavior permits a local characterization of functions,
which the Fourier transform does not.8

1.2 M ULTIRESOLUTION C ONCEPT


A slightly different expansion is obtained with multiresolution pyramids since the
expansion is actually redundant (the number of samples in the expansion is big-
ger than in the original signal). However, conceptually, it is intimately related to
subband and wavelet decompositions. The basic idea is successive approximation.
A signal is written as a coarse approximation (typically a lowpass, subsampled
version) plus a prediction error which is the difference between the original signal
and a prediction based on the coarse version. Reconstruction is immediate: simply
add back the prediction to the prediction error. The scheme can be iterated on the
coarse version. It can be shown that if the lowpass filter meets certain constraints of
orthogonality, then this scheme is identical to an oversampled discrete-time wavelet
series. Otherwise, the successive approximation approach is still at least concep-
tually identical to the wavelet decomposition since it performs a multiresolution
analysis of the signal.
A schematic diagram of a pyramid decomposition, with attached resulting im-
ages, is shown in Figure 1.4. After the encoding, we have a coarse resolution image
of half size, as well as an error image of full size (thus the redundancy). For appli-
cations, the decomposition into a coarse resolution which gives an approximate but
adequate version of the full image, plus a difference or detail image, is conceptually
very important.

Example 1.3 Multiresolution Image Database


Let us consider the following practical problem: Users want to access and retrieve electronic
images from an image database using a computer network with limited bandwidth. Because
the users have an approximate idea of which image they want, they will first browse through
some images before settling on a target image [214]. Given the limited bandwidth, browsing
is best done on coarse versions of the images which can be transmitted faster. Once an image
is chosen, the residual can be sent. Thus, the scheme shown in Figure 1.4 can be used, where
the coarse and residual images are further compressed to diminish the transmission time.

The above example is just one among many schemes where multiresolution de-
compositions are useful in communications problems. Others include transmission
8
For example, in [137], this mathematical microscope is used to analyze some famous lacunary
Fourier series that was proposed over a century ago.
10 CHAPTER 1

coarse

D I I


x + + x
residual
MR encoder MR decoder

Figure 1.4 Pyramid decomposition of an image where encoding is shown on the


left and decoding is shown on the right.
FIGUREThe
1.4operators D and I correspond
fig1.4 to
decimation and interpolation operators, respectively. For example, D produces
an N/2 × N/2 image from an N × N original, while I interpolates an N × N
image based on an N/2 × N/2 original.

over error-prone channels, where the coarse resolution can be better protected to
guarantee some minimum level of quality.
Multiresolution decompositions are also important for computer vision tasks
such as image segmentation or object recognition: the task is performed in a suc-
cessive approximation manner, starting on the coarse version and then using this
result as an initial guess for the full task. However, this is a greedy approach which
is sometimes suboptimal. Figure 1.5 shows a famous counter-example, where a
multiresolution approach would be seriously misleading . . .
Interestingly, the multiresolution concept, besides being intuitive and useful in
practice, forms the basis of a mathematical framework for wavelets [181, 194]. As
in the pyramid example shown in Figure 1.4, one can decompose a function into a
coarse version plus a residual, and then iterate this to infinity. If properly done,
this can be used to analyze wavelet schemes and derive wavelet bases.

1.3 OVERVIEW OF THE B OOK


We start with a review of fundamentals in Chapter 2. This chapter should make
the book as self-contained as possible. It reviews Hilbert spaces at an elementary
but sufficient level, linear algebra (including matrix polynomials) and Fourier the-
1.3. OVERVIEW OF THE BOOK 11

Figure 1.5 Counter-example to multiresolution technique. The coarse approx-


imation is unrelated to the full-resolution image (Comet Photo AG).

ory, with material on sampling and discrete-time Fourier transforms in particular.


The review of continuous-time and discrete-time signal processing is followed by
a discussion of multirate signal processing, which is a topic central to later chap-
ters. Finally, a short introduction to time-frequency distributions discusses the
local Fourier transform and the wavelet transform, and shows the uncertainty prin-
ciple. The appendix gives factorizations of unitary matrices, and reviews results on
convergence and regularity of functions.
Chapter 3 focuses on discrete-time bases and filter banks. This topic is impor-
tant for several later chapters as well as for applications. We start with two simple
12 CHAPTER 1

expansions which will reappear throughout the book as a recurring theme: the Haar
and the sinc bases. They are limit cases of orthonormal expansions with good time
localization (Haar) and good frequency localization (sinc). This naturally leads to
an in-depth study of two-channel filter banks, including analytical tools for their
analysis as well as design methods. The construction of orthonormal and linear
phase filter banks is described. Multichannel filter banks are developed next, first
through tree structures and then in the general case. Modulated filter banks, cor-
responding conceptually to a discrete-time local Fourier analysis, are addressed as
well. Next, pyramid schemes and overcomplete representations are explored. Such
schemes, while not critically sampled, have some other attractive features, such
as time invariance. Then, the multidimensional case is discussed both for simple
separable systems, as well as for general nonseparable ones. The latter systems
involve lattice sampling which is detailed in an appendix. Finally, filter banks for
telecommunications, namely transmultiplexers and adaptive subband filtering, are
presented briefly. The appendix details factorizations of orthonormal filter banks
(corresponding to paraunitary matrices).
Chapter 4 is devoted to the construction of bases for continuous-time signals,
in particular wavelets and local cosine bases. Again, the Haar and sinc cases play
illustrative roles as extremes of wavelet constructions. After an introduction to
series expansions, we develop multiresolution analysis as a framework for wavelet
constructions. This naturally leads to the classic wavelets of Meyer and Battle-
Lemarié or Stromberg. These are based on Fourier-domain analysis. This is followed
by Daubechies’ construction of wavelets from iterated filter banks. This is a time-
domain construction based on the iteration of a multirate filter. Study of the
iteration leads to the notion of regularity of the discrete-time filter. Then, the
wavelet series expansion is considered both in terms of properties and computation
of the expansion coefficients. Some generalizations of wavelet constructions are
considered next, first in one dimension (including biorthogonal and multichannel
wavelets) and then in multiple dimensions, where nonseparable wavelets are shown.
Finally, local cosine bases are derived and they can be seen as a real-valued local
Fourier transform.
Chapter 5 is concerned with continuous wavelet and Fourier transforms. Unlike
the series expansions in Chapters 3 and 4, these are very redundant representa-
tions useful for signal analysis. Both transforms are analyzed, inverses are derived,
and their main properties are given. These transforms can be sampled, that is,
scale/frequency and time shift can be discretized. This leads to redundant series
representations called frames. In particular, reconstruction or inversion is discussed,
and the case of wavelet and local Fourier frames is considered in some detail.
Chapter 6 treats algorithmic and computational aspects of series expansions.
First, a review of classic fast algorithms for signal processing is given since they
1.3. OVERVIEW OF THE BOOK 13

form the ingredients used in subsequent algorithms. The key role of the fast Fourier
transform (FFT) is pointed out. The complexity of computing filter banks, that is,
discrete-time expansions, is studied in detail. Important cases include the discrete-
time wavelet series or transform and modulated filter banks. The latter corresponds
to a local discrete-time Fourier series or transform, and uses FFT’s for efficient com-
putation. These filter bank algorithms have direct applications in the computation
of wavelet series. Overcomplete expansions are considered next, in particular for
the computation of a sampled continuous wavelet transform. The chapter concludes
with a discussion of special topics related to efficient convolution algorithms and
also application of wavelet ideas to numerical algorithms.
The last chapter is devoted to one of the main applications of wavelets and
filter banks in signal processing, namely signal compression. The technique is often
called subband coding because signals are considered in spectral bands for com-
pression purposes. First comes a review of transform based compression, including
quantization and entropy coding. Then follow specific discussions of one-, two- and
three-dimensional signal compression methods based on transforms. Speech and
audio compression, where subband coding was first invented, is discussed. The
success of subband coding in current audio coding algorithms is shown on spe-
cific examples such as the MUSICAM standard. A thorough discussion of image
compression follows. While current standards such as JPEG are block transform
based, some innovative subband or wavelet schemes are very promising and are
described in detail. Video compression is considered next. Besides expansions,
motion estimation/compensation methods play a key role and are discussed. The
multiresolution feature inherent in pyramid and subband coding is pointed out as
an attractive feature for video compression, just as it is for image coding. The final
section discusses the interaction of source coding, particularly the multiresolution
type, and channel coding or transmission. This joint source-channel coding is key
to new applications of image and video compression, as in transmission over packet
networks. An appendix gives a brief review of statistical signal processing which
underlies coding methods.
14 CHAPTER 1
2

Fundamentals of Signal Decompositions

“A journey of a thousand miles


must begin with a single step.”
— Lao-Tzu, Tao Te Ching

T he mathematical framework necessary for our later developments is established


in this chapter. While we review standard material, we also cover the broad spec-
trum from Hilbert spaces and Fourier theory to signal processing and time-frequency
distributions. Furthermore, the review is done from the point of view of the chap-
ters to come, namely, signal expansions. This chapter attempts to make the book
as self-contained as possible.
We tried to keep the level of formalism reasonable, and refer to standard texts for
many proofs. While this chapter may seem dry, basic mathematics is the foundation
on which the rest of the concepts are built, and therefore, some solid groundwork
is justified.
After defining notations, we discuss Hilbert spaces. In their finite-dimensional
form, Hilbert spaces are familiar to everyone. Their infinite-dimensional counter-
parts, in particular L2 (R) and l2 (Z), are derived, since they are fundamental to
signal processing in general and to our developments in particular. Linear opera-
tors on Hilbert spaces and (in finite dimensions) linear algebra are discussed briefly.
The key ideas of orthonormal bases, orthogonal projection and best approximation
are detailed, as well as general bases and overcomplete expansions, or, frames.
We then turn to a review of Fourier theory which starts with the Fourier trans-
form and series. The expansion of bandlimited signals and sampling naturally lead
to the discrete-time Fourier transform and series.

15
16 CHAPTER 2

Next comes a brief review of continuous-time and discrete-time signal process-


ing, followed by a discussion of multirate discrete-time signal processing. It should
be emphasized that this last topic is central to the rest of the book, but not often
treated in standard signal processing books.
Finally, we review time-frequency representations, in particular short-time Fourier
or Gabor expansions as well as the newer wavelet expansion. We also discuss the
uncertainty relation, which is a fundamental limit in linear time-frequency repre-
sentations. A bilinear expansion, the Wigner-Ville transform, is also introduced.

2.1 N OTATIONS
Let C, R, Z and N denote the sets of complex, real, integer and natural numbers,
respectively. Then, C n , and Rn will be the sets of all n-tuples (x1 , . . . , xn ) of
complex and real numbers, respectively.
The superscript ∗ denotes complex conjugation, or, (a + jb)∗ = (a − jb), where
the symbol j is used for the square root of −1 and a, b ∈ R. The subscript ∗ is used
to denote complex conjugation of the constants but not the complex variable, for
example, (az)∗ = a∗ z where z is a complex variable. The superscript T denotes the
transposition of a vector or a matrix, while the superscript ∗ on a vector or matrix
denotes hermitian transpose, or transposition and complex conjugation. Re(z) and
Im(z) denote the real and imaginary parts of the complex number z.
We define the N th root of unity as WN = e−j2π/N . It satisfies the following:

WNN = 1, (2.1.1)
WNkN +i = WNi , with k, i in Z, (2.1.2)

N −1 
N n = lN, l ∈ Z,
WNk·n = (2.1.3)
0 otherwise.
k=0

The last relation is often referred to as orthogonality of the roots of unity.


Often we deal with functions of a continuous variable, and a related sequence
indexed by an integer (typically, the latter is a sampled version of the former). To
avoid confusion, and in keeping with the tradition of the signal processing litera-
ture [211], we use parentheses around a continuous variable and brackets around a
discrete one, for example, f (t) and x[n], where

x[n] = f (nT ), n ∈ Z, T ∈ R.

In particular, δ(t) and δ[n] denote continuous-time and discrete-time Dirac func-
tions, which are very different indeed. The former is a generalized function (see
Section 2.4.4) while the latter is the sequence which is 1 for n = 0 and 0 otherwise
(the Dirac functions are also called delta or impulse functions).
2.2. HILBERT SPACES 17

In discrete-time signal processing, we will often encounter 2π-periodic functions


(namely, discrete-time Fourier transforms of sequences, see Section 2.4.6), and we
will write, for example, H(ejω ) to make the periodicity explicit.

2.2 H ILBERT S PACES


Finite-dimensional vector spaces, as studied in linear algebra [106, 280], involve
vectors over R or C that are of finite dimension n. Such spaces are denoted by Rn
and C n , respectively. Given a set of vectors, {vk }, in Rn or C n , important questions
include:

(a) Does the set {vk } span the space Rn or C n , that is, can every vector in Rn or
C n be written as a linear combination of vectors from {vk }?

(b) Are the vectors linearly independent, that is, is it true that no vector from
{vk } can be written as a linear combination of the others?

(c) How can we find bases for the space to be spanned, in particular, orthonormal
bases?

(d) Given a subspace of Rn or C n and a general vector, how can we find an


approximation in the least-squares sense, (see below) that lies in the subspace?

Two key notions used in addressing these questions include:

(a) The length, or norm,1 of a vector (we take Rn as an example),


 1/2

n
x = x2i .
i=1

(b) The orthogonality of a vector with respect to another vector (or set of vectors),
for example,
x, y = 0,
with an appropriately defined scalar product,

n
x, y = xi yi .
i=1

So far, we relied on the fact that the spaces were finite-dimensional. Now, the idea
is to generalize our familiar notion of a vector space to infinite dimensions. It is
1
Unless otherwise specified, we will assume a squared norm.
18 CHAPTER 2

necessary to restrict the vectors to have finite length or norm (even though they
are infinite-dimensional). This leads naturally to Hilbert spaces. For example, the
space of square-summable sequences, denoted by l2 (Z), is the vector space “C ∞ ”
with a norm constraint. An example of a set of vectors spanning l2 (Z) is the set
{δ[n − k]}, k ∈ Z. A further extension with respect to linear algebra is that vectors
can be generalized from n-tuples of real or complex values to include functions of
a continuous variable. The notions of norm and orthogonality can be extended to
functions using a suitable inner product between functions, which are thus viewed
as vectors. A classic example of such orthogonal vectors is the set of harmonic sine
and cosine functions, sin(nt) and cos(nt), n = 0, 1, . . . , on the interval [−π, π].
The classic questions from linear algebra apply here as well. In particular, the
question of completeness, that is, whether the span of the set of vectors {vk } covers
the whole space, becomes more involved than in the finite-dimensional case. The
norm plays a central role, since any vector in the space must be expressed by a
linear combination of vk ’s such that the norm of the difference between the vector
and the linear combination of vk ’s is zero. For l2 (Z), {δ[n − k]}, k ∈ Z, constitute
a complete set which is actually an orthonormal basis. For the space of square-
integrable functions over the interval [−π, π], denoted by L2 ([−π, π]), the harmonic
sines and cosines are complete since they form the basis used in the Fourier series
expansion.
If only a subset of the complete set of vectors {vk } is used, one is interested in
the best approximation of a general element of the space by an element from the
subspace spanned by the vectors in the subset. This question has a particularly
easy answer when the set {vk } is orthonormal and the goal is least-squares approx-
imation (that is, the norm of the difference is minimized). Because the geometry
of Hilbert spaces is similar to Euclidean geometry, the solution is the orthogonal
projection onto the approximation subspace, since this minimizes the distance or
approximation error.
In the following, we formally introduce vector spaces and in particular Hilbert
spaces. We discuss orthogonal and general bases and their properties. We often use
the finite-dimensional case for intuition and examples. The treatment is not very
detailed, but sufficient for the remainder of the book. For a thorough treatment,
we refer the reader to [113].

2.2.1 Vector Spaces and Inner Products


Let us start with a formal definition of a vector space.
D EFINITION 2.1
A vector space over the set of complex or real numbers, C or R, is a set of
vectors, E, together with addition and scalar multiplication, which, for general
2.2. HILBERT SPACES 19

x, y in E, and α, β in C or R, satisfy the following:

(a) Commutativity: x + y = y + x.

(b) Associativity: (x + y) + z = x + (y + z), (αβ)x = α(βx).

(c) Distributivity: α(x + y) = αx + αy, (α + β)x = αx + βx.

(d) Additive identity: there exists 0 in E, such that x + 0 = x, for all x in


E.

(e) Additive inverse: for all x in E, there exists a (−x) in E, such that
x + (−x) = 0.

(f) Multiplicative identity: 1 · x = x for all x in E.

Often, x, y in E will be n-tuples or sequences, and then we define

x + y = (x1 , x2 , . . .) + (y1 , y2 , . . .) = (x1 + y1 , x2 + y2 , . . .)

αx = α(x1 , x2 , . . .) = (αx1 , αx2 , . . .).


While the scalars are from C or R, the vectors can be arbitrary, and apart from
n-tuples and infinite sequences, we could also take functions over the real line.
A subset M of E is a subspace of E if

(a) For all x and y in M , x + y is in M .

(b) For all x in M and α in C or R, αx is in M .

Given S ⊂ E, the span of S is the subspace of E consisting of all linear combinations


of vectors in S, for example, in finite dimensions,
 n 

span(S) = αi xi | αi ∈ C or R, xi ∈ S .
i=1
n
Vectors x1 , . . . , xn are called linearly independent, if i=1 αi xi = 0 is true only
if αi = 0, for all i. Otherwise, these vectors are linearly dependent. If there
are infinitely many vectors x1 , x2 , . . ., they are linearly independent if for each k,
x1 , x2 , . . . , xk are linearly independent.
A subset {x1 , . . . , xn } of a vector space E is called a basis for E, when E =
span(x1 , . . . , xn ) and x1 , . . . , xn are linearly independent. Then, we say that E has
dimension n. E is infinite-dimensional if it contains an infinite linearly independent
set of vectors. As an example, the space of infinite sequences is spanned by the
20 CHAPTER 2

infinite set {δ[n − k]}k∈Z . Since they are linearly independent, the space is infinite-
dimensional.
Next, we equip the vector space with an inner product that is a complex function
fundamental for defining norms and orthogonality.
D EFINITION 2.2
An inner product on a vector space E over C (or R), is a comple-valued
function ·, ·, defined on E × E with the following properties:

(a) x + y, z = x, z + y, z.

(b) x, αy = αx, y.

(c) x, y∗ = y, x.

(d) x, x ≥ 0, and x, x = 0 if and only if x ≡ 0.

Note that (b) and (c) imply ax, y = a∗ x, y. From (a) and (b), it is clear
that the inner product is linear. Note that we choose the definition of the inner
product which takes the complex conjugate of the first vector (follows from (b)).
For illustration, the standard inner product for complex-valued functions over R
and sequences over Z are
 ∞
f, g = f ∗ (t) g(t)dt,
−∞

and


x, y = x∗ [n] y[n],
n=−∞

respectively (if they exist). The norm of a vector is defined from the inner product
as

x = x, x, (2.2.1)


and the distance between two vectors x and y is simply the norm of their difference
x − y. Note that other norms can be defined (see (2.2.16)), but since we will only
use the usual Euclidean or square norm as defined in (2.2.1), we use the symbol
 .  without a particular subscript.
The following hold for inner products over a vector space:

(a) Cauchy-Schwarz inequality

|x, y| ≤ x y, (2.2.2)

with equality if and only if x = αy.


2.2. HILBERT SPACES 21

(b) Triangle inequality


x + y ≤ x + y,

with equality if and only if x = αy, where α is a positive real constant.

(c) Parallelogram law

x + y2 + x − y2 = 2(x2 + y2 ).

Finally, the inner product can be used to define orthogonality of two vectors x and
y, that is, vectors x and y are orthogonal if and only if

x, y = 0.

If two vectors are orthogonal, which is denoted by x ⊥ y, then they satisfy the
Pythagorean theorem,
x + y2 = x2 + y2 ,

since x + y2 = x + y, x + y = x2 + x, y + y, x + y2 .


A vector x is said to be orthogonal to a set of vectors S = {yi } if x, yi  = 0 for
all i. We denote this by x ⊥ S. More generally, two subspaces S1 and S2 are called
orthogonal if all vectors in S1 are orthogonal to all of the vectors in S2 , and this is
written S1 ⊥ S2 . A set of vectors {x1 , x2 , . . .} is called orthogonal if xi ⊥ xj when
i = j. If the vectors are normalized to have unit norm, we have an orthonormal
system, which therefore satisfies

xi , xj  = δ[i − j].



Vectors in
an orthonormal
system are linearly independent, since αi xi = 0 implies
0 = xj , αi xi  = αi xj , xi  = αj . An orthonormal system in a vector space E
is an orthonormal basis if it spans E.

2.2.2 Complete Inner Product Spaces


A vector space equipped with an inner product is called an inner product space.
One more notion is needed in order to obtain a Hilbert space, completeness. To
this end, we consider sequences of vectors {xn } in E, which are said to converge to
x in E if xn − x → 0 as n → ∞. A sequence of vectors {xn } is called a Cauchy
sequence, if xn − xm  → 0, when n, m → ∞. If every Cauchy sequence in E,
converges to a vector in E, then E is called complete. This leads to the following
definition:
22 CHAPTER 2

D EFINITION 2.3
A complete inner product space is called a Hilbert space.
We are particularly interested in those Hilbert spaces which are separable because a
Hilbert space contains a countable orthonormal basis if and only if it is separable.
Since all Hilbert spaces with which we are going to deal are separable, we implicitly
assume that this property is satisfied (refer to [113] for details on separability).
Note that a closed subspace of a separable Hilbert space is separable, that is, it also
contains a countable orthonormal basis.
Given a Hilbert space E and a subspace S, we call the orthogonal complement
of S in E, denoted S ⊥ , the set {x ∈ E | x ⊥ S}. Assume further that S is closed,
that is, it contains all limits of sequences of vectors in S. Then, given a vector y in
E, there exists a unique v in S and a unique w in S ⊥ such that y = v + w. We can
thus write
E = S ⊕ S⊥,
or, E is the direct sum of the subspace and its orthogonal complement.
Let us consider a few examples of Hilbert spaces.

Complex/Real Spaces The complex space C n is the set of all n-tuples x =


(x1 , . . . , xn ), with finite xi in C. The inner product is defined as

n
x, y = x∗i yi ,
i=1

and the norm is


n


x = x, x = |xi |2 .
i=1

The above holds for the real space R as well (note that then yi∗ = yi ). For
n

example, vectors ei = (0, . . . , 0, 1, 0, . . . , 0), where 1 is in the ith position, form


an orthonormal basis both for Rn and C n . Note that these are the usual spaces
considered in linear algebra.

Space of Square-Summable Sequences In discrete-time signal processing we


will be dealing almost exclusively with sequences x[n] having finite square sum or
finite energy,2 where x[n] is, in general, complex-valued and n belongs to Z. Such
a sequence x[n] is a vector in the Hilbert space l2 (Z). The inner product is


x, y = x[n]∗ y[n],
n=−∞
2
In physical systems, the sum or integral of a squared function often corresponds to energy.
2.2. HILBERT SPACES 23

and the norm is 


x = x, x = |x[n]|2 .


n∈Z

Thus, l2 (Z) is the space of all sequences such that x < ∞. This is obviously an
infinite-dimensional space, and a possible orthonormal basis is {δ[n − k]}k∈Z .
For the completeness of l2 (Z), one has to show that if xn [k] is a sequence of
vectors in l2 (Z) such that xn −xm  → 0 as n, m → ∞ (that is, a Cauchy sequence),
then there exists a limit x in l2 (Z) such that xn −x → 0. The proof can be found,
for example, in [113].

Space of Square-Integrable Functions A function f (t) defined on R is said to


be in the Hilbert space L2 (R), if |f (t)|2 is integrable,3 that is, if

|f (t)|2 dt < ∞.
t∈R

The inner product on L2 (R) is given by



f, g = f (t)∗ g(t)dt,
t∈R

and the norm is 


f  = f, f  = |f (t)|2 dt.


t∈R
2 2 2
This space is infinite-dimensional (for example, e−t , te−t , t2 e−t . . . are linearly
independent).

2.2.3 Orthonormal Bases


Among all possible bases in a Hilbert space, orthonormal bases play a very impor-
tant role. We start by recalling the standard linear algebra procedure which can be
used to orthogonalize an arbitrary basis.

Gram-Schmidt Orthogonalization Given a set of linearly independent vectors


{xi } in E, we can construct an orthonormal set {yi } with the same span as {xi } as
follows: Start with
x1
y1 = .
x1 
3
Actually, |f |2 has to be Lebesgue integrable.
24 CHAPTER 2

Then, recursively set


xk − vk
yk = , k = 2, 3, . . .
xk − vk 

where

k−1
vk = yi , xk yi .
i=1

As will be seen shortly, the vector vk is the orthogonal projection of xk onto the
subspace spanned by the previous orthogonalized vectors and this is subtracted
from xk , followed by normalization.
A standard example of such an orthogonalization procedure is the Legendre
polynomials over the interval [−1, 1]. Start with xk (t) = tk , k = 0, 1, . . . and apply
the Gram-Schmidt procedure to get yk (t), of degree k, norm 1 and orthogonal to
yi (t), i < k (see Problem 2.1).

Bessel’s Inequality If we have an orthonormal system of vectors {xk } in E, then


for every y in E the following inequality, known as Bessel’s inequality, holds:

y2 ≥ |xk , y|2 .
k

If we have an orthonormal system that is complete in E, then we have an orthonor-


mal basis for E, and Bessel’s relation becomes an equality, often called Parseval’s
equality (see Theorem 2.4).

Orthonormal Bases For a set of vectors S = {xi } to be an orthonormal basis,


we first have to check that the set of vectors S is orthonormal and then that
it is complete, that is, that every vector from the space to be represented can
be expressed as a linear combination of the vectors from S. In other words, an
orthonormal system {xi } is called an orthonormal basis for E, if for every y in E,

y = αk xk . (2.2.3)
k

The coefficients αk of the expansion are called the Fourier coefficients of y (with
respect to {xi }) and are given by

αk = xk , y. (2.2.4)

This can be shown by using the continuity of the inner product (that is, if xn → x,
and yn → y, then xn , yn  → x, y) as well as the orthogonality of the xk ’s. Given
2.2. HILBERT SPACES 25

that y is expressed as (2.2.3), we can write


n
xk , y = lim xk , αi xi  = αk ,
n→∞
i=0

where we used the linearity of the inner product.


In finite dimensions (that is, Rn or C n ), having an orthonormal set of size n
is sufficient to have an orthonormal basis. As expected, this is more delicate in
infinite dimensions (that is, it is not sufficient to have an infinite orthonormal set).
The following theorem gives several equivalent statements which permit us to check
if an orthonormal system is also a basis:
T HEOREM 2.4
Given an orthonormal system {x1 , x2 , . . .} in E, the following are equivalent:

(a) The set of vectors {x1 , x2 , . . .} is an orthonormal basis for E.

(b) If xi , y = 0 for i = 1, 2, . . ., then y = 0.

(c) span({xi }) is dense in E, that is, every vector in E is a limit of a sequence


of vectors in span({xi }).

(d) For every y in E, 


y2 = |xi , y|2 , (2.2.5)
i

which is called Parseval’s equality.

(e) For every y1 and y2 in E,



y1 , y2  = xi , y1 ∗ xi , y2 , (2.2.6)
i

which is often called the generalized Parseval’s equality.

For a proof, see [113].

Orthogonal Projection and Least-Squares Approximation Often, a vector from


a Hilbert space E has to be approximated by a vector lying in a (closed) subspace S.
We assume that E is separable, thus, S contains an orthonormal basis {x1 , x2 , . . .}.
Then, the orthogonal projection of y ∈ E onto S is given by

ŷ = xi , yxi .
i
26 CHAPTER 2

x3

d x2

y^

x1

Figure 2.1 Orthogonal projection onto a subspace. Here, y ∈ R3 and ŷ is its


projection onto the span of {xFIGURE 2.1 fignew2.2.1
1 , x2 }. Note that y − ŷ is orthogonal to the span
{x1 , x2 }.

x2
x2
〈 x 2, y〉 y

〈 x̃ 2, y〉 y

x1 x1
ŷ = 〈 x 1, y〉 〈 x̃ 1, y〉 ŷ = 〈 x , y〉
1

(a) (b)

FIGURE 2.2 fignew2.2.2


Figure 2.2 Expansion in orthogonal and biorthogonal bases. (a) Orthogonal
case: The successive approximation property holds. (b) Biorthogonal case:
The first approximation cannot be used in the full expansion.

Note that the difference d = y − ŷ satisfies


d ⊥ S
and, in particular, d ⊥ ŷ, as well as
y2 = ŷ2 + d2 .
This is shown pictorially in Figure 2.1. An important property of such an approxi-
mation is that it is best in the least-squares sense, that is,
min y − x
2.2. HILBERT SPACES 27


for x in S is attained for x = i αi xi with

αi = xi , y,

that is, the Fourier coefficients. An immediate consequence of this result is the
successive approximation property of orthogonal expansions. Call ŷ (k) the best
approximation of y on the subspace spanned by {x1 , x2 , . . . , xk } and given by the
coefficients {α1 , α2 , . . . , αk } where αi = xi , y. Then, the approximation ŷ (k+1) is
given by
ŷ (k+1) = ŷ (k) + xk+1 , yxk+1 ,
that is, the previous approximation plus the projection along the added vector xk+1 .
While this is obvious, it is worth pointing out that this successive approximation
property does not hold for nonorthogonal bases. When calculating the approxima-
tion ŷ (k+1) , one cannot simply add one term to the previous approximation, but has
to recalculate the whole approximation (see Figure 2.2). For a further discussion
of projection operators, see Appendix 2.A.

2.2.4 General Bases


While orthonormal bases are very convenient, the more general case of nonorthog-
onal or biorthogonal bases is important as well. In particular, biorthogonal bases
will be constructed in Chapters 3 and 4. A system {xi , x̃i } constitutes a pair of
biorthogonal bases of a Hilbert space E if and only if [56, 73]

(a) For all i, j in Z


xi , x̃j  = δ[i − j]. (2.2.7)

(b) There exist strictly positive constants A, B, Ã, B̃ such that, for all y in E

A y2 ≤ |xk , y|2 ≤ B y2 , (2.2.8)
k

à y 2
≤ |x̃k , y|2 ≤ B̃ y2 . (2.2.9)
k

Compare these inequalities with (2.2.5) in the orthonormal case. Bases which satisfy
(2.2.8) or (2.2.9) are called Riesz bases [73]. Then, the signal expansion formula
becomes  
y = xk , y x̃k = x̃k , y xk . (2.2.10)
k k

It is clear why the term biorthogonal is used, since to the (nonorthogonal) basis
{xi } corresponds a dual basis {x̃i } which satisfies the biorthogonality constraint
28 CHAPTER 2

(2.2.7). If the basis {xi } is orthogonal, then it is its own dual, and the expansion
formula (2.2.10) becomes the usual orthogonal expansion given by (2.2.3–2.2.4).
Equivalences similar to Theorem 2.4 hold in the biorthogonal case as well, and
we give the Parseval’s relations which become

y2 = xi , y∗ x̃i , y, (2.2.11)
i

and

y1 , y2  = xi , y1 ∗ x̃i , y2 , (2.2.12)
i

= x̃i , y1 ∗ xi , y2 . (2.2.13)
i

For a proof, see [213] and Problem 2.8.

2.2.5 Overcomplete Expansions


So far, we have considered signal expansion onto bases, that is, the vectors used
in the expansion were linearly independent. However, one can also write signals in
terms of a linear combination of an overcomplete set of vectors, where the vectors
are not independent anymore. A more detailed treatment of such overcomplete sets
of vectors, called frames, can be found in Chapter 5 and in [73, 89]. We will only
discuss a few basic notions here.
A family of functions {xk } in a Hilbert space H is called a frame if there exist
two constants A > 0, B < ∞, such that for all y in H

A y2 ≤ |xk , y|2 ≤ B y2 .
k

A, B are called frame bounds, and when they are equal, we call the frame tight. In
a tight frame we have 
|xk , y|2 = A y2 ,
k

and the signal can be expanded as follows:



y = A−1 xk , yxk . (2.2.14)
k

While this last equation resembles the expansion formula in the case of an or-
thonormal basis, a frame does not constitute an orthonormal basis in general. In
particular, the vectors may be linearly dependent and thus not form a basis. If all
2.3. ELEMENTS OF LINEAR ALGEBRA 29

the vectors in a tight frame have unit norm, then the constant A gives the redun-
dancy ratio (for example, A = 2 means there are twice as many vectors as needed
to cover the space). Note that if A = B = 1, and xk  = 1 for all k, then {xk }
constitutes an orthonormal basis.
Because of the linear dependence which exists among the vectors used in the
the expansion is not unique anymore. Consider the set {x1 , x2 , . . .}
expansion,
where i βi xi = 0 (where not all βi ’s are zero) because of linear dependence. If y
can be written as 
y = αi xi , (2.2.15)
i

then one can add βi to each αi without changing the validity of the expansion
(2.2.15). The expansion (2.2.14) is unique in the sense that it minimizes the norm
of the expansion among all valid expansions. Similarly, for general frames, there
exists a unique dual frame which is discussed in Section 5.3.2 (in the tight frame
case, the frame and its dual are equal).
This concludes for now our brief introduction of signal expansions. Later, more
specific expansions will be discussed, such as Fourier and wavelet expansions. The
fundamental properties seen above will reappear in more specialized forms (for
example, Parseval’s equality).
While we have only discussed Hilbert spaces, there are of course many other
spaces of functions which are of interest. For example, Lp (R) spaces are those
containing functions f for which |f |p is integrable [113]. The norm on these spaces
is defined as  ∞
f p = ( |f (t)|p dt)1/p , (2.2.16)
−∞

which for p = 2 is the usual L2 norm.4


Two Lpspaces which will be useful later are

L1 (R), the space of functions f (t) satisfying −∞ |f (t)|dt < ∞, and L∞ (R), the
space of functions f (t) such that sup |f (t)| < ∞. Their discrete-time equivalents
are l1 (Z) (space of sequences x[n] such that n |x[n]| < ∞) and l∞ (Z) (space of
sequences x[n] such that sup |x[n]| < ∞). Associated with these spaces are the
corresponding norms. However, many of the intuitive geometric interpretations we
have seen so far for L2 (R) and l2 (Z) do not hold in these spaces (see Problem 2.3).
Recall that in the following, since we use mostly L2 and l2 , we use  .  to mean
 . 2 .

2.3 E LEMENTS OF L INEAR A LGEBRA


The finite-dimensional cases of Hilbert spaces, namely Rn and C n , are very impor-
tant, and linear operators on such spaces are studied in linear algebra. Many good
4
For p = 2, the norm  . p cannot be derived from an inner product as in Definition 2.2.
30 CHAPTER 2

reference texts exist on the subject, see [106, 280]. Good reviews can also be found
in [150] and [308]. We give only a brief account here, focusing on basic concepts
and topics which are needed later, such as polynomial matrices.

2.3.1 Basic Definitions and Properties


We can view matrices as representations of bounded linear operators (see Ap-
pendix 2.A). The familiar system of equations
A11 x1 + ··· + A1n xn = y1 ,
.. .. .. ..
. . . .
Am1 x1 + · · · + Amn xn = ym ,
can be compactly represented as
Ax = y. (2.3.1)
Therefore, any finite matrix, or a rectangular (m rows and n columns) array of
numbers, can be interpreted as an operator A
⎛ ⎞
A11 · · · A1m
. .. ⎠
A = ⎝ .. ..
. . .
Am1 · · · Amn
An m × 1 matrix is called a column vector, while a 1 × n matrix is a row vector.
As seen in (2.3.1), we write matrices as bold capital letters, and column vectors
as lower-case bold letters. A row vector would then be written as v T , where T
denotes transposition (interchange of rows and columns, that is, if A has elements
Aij , AT has elements Aji ). If the entries are complex, one often uses hermitian
transposition, which is complex conjugation followed by usual transposition, and is
denoted by a superscript *.
When m = n, the matrix is called square, otherwise it is called rectangular. A
1 × 1 matrix is called scalar. We denote by 0 the null matrix (all elements are zero)
and by I the identity (Aii = 1, and 0 otherwise). The identity matrix is a special
case of a diagonal matrix. The antidiagonal matrix J has all the elements on the
other diagonal equal to 1, while the rest are 0, that is, Aij = 1, for j = n + 1 − i,
and Aij = 0 otherwise. A lower (or upper) triangular matrix is a square matrix
with all of its elements above (or below) the main diagonal equal to zero.
Beside addition/subtraction of same-size matrices (by adding/subtracting the
corresponding elements), one can multiply matrices A and B with sizes m × n and
n × p respectively, yielding a matrix C whose elements are given by

n
Cij = Aik Bkj .
k=1
2.3. ELEMENTS OF LINEAR ALGEBRA 31

Note that the matrix product is not commutative in general, that is, A B = B A.5
It can be shown that (A B)T = B T AT .
The inner product of two (column) vectors from RN is v 1 , v 2  = v T1 · v2 , and if
the vectors are from C n , then v 1 , v 2  = v ∗1 · v 2 . The outer product of two vectors
from Rn and Rm is an n × m matrix given by v 1 · v T2 .
To define the notion of a determinant, we first need to define a minor. A minor
M ij is a submatrix of the matrix A obtained by deleting its ith row and jth column.
More generally, a minor can be any submatrix of the matrix A obtained by deleting
some of its rows and columns. Then the determinant of an n × n matrix can be
defined recursively as


n
det(A) = Aij (−1)i+j det(M ij )
i=1

where j is fixed and belongs to {1, . . . , n}. The cofactor C ij is (−1)i+j det(M ij ).
A square matrix is said to be singular if det(A) = 0. The product of two matrices
is nonsingular only if both matrices are nonsingular. Some properties of interest
include the following:

(a) If C = A B, then det(C) = det(A) det(B).

(b) If B is obtained by interchanging two rows/columns of A, then det(B) =


− det(A).

(c) det(AT ) = det(A).

(d) For an n × n matrix A, det(cA) = cn det(A).

(e) The determinant of a triangular, and in particular, of a diagonal matrix is the


product of the elements on the main diagonal.

An important interpretation of the determinant is that it corresponds to the volume


of the parallelepiped obtained when taking the column vectors of the matrix as its
edges (one can take the row vectors as well, leading to a different parallelepiped,
but the volume remains the same). Thus, a zero determinant indicates linear de-
pendence of the row and column vectors of the matrix, since the parallelepiped is
not of full dimension.
The rank of a matrix is the size of its largest nonsingular minor (possibly the
matrix itself). In a rectangular m × n matrix, the column rank equals the row rank,
that is, the number of linearly independent rows equals the number of linearly
5
When there is possible confusion, we will denote a matrix product by A · B; otherwise we will
simply write AB.
32 CHAPTER 2

independent columns. In other words, the dimension of span(columns) is equal to


the dimension of span(rows). For an n × n matrix to be nonsingular, its rank should
equal n. Also rank(AB) ≤ min(rank(A), rank(B)).
For a square nonsingular matrix A, the inverse matrix A−1 can be computed
using Cramer’s formula
adjugate(A)
A−1 = ,
det(A)
where the elements of adjugate(A) are (adjugate(A))ji = cofactor of Aji = C ji .
For a square matrix, AA−1 = A−1 A = I. Also, (AB)−1 = B −1 A−1 . Note that
Cramer’s formula is not actually used to compute the inverse in practice; rather, it
serves as a tool in proofs.
For an m × n rectangular matrix A, an n × m matrix L is its left inverse if
LA = I. Similarly, an n × m matrix R is a right inverse of A if AR = I. These
inverses are not unique and may not even exist. However, if the matrix A is square
and has full rank, then its right inverse equals its left inverse, and we can apply
Cramer’s formula to find that inverse.
The Kronecker product of two matrices is defined as (we show a 2 × 2 matrix
as an example)    
a b aM bM
⊗M = , (2.3.2)
c d cM dM
where a, b, c and d are scalars and M is a matrix (neither matrix need be square).
See Problem 2.19 for an application of Kronecker products. The Kronecker product
has the following useful property with respect to the usual matrix product [32]:

(A ⊗ B)(C ⊗ D) = (AC) ⊗ (BD) (2.3.3)

where all the matrix products have to be well-defined.

2.3.2 Linear Systems of Equations and Least Squares


Going back to the equation A x = y, one can say that the system has a unique
solution provided A is nonsingular, and this solution is given by x = A−1 y. Note
that one would rarely compute the inverse matrix in order to solve a linear system
of equations; rather Gaussian elimination would be used, since it is much more
efficient. In the following, the column space of A denotes the linear span of the
columns of A, and similarly, the row space is the linear span of the rows of A.
Let us give an interpretation of solving the problem Ax = y. The product Ax
constitutes a linear combination of the columns of A weighted by the entries of x.
Thus, if y belongs to the column space of A, also called the range of A, there will
be a solution. If the columns are linearly independent, the solution is unique, if
they are not, there are infinitely many solutions. The null space of A is spanned
2.3. ELEMENTS OF LINEAR ALGEBRA 33

by the vectors orthogonal to the row space, or Av = 0. If A is of size m × n (the


system of equations has m equations in n unknowns), then the dimension of the
range (which equals the rank ρ) plus the dimension of the null space is equal to
m. A similar relation holds for row spaces (which are column spaces of AT ) and
the sum is then equal to n. If y is not in the range of A there is no exact solution
and only approximations are possible, such as the orthogonal projection of y onto
the span of the columns of A, which results in a least-squares solution. Then, the
error between y and its projection ŷ (see Figure 2.1) is orthogonal to the column
space of A. That is, any linear combination of the columns of A, for example Aα,
is orthogonal to y − ŷ = y − Ax̂ where x̂ is the least-squares solution. Thus

(Aα)T (y − Ax̂) = 0

or
AT Ax̂ = AT y,
which are called the normal equations of the least-squares problem. If the columns
of A are linearly independent, then AT A is invertible. The unique least-squares
solution is
x̂ = (AT A)−1 AT y (2.3.4)
(recall that A is either rectangular or rank deficient, and does not have a proper
inverse) and the orthogonal projection ŷ is equal to

ŷ = A(AT A)−1 AT y. (2.3.5)

Note that the matrix P = A(AT A)−1 AT satisfies P 2 = P and is symmetric


P = P T , thus satisfying the condition for an orthogonal projection operator (see
Appendix 2.A). Also, it can be verified that the partial derivatives of the squared
error with respect to the components of x̂ are zero for the above choice (see Prob-
lem 2.6).

2.3.3 Eigenvectors and Eigenvalues


The characteristic polynomial for a matrix A is D(x) = det(xI − A), whose roots
are called eigenvalues λi . In particular, a vector p = 0 for which

Ap = λp,

is an eigenvector associated with the eigenvalue λ. If a matrix of size n × n has


n linearly independent eigenvectors, then it can be diagonalized, that is, it can be
written as
A = T ΛT −1 ,
34 CHAPTER 2

where Λ is a diagonal matrix containing the eigenvalues of A along the diagonal


and T contains its eigenvectors as its columns. An important case is when A
is symmetric or, in the complex case, hermitian symmetric, A∗ = A. Then, the
eigenvalues are real, and a full set of orthogonal eigenvectors exists. Taking them as
columns of a matrix U after normalizing them to have unit norm so that U ∗ ·U = I,
we can write a hermitian symmetric matrix as

A = U ΛU ∗ .

This result constitutes the spectral theorem for hermitian matrices. Hermitian
symmetric matrices commute with their hermitian transpose. More generally, a
matrix N that commutes with its hermitian transpose is called normal, that is, it
satisfies N ∗ N = N N ∗ . Normal matrices are exactly those that have a complete
set of orthogonal eigenvectors.
The importance of eigenvectors in the study of linear operators comes from the
following fact: Assuming a full set of eigenvectors,
a vector x can be written as a
linear combination of eigenvectors x = αi v i . Then,
 
  
Ax = A αi vi = αi (Avi ) = αi λi v i .
i i i

The concept of eigenvectors generalizes to eigenfunctions for continuous operators,


which are functions fω (t) such that Afω (t) = λ(ω)fω (t). A classic example is the
complex sinusoid, which is an eigenfunction of the convolution operator, as will be
shown in Section 2.4.

2.3.4 Unitary Matrices


We just explained an instance of a square unitary matrix, that is, an m × m matrix
U which satisfies
U ∗ U = U U ∗ = I, (2.3.6)
or, its inverse is its (hermitian) transpose. When the matrix has real entries, it is
often called orthogonal or orthonormal, and sometimes, a scale factor is allowed on
the left of (2.3.6). Rectangular unitary matrices are also possible, that is, an m × n
matrix U with m < n is unitary if

U x = x, ∀x ∈ C \ ,

as well as
U x, U y = x, y, ∀x, y ∈ C \ ,
2.3. ELEMENTS OF LINEAR ALGEBRA 35

which are the usual Parseval’s relations. Then it follows that


U U ∗ = I,
where I is of size m × m (and the product does not commute). Unitary matrices
have eigenvalues of unit modulus and a complete set of orthogonal eigenvectors.
Note that a unitary matrix performs a rotation, thus, the l2 norm is preserved.
When a square m × m matrix A has full rank its columns (or rows) form a
basis for Rm and we recall that the Gram-Schmidt orthogonalization procedure
can be used to get an orthogonal basis. Gathering the steps of the Gram-Schmidt
procedure into a matrix form, we can write A as
A = QR,
where the columns of Q form the orthonormal basis and R is upper triangular.
Unitary matrices form an important but restricted class of matrices, which can
be parametrized in various forms. For example, an n × n real orthogonal matrix
has n(n − 1)/2 degrees of freedom (up to a permutation of its rows or columns and
a sign change in each vector). If we want to find an orthonormal basis for Rn ,
start with an arbitrary vector and normalize it to have unit norm. This gives n − 1
degrees of freedom. Next, choose a norm-1 vector in the orthogonal complement
with respect to the first vector, which is of dimension n − 1, giving another n − 2
degrees of freedom.
n−1Iterate until the nth vector is chosen, which is unique up to a
sign. We have i=0 i = n(n − 1)/2 degrees of freedom. These degrees of freedom
can be used in various parametrizations, based either on planar or Givens rotations
or, on Householder building blocks (see Appendix 2.B).

2.3.5 Special Matrices


A (right) circulant matrix is a matrix where each row is obtained by a (right)
circular shift of the previous row, or
⎛ ⎞
c0 c1 · · · cn−1
⎜ cn−1 c0 c1 · · · cn−2 ⎟
C = ⎜ ⎝ ... .. ⎟ .
. ⎠
c1 c2 ··· c0
A Toeplitz matrix is a matrix whose (i, j)th entry depends only on the value of i − j
and thus it is constant along the diagonals, or
⎛ t t t ··· t ⎞
0 1 2 n−1
⎜ t−1 t0 t1 ··· tn−2 ⎟
⎜ ⎟
T = ⎜ t
⎜ −2
t−1 t0 ··· tn−3 ⎟ .
⎝ ... .. .. .. ⎟
. ⎠
..
. . .
t−n+1 t−n+2 t−n+3 ··· t0
36 CHAPTER 2

Sometimes, the elements ti are matrices themselves, in which case the matrix is
called block Toeplitz. Another important matrix is the DFT (Discrete Fourier
Transform) matrix. The (i, k)th element of the DFT matrix of size n × n is
Wnik = e−j2πik/n . The DFT matrix diagonalizes circulant matrices, that is, its
columns and rows are the eigenvectors of circulant matrices (see Section 2.4.8 and
Problem 2.18).
A real symmetric matrix A is called positive definite if all its eigenvalues are
greater than 0. Equivalently, for all nonzero vectors x, the following is satisfied:

xT Ax > 0.

Finally, for a positive definite matrix A, there exists a nonsingular matrix W such
that
A = WTW,
where W is intuitively a “square root” of A. One possible way to choose such a
T
√ as A = QΛQ and then, since all the eigenvalues
square root is to diagonalize A
T
are positive, choose W = Q Λ (the square root is applied on each eigenvalue in
the diagonal matrix Λ). The above discussion carries over to hermitian symmetric
matrices by using hermitian transposes.

2.3.6 Polynomial Matrices


Since a fair amount of the results given in Chapter 3 will make use of polynomial
matrices, we will present a brief overview of this subject. For more details, the
reader is referred to [106], while self-contained presentations on polynomial matrices
can be found in [150, 308].
A polynomial matrix (or a matrix polynomial) is a matrix whose entries are
polynomials. The fact that the above two names can be used interchangeably is
due to the following forms of a polynomial matrix H(x):
⎛ i⎞
ai xi · · · bi x
H(x) = ⎝ .
.. . .. .. ⎠ =  H xi ,
. i
i
i
ci x · · · di x i

that is, it can be written either as a matrix containing polynomials as its entries,
or a polynomial having matrices as its coefficients.
The question of the rank in polynomial matrices is more subtle. For example,
the matrix  
a + bx 3(a + bx)
,
c + dx λ(c + dx)
with λ = 3, always has rank less than 2, since the two columns are proportional
to each other. On the other hand, if λ = 2, then the matrix would have the rank
2.4. FOURIER THEORY AND SAMPLING 37

less than 2 only if x = −a/b or x = −c/d. This leads to the notion of normal rank.
First, note that H(x) is nonsingular only if det(H(x)) is different from 0 for some
x. Then, the normal rank of H(x) is the largest of the orders of minors that have
a determinant not identically zero. In the above example, for λ = 3, the normal
rank is 1, while for λ = 2, the normal rank is 2.
An important class of polynomial matrices are unimodular matrices, whose de-
terminant is not a function of x. An example is the following matrix:
 
1+x x
H(x) = ,
2+x 1+x
whose determinant is equal to 1. There are several useful properties pertaining
to unimodular matrices. For example, the product of two unimodular matrices
is again unimodular. The inverse of a unimodular matrix is unimodular as well.
Also, one can prove that a polynomial matrix H(x) is unimodular, if and only if
its inverse is a polynomial matrix. All these facts can be proven using properties
of determinants (see, for example, [308]).
The extension of the concept of unitary matrices to polynomial matrices leads
to paraunitary matrices [308] as studied in circuit theory. In fact, these matrices
are unitary on the unit circle or the imaginary axis, depending if they correspond
to discrete-time or continuous-time linear operators (z-transforms or Laplace trans-
forms). Consider the discrete-time case and x = ejω . Then, a square matrix U (x)
is unitary on the unit circle if
[U (ejω )]∗ U (ejω ) = U (ejω )[U (ejω )]∗ = I.
Extending this beyond the unit circle leads to
[U (x−1 )]T U (x) = U (x)[U (x−1 )]T = I, (2.3.7)
since (ejω )∗ = e−jω . If the coefficients of the polynomials are complex, the coeffi-
cients need to be conjugated in (2.3.7), which is usually written [U ∗ (x−1 )]T . This
will be studied in Chapter 3.
As a generalization of polynomial matrices, one can consider the case of rational
matrices. In that case, each entry is a ratio of two polynomials. As will be shown
in Chapter 3, polynomial matrices in z correspond to finite impulse response (FIR)
discrete-time filters, while rational matrices can be associated with infinite impulse
response (IIR) filters. Unimodular and unitary matrices can be defined in the
rational case, as in the polynomial case.

2.4 F OURIER T HEORY AND S AMPLING


This section reviews the Fourier transform and its variations when signals have
particular properties (such as periodicity). Sampling, which establishes the link be-
38 CHAPTER 2

tween continuous- and discrete-time signal processing, is discussed in detail. Then,


discrete versions of the Fourier transform are examined. The recurring theme is
that complex exponentials form an orthonormal basis on which many classes of
signals can be expanded. Also, such complex exponentials are eigenfunctions of
convolution operators, leading to convolution theorems. The material in this sec-
tion can be found in many sources, and we refer to [37, 91, 108, 215, 326] for details
and proofs.

2.4.1 Signal Expansions and Nomenclature


Let us start by discussing some naming conventions. First, the signal to be ex-
panded is either continuous or discrete in time. Then, the expansion involves an
integral (a transform) or a summation (a series). This leads to four possible com-
binations of continuous/discrete time and integral/series expansions. Note that in
the integral case, strictly speaking, we do not have an expansion, but a transform.
We use lower case and capital letters for the signal and its expansion (or transform)
and denote by ψω and ψi a continuous and discrete set of basis functions. In gen-
eral, there is a basis {ψ} and its dual {ψ̃}, which are equal in the orthogonal case.
Thus, we have

(a) Continuous-time integral expansion, or transform



x(t) = Xω ψω (t)dω with Xω = ψ̃ω (t), x(t).

(b) Continuous-time series expansion



x(t) = Xi ψi (t) with Xi = ψ̃i (t), x(t).
i

(c) Discrete-time integral expansion



x[n] = Xω ψω [n]dω with Xω = ψ̃ω [n], x[n].

(d) Discrete-time series expansion



x[n] = Xi ψi [n] with Xi = ψ̃i [n], x[n].
i

In the classic Fourier cases, this leads to


2.4. FOURIER THEORY AND SAMPLING 39

(a) The continuous-time Fourier transform (CTFT), often simply called the Fourier
transform.

(b) The continuous-time Fourier series (CTFS), or simply Fourier series.

(c) The discrete-time Fourier transform (DTFT).

(d) The discrete-time Fourier series (DTFS).

In all the Fourier cases, {ψ} = {ψ̃}. The above transforms and series will be
discussed in this section. Later, more general expansions will be introduced, in par-
ticular, series expansions of discrete-time signals using filter banks in Chapter 3,
series expansions of continuous-time signals using wavelets in Chapter 4, and in-
tegral expansions of continuous-time signals using wavelets and short-time Fourier
bases in Chapter 5.

2.4.2 Fourier Transform


Given an absolutely integrable function f (t), its Fourier transform is defined by
 ∞
F (ω) = f (t)e−jωt dt = ejωt , f (t), (2.4.1)
−∞

which is called the Fourier analysis formula. The inverse Fourier transform is given
by  ∞
1
f (t) = F (ω)ejωt dω, (2.4.2)
2π −∞
or, the Fourier synthesis formula. Note that ejωt is not in L2 (R), and that the set
{ejωt } is not countable. The exact conditions under which (2.4.2) is the inverse
of (2.4.1) depend on the behavior of f (t) and are discussed in standard texts on
Fourier theory [46, 326]. For example, the inversion is exact if f (t) is continuous
(or if f (t) is defined as (f (t+ ) + f (t− ))/2 at a point of discontinuity).6
When f (t) is square-integrable, then the formulas above hold in the L2 sense
(see Appendix 2.C), that is, calling fˆ(t) the result of the analysis followed by the
synthesis formula,
f (t) − fˆ(t) = 0.
Assuming that the Fourier transform and its inverse exist, we will denote by

f (t) ←→ F (ω)
6
We assume that f (t) is of bounded
variation. That is, for f (t) defined on a closed interval [a, b],
there exists a constant A such that N n=1 |f (tn ) − f (tn−1 )| < A for any finite set {ti } satisfying
a ≤ t0 < t1 < . . . < tN ≤ b. Roughly speaking, the graph of f (t) cannot oscillate over an infinite
distance as t goes over a finite interval.
40 CHAPTER 2

a Fourier transform pair. The Fourier transform satisfies a number of properties,


some of which we briefly review below. For proofs, see [215].

Linearity Since the Fourier transform is an inner product (see (2.4.1)), it follows
immediately from the linearity of the inner product that

αf (t) + βg(t) ←→ αF (ω) + βG(ω).

Symmetry If F (ω) is the Fourier transform of f (t), then

F (t) ←→ 2πf (−ω), (2.4.3)

which indicates the essential symmetry of the Fourier analysis and synthesis formu-
las.

Shifting A shift in time by t0 results in multiplication by a phase factor in the


Fourier domain,
f (t − t0 ) ←→ e−jωt0 F (ω). (2.4.4)
Conversely, a shift in frequency results in a phase factor, or modulation by a complex
exponential, in the time domain,

ejω0 t f (t) ←→ F (ω − ω0 ).

Scaling Scaling in time results in inverse scaling in frequency as given by the


following transform pair (a is a real constant):
1 ω 
f (at) ←→ F . (2.4.5)
|a| a

Differentiation/Integration Derivatives in time lead to multiplication by (jω) in


frequency,
∂ n f (t)
←→ (jω)n F (ω), (2.4.6)
∂tn
if the transform actually exists. Conversely, if F (0) = 0, we have
 t
F (ω)
f (τ )dτ ←→ .
−∞ jω

Differentiation in frequency leads to

∂ n F (ω)
(−jt)n f (t) ←→ .
∂ω n
2.4. FOURIER THEORY AND SAMPLING 41

Moments Calling mn the nth moment of f (t),


 ∞
mn = tn f (t)dt, n = 0, 1, 2, . . . , (2.4.7)
−∞

the moment theorem of the Fourier transform states that


∂ n F (ω)
(−j)n mn = |ω=0 , n = 0, 1, 2, . . . . (2.4.8)
∂ω n

Convolution The convolution of two functions f (t) and g(t) is given by


 ∞
h(t) = f (τ )g(t − τ )dτ, (2.4.9)
−∞

and is denoted h(t) = f (t) ∗ g(t) = g(t) ∗ f (t) since (2.4.9) is symmetric in f (t)
and g(t). Denoting by F (ω) and G(ω) the Fourier transforms of f (t) and g(t),
respectively, the convolution theorem states that

f (t) ∗ g(t) ←→ F (ω) G(ω).

This result is fundamental, and we will prove it for f (t) and g(t) being in L1 (R).
Taking the Fourier transform of f (t) ∗ g(t),
 ∞  ∞ 
f (τ )g(t − τ )dτ e−jωt dt,
−∞ −∞

changing the order of integration (which is allowed when f (t) and g(t) are in L1 (R);
see Fubini’s theorem in [73, 250]) and using the shift property, we get
 ∞  ∞   ∞
−jωt
f (τ ) g(t − τ )e dt dτ = f (τ )e−jωτ G(ω)dτ = F (ω) G(ω).
−∞ −∞ −∞

The result holds as well when f (t) and g(t) are square-integrable, but requires a
different proof [108].
An alternative view of the convolution theorem is to identify the complex ex-
ponentials ejωt as the eigenfunctions of the convolution operator, since
 ∞  ∞
ejω(t−τ )
g(τ )dτ = ejωt
e−jωτ g(τ )dτ = ejωt G(ω).
−∞ −∞

The associated eigenvalue G(ω) is simply the Fourier transform of the impulse
response g(τ ) at frequency ω.
42 CHAPTER 2

By symmetry, the product of time-domain functions leads to the convolution of


their Fourier transforms,
1
f (t) g(t) ←→ F (ω) ∗ G(ω). (2.4.10)

This is known as the modulation theorem of the Fourier transform.
As an application of both the convolution theorem and the derivative property,
consider taking the derivative of a convolution,
∂[f (t) ∗ g(t)]
h (t) = .
dt
The Fourier transform of h (t), following (2.4.6), is equal to

jω (F (ω)G(ω)) = (jωF (ω)) G(ω) = F (ω) (jωG(ω)) ,

that is,
h (t) = f  (t) ∗ g(t) = f (t) ∗ g (t).
This is useful when convolving a signal with a filter which is known to be the
derivative of a given function such as a Gaussian, since one can think of the result
as being the convolution of the derivative of the signal with a Gaussian.

Parseval’s Formula Because the Fourier transform is an orthogonal transform,


it satisfies an energy conservation relation known as Parseval’s formula. See also
Section 2.2.3 where we proved Parseval’s formula for orthonormal bases. Here,
we need a different proof because the Fourier transform does not correspond to an
orthonormal basis expansion (first, exponentials are not in L2 (R) and also the com-
plex exponentials are uncountable, whereas we considered countable orthonormal
bases [113]). The general form of Parseval’s formula for the Fourier transform is
given by  ∞  ∞
∗ 1
f (t) g(t) dt = F ∗ (ω) G(ω) dω, (2.4.11)
−∞ 2π −∞
which reduces, when g(t) = f (t), to
 ∞  ∞
1
|f (t)|2 dt = |F (ω)|2 dω. (2.4.12)
−∞ 2π −∞

Note that the factor 1/2π comes from our definition √ of the Fourier transform (2.4.1–
2.4.2). A symmetric definition, with a factor 1/ 2π in both the analysis and
synthesis formulas (see, for example, [73]), would remove the scale factor in (2.4.12).
The proof of (2.4.11) uses the fact that

f ∗ (t) ←→ F ∗ (−ω)
2.4. FOURIER THEORY AND SAMPLING 43

and the frequency-domain convolution relation (2.4.10). That is, since f ∗ (t) · g(t)
has Fourier transform (1/2π)(F ∗ (−ω) ∗ G(ω)), we have
 ∞  ∞
∗ −jωt 1
f (t) g(t) e dt = F ∗ (−Ω) G(ω − Ω) dΩ,
−∞ 2π −∞

where (2.4.11) follows by setting ω = 0.

2.4.3 Fourier Series


A periodic function f (t) with period T ,

f (t + T ) = f (t),

can be expressed as a linear combination of complex exponentials with frequencies


nω0 where ω0 = 2π/T . In other words,


f (t) = F [k]ejkω0 t , (2.4.13)
k=−∞

with  T /2
1
F [k] = f (t) e−jkω0 t dt. (2.4.14)
T −T /2

If f (t) is continuous, then the series converges uniformly to f (t). If a period of


f (t) is square-integrable but not necessarily continuous, then the series converges
to f (t) in the L2 sense; that is, calling fˆN (t) the truncated series with k going from
−N to N , the error f (t) − fˆN (t) goes to zero as N → ∞. At points of discon-
tinuity, the infinite sum (2.4.13) equals the average (f (t+ ) + f (t− ))/2. However,
convergence is not uniform anymore but plagued by the Gibbs phenomenon. That
is, fˆN (t) will overshoot or undershoot near the point of discontinuity. The amount
of over/undershooting is independent of the number of terms N used in the approx-
imation. Only the width diminishes as N is increased.7 For further discussions on
the convergence of Fourier series, see Appendix 2.C and [46, 326].
Of course, underlying the Fourier series construction is the fact that the set of
functions used in the expansion (2.4.13) is a complete orthonormal system √ forjkωthe
interval [−T /2, T /2] (up to a scale factor). That is, defining ϕk (t) = (1/ T ) e 0t
for t in [−T /2, T /2] and k in Z, we can verify that

ϕk (t), ϕl (t)[− T , T ] = δ[k − l].


2 2

7
Again, we consider nonpathological functions (that is, of bounded variation).
44 CHAPTER 2

When k = l, the inner product equals 1. If k = l, we have


 T /2
1 2π 1
ej T (l−k)t
dt = sin(π(l − k)) = 0.
T −T /2 π(l − k)

That the set {ϕk } is complete is shown in [326] and means that there exists no
periodic function f (t) with L2 norm greater than zero that has all its Fourier series
coefficients equal to zero. Actually, there is equivalence between norms, as shown
below.

Parseval’s Relation With the Fourier series coefficients as defined in (2.4.14),


and the inner product of periodic functions taken over one period, we have

f (t), g(t)[− T , T ] = T F [k], G[k],


2 2

where the factor T is due to the normalization chosen in (2.4.13–2.4.14). In partic-


ular, for g(t) = f (t),
f (t)2[− T , T ] = T F [k]2 .
2 2

This is an example of Theorem 2.4, up to the scaling factor T .

Best Approximation Property While the following result is true in a more gen-
eral setting (see Section 2.2.3), it is sufficiently important to be restated for Fourier
series, namely
   
 N   N 
   
f (t) − ϕk , f ϕk (t) ≤ f (t) − ak ϕk (t) ,
   
k=−N k=−N

where {ak } is an arbitrary set of coefficients. That is, the Fourier series coefficients
are the best ones for an approximation in the span of {ϕk (t)}, k = −N, . . . , N .
Moreover, if N is increased, new coefficients are added without affecting the previous
ones.
Fourier series, beside their obvious use for characterizing periodic signals, are
useful for problems of finite size through periodization. The immediate concern,
however, is the introduction of a discontinuity at the boundary, since periodization
of a continuous signal on an interval results, in general, in a discontinuous periodic
signal.
Fourier series can be related to the Fourier transform seen earlier by using
sequences of Dirac functions which are also used in sampling. We will turn our
attention to these functions next.
2.4. FOURIER THEORY AND SAMPLING 45

2.4.4 Dirac Function, Impulse Trains and Poisson Sum Formula


The Dirac function [215], which is a generalized function or distribution, is defined
as a limit of rectangular functions. For example, if

1/ε 0 ≤ t < ε,
δε (t) = (2.4.15)
0 otherwise,

then δ(t) = limε→0 δε (t). More generally, one can use any smooth function ψ(t)
with integral 1 and define [278]
 
1 t
δ(t) = lim ψ .
→0  
Any operation involving a Dirac function requires a limiting operation. Since we are
reviewing standard results, and for notational convenience, we will skip the limiting
process. However, let us emphasize that Dirac functions have to be handled with
care in order to get meaningful results. When in doubt, it is best to go back to the
definition and the limiting process. For details see, for example, [215]. It follows
from (2.4.15) that  ∞
δ(t) dt = 1, (2.4.16)
−∞

as well as8
 ∞  ∞
f (t − t0 ) δ(t) dt = f (t) δ(t − t0 ) dt = f (t0 ). (2.4.17)
−∞ −∞

Actually, the preceding two relations can be used as an alternative definition of


the Dirac function. That is, the Dirac function is a linear operator over a class of
functions satisfying (2.4.16–2.4.17). From the above, it follows that

f (t) ∗ δ(t − t0 ) = f (t − t0 ). (2.4.18)

One more standard relation useful for the Dirac function is [215]

f (t) δ(t) = f (0) δ(t).

The Fourier transform of δ(t − t0 ) is, from (2.4.1) and (2.4.17), equal to

δ(t − t0 ) ←→ e−jωt0 .

Using the symmetry property (2.4.3) and the previous results, we see that

ejω0 t ←→ 2πδ(ω − ω0 ). (2.4.19)


8
Note that this holds only for points of continuity.
46 CHAPTER 2

According to the above and using the modulation theorem (2.4.10), f (t) ejω0 t has
Fourier transform F (ω − ω0 ).
Next, we introduce the train of Dirac functions spaced T > 0 apart, denoted
sT (t) and given by
∞
sT (t) = δ(t − nT ). (2.4.20)
n=−∞

Before getting its Fourier transform, we derive the Poisson sum formula. Note that,
given a function f (t) and using (2.4.18),
 ∞ ∞

f (τ ) sT (t − τ ) dτ = f (t − nT ). (2.4.21)
−∞ n=−∞

Call the above T -periodic function f0 (t). Further assume that f (t) is sufficiently
smooth and decaying rapidly such that the above series converges uniformly to
f0 (t). We can then expand f0 (t) into a uniformly convergent Fourier series

  
 1 T /2
f0 (t) = f0 (τ )e−j2πkτ /T dτ ej2πkt/T .
T −T /2
k=−∞

Consider the Fourier series coefficient in the above formula, using the expression
for f0 (t) in (2.4.21)
 T /2 ∞ 
 (2n+1)T /2
f0 (τ )e−j2πkτ /T dτ = f (τ ) e−j2πkτ /T dτ
−T /2 n=−∞ (2n−1)T /2
 
2πk
= F .
T

This leads to the Poisson sum formula.


T HEOREM 2.5 Poisson Sum Formula
For a function f (t) with sufficient smoothness and decay,

 ∞  
1  2πk
f (t − nT ) = F ej2πkt/T . (2.4.22)
n=−∞
T T
k=−∞

In particular, taking T = 1 and t = 0,



 ∞

f (n) = F (2πk).
n=−∞ k=−∞
2.4. FOURIER THEORY AND SAMPLING 47

One can use the Poisson formula to derive the Fourier transform of the impulse
train sT (t) in (2.4.20). It can be shown that

2π  2πk
ST (ω) = δ(ω − ). (2.4.23)
T T
k=−∞

We have explained that sampling the spectrum and periodizing the time-domain
function are equivalent. We will see the dual situation, when sampling the time-
domain function leads to a periodized spectrum. This is also an immediate appli-
cation of the Poisson formula.

2.4.5 Sampling
The process of sampling is central to discrete-time signal processing, since it pro-
vides the link with the continuous-time domain. Call fT (t) the sampled version of
f (t), obtained as


fT (t) = f (t) sT (t) = f (nT ) δ(t − nT ). (2.4.24)
n=−∞

Using the modulation theorem of the Fourier transform (2.4.10) and the transform
of sT (t) given in (2.4.23), we get
∞   ∞  
1  2π 1  2π
FT (ω) = F (ω) ∗ δ ω−k = F ω−k , (2.4.25)
T T T T
k=−∞ k=−∞

where we used (2.4.18). Thus, FT (ω) is periodic with period 2π/T , and is obtained
by overlapping copies of F (ω) at every multiple of 2π/T . Another way to prove
(2.4.25) is to use the Poisson formula. Taking the Fourier transform of (2.4.24)
results in
∞
FT (ω) = f (nT ) e−jnT ω ,
n=−∞

since fT (t) is a weighted sequence of Dirac functions with weights f (nT ) and shifts
of nT . To use the Poisson formula, consider the function gΩ (t) = f (t) e−jtΩ , which
has Fourier transform GΩ (ω) = F (ω + Ω) according to (2.4.19). Now, applying
(2.4.22) to gΩ (t), we find

 ∞  
1  2πk
gΩ (nT ) = GΩ
n=−∞
T T
k=−∞
48 CHAPTER 2

or changing Ω to ω and switching the sign of k,



 ∞  
−jnT ω 1  2π
f (nT ) e = F ω−k , (2.4.26)
n=−∞
T T
k=−∞

which is the desired result (2.4.25).


Equation (2.4.25) leads immediately to the famous sampling theorem of Whit-
taker, Kotelnikov and Shannon. If the sampling frequency ωs = 2π/Ts is larger
than 2ωm (where F (ω) is bandlimited9 to ωm ), then we can extract one instance
of the spectrum without overlap. If this were not true, then, for example for k = 0
and k = 1, F (ω) and F (ω − 2π/T ) would overlap and reconstruction would not be
possible.
T HEOREM 2.6 Sampling Theorem
If f (t) is continuous and bandlimited to ωm , then f (t) is uniquely defined
by its samples taken at twice ωm or f (nπ/ωm ). The minimum sampling
frequency is ωs = 2ωm and T = π/ωm is the maximum sampling period.
Then f (t) can be recovered by the interpolation formula


f (t) = f (nT ) sincT (t − nT ), (2.4.27)
n=−∞

where
sin (πt/T )
sincT (t) = .
πt/T
Note that sincT (nT ) = δ[n], that is, it has the interpolation property since it is 1
at the origin but 0 at nonzero multiples of T . It follows immediately that (2.4.27)
holds at the sampling instants t = nT .
P ROOF
The proof that (2.4.27) is valid for all t goes as follows: Consider the sampled version of
f (t), fT (t), consisting of weighted Dirac functions (2.4.24). We showed that its Fourier
transform is given by (2.4.25). The sampling frequency ωs equals 2ωm , where ωm is the
bandlimiting frequency of F (ω). Thus, F (ω − kωs ) and F (ω − lωs ) do not overlap for k = l.
To recover F (ω), it suffices to keep the term with k = 0 in (2.4.25) and normalize it by
T . This is accomplished with a function that has a Fourier transform which is equal to T
from −ωm to ωm and 0 elsewhere. This is called an ideal lowpass filter. Its time-domain
impulse response, denoted sincT (t) where T = π/ωm , is equal to (taking the inverse Fourier
transform)
 ωm
1 T  jπt/T sin(πt/T )
sincT (t) = T e−jωt dω = e − e−jπt/T = . (2.4.28)
2π −ωm 2πjt πt/T
9
We will say that a function f (t) is bandlimited to ωm if its Fourier transform F (ω) = 0 for
|ω| ≥ ωm .
2.4. FOURIER THEORY AND SAMPLING 49

Convolving fT (t) with sincT (t) filters out the repeated spectrums (terms with k = 0 in
(2.4.25)) and recovers f (t), as is clear in frequency domain. Because fT (t) is a sequence
of Dirac functions of weights f (nT ), the convolution results in a weighted sum of shifted
impulse responses,
 

∞ 

f (nT )δ(t − nT ) ∗ sincT (t) = f (nT ) sincT (t − nT ),
n=−∞ n=−∞

proving (2.4.27)

An alternative interpretation of the sampling theorem is as a series expansion on


an orthonormal basis for bandlimited signals. Define
1
ϕn,T (t) = √ sincT (t − nT ), (2.4.29)
T

whose Fourier transform magnitude is T from −ωm to ωm , and 0 otherwise. One
can verify that ϕn,T (t) form an orthonormal set using Parseval’s relation. The
Fourier transform of (2.4.29) is (from (2.4.28) and the shift property (2.4.4))


π/ωm e−jωnπ/ωm −ωm ≤ ω ≤ ωm ,


Φn,T (ω) ←→
0 otherwise,

where T = π/ωm . From (2.4.11), we find


 ωm
1
ϕn,T , ϕk,T  = ejω(n−k)π/ωm dω = δ[n − k].
2ωm −ωm

Now, assume a bandlimited signal f (t) and consider the inner product ϕn,T , f .
Again using Parseval’s relation,
√  ω
T m √
ϕn,T , f  = ejωnT F (ω) dω = T f (nT ),
2π −ωm

because the integral is recognized as the inverse Fourier transform of F (ω) at t =


nT (the bounds [−ωm , ωm ] do not alter the computation of F (ω) because it is
bandlimited to ωm ). Therefore, another way to write the interpolation formula
(2.4.27) is
∞
f (t) = ϕn,T , f  ϕn,T (t) (2.4.30)
n=−∞

(the only change is that we normalized the sinc basis functions to have unit norm).
What happens if f (t) is not bandlimited? Because {ϕn,T } is an orthogonal set,
the interpolation formula (2.4.30) represents the orthogonal projection of the input
50 CHAPTER 2

signal onto the subspace of bandlimited signals. Another way to write the inner
product in (2.4.30) is
 ∞
ϕn,T , f  = ϕ0,T (τ − nT ) f (τ ) dτ = ϕ0,T (−t) ∗ f (t)|t=nT ,
−∞

which equals ϕ0,T (t)∗f (t) since ϕ0,T (t) is real and symmetric in t. That is, the inner
products, or coefficients, in the interpolation formula are simply the outputs of an
ideal lowpass filter with cutoff π/T sampled at multiples of T . This is the usual
view of the sampling theorem as a bandlimiting convolution followed by sampling
and reinterpolation.
To conclude this section, we will demonstrate a fact that will be used in Chap-
ter 4. It states that the following can be seen as a Fourier transform pair:

f (t), f (t + n) = δ[n] ←→ |F (ω + 2kπ)|2 = 1. (2.4.31)
k∈Z

The left side of the equation is simply the deterministic autocorrelation10 of f (t)
evaluated at integers, that is, sampled autocorrelation. If we denote the auto-
correlation of f (t) as p(τ ) = f (t), f (t + τ ), then the left side of (2.4.31) is
p1 (τ ) = p(τ )s1 (τ ), where s1 (τ ) is as defined in (2.4.20) with T = 1. The Fourier
transform of p1 (τ ) is (apply (2.4.25))

P1 (ω) = P (ω − 2kπ).
k∈Z

Since the Fourier transform of p(t) is P (ω) = |F (ω)|2 , we get that the Fourier
transform of the right side of (2.4.31) is the left side of (2.4.31).

2.4.6 Discrete-Time Fourier Transform


Given a sequence {f [n]}n∈Z , its discrete-time Fourier transform (DTFT) is defined
by
∞

F (e ) = f [n] e−jωn , (2.4.32)
n=−∞

which is 2π-periodic. Its inverse is given by


 π
1
f [n] = F (ejω ) ejωn dω. (2.4.33)
2π −π
A sufficient condition for the convergence of (2.4.32) is that the sequence f [n] be
absolutely summable. Then, convergence is uniform to a continuous function of ω
10

The deterministic autocorrelation of a real function f (t) is f (t) ∗ f (−t) = f (τ ) f (τ + t) dτ .
2.4. FOURIER THEORY AND SAMPLING 51

[211]. If the sequence is square-summable, then we have mean square convergence of


the series in (2.4.32) (that is, the energy of the error goes to zero as the summation
limits go to infinity). By using distributions, one can define discrete-time transforms
of more general sequences as well, for example [211]


ejω0 n ←→ 2π δ(ω − ω0 + 2πk).
k=−∞

Comparing (2.4.32–2.4.33) with the equivalent expressions for Fourier series (2.4.13–
2.4.14), one can see that they are duals of each other (within scale factors). Fur-
thermore, if the sequence f [n] is obtained by sampling a continuous-time function
f (t) at instants nT ,
f [n] = f (nT ), (2.4.34)
then the discrete-time Fourier transform is related to the Fourier transform of f (t).
Denoting the latter by Fc (ω), the Fourier transform of its sampled version is equal
to (see (2.4.26))

 ∞  
−jnT ω 1  2π
FT (ω) = f (nT ) e = Fc ω − k . (2.4.35)
n=−∞
T T
k=−∞

Now consider (2.4.32) at ωT and use (2.4.34), thus




F (ejωT ) = f (nT ) e−jnωT
n=−∞

and, using (2.4.35),


∞  
1  2π
F (e jωT
) = Fc ω − k . (2.4.36)
T T
k=−∞

Because of these close relationships with the Fourier transform and Fourier series,
it follows that all properties seen earlier carry over and we will only repeat two of
the most important ones (for others, see [211]).

Convolution Given two sequences f [n] and g[n] and their discrete-time Fourier
transforms F (ejω ) and G(ejω ), then

 ∞

f [n] ∗ g[n] = f [n − l] g[l] = f [l] g[n − l] ←→ F (ejω ) G(ejω ).
l=−∞ l=−∞
52 CHAPTER 2

Parseval’s Equality With the same notations as above, we have



  π
1
f ∗ [n] g[n] = F ∗ (ejω ) G(ejω ) dω, (2.4.37)
n=−∞
2π −π

and in particular, when g[n] = f [n],



  π
1
|f [n]|2 = |F (ejω )|2 dω.
n=−∞
2π −π

2.4.7 Discrete-Time Fourier Series


If a discrete-time sequence is periodic with period N , that is, f [n] = f [n + lN ],
l ∈ Z, then its discrete-time Fourier series representation is given by


N −1
F [k] = f [n] WNnk , k ∈ Z, (2.4.38)
n=0
N−1
1
f [n] = F [k] WN−nk , n ∈ Z, (2.4.39)
N
k=0

where WN is the N th root of unity. That this is an analysis-synthesis pair is easily


verified by using the orthogonality of the roots of unity (see (2.1.3)). Again, all the
familiar properties of Fourier transforms hold, taking periodicity into account. For
example, convolution is now periodic convolution, that is,


N −1 
N −1
f [n] ∗ g[n] = f [n − l] g[l] = f0 [(n − l) mod N ] g0 [l], (2.4.40)
l=0 l=0

where f0 [·] and g0 [·] are equal to one period of f [·] and g[·] respectively. That is,
f0 [n] = f [n], n = 0, . . . , N − 1, and 0 otherwise, and similarly for g0 [n]. Then, the
convolution property is given by

f [n] ∗ g[n] = f0 [n] ∗p g0 [n] ←→ F [k] G[k], (2.4.41)

where ∗p denotes periodic convolution. Parseval’s formula then follows as

 −1 N −1
1  ∗
N

f [n] g[n] = F [k] G[k].
N
n=0 k=0
2.4. FOURIER THEORY AND SAMPLING 53

Just as the Fourier series coefficients were related to the Fourier transform of one
period (see (2.4.14)), the coefficients of the discrete-time Fourier series can be ob-
tained from the discrete-time Fourier transform of one period. If we call F0 (ejω )
the discrete-time Fourier transform of f0 [n], (2.4.32) and (2.4.38) imply that

 
N −1
F0 (ejω ) = f0 [n] e−jωn = f [n] e−jωn ,
n=−∞ n=0

leading to
F [k] = F0 (ejω )|ω=k2π/N .
The sampling of F0 (ejω ) simply repeats copies of f0 [n] at integer multiples of N ,
and thus we have

 N −1
1 
N −1
1  
f [n] = f0 [n − lN ] = F [k] ejnk2π/N
= F0 ejk2π/N ejnk2π/N ,
N N
l=−∞ k=0 k=0
(2.4.42)
which is the discrete-time version of the Poisson sum formula. It actually holds
for f0 [·] with support larger than 0, . . . , N − 1, as long as the first sum in (2.4.42)
converges. For n = 0, (2.4.42) yields

 N −1
1  
f0 [lN ] = F0 ejk2π/N .
N
l=−∞ k=0

2.4.8 Discrete Fourier Transform


The importance of the discrete-time Fourier transform of a finite-length sequence
(which can be one period of a periodic sequence) leads to the definition of the
discrete Fourier transform (DFT). This transform is very important for computa-
tional reasons, since it can be implemented using the fast Fourier transform (FFT)
algorithm (see Chapter 6). The DFT is defined as


N −1
F [k] = f [n] WNnk , (2.4.43)
n=0

and its inverse as


N −1
1 
f [n] = F [k] WN−nk , (2.4.44)
N
k=0

where WN = e−j2π/N . These are the same formulas as (2.4.38–2.4.39), except that
f [n] and F [k] are not defined for n, k ∈ {0, . . . , N −1}. Recall that the discrete-time
54 CHAPTER 2

Fourier transform of a finite-length sequence can be sampled at ω = 2π/N (which


periodizes the sequence). Therefore, it is useful to think of the DFT as the transform
of one period of a periodic signal, or a sampling of the DTFT of a finite-length signal.
In both cases, there is an underlying periodic signal. Therefore, all properties are
with respect to this inherent periodicity. For example, the convolution property of
the DFT leads to periodic convolution (see (2.4.40)). Because of the finite-length
signals involved, the DFT is a mapping on C N and can thus be best represented as
a matrix-vector product. Calling F the Fourier matrix with entries

Fn,k = WNnk , n, k = 0, . . . , N − 1,

then its inverse is equal to (following (2.4.44))


1 ∗
F −1 = F . (2.4.45)
N
Given a sequence {f [0], f [1], . . . , f [N − 1]}, we can define a circular convolution
matrix C with a first line equal to {f [0], f [N − 1], . . . , f [1]} and each subsequent
line being a right circular shift of the previous one. Then, circular convolution of
{f [n]} with a sequence {g[n]} can be written as

f ∗p g = Cg = F −1 ΛF g,

according to the convolution property (2.4.40–2.4.41), where Λ is a diagonal matrix


with F [k] on its diagonal. Conversely, this means that C is diagonalized by F
or that the complex exponential sequences {ej(2π/N )nk } = WN−nk are eigenvectors
of the convolution matrix C, with eigenvalues F [k]. Note that the time reversal
associated with convolution is taken into account in the definition of the circulant
matrix C.
Using matrix notation, Parseval’s formula for the DFT follows easily. Call f̂
the Fourier transform of the vector f = ( f [0] f [1] · · · f [N − 1] )T , that is

f̂ = F f ,

and a similar definition for ĝ as the Fourier transform of g. Then



f̂ ĝ = (F f )∗ (F g) = f ∗ F ∗ F g = N f ∗ g,

where we used (2.4.45), that is, the fact that F ∗ is the inverse of F up to a scale
factor of N .
Other properties of the DFT follow from their counterparts for the discrete-time
Fourier transform, bearing in mind the underlying circular structure implied by the
discrete-time Fourier series (for example, a shift is a circular shift).
2.4. FOURIER THEORY AND SAMPLING 55

f (t) F (ω)

(a)

t ω

f (t) F (ω)

(b)

t ω
T 2π
------
T
f (t) F (ω)

(c)

t ω
2π ωs
-------
ωs
f [n] F [k]

(d)

n k
N N

Figure 2.3 FIGURE


Fourier transforms with various 2.3
combinations of continu-
fig2.3.1
ous/discrete time and frequency variables (see also Table 2.1). (a) Continuous-
time Fourier transform. (b) Continuous-time Fourier series (note that the
frequency-domain function is discrete in frequency, appearing at multiples of
2π/T , with weights F [k]). (c) Discrete-time Fourier transform (note that the
time-domain function is discrete in time, appearing at multiples of 2π/ωs , with
weights f [n]). (d) Discrete-time Fourier series.

2.4.9 Summary of Various Flavors of Fourier Transforms

Between the Fourier transform, where both time and frequency variables are con-
tinuous, and the discrete-time Fourier series (DTFS), where both variables are
discrete, there are a number of intermediate cases.
First, in Table 2.1 and Figure 2.3, we compare the Fourier transform, Fourier
56 CHAPTER 2

f (t) F (ω)

(a)

t ω
2π ωs
------ ------
ωs 2
f (t) F (ω)

(b)

t ω
T 2π
------
T
f (t) F (ω)

(c)

•••
t ω
2π T 2π ωs
------ ------
ωs T ------
2
f (t) F (ω)

(d)

•••
t ω
0 1 2 ••• N-1 2π 2π
------
N

FIGURE 2.4 fig2.3.2


Figure 2.4 Fourier transform with length and bandwidth restrictions on the
signal (see also Table 2.2). (a) Fourier transform of bandlimited signals, where
the time-domain signal can be sampled. Note that the function in frequency
domain has support on (−ωs/2 , ωs/2 ). (b) Fourier transform of finite-length
signals, where the frequency-domain signal can be sampled. (c) Fourier series
of bandlimited periodic signals (it has a finite number of Fourier components).
(d) Discrete-time Fourier transform of finite-length sequences.

series, discrete-time Fourier transform and discrete-time Fourier series. The table
shows four combinations of continuous versus discrete variables in time and fre-
quency. As defined in Section 2.4.1, we use a short-hand CT or DT for continuous-
versus discrete-time variable, and we call it a Fourier transform or series if the
synthesis formula involves an integral or a summation.
Then, in Table 2.2 and Figure 2.4, we consider the same transforms but when
Table 2.1 Fourier transforms with various combinations of continuous/discrete time and fre-
quency variables. CT and DT stand for continuous and discrete time, while FT and FS stand
for Fourier transform (integral synthesis) and Fourier series (summation synthesis). P stands
for a periodic signal. The relation between sampling period T and sampling frequency ωs is
ωs = 2π/T . Note that in the DTFT case, ωs is usually equal to 2π (T = 1).

Analysis
Transform Time Freq. Duality
Synthesis

F (ω) = t f (t) e−jωt dt
(a) Fourier transform self-
C C 
CTFT dual
f (t) = 1/2π ω F (ω) ejωt dω
2.4. FOURIER THEORY AND SAMPLING

 T /2
F [k] = 1/T −T /2 f (t) e−j2πkt/T dt dual
(b) Fourier series C
D with
CTFS P
f (t) = F [k] ej2πkt/T DTFT
k


(c) Discrete-time F (ejω ) = n f [n] e−j2πωn/ωs dual
C
Fourier transform D with
P  ωs /2 jω
DTFT f [n] = 1/ωs −ωs /2 F (e ) ej2πωn/ωs dω CTFS

N −1
(d) Discrete-time F [k] = n=0 f [n] e−j2πnk/N
D D self
Fourier series
P P N −1 -dual
DTFS f [n] = 1/N n=0 F [k] ej2πnk/N
57
58
Table 2.2 Various Fourier transforms with restrictions on the signals involved. Either the signal
is of finite length (FL) or the Fourier transform is bandlimited (BL).

Transform Time Frequency Equivalence Duality

(a) Fourier transform


Can be Sample time. Dual with
of bandlimited signal (− ω2s , ω2s )
sampled Periodize frequency. FL-CTFT
BL-CTFT

(b) Fourier transform


Can be Periodize time. Dual with
of finite-length signal (0, T )
sampled Sample frequency. BL-CTFT
FL-CTFT

(c) Fourier series of band- Periodic Finite number Sample time.


Dual with
limited periodic signal can be of Fourier Finite Fourier
FL-DTFT
BL-CTFS sampled coefficients series in time.

(d) Discrete-time Fourier


Finite Periodic Sample frequency.
transform of finite- Dual with
number of can be Finite Fourier
length sequence BL-CTFS
samples sampled series in frequency.
FL-DTFT
CHAPTER 2
2.5. SIGNAL PROCESSING 59

the signal satisfies some additional restrictions, that is, when it is limited either in
time or in frequency. In that case, the continuous function (of time or frequency)
can be sampled without loss of information.

2.5 S IGNAL P ROCESSING


This section briefly covers some fundamental notions of continuous and discrete-
time signal processing. Our focus is on linear time-invariant or periodically time-
varying systems. For these, weighted complex exponentials play a special role,
leading to the Laplace and z-transform as useful generalizations of the continu-
ous and discrete-time Fourier transforms. Within this class of systems, we are
particularly interested in those having finite-complexity realizations or finite-order
differential/difference equations. These will have rational Laplace or z-transforms,
which we assume in what follows. For further details, see [211, 212]. We also discuss
the basics of multirate signal processing which is at the heart of the material on
discrete-time bases in Chapter 3. More material on multirate signal processing can
be found in [67, 308].

2.5.1 Continuous-Time Signal Processing


Signal processing, which is based on Fourier theory, is concerned with actually
implementing algorithms. So, for example, the study of filter structures and their
associated properties is central to the subject.

The Laplace Transform An extension of the Fourier transform to the complex


plane (instead of just the frequency axis) is the following:
 ∞
F (s) = f (t)e−st dt,
−∞

where s = σ + jω. This is equivalent, for a given σ, to the Fourier transform of


f (t)·e−σt , that is, the transform of an exponentially weighted signal. Now, the above
transform does not in general converge for all s, that is, associated with it is a region
of convergence (ROC). The ROC has the following important properties [212]: The
ROC is made up of strips in the complex plane parallel to the jω-axis. If the jω-axis
is contained in the ROC, then the Fourier transform converges. Note that if the
Laplace transform is rational, then the ROC cannot contain any poles. If a signal
is right-sided (that is, zero for t < T0 ) or left-sided (zero for t > T1 ), then the ROC
is right- or left-sided, respectively, in the sense that it extends from some vertical
line (corresponding to the limit value of Re(s) up to where the Laplace transform
converges) all the way to Re(s) becoming plus or minus infinity. It follows that a
60 CHAPTER 2

finite-length signal has the whole complex plane as its ROC (assuming it converges
anywhere), since it is both left- and right-sided and connected.
If a signal is two-sided, that is, neither left- nor right-sided, then its ROC is the
intersection of the ROC’s of its left- and right-sided parts. This ROC is therefore
either empty or of the form of a vertical strip.
Given a Laplace transform (such as a rational expression), different ROC’s lead
to different time-domain signals. Let us illustrate this with an example.

Example 2.1
Assume F (s) = 1/((s + 1)(s + 2)). The ROC {Re(s) < −2} corresponds to a left-sided
signal
f (t) = −(e−t − e−2t ) u(−t).
The ROC {Re(s) > −1} corresponds to a right-sided signal

f (t) = (e−t − e−2t ) u(t).

Finally, the ROC {−2 < Re(s) < −1} corresponds to a two-sided signal

f (t) = −e−t u(−t) − e−2t u(t).

Note that only the right-sided signal would also have a Fourier transform (since its ROC
includes the jω-axis).

For the inversion of the Laplace transform, recall its relation to the Fourier
transform of an exponentially weighted signal. Then, it can be shown that its
inverse is  σ+j∞
1
f (t) = F (s) est ds,
2πj σ−j∞
where σ is chosen inside the ROC. We will denote a Laplace transform pair by

f (t) ←→ F (s), s ∈ ROC.

For a review of Laplace transform properties, see [212]. Next, we will concentrate
on filtering only.

Linear Time-Invariant Systems The convolution theorem of the Laplace trans-


form follows immediately from the fact that exponentials are eigenfunctions of the
convolution operator. For, if f (t) = h(t) ∗ g(t) and h(t) = est , then
  
f (t) = h(t−τ ) g(τ ) dτ = es(t−τ ) g(τ ) dτ = est e−sτ g(τ ) dτ = est G(s).

The eigenvalue attached to est is the Laplace transform of g(t) at s. Thus,

f (t) = h(t) ∗ g(t) ←→ F (s) = H(s) G(s),


2.5. SIGNAL PROCESSING 61

with an ROC containing the intersection of the ROC’s of H(s) and G(s).
The differentiation property of the Laplace transform says that

∂f (t)
←→ s F (s),
∂t
with ROC containing the ROC of F (s). Then, it follows that linear constant-
coefficient differential equations can be characterized by a Laplace transform called
the transfer function H(s). Linear, time-invariant differential equations, given by


N
∂ k y(t) 
M
∂ k x(t)
ak = bk , (2.5.1)
∂tk ∂tk
k=0 k=0

lead, after taking the Laplace transform, to the following ratio:


M k
Y (s) k=0 bk s
H(s) = = N ,
X(s) k=0 ak s
k

that is, the input and the output are related by a convolution with a filter having
impulse response h(t), where h(t) is the inverse Laplace transform of H(s).
To take this inverse Laplace transform, we need to specify the ROC. Typically,
we look for a causal solution, where we solve the differential equation forward
in time. Then, the ROC extends to the right of the vertical line which passes
through the rightmost pole. Stability11 of the filter corresponding to the transfer
function requires that the ROC include the jω-axis. This leads to the well-known
requirement that a causal system with rational transfer function is stable if and
only if all the poles are in the left half-plane (the real part of the pole location is
smaller than zero). In the above discussion, we have assumed initial rest conditions,
that is, the homogeneous solution of differential Equation (2.5.1) is zero (otherwise,
the system is neither linear nor time-invariant).

Example 2.2 Butterworth Filters


Among various classes of continuous-time filters we will briefly describe the Butterworth
filters, both because they are simple and because they will reappear later as useful filters in
the context of wavelets. The magnitude squared of the Fourier transform of an N th-order
Butterworth filter is given by

1
|HN (jω)|2 = , (2.5.2)
1 + (jω/jωc )2N

where ωc is a parameter which will specify the cutoff frequency beyond which sinusoids are
substantially attenuated. Thus, ωc defines the bandwidth of the lowpass Butterworth filter.
11
Stability of a filter means that a bounded input produces a bounded output.
62 CHAPTER 2

Since |HN (jω)|2 = H(jω)H ∗ (jω) = H(jω)H(−jω) when the filter is real, and noting that
(2.5.2) is the Laplace transform for s = jω, we get

1
H(s) H(−s) = . (2.5.3)
1 + (s/jωc )2N

The poles of H(s)H(−s) are thus at (−1)1/2N (jωc ), or

π(2k + 1) π
|sk | = ωc , arg[sk ] = + ,
2N 2
and k = 0, . . . , 2N − 1. The poles thus lie on a circle, and they appear in pairs at ±sk .
To get a stable and causal filter, one simply chooses the N poles which lie on the left-hand
side half-circle. Since pole locations specify the filter only up to a scale factor, set s = 0
in (2.5.3) which leads to H(0) = 1. For example, a second-order Butterworth filter has the
following Laplace transform:

ωc2
H2 (s) = . (2.5.4)
(s + ωc e jπ/4 )(s + ωc e−jπ/4 )

One can find its “physical” implementation by going back, through the inverse Laplace
transform, to the equivalent linear constant-coefficient differential equation. See also Ex-
ample 3.6 in Chapter 3, for discrete-time Butterworth filters.

2.5.2 Discrete-Time Signal Processing


Just as the Laplace transform was a generalization of the Fourier transform, the
z-transform will be introduced as a generalization of the discrete-time Fourier trans-
form [149]. Again, it will be most useful for the study of difference equations (the
discrete-time equivalent of differential equations) and the associated discrete-time
filters.

The z-Transform The forward z-transform is defined as




F (z) = f [n] z −n , (2.5.5)
n=−∞

where z ∈ C. On the unit circle z = ejω , this is the discrete-time Fourier transform
(2.4.32), and for z = ρejω , it is the discrete-time Fourier transform of the sequence
f [n] · ρn . Similarly to the Laplace transform, there is a region of convergence
(ROC) associated with the z-transform F (z), namely a region of the complex plane
where F (z) converges. Consider the case where the z-transform is rational and
the sequence is bounded in amplitude. The ROC does not contain any pole. If the
sequence is right-sided (left-sided), the ROC extends outward (inward) from a circle
with the radius corresponding to the modulus of the outermost (innermost) pole. If
the sequence is two-sided, the ROC is a ring. The discrete-time Fourier transform
2.5. SIGNAL PROCESSING 63

converges absolutely if and only if the ROC contains the unit circle. From the
above discussion, it is clear that the unit circle in the z-plane of the z-transform
and the jω-axis in the s-plane of the Laplace transform play equivalent roles.
Also, just as in the Laplace transform, a given z-transform corresponds to dif-
ferent signals, depending on the ROC attached to it.
The inverse z-transform involves contour integration in the ROC and Cauchy’s
integral theorem [211]. If the contour of integration is the unit circle, the inver-
sion formula reduces to the discrete-time Fourier transform inversion (2.4.33). On
circles centered at the origin but of radius ρ different from 1, one can think of for-
ward and inverse z-transforms as the Fourier analysis and synthesis of a sequence
f  [n] = ρn f [n]. Thus, convergence properties are as for the Fourier transform of the
exponentially weighted sequence. In the ROC, we can write formally a z-transform
pair as
f [n] ←→ F (z), z ∈ ROC.

When z-transforms are rational functions, the inversion is best done by partial frac-
tion expansion followed by term-wise inversion. Then, the z-trans-
form pairs,
1
an u[n] ←→ |z| > |a|, (2.5.6)
1 − az −1
and
1
−an u[−n − 1] ←→ |z| < |a|, (2.5.7)
1 − az −1

are useful, where u[n] is the unit-step function (u[n] = 1, n ≥ 0, and 0 otherwise).
The above transforms follow from the definition (2.5.5) and the sum of geometric
series, and they are a good example of identical z-transforms with different ROC’s
corresponding to different signals.
As a simple example, consider the sequence

f [n] = a|n|

which, following (2.5.6–2.5.7), has a z-transform


! !
1 1 !1!
F (z) = −1
− , ROC |a| < |z| < !! !! ,
1 − az 1 − 1/az −1 a

that is, a nonempty ROC only if |a| < 1. For more z-transform properties, see
[211].
64 CHAPTER 2

Convolutions, Difference Equations and Discrete-Time Filters Just as in con-


tinuous time, complex exponentials are eigenfunctions of the convolution operator.
That is, if f [n] = h[n] ∗ g[n] and h[n] = z n , z ∈ C, then
  
f [n] = h[n − k] g[k] = z (n−k) g[k] = z n z −k g[k] = z n G(z).
k k k

The z-transform G(z) is thus the eigenvalue of the convolution operator for that
particular value of z. The convolution theorem follows as

f [n] = h[n] ∗ g[n] ←→ F (z) = H(z) G(z),

with an ROC containing the intersection of the ROC’s of H(z) and G(z). Convo-
lution with a time-reversed filter can be expressed as an inner product,
 
f [n] = x[k] h[n − k] = x[k] h̃[k − n] = x[k], h̃[k − n],
k k

where “ ˜ ” denotes time reversal, h̃[n] = h[−n].


It is easy to verify that the “delay by one” operator, that is, a discrete-time
filter with impulse response δ[n − 1] has a z-transform z −1 . That is why z −1 is
often called a delay, or z −1 is used in block diagrams to denote a delay. Then, given
x[n] with the z-transform X(z), x[n − k] has a z-transform

x[n − k] ←→ z −k X(z).

Thus, a linear constant-coefficient difference equation can be analyzed with the


z-transform, leading to the notion of a transfer function. We assume initial rest
conditions in the following, that is, all delay operators are set to zero initially. Then,
the homogeneous solution to the difference equation is zero. Assume a linear, time-
invariant difference equation given by


N 
M
ak y[n − k] = bk x[n − k], (2.5.8)
k=0 k=0

and taking its z-transform using the delay property, we get the transfer function as
the ratio of the output and input z-transforms,
M −1
Y (z) k=0 bk z
H(z) = = N .
X(z) −1
k=0 ak z

The output is related to the input by a convolution with a discrete-time filter having
as impulse response h[n], the inverse z-transform of H(z). Again, the ROC depends
2.5. SIGNAL PROCESSING 65

on whether we wish a causal12 or an anticausal solution, and the system is stable


if and only if the ROC includes the unit circle. This leads to the conclusion that
a causal system with rational transfer function is stable if and only if all poles are
inside the unit circle (their modulus is smaller than one).
Note, however, that a system with poles inside and outside the unit circle can
still correspond to a stable system (but not a causal one). Simply gather poles inside
the unit circle into a causal impulse response, while poles outside correspond to an
anticausal impulse response, and thus, the stable impulse response is two-sided.
From a transfer function given by a z-transform it is always possible to get a
difference equation and thus a possible hardware implementation. However, many
different realizations have the same transfer function and depending on the ap-
plication, certain realizations will be vastly superior to others (for example, in
finite-precision implementation). Let us just mention that the most obvious im-
plementation which realizes the difference equation (2.5.8), called the direct-form
implementation is poor as far as coefficient quantization is concerned. A better
solution is obtained by factoring H(z) into single and/or complex conjugate roots
and implementing a cascade of such factors. For a detailed discussion of numerical
behavior of filter structures see [211].

Autocorrelation and Spectral Factorization An important concept which we


will use later in the book, is that of deterministic autocorrelation (autocorrelation
in the statistical sense will be discussed in Chapter 7, Appendix 7.A). We will say
that
p[m] = h[n], h[n + m],
is the deterministic autocorrelation (or, simply autocorrelation from now on) of the
sequence h[n]. In Fourier domain, we have that

 ∞
 ∞

−jωn

P (e ) = p[n] e = h∗ [k] h[k + n] e−jωn ,
n=−∞ n=−∞ k=−∞
∗ jω
= H (e ) H(e ) = |H(e )| ,
jω jω 2

that is, P (ejω ) is a nonnegative function on the unit circle. In other words, the
following is a Fourier-transform pair:

p[m] = h[n], h[n + m] ←→ P (ejω ) = |H(ejω )|2 .

Similarly, in z-domain, the following is a transform pair:

p[m] = h[n], h[n + m] ←→ P (z) = H(z) H∗ (1/z)


12
A discrete-time sequence x[n] is said to be causal if x[n] = 0 for n < 0.
66 CHAPTER 2

(recall that the subscript * implies conjugation of the coefficients but not of z).
Note that from the above, it is obvious that if zk is a zero of P (z), so is 1/zk∗ (that
also means that zeros on the unit circle are of even multiplicity). When h[n] is
real, and zk is a zero of H(z), then zk∗ , 1/zk , 1/zk∗ are zeros as well (they are not
necessarily different).
Suppose now that we are given an autocorrelation function P (z) and we want
to find H(z). Here, H(z) is called a spectral factor of P (z) and the technique of
extracting it, spectral factorization. These spectral factors are not unique, and are
obtained by assigning one zero out of each zero pair to H(z) (we assume here that
p[m] is FIR, otherwise allpass functions (2.5.10) can be involved). The choice of
which zeros to assign to H(z) leads to different spectral factors. To obtain a spectral
factor, first factor P (z) into its zeros as follows:

"
Nu "
N "
N
P (z) = α ((1 − z1i z ) (1 − z1i z)) (1 − z2i z ) (1 − z2∗i z),
−1 −1

i=1 i=1 i=1

where the first product contains the zeros on the unit circle, and thus |z1i | = 1,
and the last two contain pairs of zeros inside/outside the unit circle, respectively.
In that case, |z2i | < 1. To obtain various H(z), one has to take one zero out of
each zero pair on the unit circle, as well as one of two zeros inside/outside the
unit circle. Note that all these solutions have the same magnitude response but
different phase behavior. An important case is the minimum phase solution which
is the one, among all causal spectral factors, that has the smallest phase term. To
get a minimum phase solution, we will consistently choose the zeros inside the unit
circle. Thus, H(z) would be of the form

√ "
Nu "
N
H(z) = α (1 − z1i z −1 ) (1 − z2i z −1 ).
i=1 i=1

Examples of Discrete-Time Filters Discrete-time filters come in two major


classes. The first class consists of infinite impulse response (IIR) filters, which
correspond to difference equations where the present output depends on past out-
puts (that is, N ≥ 1 in (2.5.8)). IIR filters often depend on a finite number of past
outputs (N < ∞) in which case the transfer function is a ratio of polynomials in
z −1 . Often, by abuse of language, we will call an IIR filter a filter with a rational
transfer function. The second class corresponds to nonrecursive, or finite impulse
response (FIR) filters, where the output only depends on the inputs (or N = 0 in
(2.5.8)). The z-transform is thus a polynomial in z −1 . An important class of FIR
filters are those which have symmetric or antisymmetric impulse responses because
this leads to a linear phase behavior of their Fourier transform. Consider causal
2.5. SIGNAL PROCESSING 67

FIR filters of length L. When the impulse response is symmetric, one can write

H(ejω ) = e−jω(L−1)/2 A(ω),

where L is the length of the filter, and A(ω) is a real function of ω. Thus, the phase
is a linear function of ω. Similarly, when the impulse response is antisymmetric,
one can write
H(ejω ) = je−jω(L−1)/2 B(ω),
where B(ω) is a real function of ω. Here, the phase is an affine function of ω (but
usually called linear phase).
One way to design discrete-time filters is by transformation of an analog filter.
For example, one can sample the impulse response of the analog filter if its magni-
tude frequency response is close enough to being bandlimited. Another approach
consists of mapping the s-plane of the Laplace transform into the z-plane. From
our previous discussion of the relationship between the two planes, it is clear that
the jω-axis should map into the unit circle and the left half-plane should become
the inside of the unit circle in order to preserve stability. Such a mapping is given
by the bilinear transformation [211]

1 − z −1
B(z) = β .
1 + z −1
Then, the discrete-time filter Hd is obtained from a continuous-time filter Hc by
setting
Hd (z) = Hc (B(z)).
Considering what happens on the jω-axis and the unit circle, it can be verified that
the bilinear transform warps the frequency axis as ω = 2 arctan(ωc /β), where ω
and ωc are the discrete and continuous frequency variables, respectively.
As an example, the discrete-time Butterworth filter has a magnitude frequency
response equal to
1
|H(ejω )|2 = . (2.5.9)
1 + (tan(ω/2)/ tan(ω0 /2))2N
This squared magnitude is flat at the origin, in the sense that its first 2N − 1
derivatives are zero at ω = 0. Note that since we have a closed-form factorization of
the continuous-time Butterworth filter (see (2.5.4)), it is best to apply the bilinear
transform to the factored form rather than factoring (2.5.9) in order to obtain
H(ejω ) in its cascade form.
Instead of the above indirect construction, one can design discrete-time filters
directly. This leads to better designs at a given complexity of the filter or, con-
versely, to lower-complexity filters for a given filtering performance.
68 CHAPTER 2

In the particular case of FIR linear phase filters (that is, a finite-length sym-
metric or antisymmetric impulse response), a powerful design method called the
Parks-McClellan algorithm [211] leads to optimal filters in the minimax sense (the
maximum deviation from the desired Fourier transform magnitude is minimized).
The resulting approximation of the desired frequency response becomes equiripple
both in the passband and stopband (the approximation error is evenly spread out).
It is thus very different from a monotonically decreasing approximation as achieved
by a Butterworth filter.
Finally, we discuss the allpass filter, which is an example of what could be called
a unitary filter. An allpass filter has the property that

|Hap (ejω )| = 1, (2.5.10)

for all ω. Calling y[n] the output of the allpass when x[n] is input, we have
1 1 1
y2 = Y (ejω )2 = Hap (ejω ) X(ejω )2 = X(ejω )2 = x2 ,
2π 2π 2π
which means it conserves the energy of the signal it filters. An elementary single-
pole/zero allpass filter is of the following form (see also Appendix 3.A in Chapter
3):
z −1 − a∗
Hap (z) = . (2.5.11)
1 − az −1
Writing the pole location as a = ρejθ , the zero is at 1/a∗ = (1/ρ)ejθ . A general
allpass filter is made up of elementary sections as in (2.5.11)

"
N
z −1 − a∗i P̃ (z)
Hap (z) = = , (2.5.12)
1 − ai z −1 P (z)
i=1

where P̃ (z) = z −N P∗ (z −1 ) is the time-reversed and coefficient-conjugated version


of P (z) (recall that the subscript ∗ stands for conjugation of the coefficients of the
polynomial, but not of z). On the unit circle,

P ∗ (ejω )
Hap (ejω ) = e−jωN ,
P (ejω )
and property (2.5.10) follows easily. That all rational functions satisfying (2.5.10)
can be factored as in (2.5.12) is shown in [308].

2.5.3 Multirate Discrete-Time Signal Processing


As implied by its name, multirate signal processing deals with discrete-time se-
quences taken at different rates. While one can always go back to an underlying
2.5. SIGNAL PROCESSING 69

continuous-time signal and resample it at a different rate, most often, the rate
changes are being done in the discrete-time domain. We review some of the key
results. For further details, see [67] and [308].

Sampling Rate Changes Downsampling or subsampling13 a sequence x[n] by an


integer factor N results in a sequence y[n] given by

y[n] = x[nN ],

that is, all samples with indexes modulo N different from zero are discarded. In
the Fourier domain, we get

1   j(ω−2πk)/N 
N −1

Y (e ) = X e , (2.5.13)
N
k=0

that is, the spectrum is stretched by N , and (N − 1) aliased versions at multiples


of 2π are added. They are called aliased because they are copies of the original
spectrum (up to a stretch) but shifted in frequency. That is, low-frequency com-
ponents will be replicated at the aliasing frequencies ωi = 2πi/N , as will high
frequencies (with an appropriate shift). Thus, some high-frequency sinusoid might
have a low-frequency alias. Note that the aliased components are nonharmonically
related to the original frequency component; a fact that can be very disturbing in
applications such as audio. Sometimes, it is useful to extend the above relation to
the z-transform domain;

1   k 1/N 
N −1
Y (z) = X WN z , (2.5.14)
N
k=0

where WN = e−j2π/N as usual. To prove (2.5.14), consider first a signal x [n] which
equals x[n] at multiples of N , and 0 elsewhere. If x[n] has z-transform X(z), then
X  (z) equals
N −1
1 
X  (z) = X(WNk z) (2.5.15)
N
k=0

as can be shown by using the orthogonality of the roots of unity (2.1.3). To obtain
y[n] from x [n], one has to drop the extra zeros between the nonzero terms or
contract the signal by a factor of N . This is obtained by substituting z 1/N for z in
(2.5.15), leading to (2.5.14). Note that (2.5.15) contains the signal X as well as its
13
Sometimes, the term decimation is used even though it historically stands for “keep 9 out of
10” in reference to a Roman practice of killing every tenth soldier of a defeated army.
70 CHAPTER 2

(a) jω
1 X(e )

ω
π 2π 3π 4π 5π 6π

(b) 5/9 Y(e jω)

1/3 X(e jω/3)/3 X(e j(ω−2π)/3)/3 X(e j(ω−4π)/3)/3

ω
π 2π 3π 4π 5π 6π

Figure 2.5 Downsampling by 3 in the frequency domain. (a) Original spec-


FIGURE(b)
trum (we assume a real spectrum for simplicity). 2.5The three stretched
fig2.4.1
replicas and the sum Y (ejω ).

N − 1 modulated versions (on the unit circle, X(WNk z) = X(ej(ω−k2π/N ) )). This
is the reason why in Chapter 3, we will call the analysis dealing with X(WNk z),
modulation-domain analysis.
An alternative proof of (2.5.13) (which is (2.5.14) on the unit circle) consists
of going back to the underlying continuous-time signal and resampling with an
N -times larger sampling period. This is considered in Problem 2.10.
By way of an example, we show the case N = 3 in Figure 2.5. It is obvious
that in order to avoid aliasing, downsampling by N should be preceded by an ideal
lowpass filter with cutoff frequency π/N (see Figure 2.6(a)). Its impulse response
h[n] is given by
 π/N
1 sin πn/N
h[n] = ejωn dω = . (2.5.16)
2π −π/N πn
2.5. SIGNAL PROCESSING 71

(a) LP: π/N N

(b) M LP: π/M

(c) M LP: min(π/M, π/N) N

Figure 2.6 Sampling rate changes. (a) Downsampling by N preceded by ideal


FIGURE 2.6
lowpass filtering with cutoff frequency π/N . (b) Upsampling by M followed
fig2.4.2
by interpolation with an ideal lowpass filter with cutoff frequency π/M . (c)
Sampling rate change by a rational factor M/N , with an interpolation filter in
between. The cutoff frequency is the lesser of π/M and π/N .

The converse of downsampling is upsampling by an integer M . That is, to obtain a


new sequence, one simply inserts M − 1 zeros between consecutive samples of the
input sequence, or

x[n/M ] n = kM, k ∈ Z
y[n] =
0 otherwise.

In Fourier domain, this amounts to

Y (ejω ) = X(ejM ω ), (2.5.17)

and similarly, in z-transform domain

Y (z) = X(z M ). (2.5.18)

Due to upsampling, the spectrum contracts by M . Besides the “base spectrum”


at multiples of 2π, there are spectral images in between which are due to the
interleaving of zeros in the upsampling. To get rid of these spectral images, a
perfect interpolator or a lowpass filter with cutoff frequency π/M has to be used,
as shown in Figure 2.6(b). Its impulse response is as given in (2.5.16), but with a
different scale factor,
sin πn/M
h[n] = .
πn/M
It is easy to see that h[nM ] = δ[n]. Therefore, calling u[n] the result of the in-
terpolation, or u[n] = y[n] ∗ h[n], it follows that u[nM ] = x[n]. Thus, u[n] is a
72 CHAPTER 2

perfect interpolation of x[n] in the sense that the missing samples have been filled
in without disturbing the original ones.
A rational sampling rate change by M/N is obtained by cascading upsampling
and downsampling with an interpolation filter in the middle, as shown in Figure
2.6(c). The interpolation filter is the cascade of the ideal lowpass for the upsampling
and for the downsampling, that is, the narrower of the two in the ideal filter case.
Finally, we demonstrate a fact that will be extensively used in Chapter 3. It
can be seen as an application of downsampling followed by upsampling to the de-
terministic autocorrelation of g[n]. This is the discrete-time equivalent of (2.4.31).
We want to show that the following holds:

N −1
g[n], g[n + N l] = δ[l] ←→ G(WNk z) G(WN−k z −1 ) = N. (2.5.19)
k=0

The left side of the above equation is simply the autocorrelation of g[n] evaluated
at every N th index m = N l. If we denote the autocorrelation of g[n] as p[n], then
the left side of (2.5.19) is p [n] = p[N n]. The z-transform of p [n] is (apply (2.5.14))
N −1
 1 
P (z) = P (WNk z 1/N ).
N
k=0

Replace now z 1/N by z and since the z-transform of p[n] is P (z) = G(z)G(z −1 ), we
get that the z-transform of the left side of (2.5.19) is the right side of (2.5.19).

Multirate Identities

Commutativity of Sampling Rate Changes Upsampling by M and downsampling by


N commute if and only if M and N are coprime.
The relation is shown pictorially in Figure 2.7(a). Using (2.5.14) and (2.5.18)
for down and upsampling in z-domain, we find that upsampling by M followed by
downsampling by N leads to

N −1
Yu/d (z) = X(WNk z M/N ),
k=0

while the reverse order leads to



N −1
Yd/u (z) = X(WNkM z M/N ).
k=0

For the two expressions to be equal, kM mod N has to be a permutation, that is,
kM mod N = l has to have a unique solution for all l ∈ {0, . . . , N − 1}. If M and N
2.5. SIGNAL PROCESSING 73

(M,N) coprime
(a) M N N M

(b) N H(z) H(zN) N

(c) H(z) N N H(zN)

FIGURE 2.7 fignew2.5.3


Figure 2.7 Multirate identities. (a) Commutativity of up and downsampling.
(b) Interchange of downsampling and filtering. (c) Interchange of filtering and
upsampling.

have a common factor L > 1, then M = M  L and N = N  L. Note that (kM mod
N ) mod L is zero, or kM mod N is a multiple of L and thus not a permutation.
If M and N are coprime, then Bezout’s identity [209] guarantees that there exist
two integers m and n such that mM + nN = 1. It follows that mM mod N = 1
thus, k = ml mod N is the desired solution to the equation k M mod N = l. This
property has an interesting generalization in multiple dimensions (see for example
[152]).

Interchange of Filtering and DownsamplingDownsampling by N followed by filtering


with a filter having z-transform H(z) is equivalent to filtering with the upsampled
filter H(z N ) before the downsampling.
Using (2.5.14), it follows that downsampling the filtered signal with the z-
transform X(z)H(z N ) results in


N −1   
N −1
X(WNK z 1/N k 1/N N
) H (WN z ) = H(z) X(WNk z 1/N ),
k=0 k=0

which is equal to filtering a downsampled version of X(z).

Interchange of Filtering and Upsampling Filtering with a filter having the z-transform
H(z), followed by upsampling by N , is equivalent to upsampling followed by filtering
with H(z N ).
Using (2.5.18), it is immediate that both systems lead to an output with z-
transform X(z N )H(z N ) when the input is X(z).
In short, the last two properties simply say that filtering in the downsampled
domain can always be realized by filtering in the upsampled domain, but then with
74 CHAPTER 2

x[n]
3 3 +

z 3 3 z-1 +

z2 3 3 z-2

Figure 2.8 Polyphase transform (forward and inverse transforms for the case
N = 3 are shown). FIGURE 2.8 fignew2.4.4

the upsampled filter (down and upsampled stand for low versus high sampling rate
domain). The last two relations are shown in Figures 2.7(b) and (c).

Polyphase Transform Recall that in a time-invariant system, if input x[n] pro-


duces output y[n], then input x[n + m] will produce output y[n + m]. In a time-
varying system this is not true. However, there exist periodically time-varying
systems for which if input x[n] produces output y[n], then x[n + N m] produces
output y[n + mN ]. These systems are periodically time-varying with period N . For
example, a downsampler by N followed by an upsampler by N is such a system. A
downsampler alone is also periodically time-varying, but with a time-scale change.
Then, if x[n] produces y[n], x[n + mN ] produces y[n + m] (note that x[n] and y[n]
do not live on the same time-scale). Such periodically time-varying systems can
be analyzed with a simple but useful transform where a sequence is mapped into
N sequences with each being a shifted and downsampled version of the original
sequence. Obviously, the original sequence can be recovered by simply interleaving
the subsequences. Such a transform is called a polyphase transform of size N since
each subsequence has a different phase and there are N of them. The simplest
example is the case N = 2, where a sequence is subdivided into samples of even
and odd indexes, respectively. In general, we define the size-N polyphase transform
of a sequence x[n] as a vector of sequences ( x0 [n] x1 [n] · · · xN −1 [n] )T , where

xi [n] = x[nN + i].

These are called signal polyphase components. In z-transform domain, we can write
X(z) as the sum of shifted and upsampled polyphase components. That is,


N −1
X(z) = z −i Xi (z N ), (2.5.20)
i=0
2.5. SIGNAL PROCESSING 75

where


Xi (z) = x[nN + i] z −n . (2.5.21)
n=−∞

Figure 2.8 shows the signal polyphase transform and its inverse (for the case N = 3).
Because the forward shift requires advance operators which are noncausal, a causal
version would produce a total delay of N − 1 samples between forward and inverse
polyphase transform. Such a causal version is obtained by multiplying the noncausal
forward polyphase transform by z −N +1 .
Later we will need to express the output of filtering with H followed by down-
sampling in terms of the polyphase components of the input signal. That is, we
need the 0th polyphase component of H(z)X(z). This is easiest if we define a
polyphase decomposition of the filter to have the reverse phase of the one used for
the signal, or

N −1
H(z) = z i Hi (z N ), (2.5.22)
i=0

with


Hi (z) = h[N n − i]z −n , i = 0, . . . , N − 1. (2.5.23)
n=−∞

Then the product H(z)X(z) after downsampling by N becomes


N −1
Y (z) = Hi (z) Xi (z).
i=0

The same operation (filtering by h[n] followed by downsampling by N ) can be


expressed in matrix notation as

⎛ . ⎞ ⎛ ⎞⎛
.. .. .. .. .. .. ⎞
⎜ . . . . ⎟ .
⎜ ⎟ ⎜ · · · h[L − 1] · · · h[L − N ] h[L − N − 1] · · · ⎟ ⎜ ⎟
⎜ y[0] ⎟ ⎜ x[0] ⎟
⎜ ⎟ = ⎜

⎟⎜
⎟ ⎟,
⎝ y[1] ⎠ ⎝ · · · 0 · · · 0 h[L − 1] · · · ⎠⎝ x[1] ⎠
.. .. .. .. .. ..
. . . . . .

where L is the filter length, and the matrix operator will be denoted by H. Simi-
76 CHAPTER 2

larly, upsampling by N followed by filtering by g[n] can be expressed as


⎛ ⎞
.. ..
⎛ . ⎞ ⎜ . . ⎟⎛ . ⎞
.. ⎜ ··· g[0] 0 ··· ⎟ ..
⎜ ⎟
⎜ ⎟ ⎜ . . ⎟ ⎜ ⎟
⎜ x[0] ⎟ ⎜ .. .. ··· ⎟ ⎜ y[0] ⎟
⎜ ⎟ = ⎜ ··· ⎟⎜ ⎟.
⎝ x[1] ⎠ ⎜ · · · g[N − 1] 0 · · · ⎟ ⎝ y[1] ⎠
⎜ ⎟
.. ⎜ ··· g[N ] g[0] · · · ⎟ ..
. ⎝ ⎠ .
.. ..
. .

Here the matrix operator is denoted by G. Note that if h[n] = g[−n], then H = GT ,
a fact that will be important when analyzing orthonormal filter banks in Chapter 3.

2.6 T IME -F REQUENCY R EPRESENTATIONS


While the Fourier transform and its variations are very useful mathematical tools,
practical applications require basic modifications. These modifications aim at “lo-
calizing” the analysis, so that it is not necessary to have the signal over (−∞, ∞)
to perform the transform (as required with the Fourier integral) and so that local
effects (transients) can be captured with some accuracy. The classic example is the
short-time Fourier [204], or Gabor transform14 [102], which uses windowed complex
exponentials and their translates as expansion functions. We therefore discuss the
localization properties of basis functions and derive the uncertainty principle which
gives a lower bound on the joint time and frequency resolutions. We then review the
short-time Fourier transform and its associated energy distribution called the spec-
trogram and introduce the wavelet transform. Block transforms are also discussed.
Finally, an example of a bilinear expansion, namely the Wigner-Ville distribution,
is also discussed.

2.6.1 Frequency, Scale and Resolution


When calculating a signal expansion, a primary concern is the localization of a
given basis function in time and frequency. For example, in the Fourier transform,
the functions used in the analysis are infinitely sharp in their frequency localization
(they exist at one precise frequency) but have no time localization because of their
infinite extent.
There are various ways to define the localization of a particular basis function,
but they are all related to the “spread” of the function in time and frequency. For
14
Gabor’s original paper proposed synthesis of signals using complex sinusoids windowed by a
Gaussian, and is thus a synthesis rather than an analysis tool. However, it is closely related to the
short-time Fourier transform, and we call Gabor transform a short-time Fourier transform using a
Gaussian window.
2.6. TIME-FREQUENCY REPRESENTATIONS 77

|F (ω)|2 It t

| f (t)|2

Figure 2.9 Tile in the time-frequencyFIGURE


plane as2.10
an approximation of fig2.5.1
the time-
frequency localization of f (t). Intervals It and Iω contain 90% of the energy
of the time- and frequency-domain functions, respectively.

ω ω
f”
f
6ω0
5ω0
4ω0
f f'
ω0 3ω0
f'
2ω0
ω0
t t
τ τ0 2τ0 3τ0 4τ0 5τ0 6τ0

(a) (b)

FIGURE 2.11 fig2.5.2


Figure 2.10 Elementary operations on a basis function f and effect on the
time-frequency tile. (a) Shift in time by τ producing f  and modulation by ω0
producing f  . (b) Scaling f  (t) = f (at) (a = 1/3 is shown).

example, one can define intervals It and Iω which contain 90% of the energy of
the time- and frequency-domain functions, respectively, and are centered around
the center of gravity of |f (t)|2 and |F (ω)|2 (see Figure 2.9). This defines what we
call a tile in the time-frequency domain, as shown in Figure 2.9. For simplicity, we
assumed a complex basis function. A real basis function would be represented by
two mirror tiles at positive and negative frequencies.
Consider now elementary operations on a basis function and their effects on the
tile. Obviously, a shift in time by τ results in shifting of the tile by τ . Similarly,
modulation by ejω0 t shifts the tile by ω0 in frequency (vertically). This is shown
78 CHAPTER 2

in Figure 2.10(a). Finally, scaling by a, or f  (t) = f (at), results in It = (1/a)It


and Iω = aIω , following the scaling property of the Fourier transform (2.4.5). That
is, both the shape and localization of the tile have been affected, as shown in
Figure 2.10(b). Note that all elementary operations conserve the surface of the
time-frequency tile. In the scaling case, resolution in frequency was traded for
resolution in time.
Since scaling is a fundamental operation used in the wavelet transform, we need
to define it properly. While frequency has a natural ordering, the notion of scale
is defined differently by different authors. The analysis functions for the wavelet
transform will be defined as
 
1 t−b
ψa,b (t) = √ ψ , a ∈ R+
a a

where the function ψ(t) is usually a bandpass filter. Thus, large a’s (a  1)
correspond to long basis functions, and will identify long-term trends in the signal
to be analyzed. Small a’s (0 < a < 1) lead to short basis functions, which will follow
short-term behavior of the signal. This leads to the following: Scale is proportional
to the duration of the basis functions used in the signal expansion.
Because of this, and assuming that a basis function is a bandpass filter as in
wavelet analysis, high-frequency basis functions are obtained by going to small
scales, and therefore, scale is loosely related to inverse frequency. This is only
a qualitative statement, since scaling and modulation are fundamentally different
operations as was seen in Figure 2.10. The discussed scale is similar to those in
geographical maps, where large means a coarse, global view, and small corresponds
to a fine, detailed view.
Scale changes can be inverted if the function is continuous-time. In discrete
time, the situation is more complicated. From the discussion of multirate signal
processing in Section 2.5.3, we can see that upsampling (that is, a stretching of the
sequence) can be undone by downsampling by the same factor, and this with no
loss of information if done properly. Downsampling (or contraction of a sequence)
involves loss of information in general, since either a bandlimitation precedes the
downsampling, or aliasing occurs. This naturally leads to the notion of resolution of
a signal. We will thus say that the resolution of a finite-length signal is the minimum
number of samples required to represent it. It is thus related to the information
content of the signal. For infinite-length signals having finite energy and sufficient
decay, one can define the length as the essential support (for example, where 99%
of the energy is).
In continuous time, scaling does not change the resolution, since a scale change
affects both the sampling rate and the length of the signal, thus keeping the number
of samples constant. In discrete time, upsampling followed by interpolation does
2.6. TIME-FREQUENCY REPRESENTATIONS 79

halfband Resolution: halved


(a) x[n] lowpass y[n] Scale: unchanged

halfband Resolution: unchanged


(b) x[n] 2 lowpass y[n] Scale: halved

halfband Resolution: halved


(c) x[n] lowpass 2 y[n] Scale: doubled

Figure 2.11 Scale and resolution in discrete-time sequences. (a) Lowpass


filtering reduces the resolution. FIGURE 2.12 and interpolation
(b) Upsampling fig2.5.3
change the
scale but not the resolution. (c) Lowpass filtering and downsampling increase
scale and reduces resolution.

not affect the resolution, since the interpolated samples are redundant. Downsam-
pling by N decreases the resolution by N , and cannot be undone. Figure 2.11 shows
the interplay of scale and resolution on simple discrete-time examples. Note that
the notion of resolution is central to multiresolution analysis developed in Chap-
ters 3 and 4. There, the key idea is to split a signal into several lower-resolution
components, from which the original, full-resolution signal can be recovered.

2.6.2 Uncertainty Principle


As indicated in the discussion of scaling in the previous section, sharpness of the
time analysis can be traded off for sharpness in frequency, and vice versa. But
there is no way to get arbitrarily sharp analysis in both domains simultaneously, as
shown below [37, 102, 215]. Note that the sharpness is also called resolution in time
and frequency (but is different from the resolution discussed just above, which was
related to information content). Consider a unit energy signal f (t) with Fourier
transform F (ω) centered around the origin in time as well as in frequency, that is,
satisfying t|f (t)|2 dt = 0 and ω|F (ω)|2 dω = 0 (this can always be obtained by
appropriate translation and modulation). Define the time width Δt of f (t) by
 ∞
2
Δt = t2 |f (t)|2 dt, (2.6.1)
−∞

and its frequency width Δω by


 ∞
Δ2ω = ω 2 |F (ω)|2 dω.
−∞
80 CHAPTER 2

T HEOREM 2.7 Uncertainty Principle



If f (t) vanishes faster than 1/ t as t → ±∞, then
π
Δ2t Δ2ω ≥ , (2.6.2)
2
where equality holds only for Gaussian signals
#
α −αt2
f (t) = e . (2.6.3)
π

P ROOF
Consider the integral of t f (t) f  (t). Using Cauchy-Schwarz inequality (2.2.2),
! !2  
! !
! tf (t) f 
(t) dt ! ≤ |tf (t)|2
dt |f  (t)|2 dt. (2.6.4)
! !
R R R

The first integral on the right side is equal to Δ2t . Because f  (t) has Fourier trans-
form jωF (ω), and using Parseval’s formula, we find that the second integral is equal
to (1/(2π))Δ2ω . Thus, the integral on the left side of (2.6.4) is bounded from above by
(1/(2π))Δ2t Δ2ω . Using integration by parts, and noting that f (t)f  (t) = (1/2)(∂f 2 (t))/(∂t),
  
1 ∂f 2 (t) 1 !∞ 1
tf (t) f  (t) dt = t dt = t f 2 (t)!−∞ − f 2 (t) dt.
R 2 R ∂t 2 2 R

By assumption, the limit of tf 2 (t) is zero at infinity, and, because the function is of unit
norm, the above equals −1/2. Replacing this into (2.6.4), we obtain

1 1 2 2
≤ Δt Δω ,
4 2π

or (2.6.2). To find a function that meets the lower bound note that Cauchy-Schwarz in-
equality is an equality when the two functions involved are equal within a multiplicative
factor, that is, from (2.6.4),
f  (t) = ktf (t).

Thus, f (t) is of the form


2
f (t) = cekt /2
(2.6.5)

and (2.6.3) follows for k = −2α and c = α/π.

The uncertainty principle is fundamental since it sets a bound on the maximum


joint sharpness or resolution in time and frequency of any linear transform. It is
easy to check that scaling does not change the time-bandwidth product, it only
exchanges one resolution for the other, similarly to what was shown in Figure 2.10.
2.6. TIME-FREQUENCY REPRESENTATIONS 81

Example 2.3 Prolate Spheroidal Wave Functions


A related problem is that of finding bandlimited functions which are maximally concentrated
around the origin in time (recall that there exist no functions that are both bandlimited
and of finite duration). That is, find a function f (t) of unit norm and bandlimited to ω0
(F (ω) = 0, |ω| > ω0 ) such that, for a given T ∈ (0, ∞)
 T
α= |f (t)|2 dt
−T

is maximized. It can be shown [216, 268] that the solution f (t) is the eigenfunction with
the largest eigenvalue satisfying
 T
sin ω0 (t − τ )
f (τ ) dτ = λf (t). (2.6.6)
−T π(t − τ )

An interpretation of the above formula is the following. If T → ∞, then we have the


usual convolution with an ideal lowpass filter, and thus, any bandlimited function is an
eigenfunction with eigenvalue 1. For finite T , because of the truncation, the eigenvalues will
be strictly smaller than one. Actually, it turns out that the eigenvalues belong to (0, 1) and
are all different, or
1 > λ0 > λ1 > · · · > λn → 0, n → ∞.

Call fn (t) the eigenfunction of (2.6.6) with eigenvalue λn . Then (i) each fn (t) is unique (up
to a scale factor), (ii) fn (t) and fm (t) are orthogonal for n = m, and (iii) with proper nor-
malization the set {fn (t)} forms an orthonormal basis for functions bandlimited to (−ω0 , ω0 )
[216]. These functions are called prolate spheroidal wave functions. Note that while (2.6.6)
seems to depend on both T and ω0 , the solution depends only on the product T · ω0 .

2.6.3 Short-Time Fourier Transform


To achieve a “local” Fourier transform, one can define a windowed Fourier trans-
form. The signal is first multiplied by a window function w(t−τ ) and then the usual
Fourier transform is taken. This results in a two-indexed transform, ST F Tf (ω, τ ),
given by  ∞
ST F Tf (ω, τ ) = w∗ (t − τ ) f (t)e−jωt dt.
−∞

That is, one measures the similarity between the signal and shifts and modulates
of an elementary window, or

ST F Tf (ω, τ ) = gω,τ (t), f (t),

where
gω,τ (t) = w(t − τ )ejωt .
Thus, each elementary function used in the expansion has the same time and fre-
quency resolution, simply a different location in the time-frequency plane. It is
82 CHAPTER 2

t
(a) (b)

(c) (d)

Figure 2.12 The short-time Fourier and wavelet transforms. (a) Modulates
fig2.5.4
and shifts of a Gaussian window usedFIGURE 2.13
in the expansion. (b) Tiling of the time-
frequency plane. (c) Shifts and scales of the prototype bandpass wavelet. (d)
Tiling of the time-frequency plane.

thus natural to discretize the STFT on a rectangular grid (mω0 , nτ0 ). If the win-
dow function is a lowpass filter with a cutoff frequency of ωb , or a bandwidth of
2ωb , then ω0 is chosen smaller than 2ωb and τ0 smaller than π/ωb in order to get an
adequate sampling. Typically, the STFT is actually oversampled. A more detailed
discussion of the sampling of the STFT is given in Section 5.2, where the inversion
formula is also given. A real-valued version of the STFT, using cosine modulation
and an appropriate window, leads to orthonormal bases, which are discussed in
Section 4.8.
Examples of STFT basis functions and the tiling of the time-frequency plane
are given in Figures 2.12(a) and (b). To achieve good time-frequency resolution, a
Gaussian window (see (2.6.5)) can be used, as originally proposed by Gabor [102].
Thus, the STFT is often called Gabor transform as well.
The spectrogram is the energy distribution associated with the STFT, that is,

S(ω, τ ) = |ST F T (ω, τ )|2 . (2.6.7)


2.6. TIME-FREQUENCY REPRESENTATIONS 83

Because the STFT can be thought of as a bank of filters with impulse responses
gω,τ (−t) = w(−t − τ ) e−jωτ , the spectrogram is the magnitude squared of the filter
outputs.

2.6.4 Wavelet Transform


Instead of shifts and modulates of a prototype function, one can choose shifts and
scales, and obtain a constant relative bandwidth analysis known as the wavelet
transform. To achieve this, take a real bandpass filter with impulse response ψ(t)
and zero mean  ∞
ψ(t) dt = Ψ(0) = 0.
−∞

Then, define the continuous wavelet transform as


  
1 ∗ t−b
CW Tf (a, b) = √ ψ f (t) dt, (2.6.8)
a R a

where a ∈ R+ and b ∈ R. That is, we measure the similarity between the signal
f (t) and shifts and scales of an elementary function, since

CW Tf (a, b) = ψa,b (t), f (t),

where  
1 t−b
ψa,b (t) = √ ψ
a a

and the factor 1/ a is used to conserve the norm. Now, the functions used in
the expansion have changing time-frequency tiles because of the scaling. For small
a (a < 1), ψa,b (t) will be short and of high frequency, while for large a (a > 1),
ψa,b (t) will be long and of low frequency. Thus, a natural discretization will use
large time steps for large a, and conversely, choose fine time steps for small a. The
discretization of (a, b) is then of the form (an0 , an0 · τ0 ), and leads to functions for the
expansion as shown in Figure 2.12(c). The resulting tiling of the time-frequency
plane is shown in Figure 2.12(d) (the case a = 2 is shown). Special choices for
ψ(t) and the discretization lead to orthonormal bases or wavelet series as studied
in Chapter 4, while the overcomplete, continuous wavelet transform in (2.6.8) is
discussed in Section 5.1.

2.6.5 Block Transforms


An easy way to obtain a time-frequency representation is to slice the signal into
nonoverlapping adjacent blocks and expand each block independently. For example,
this can be done using a window function on the signal which is the indicator
84 CHAPTER 2

function of the interval [nT, (n+1)T ), periodizing each windowed signal with period
T and applying an expansion such as the Fourier series on each periodized signal (see
Section 4.1.2). Of course, the arbitrary segmentation at points nT creates artificial
boundary problems. Yet, such transforms are used due to their simplicity. For
example, in discrete time, block transforms such as the Karhunen-Loève transform
(see Section 7.1.1) and its approximations are quite popular.

2.6.6 Wigner-Ville Distribution


An alternative to linear expansions of signals are bilinear expansions, of which the
Wigner-Ville distribution is the most well-known [53, 59, 135].
Bilinear or quadratic time-frequency representations are motivated by the idea
of an “instantaneous power spectrum”, of which the spectrogram (see (2.6.7)) is
a possible example. In addition, the time-frequency distribution T F Df (ω, τ ) of
a signal f (t) with Fourier transform F (ω) should satisfy the following marginal
properties: Its integral along τ given ω should equal |F (ω)|2 , and its integral along
ω given τ should equal |f (τ )|2 . Also, time-frequency shift invariance is desirable,
that is, if g(t) = f (t − τ0 )ejω0 t , then

T F Dg (ω, τ ) = T DFf (ω − ω0 , τ − τ0 ).

The Wigner-Ville distribution satisfies the above requirements, as well as several


other desirable ones [135]. It is defined, for a signal f (t), as
 ∞
W Df (ω, τ ) = f (τ + t/2) f ∗ (τ − t/2) e−jωt dt. (2.6.9)
−∞

A related distribution is the ambiguity function [216], which is dual to (2.6.9)


through a two-dimensional Fourier transform.
The attractive feature of time-frequency distributions such as the Wigner-Ville
distribution above is the possible improved time-frequency resolution. For signals
with a single time-frequency component (such as a linear chirp signal), the Wigner-
Ville distribution gives a very clear and concentrated energy ridge in the time-
frequency plane.
However, the increased resolution for single component signals comes at a price
for multicomponent signals, with the appearance of cross terms or interferences. If
there are N components in the signal, there $Nwill
% be N signal terms and one cross
term for each pair of components, that is, 2 or N (N − 1)/2 cross terms. While
these interferences can be smoothed, this smoothing will come at the price of some
resolution loss. In any case, the interference patterns make it difficult to visually
interpret quadratic time-frequency distributions of complex signals.
2.A. BOUNDED LINEAR OPERATORS ON HILBERT SPACES 85

APPENDIX 2.A B OUNDED L INEAR O PERATORS ON H ILBERT S PACES

D EFINITION 2.8
An operator A which maps one Hilbert space H1 into another Hilbert space
H2 (which may be the same) is called a linear operator if for all x, y in H1
and α in C

(a) A(x + y) = Ax + Ay.

(b) A(αx) = αAx.

The norm of A, denoted by A, is given by

A = sup Ax.


x=1

A linear operator A : H1 → H2 is called bounded if

sup Ax < ∞.


x≤1

An important property of bounded linear operators is that they are continuous,


that is, if xn → x then Axn → Ax. An example of a bounded operator is the
multiplication operator in l2 (Z), defined as

Ax[n] = m[n] x[n],

where m[n] ∈ l∞ (Z). Because



Ax2 = (m[n])2 (x[n])2 ≤ max(m[n])2 x2 ,
n

the operator is bounded. A bounded linear operator A : H1 → H2 is called invertible


if there exists a bounded linear operator A−1 : H2 → H1 such that

A−1 Ax = x, for every x in H1 ,


AA−1 y = y, for every y in H2 .

The operator A−1 is called the inverse of A. An important result is the following:
Suppose A is a bounded linear operator mapping H onto itself, and A < 1. Then
I − A is invertible, and for every y in H,


−1
(I − A) y = Ak y. (2.A.1)
k=0
86 CHAPTER 2

Note that although the above expansion has the same form for a scalar as well
as an operator, one should not forget the distinction between the two. Another
important notion is that of an adjoint operator.15 It can be shown that for every x
in H1 and y in H2 , there exists a unique y ∗ from H1 , such that

Ax, yH2 = x, y ∗ H1 = x, A∗ yH1 . (2.A.2)

The operator A∗ : H2 → H1 defined by A∗ y = y ∗ , is the adjoint of A. Note that A∗


is also linear and bounded, and that A = A∗ . If H2 = H1 and A = A∗ , then A
is called a self-adjoint or hermitian operator.
Finally, an important type of operators are projection operators. Given a closed
subspace S of a Hilbert space E, an operator P is called an orthogonal projection
onto S if
P (v + w) = v for all v ∈ S and w ∈ S ⊥ .
It can be shown that an operator is an orthogonal projection if and only if P 2 = P
and P is self-adjoint.
Let us now show how we can associate a possibly infinite matrix16 with a given
bounded linear operator on a Hilbert space. Given is a bounded linear operator A
H with the orthonormal basis {xi }. Then any x from H can be
on a Hilbert space
written as x = i xi , xxi , and
 
Ax = xi , xAxi , Axi = xk , Axi xk .
i k

Similarly, writing y = i xi , yxi , we can write Ax = y as
⎛ ⎞⎛ ⎞ ⎛ ⎞
x1 , Ax1  x1 , Ax2  . . . x1 , x x1 , y
⎝ x2 , Ax1  x2 , Ax2  . . . ⎠ ⎝ x2 , x ⎠ = ⎝ x2 , y ⎠ ,
.. .. .. ..
. . . .
or, in other words, the matrix {aij } corresponding to the operator A expressed
with respect to the basis {xi } is defined by aij = xi , Axj .

APPENDIX 2.B PARAMETRIZATION OF U NITARY M ATRICES


Our aim in this appendix is to show two ways of factoring real, n × n, unitary
matrices, namely using Givens rotations and Householder building blocks. We
concentrate here on real, square matrices, since these are the ones we will be using
in Chapter 3. The treatment here is fairly brisk; for a more detailed, yet succinct
account of these two factorizations, see [308].
15
In the case of matrices, the adjoint is the hermitian transpose.
16
To be consistent with our notation throughout the book, in this context, matrices will be
denoted by capital bold letters, while vectors will be denoted by lower-case bold letters.
2.B. PARAMETRIZATION OF UNITARY MATRICES 87

±1
± 1 U1
•••
(a) U2

•••
±1 Un-2

•••

•••

•••
Un-1
±1
•••

±1
•••

Un

(b) •••

•••

•••
•••

•••
•••

•••

•••

Ui

Figure 2.13 Unitary matrices. (a) FIGURE 2.11


Factorization unitary, n × n
of a real, fignew2.a.1
matrix. (b) The structure of the block U i .

2.B.1 Givens Rotations

Recall that a real, n × n, unitary matrix U satisfies (2.3.6). We want to show


that such a matrix can be factored as in Figure 2.13, where each cross in part (b)
represents a Givens (planar) rotation

 
cos α − sin α
Gα = . (2.B.1)
sin α cos α

The way to demonstrate this is to show that any real, unitary n × n matrix U n can
be expressed as
 
U n−1 0
U n = Rn−2 · · · R0 , (2.B.2)
0 ±1
88 CHAPTER 2

where U n−1 is an (n − 1) × (n − 1), real, unitary matrix, and Ri is of the following


form: ⎛ ⎞
1 ... 0 0 0 ... 0 0
⎜ .. .. .. .. .. .. .. .. ⎟
⎜ . . . . . . . . ⎟
⎜ ⎟
⎜ 0 ... 1 0 0 ... 0 0 ⎟
⎜ ⎟
⎜ 0 . . . 0 cos αi 0 . . . 0 − sin αi ⎟
Ri = ⎜ ⎜ ⎟,

⎜ 0 ... 0 0 1 ... 0 0 ⎟
⎜ .. .. .. .. .. .. .. .. ⎟
⎜ . . . . . . . . ⎟
⎜ ⎟
⎝ 0 ... 0 0 0 ... 1 0 ⎠
0 . . . 0 sin αi 0 . . . 0 cos αi
that is, we have a planar rotation in rows (i − 1) and n. By repeating the process
on the matrix U n−1 , we obtain the factorization as in Figure 2.13. The proof that
any real, unitary matrix can be written as in (2.B.2) can be found in [308]. Note
that the number of free variables (angles in Givens rotations) is n(n − 1)/2.

2.B.2 Householder Building Blocks


A unitary matrix can be factored in terms of Householder building blocks, where
each block has the form I − 2 · uuT , and u is a unitary vector. Thus, an n × n
unitary matrix U can be written as

U = c H 1 · · · H n−1 · D, (2.B.3)

where D is diagonal with dii = ejθi , and H i are Householder blocks I − 2ui uTi .
The fact that we mention the Householder factorization here is because we will
use its polynomial version to factor lossless matrices in Chapter 3.
Note that the Householder building block is unitary, and that the factorization
in (2.B.3) can be proved similarly to the factorization using Givens rotations. That
is, we can first show that
 jα 
1 e 0 0
√ H 1U = ,
c 0 U1

where U 1 is an (n−1)×(n−1) unitary matrix. Repeating the process on U 1 , U 2 , . . . ,


we finally obtain
1
√ H n−1 . . . H 1 U = D,
c
but since H i = H −1
i , we obtain (2.B.3).
2.C. CONVERGENCE AND REGULARITY OF FUNCTIONS 89

APPENDIX 2.C C ONVERGENCE AND R EGULARITY OF F UNCTIONS

In Section 2.4.3, when discussing Fourier series, we pointed out possible convergence
problems such as the Gibbs phenomenon. In this appendix, we first review different
types of convergence and then discuss briefly some convergence properties of Fourier
series and transforms. Then, we discuss regularity of functions and the associated
decay of the Fourier series and transforms. More details on these topics can be
found for example in [46, 326].

2.C.1 Convergence

Pointwise Convergence Given an infinite sequence of functions {fn }∞ n=1 , we say


that it converges pointwise to a limit function f = limn→∞ fn if for each value of t
we have
lim fn (t) = f (t).
n→∞

This is a relatively weak form of convergence, since certain properties of fn (t), such
as continuity, are not passed on to the limit. Consider the truncated Fourier series,
that is (from (2.4.13))
n
fn (t) = F [k] ejkwot . (2.C.1)
k=−n

This Fourier series converges pointwise for all t when F [k] are the Fourier coefficients
(see (2.4.14)) of a piecewise smooth17 function f (t). Note that while each fn (t) is
continuous, the limit need not be.

Uniform Convergence An infinite sequence of functions {fn }∞ n=1 converges uni-


formly to a limit f (t) on a closed interval [a, b] if (i) the sequence converges pointwise
on [a, b] and (ii) given any  > 0, there exists an integer N such that for n > N ,
fn (t) satisfies |f (t) − fn (t)| <  for all t in [a, b].
Uniform convergence is obviously stronger than pointwise convergence. For
example, uniform convergence of the truncated Fourier series (2.C.1) implies con-
tinuity of the limit, and conversely, continuous piecewise smooth functions have
uniformly convergent Fourier series [326]. An example of pointwise convergence
without uniform convergence is the Fourier series of piecewise smooth but discon-
tinuous functions and the associated Gibbs phenomenon around discontinuities.

17
A piecewise smooth function on an interval is piecewise continuous (finite number of disconti-
nuities) and its derivative is also piecewise continuous.
90 CHAPTER 2

Mean Square Convergence An infinite sequence of functions {fn }∞


n=1 converges
in the mean square sense to a limit f (t) if

lim f − fn 2 = 0.
n→∞

Note that this does not mean that limn→∞ fn = f for all t, but only almost ev-
erywhere. For example, the truncated Fourier series (2.C.1) of a piecewise smooth
function converges in the mean square sense to f (t) when F [k] are the Fourier se-
ries coefficients of f (t), even though at a point of discontinuity t0 , f (t0 ) might be
different from limn→∞ fn (t0 ) which equals the mean of the right and left limits.
In the case of the Fourier transform, the concept analogous to the truncated
Fourier series (2.C.1) is the truncated integral defined from the Fourier inversion
formula (2.4.2) as
 c
1
fc (t) = F (ω) ejωt dω
2π −c
where F (ω) is the Fourier transform of f (t) (see (2.4.1)). The convergence of the
above integral as c → ∞ is an important question, since the limit limc→∞ fc (t)
might not equal f (t). Under suitable restrictions on f (t), equality will hold. As an
example, if f (t) is piecewise smooth and absolutely integrable, then limc→∞ fc (t0 ) =
f (t0 ) at each point of continuity and is equal to the mean of the left and right limits
at discontinuity points [326].

2.C.2 Regularity
So far, we have mostly discussed functions satisfying some integral conditions (abso-
lutely or square-integrable functions for example). Instead, regularity is concerned
with differentiability. The space of continuous functions is called C 0 , and similarly,
C n is the space of functions having n continuous derivatives.
A finer analysis is obtained using Lipschitz (or Hölder) exponents. A function
f is called Lipschitz of order α, 0 < α ≤ 1, if for any t and some small , we have

|f (t) − f (t + )| ≤ c||α . (2.C.2)

Higher orders r = n + α can be obtained by replacing f with its nth derivative.


This defines Hölder spaces of order r. Note that condition (2.C.2) for α = 1 is
weaker than differentiability. For example, the triangle function or linear spline
f (t) = 1 − |t|, t ∈ [0, 1], and 0 otherwise is Lipschitz of order 1 but only C 0 .
How does regularity manifest itself in the Fourier domain? Since differentiation
amounts to a multiplication by (jω) in Fourier domain (see (2.4.6)), existence of
derivatives is related to sufficient decay of the Fourier spectrum.
2.C. CONVERGENCE AND REGULARITY OF FUNCTIONS 91

It can be shown (see [216]) that if a function f (t) and all its derivatives up
to order n exist and are of bounded variation, then the Fourier transform can be
bounded by
c
F (ω) ≤ , (2.C.3)
1 + |ω|n+1
that is, it decays as O(1/|ω|n+1 ) for large ω. Conversely, if F (ω) has a decay as in
(2.C.3), then f (t) has n−1 continuous derivatives, and the nth derivative exists but
might be discontinuous. A finer analysis of regularity and associated localization in
Fourier domain can be found in [241], in particular for functions in Hölder spaces
and using different norms in Fourier domain.
92 CHAPTER 2

P ROBLEMS
2.1 Legendre polynomials: Consider the interval [−1, 1] and the vectors 1, t, t2 , t3 , . . .. Using
Gram-Schmidt orthogonalization, find an equivalent orthonormal set.

2.2 Prove Theorem 2.4, parts (a), (b), (d), (e), for finite-dimensional Hilbert spaces, Rn or C n .

2.3 Orthogonal transforms and l∞ norm: Orthogonal transforms conserve the l2 norm, but not
others, in general. The l∞ norm of a vector is defined as (assume v ∈ Rn ):

l∞ [v] = max |vi |.


i=0,...,n−1

(a) Consider n = 2 and the set of real orthogonal transforms T2 , that is, plane rotations.
Given the set of vectors v with unit l2 norm (that is, vectors on the unit circle), give
lower and upper bounds such that

a2 ≤ l∞ [T2 · v] ≤ b2 .

(b) Give the lower and upper bounds for the general case n > 2, that is, an and bn .

2.4 Norm of operators: Consider operators that map l2 (Z) to itself, and indicate their norm,
or bounds on their norm.

(a) (Ax)[n] = m[n] · x[n], m[n] = ejΘn , n ∈ Z.


(b) (Ax)[2n] = x[2n] + x[2n + 1], (Ax)[2n + 1] = x[2n] − x[2n + 1], n ∈ Z.

2.5 Assume a finite-dimensional space RN and an orthonormal basis {x1 , x2 , . . . , xN }. Any


vector y can thus be written as y = i αi x i where αi = xi , y . Consider the best
approximation to y in the least-squares sense and living on the subspace spanned by the
first K vectors, {x1 , x2 , . . . , xK }, or ŷ = K i=1 βi xi . Prove that βi = αi for i = 1, . . . , K,
by showing that it minimizes y − ŷ. Hint: Use Parseval’s equality.

2.6 Least-squares solution: Show that for the least-squares solution obtained in Section 2.3.2,
the partial derivatives ∂(|y − ŷ|2 )/∂ x̂i are all zero.

2.7 Least-squares solution to a linear system of equations: The general solution was given in
Equation (2.3.4–2.3.5).

(a) Show that if y belongs to the column space of A, then ŷ = y.


(b) Show that if y is orthogonal to the column space of A, then ŷ = 0.

2.8 Parseval’s formulas can be proven by using orthogonality and biorthogonality relations of
the basis vectors.

(a) Show relations (2.2.5–2.2.6) using the orthogonality of the basis vectors.
(b) Show relations (2.2.11–2.2.13) using the biorthogonality of the basis vectors.
PROBLEMS 93

2.9 Consider the space of square-integrable real functions on the interval [−π, π], L2 ([−π, π]),
and the associated orthonormal basis given by
 &
1 cos nx sin nx
√ , √ , √ , n = 1, 2, . . .
2π π π

Consider the following two subspaces: S – space of symmetric functions, that is, f (x) =
f (−x), on [−π, π], and A – space of antisymmetric functions, f (x) = −f (−x), on [−π, π].

(a) Show how any function f (x) from L2 ([−π, π]) can be written as f (x) = fs (x) + fa (x),
where fs (x) ∈ S and fa (x) ∈ A.
(b) Give orthonormal bases for S and A.
(c) Verify that L2 ([−π, π]) = S ⊕ A.

2.10 Downsampling by N : Prove (2.5.13) by going back to the underlying time-domain signal
and resampling it with an N -times longer sampling period. That is, consider x[n] and
y[n] = x[nN ] as two sampled versions of the same continuous-time signal, with sampling
periods T and N T , respectively. Hint: Recall that the discrete-time Fourier transform
X(ejω ) of x[n] is (see (2.4.36))
 
1 

ω ω 2π
X(ejω ) = XT ( ) = XC −k ,
T T T T
k=−∞

where T is the sampling period. Then Y (ejω ) = XNT (ω/N T ) (since the sampling period
is now N T ), where XNT (ω/N T ) can be written similarly to the above equation. Finally,
split the sum involved in XNT (ω/N T ) into k = nN + l, and gathering terms, (2.5.13) will
follow.

2.11 Downsampling and aliasing: If an arbitrary discrete-time sequence x[n] is input to a filter
followed by downsampling by 2, we know that an ideal half-band lowpass filter (that is,
|H(ejω )| = 1, |ω| < π/2, and H(ejω ) = 0, π/2 ≤ |ω| ≤ π) will avoid aliasing.

(a) Show that H  (ejω ) = H(ej2ω ) will also avoid aliasing.


(b) Same for H  (ejω ) = H(ej(2ω−π) ).
(c) A two-channel system using H(ejω ) and H(ej(ω−π) ) followed by downsampling by
2 will keep all parts of the input spectrum untouched in either channel (except at
ω = π/2). Show that this is also true if H  (ejω ) and H  (ejω ) are used instead.

2.12 In pattern recognition, it is sometimes useful to expand a signal using the desired pattern,
or template, and its shifts, as basis functions. For simplicity, consider a signal of length N ,
x[n], n = 0, . . . , N − 1, and a pattern p[n], n = 0, . . . , N − 1. Then, choose as basis functions

ϕk [n] = p[(n − k) mod N ], k = 0, . . . , N − 1,

that is, circular shifts of p[n].

(a) Derive a simple condition on p[n], so that any x[n] can be written as a linear combi-
nation of {ϕk }.
94 CHAPTER 2

(b) Assuming the previous condition is met, give the coefficients αk of the expansion


N−1
x[n] = αk ϕk [n].
k=0

2.13 Show that a linear, periodically time-varying system of period N can be implemented with
a polyphase transform followed by upsampling by N , N filter operations and a summation.

2.14 Interpolation of oversampled signals: Assume a function f (t) bandlimited to ωm = π. If


the sampling frequency is chosen at the Nyquist rate, ωs = 2π, the interpolation filter is
the usual sinc filter with slow decay (∼ 1/t). If f (t) is oversampled, for example, with
ωs = 3π, then filters with faster decay can be used for interpolating f (t) from its samples.
Such filters are obtained by convolving (in frequency) elementary rectangular filters (two
for H2 (ω), three for H3 (ω), while H1 (ω) would be the usual sinc filter).

(a) Give the expression for h2 (t), and verify that it decays as 1/t2 .
(b) Same for h3 (t), which decays as 1/t3 . Show that H3 (ω) has a continuous derivative.
(c) By generalizing the construction above of H2 (ω) and H3 (ω), show that one can obtain
hi (t) with decay 1/ti . Also, show that Hi (ω) has a continuous (i − 2)th derivative.
However, the filters involved become spread out in time, and the result is only inter-
esting asymptotically.

2.15 Uncertainty relation: Consider the uncertainty relation Δ2ω Δ2t ≥ π/2.
2 2
√ does not change Δω · Δt . Either use scaling that conserves the L2
(a) Show that scaling
norm (f  (t) = af (at)) or be sure to renormalize Δ2ω , Δ2t .
(b) Can you give the time-bandwidth product of a rectangular pulse, p(t) = 1, −1/2 ≤
t ≤ 1/2, and 0 otherwise?
(c) Same as above, but for a triangular pulse.
(d) What can you say about the time-bandwidth product as the time-domain function is
obtained from convolving more and more rectangular pulse with themselves?

2.16 Consider allpass filters where


" a∗i + z −1
H(z) = .
i
1 + ai z −1

(a) Assume the filter has real coefficients. Show pole-zero locations, and that numerator
and denominator polynomials are mirrors of each other.
(b) Given h[n], the causal, real-coefficient
impulse response of a stable allpass filter, give
its autocorrelation a[k] = n h[n]h[n − k]. Show that the set {h[n − k]}, k ∈ Z, is an
orthonormal basis for l2 (Z). Hint: Use Theorem 2.4.
(c) Show that the set {h[n − 2k]} is an orthonormal set but not a basis for l2 (Z).

2.17 Parseval’s relation for nonorthogonal bases: Consider the space V = Rn and a biorthogonal
basis, that is, two sets {αi } and {βi } such that

αi , βi = δ[i − j] i, j = 0, . . . , n − 1
PROBLEMS 95

(a) Show that any vector v ∈ V can be written in the following two ways:


n−1 
n−1
v = αi , v βi = βi , v αi
i=0 i=0

(b) Call vα the vector with entries αi , v and similarly vβ with entries βi , v . Given v,
what can you say about vα  and vβ ?
(c) Show that the generalization of Parseval’s identity to biorthogonal systems is

v2 = v, v = vα , vβ

and
v, g = vα , gβ .

2.18 Circulant matrices: An N × N circulant matrix C is defined by its first line, since subse-
quent lines are obtained by a right circular shift. Denote the first line by {c0 , cN−1 , . . . , c1 }
so that C corresponds to a circular convolution with a filter having impulse response
{c0 , c1 , c2 , . . . , cN−1 }.

(a) Give a simple test for the singularity of C.


(b) Give a formula for det(C).
(c) Prove that C −1 is circulant.
(d) Show that C 1 C 2 = C 2 C 1 and that the result is circulant.

2.19 Walsh basis: To define the Walsh basis, we need the Kronecker product of matrices defined
in (2.3.2). Then, the matrix W k , of size 2k × 2k , is
   
1 1 1 1
Wk = ⊗ W k−1 , W 0 = [1], W1 = .
1 −1 1 −1

(a) Give W 2 , W 3 and W 4 (last one only partially).


(b) Show that W k is orthonormal (within a scale factor you should indicate).
(c) Create a block matrix T
⎡ ⎤
W0 √
⎢ 1/ 2W 1 ⎥
⎢ ⎥
⎢ 1/2W 2 ⎥
T = ⎢ ⎥,
⎢ 1/23/2 W 3 ⎥
⎣ ⎦
..
.

and show that T is unitary. Sketch the upper left corner of T .


(d) Consider the rows of T as basis functions in an orthonormal expansion of l2 (Z + )
(right-sided sequences). Sketch the tiling of the time-frequency plane achieved by this
expansion.
96 CHAPTER 2
3

Discrete-Time Bases and Filter Banks

“What is more beautiful than the Quincunx,


which, from whatever direction you look,
is correct?”
— Quintilian

Our focus in this chapter will be directed to series expansions of discrete-time


sequences. The reasons for expanding signals, discussed in Chapter 1, are linked
to signal analysis, approximation and compression, as well as algorithms and im-
plementations. Thus, given an arbitrary sequence x[n], we would like to write it
as 
x[n] = ϕk , x ϕk [n], n ∈ Z.
k∈Z

Therefore, we would like to construct orthonormal sets of basis functions, {ϕk [n]},
which are complete in the space of square-summable sequences, l2 (Z). More general,
biorthogonal and overcomplete sets, will be considered as well.
The discrete-time Fourier series, seen in Chapter 2, is an example of such an
orthogonal series expansion, but it has a number of shortcomings. Discrete-time
bases better suited for signal processing tasks will try to satisfy two conflicting
requirements, namely to achieve good frequency resolution while keeping good time
locality as well. Additionally, for both practical and computational reasons, the set
of basis functions has to be structured. Typically, the infinite set of basis functions
{ϕk } is obtained from a finite number of prototype sequences and their shifted
versions in time. This leads to discrete-time filter banks for the implementation of

97
98 CHAPTER 3

such structured expansions. This filter bank point of view has been central to the
developments in the digital signal processing community, and to the design of good
basis functions or filters in particular. While the expansion is not time-invariant,
it will at least be periodically time-invariant. Also, the expansions will often have
a successive approximation property. This means that a reconstruction based on
an appropriate subset of the basis functions leads to a good approximation of the
signal, which is an important feature for applications such as signal compression.
Linear signal expansions have been used in digital signal processing since at
least the 1960’s, mainly as block transforms, such as piecewise Fourier series and
Karhunen-Loève transforms [143]. They have also been used as overcomplete ex-
pansions, such as the short-time Fourier transform (STFT) for signal analysis and
synthesis [8, 226] and in transmultiplexers [25]. Increased interest in the subject,
especially in orthogonal and biorthogonal bases, arose with work on compression,
where redundancy of the expansion such as in the STFT is avoided. In particular,
subband coding of speech [68, 69] spurred a detailed study of critically sampled
filter banks. The discovery of quadrature mirror filters (QMF) by Croisier, Esteban
and Galand in 1976 [69], which allows a signal to be split into two downsampled
subband signals and then reconstructed without aliasing (spectral foldbacks) even
though nonideal filters are used, was a key step forward.
Perfect reconstruction filter banks, that is, subband decompositions, where the
signal is a perfect replica of the input, followed soon. The first orthogonal solution
was discovered by Smith and Barnwell [270, 271] and Mintzer [196] for the two-
channel case. Fettweiss and coworkers [98] gave an orthogonal solution related
to wave digital filters [97]. Vaidyanathan, who established the relation between
these results and certain unitary operators (paraunitary matrices of polynomials)
studied in circuit theory [23], gave more general orthogonal solutions [305, 306]
as well as lattice factorizations for orthogonal filter banks [308, 310]. Biorthogonal
solutions were given by Vetterli [315], as well as multidimensional quadrature mirror
filters [314]. Biorthogonal filter banks, in particular with linear phase filters, were
investigated in [208, 321] and multidimensional filter banks were further studied in
[155, 163, 257, 264, 325]. Recent work includes filter banks with rational sampling
factors [166, 206] and filter banks with block sampling [158]. Additional work on
the design of filter banks has been done in [144, 205] among others.
In parallel to this work on filter banks, a generalization of block transforms
called lapped orthogonal transforms (LOT’s) was derived by Cassereau [43] and
Malvar [186, 188, 189]. An attractive feature of a subclass of LOT’s is the existence
of fast algorithms for their implementation since they are modulated filter banks
(similar to a “real” STFT). The connection of LOT’s with filter banks was shown,
in [321].
99

Another development, which happened independently of filter banks but turns


out to be closely related, is the pyramid decomposition of Burt and Adelson [41].
While it is oversampled (overcomplete), it clearly uses multiresolution concepts, by
decomposing a signal into a coarse approximation plus added details. This frame-
work is central to wavelet decompositions and establishes conceptually the link be-
tween filter banks and wavelets, as shown by Mallat [179, 180, 181] and Daubechies
[71, 73]. This connection has led to a renewed interest in filter banks, especially
with the work of Daubechies who first constructed wavelets from filter banks [71]
and Mallat who showed that a wavelet series expansion could be implemented with
filter banks [181]. Recent work on this topic includes [117, 240, 319].
As can be seen from the above short historical discussion, there are two different
points of view on the subject, namely, expansion of signals in terms of structured
bases, and perfect reconstruction filter banks. While the two are equivalent, the
former is more in tune with Fourier and wavelet theory, while the latter is central
to the construction of implementable systems. In what follows, we use both points
of view, using whichever is more appropriate to explain the material.
The outline of the chapter is as follows: First, we review discrete-time series
expansions, and consider two cases in some detail, namely the Haar and the sinc
bases. They are two extreme cases of two-channel filter banks. The general two-
channel filter bank is studied in detail in Section 3.2, where both the expansion and
the more traditional filter bank point of view are given. The orthogonal case with
finite-length basis functions or finite impulse response (FIR) filters is thoroughly
studied. The biorthogonal FIR case, in particular with linear phase filters (sym-
metric or antisymmetric basis functions), is considered, and the infinite impulse
response (IIR) filter case (which corresponds to basis functions with exponential
decay) is given as well.
In Section 3.3, the study of filter banks with more than two channels starts
with tree-structured filter banks. In particular, a constant relative bandwidth
(or constant-Q) tree is shown to compute a discrete-time wavelet series. Such a
transform has a multiresolution property that provides an important framework for
wavelet transforms. More general filter bank trees, also known as wavelet packets,
are presented as well.
Filter banks with N channels are treated next. The two particular cases of block
transforms and lapped orthogonal transforms are discussed first, leading to the
analysis of general N -channel filter banks. An important case, namely modulated
filter banks, is studied in detail, both because of its relation to short-time Fourier-
like expansions, and because of its computational efficiency.
Overcomplete discrete-time expansions are discussed in Section 3.5. The pyra-
mid decomposition is studied, as well as the classic overlap-add/save algorithm for
convolution computation which is a filter bank algorithm.
100 CHAPTER 3

Multidimensional expansions and filter banks are derived in Section 3.6. Both
separable and nonseparable systems are considered. In the nonseparable case, the
focus is mostly on two-channel decompositions, while more general cases are indi-
cated as well.
Section 3.7 discusses a scheme that has received less attention in the filter bank
literature, but is nonetheless very important in applications, and is called a trans-
multiplexer. It is dual to the analysis/synthesis scheme used in compression appli-
cations, and is used in telecommunications.
The two appendices contain more details on orthogonal solutions and their fac-
torizations as well as on multidimensional sampling.
The material in this chapter covers filter banks at a level of detail which is
adequate for the remainder of the book. For a more exhaustive treatment of filter
banks, we refer the reader to the text by Vaidyanathan [308]. Discussions of fil-
ter banks and multiresolution signal processing are also contained in the book by
Akansu and Haddad [3].

3.1 S ERIES E XPANSIONS OF D ISCRETE -T IME S IGNALS


We start by recalling some general properties of discrete-time expansions. Then, we
discuss a very simple structured expansion called the Haar expansion, and give its
filter bank implementation. The dual of the Haar expansion — the sinc expansion —
is examined as well. These two examples are extreme cases of filter bank expansions
and set the stage for solutions that lie in between.
Discrete-time series expansions come in various flavors, which we briefly review
(see also Sections 2.2.3–2.2.5). As usual, x[n] is an arbitrary square-summable
sequence, or x[n] ∈ l2 (Z). First, orthonormal expansions of signals x[n] from l2 (Z)
are of the form
 
x[n] = ϕk [l], x[l] ϕk [n] = X[k] ϕk [n], (3.1.1)
k∈Z k∈Z

where 
X[k] = ϕk [l], x[l] = ϕ∗k [l] x[l], (3.1.2)
l

is the transform of x[n]. The basis functions ϕk satisfy the orthonormality1 con-
straint
ϕk [n], ϕl [n] = δ[k − l]
1
The first constraint is orthogonality between basis vectors. Then, normalization leads to
orthonormality. The terms “orthogonal” and “orthonormal” will often be used interchangeably,
unless we want to insist on the normalization and then use the latter.
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 101

and the set of basis functions is complete, so that every signal from l2 (Z) can
be expressed using (3.1.1). An important property of orthonormal expansions is
conservation of energy,
x2 = X2 .
Biorthogonal expansions, on the other hand, are given as
 
x[n] = ϕk [l], x[l] ϕ̃k [n] = X̃[k] ϕ̃k [n], (3.1.3)
k∈Z k∈Z
 
= ϕ̃k [l], x[l] ϕk [n] = X[k] ϕk [n],
k∈Z k∈Z

where
X̃[k] = ϕk [l], x[l] and X[k] = ϕ̃k [l], x[l]
are the transform coefficients of x[n] with respect to {ϕ̃k } and {ϕk }. The dual bases
{ϕk } and {ϕ̃k } satisfy the biorthogonality constraint

ϕk [n], ϕ̃l [n] = δ[k − l].

Note that in this case, conservation of energy does not hold. For stability of the
expansion, the transform coefficients have to satisfy
 
A |X[k]|2 ≤ x2 ≤ B |X[k]|2
k k

with a similar relation for the coefficients X̃[k]. In the biorthogonal case, conserva-
tion of energy can be expressed as

x2 = X[k], X̃ [k].

Finally, overcomplete expansions can be of the form (3.1.1) or (3.1.3), but with
redundant sets of functions, that is, the functions ϕk [n] used in the expansions are
not linearly independent.

3.1.1 Discrete-Time Fourier Series


The discrete-time Fourier transform (see also Section 2.4.6) is given by
 π
1
x[n] = X(ω) ejωn dw (3.1.4)
2π −π


X(ω) = x[n] e−jωn . (3.1.5)
n=−∞
102 CHAPTER 3

It is a series expansion of the 2π-periodic function X(ω) as given by (3.1.5), while


x[n] is written in terms of an integral of the continuous-time function X(ω). While
this is an important tool in the analysis of discrete-time signals and systems [211],
the fact that the synthesis of x[n] given by (3.1.4) involves integration rather than
series expansion, makes it of limited practical use. An example of a series expansion
is the discrete-time Fourier series
N −1
1 
x[n] = X[k] ej2πkn/N , (3.1.6)
N
k=0

N −1
X[k] = x[n] e−j2πkn/N ,
n=0

where x[n] is either periodic (n ∈ Z) or of finite length (n = 0, 1, . . . , N − 1). In


the latter case, the above is often called the discrete Fourier transform (DFT).
Because it only applies to such restricted types of signals, the Fourier series
is somewhat limited in its applications. Since the basis functions are complex
exponentials  1 j2πkn/N
Ne n = 0, 1, . . . , N − 1,
ϕk [n] =
0 otherwise,
for the finite-length case (or the periodic extension in the periodic case), there is no
decay of the basis function
√ over the length-N window, that is, no time localization
(note that ϕk  = 1/ N in the above definition).
In order to expand arbitrary sequences we can segment the signal, and obtain a
piecewise Fourier series (one for each segment). Simply segment the sequence x[n]
into subsequences x(i) [n] such that

(i) x[n] n = i N + l, l = 0, 1, . . . , N − 1, i ∈ Z,
x [n] = (3.1.7)
0 otherwise,

and take the discrete Fourier transform of each subsequence independently,


N −1
(i)
X [k] = x(i) [iN + l] e−j2πkl/N k = 0, 1, . . . , N − 1. (3.1.8)
l=0

Reconstruction of x[n] from X (i) [k] is obvious. Recover x(i) [n] by inverting (3.1.8)
(see also (3.1.6)) and then get x[n] following (3.1.7) by juxtaposing the various
x(i) [n]. This leads to
∞ N −1
(i)
x[n] = X (i) [k] ϕk [n],
i=−∞ k=0
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 103

where 
(i)
1 j2πkn/N
Ne n = iN + l, l = 0, 1, . . . , N − 1,
ϕk [n] =
0 otherwise.
(i)
The ϕk [n] are simply the basis functions of the DFT shifted to the appropriate
interval [iN, . . . , (i + 1)N − 1].
The above expansion is called a block discrete-time Fourier series, since the
signal is divided into blocks of size N , which are then Fourier transformed. In
matrix notation, the overall expansion of the transform is given by a block diagonal
matrix, where each block is an N × N Fourier matrix F N ,
⎛ . ⎞ ⎛. ⎞⎛ . ⎞
.. .. ..
⎜ (−1) ⎟ ⎜ ⎟ ⎜ (−1) ⎟
⎜X ⎟ ⎜ FN ⎟⎜x ⎟
⎜ (0) ⎟ ⎜ ⎟ ⎜ (0) ⎟
⎜ X ⎟ = ⎜ FN ⎟⎜ x ⎟,
⎜ (1) ⎟ ⎜ ⎟ ⎜ (1) ⎟
⎝ X ⎠ ⎝ FN ⎠⎝ x ⎠
.. .. ..
. . .

and X (i) , x(i) are size-N vectors. Up to a scale factor of 1/ N (see (3.1.6)), this is
a unitary transform. This transform is not shift-invariant in general, that is, if x[n]
has transform X[k], then x[n − l] does not necessarily have the transform X[k − l].
However, it can be seen that

x[n − l N ] ←→ X[k − l N ]. (3.1.9)

That is, the transform is periodically time-varying with period N .2 Note that we
have achieved a certain time locality. Components of the signal that exist only in
an interval [iN . . . (i + 1)N − 1] will only influence transform coefficients in the same
interval. Finally, the basis functions in this block transform are naturally divided
into size-N subsets, with no overlaps between subsets, that is
(i) (m)
ϕk [n], ϕl [n] = 0, i = m,

simply because the supports of the basis functions are disjoint. This abrupt change
between intervals, and the fact that the interval length and position are arbitrary,
are the drawbacks of this block DTFS.
In this chapter, we will extend the idea of block transforms in order to address
these drawbacks, and this will be done using filter banks. But first, we turn our
attention to the simplest block transform case, when N = 2. This is followed by
the simplest filter bank case, when the filters are ideal sinc filters. The general case,
to which these are a prelude, lies between these extremes.
2
Another way to say this is that the ”shift by N ” and the size-N block transform operators
commute.
104 CHAPTER 3

3.1.2 Haar Expansion of Discrete-Time Signals


The Haar basis, while very simple, should nonetheless highlight key features such as
periodic time variance and the relation with filter bank implementations. The basic
unit is a two-point average and difference operation. While this is a 2 × 2 unitary
transform that could be called a DFT just as well, we refer to it as the elementary
Haar basis because we will see that its suitable iteration will lead to both the
discrete-time Haar decomposition (in Section 3.3) as well as the continuous-time
Haar wavelet (in Chapter 4).
The basis functions in the Haar case are given by

 ⎪ 1
√1 n = 2k, 2k + 1, ⎨ √2 n = 2k,
ϕ2k [n] = 2 ϕ2k+1 [n] = − √12 n = 2k + 1, (3.1.10)
0 otherwise, ⎪
⎩ 0 otherwise.

It follows that the even-indexed basis functions are translates of each other, and so
are the odd-indexed ones, or

ϕ2k [n] = ϕ0 [n − 2k], ϕ2k+1 [n] = ϕ1 [n − 2k]. (3.1.11)

The transform is
1
X[2k] = ϕ2k , x = √ (x[2k] + x[2k + 1]) , (3.1.12)
2

1
X[2k + 1] = ϕ2k+1 , x = √ (x[2k] − x[2k + 1]) . (3.1.13)
2
The reconstruction is obtained from

x[n] = X[k] ϕk [n], (3.1.14)
k∈Z

as usual for an orthonormal basis. Let us prove that the set ϕk [n] given in (3.1.10)
is an orthonormal basis for l2 (Z). While the proof is straightforward in this simple
case, we indicate it for two reasons. First, it is easy to extend it to any block
transform, and second, the method of the proof can be used in more general cases
as well.

P ROPOSITION 3.1
The set of functions as given in (3.1.10) is an orthonormal basis for signals
from l2 (Z).
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 105

P ROOF
To check that the set of basis functions {ϕk }k∈Z indeed constitutes an orthonormal basis
for signals from l2 (Z), we have to verify that:

(a) {ϕk }k∈Z is an orthonormal family.

(b) {ϕk }k∈Z is complete.

Consider (a). We want to show that ϕk , ϕl = δ[k − l]. Take k even, k = 2i. Then, for l
smaller than 2i or larger than 2i + 1, the inner product is automatically zero since the basis
functions do not overlap. For l = 2i, we have
1 1
ϕ2i , ϕ2i = ϕ22i [2i] + ϕ22i [2i + 1] = + = 1.
2 2
For l = 2i + 1, we get

ϕ2i , ϕ2i+1 = ϕ2i [2i] · ϕ2i+1 [2i] + ϕ2i [2i + 1] · ϕ2i+1 [2i + 1] = 0.

A similar argument can be followed for odd l’s, and thus, orthonormality is proven. Now
consider (b). We have to demonstrate that any signal belonging to l2 (Z) can be expanded
using (3.1.14). This is equivalent to showing that there exists no x[n] with x > 0, such
that it has a zero expansion, that is, such that  ϕk , x  = 0, for all k. To prove this,
suppose it is not true, that is, suppose that there exists an x[n] with x > 0, such that
 ϕk , x  = 0, for all k. Thus

 ϕk , x  = 0 ⇐⇒  ϕk , x 2 = 0 ⇐⇒ | ϕk [n], x[n] |2 = 0. (3.1.15)
k∈Z

Since the last sum consists of strictly nonnegative terms, (3.1.15) is possible if and only if

X[k] = ϕk [n], x[n] = 0, for all k.

First, take k even, and consider X[2k] = 0. Because of (3.1.12), it means that x[2k] =
−x[2k + 1] for all k. Now take the odd k’s, and look at X[2k + 1] = 0. From (3.1.13), it
follows that x[2k] = x[2k+1] for all k. Thus, the only solution to the above two requirements
is x[2k] = x[2k + 1] = 0, or a contradiction with our assumption. This shows that there is
no sequence x[n], x > 0 such that X = 0, and proves completeness.

Now, we would like to show how the expansion (3.1.12–3.1.14) can be implemented
using convolutions, thus leading to filter banks. Consider the filter h0 [n] with the
following impulse response:

√1
2
n = −1, 0,
h0 [n] = (3.1.16)
0 otherwise.

Note that this is a noncausal filter. Then, X[2k] in (3.1.12) is the result of the
convolution of h0 [n] with x[n] at instant 2k since
 1 1
h0 [n] ∗ x[n] |n=2k = h0 [2k − l] x[l] = √ x[2k] + √ x[2k + 1] = X[2k].
l∈Z
2 2
106 CHAPTER 3

analysis synthesis
(a) y1 x1
H1 2 2 G1

x + x^

H0 2 2 G0
y0 x0

(b) |H0(ω)|, |H1(ω)|

low high
band band

0 π π ω
---
2

FIGURE 3.1 fignew3.1.3.1

Figure 3.1 Two-channel filter bank with analysis filters h0 [n], h1 [n] and synthe-
sis filters g0 [n], g1 [n]. If the filter bank implements an orthonormal transform,
then g0 [n] = h0 [−n] and g1 [n] = h1 [−n]. (a) Block diagram. (b) Spectrum
splitting performed by the filter bank.

Similarly, by defining the filter h1 [n] with the impulse response



⎪ 1
⎨ √2 n = 0,
h1 [n] = − √12 n = −1, (3.1.17)

⎩ 0 otherwise,
we obtain that X[2k + 1] in (3.1.13) follows from

h1 [n] ∗ x[n] |n=2k = h1 [2k − l] x[l]
l∈Z
1 1
= √ x[2k] − √ x[2k + 1] = X[2k + 1].
2 2
We recall (from Section 2.5.3) that evaluating a convolution at even indexes corre-
sponds to a filter followed by downsampling by 2. Therefore, X[2k] and X[2k + 1]
can be obtained from a two-channel filter bank, with filters h0 [n] and h1 [n], followed
by downsampling by 2, as shown in the left half of Figure 3.1(a). This is called an
analysis filter bank. Often, we will specifically label the channel signals as y0 and
y1 , where
y0 [k] = X[2k], y1 [k] = X[2k + 1].
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 107

It is important to note that the impulse responses of the analysis filters are time-
reversed versions of the basis functions,

h0 [n] = ϕ0 [−n], h1 [n] = ϕ1 [−n],

since convolution is an inner product involving time reversal. Also, the filters we
defined in (3.1.16) and (3.1.17) are noncausal, which is to be expected since, for
example, the computation of X[2k] in (3.1.12) involves x[2k + 1], that is, a future
sample. To summarize this discussion, it is easiest to visualize the analysis in matrix
notation as
⎛. ⎞
..
⎜ ϕ0 [n] ⎟
⎜ 1 23 4 ⎟
⎛ . ⎞ ⎛ . ⎞ ⎜ ⎟⎛ . ⎞
.. .. ⎜ h 0 [0] h 0 [−1] ⎟ ..
⎜ ⎟
⎜ y [0] ⎟ ⎜ X[0] ⎟ ⎜ h1 [0] h1 [−1] ⎟ ⎜ x[0] ⎟
⎜ 0 ⎟ ⎜ ⎟ ⎜ 3 41 2 ⎟⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ y1 [0] ⎟ ⎜ X[1] ⎟ ⎜ ϕ1 [n] ⎟ ⎜ x[1] ⎟
⎜ ⎟ = ⎜ ⎟ = ⎜ ⎟⎜ ⎟,
⎜ y0 [1] ⎟ ⎜ X[2] ⎟ ⎜ 1
ϕ2 [n]
23 4 ⎟ ⎜ x[2] ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟
⎝ y1 [1] ⎠ ⎝ X[3] ⎠ ⎜ h0 [0] h0 [−1] ⎟ ⎝ x[3] ⎠
⎜ ⎟
.. .. ⎜ h1 [0] h1 [−1] ⎟ ..
. . ⎜ 3 41 2 ⎟ .
⎜ ⎟
⎝ ϕ3 [n] ⎠
..
.
(3.1.18)
where we again see the shift property of the basis functions (see (3.1.11)). We can
verify the shift invariance of the analysis with respect to even shifts. If x [n] =
x[n − 2l], then

1 1
X  [2k] = √ (x [2k] + x [2k + 1]) = √ (x[2k − 2l] + x[2k + 1 − 2l])
2 2
= X[2k − 2l]

and similarly for X  [2k + 1] which equals X[2k + 1 − 2l], thus verifying (3.1.9).
This does not hold√ for odd shifts, however. For example, δ[n] √ has the transform
(δ[n] + δ[n − 1])/ 2 while δ[n − 1] leads to (δ[n] − δ[n − 1])/ 2.
What about the synthesis or reconstruction given by (3.1.14)? Define two filters
g0 and g1 with impulse responses equal to the basis functions ϕ0 and ϕ1

g0 [n] = ϕ0 [n], g1 [n] = ϕ1 [n]. (3.1.19)

Therefore
ϕ2k [n] = g0 [n − 2k], ϕ2k+1 [n] = g1 [n − 2k], (3.1.20)
108 CHAPTER 3

following (3.1.11). Then (3.1.14) becomes, using (3.1.19) and (3.1.20),


 
x[n] = y0 [k]ϕ2k [n] + y1 [k]ϕ2k+1 [n] (3.1.21)
k∈Z k∈Z
 
= y0 [k]g0 [n − 2k] + y1 [k]g1 [n − 2k]. (3.1.22)
k∈Z k∈Z

That is, each sample from yi [k] adds a copy of the impulse response of gi [n] shifted by
2k. This can be implemented by an upsampling by 2 (inserting a zero between every
two samples of yi [k]) followed by a convolution with gi [n] (see also Section 2.5.3).
This is shown in the right side of Figure 3.1(a), and is called a synthesis filter bank.
What we have just explained is a way of implementing a structured orthogonal
expansion by means of filter banks. We summarize two characteristics of the filters
which will hold in general orthogonal cases as well.

(a) The impulse responses of the synthesis filters equal the first set of basis func-
tions
gi [n] = ϕi [n], i = 0, 1.

(b) The impulse responses of the analysis filters are the time-reversed versions of
the synthesis ones
hi [n] = gi [−n], i = 0, 1.

What about the signal processing properties of our decomposition? From (3.1.12)
and (3.1.13), we recall that one channel computes the average and the other the
difference of two successive samples. While these are not the ”best possible” low-
pass and highpass filters (they have, however, good time localization), they lead to
an important interpretation. The reconstruction from y0 [k] (that is, the first sum
in (3.1.21)) is the orthogonal projection of the input onto the subspace spanned by
ϕ2k [n], that is, an average or coarse version of x[n]. Calling it x0 , it equals
1
x0 [2k] = x0 [2k + 1] = (x[2k] + x[2k + 1]) .
2
The other sum in (3.1.21), which is the reconstruction from y1 [k], is the orthogonal
projection onto the subspace spanned by ϕ2k+1 [n]. Denoting it by x1 , it is given by
1
x1 [2k] = (x[2k] − x[2k + 1]) , x1 [2k + 1] = −x1 [2k].
2
This is the difference or added detail necessary to reconstruct x[n] from its coarse
version x0 [n]. The two subspaces spanned by {ϕ2k } and {ϕ2k+1 } are orthogonal
and the sum of the two projections recovers x[n] perfectly, since summing (x0 [2k] +
x1 [2k]) yields x[2k] and similarly (x0 [2k + 1] + x1 [2k + 1]) gives x[2k + 1].
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 109

3.1.3 Sinc Expansion of Discrete-Time Signals


Although remarkably simple, the Haar basis suffers from an important drawback
— the frequency resolution of its basis functions (filters), is not very good. We
now look at a basis which uses ideal half-band lowpass and highpass filters. The
frequency selectivity is ideal (out-of-band signals are perfectly rejected), but the
time localization suffers (the filter impulse response is infinite, and decays only
proportionally to 1/n).
Let us start with an ideal half-band lowpass filter
√ g0 [n], defined by its 2π-
periodic discrete-time Fourier transform G0 (ejω ) = 2, ω ∈ [−π/2, π/2] and 0 for
ω ∈ [π/2, 3π/2]. The scale factor is so chosen that G0  = 2π or g0  = 1 following
Parseval’s relation for the DTFT. The inverse DTFT yields
√  π/2
2 1 sin πn/2
g0 [n] = ejωn dω = √ . (3.1.23)
2π π/2 2 πn/2

Note that g0 [2n] = 1/ 2 · δ[n]. As the highpass filter, choose a modulated version
of g0 [n], with a twist, namely a time reversal and a shift by one

g1 [n] = (−1)n g0 [−n + 1]. (3.1.24)

While the time reversal is only formal here (since g0 [n] is symmetric in n), the
shift by one is important for the completeness of the highpass and lowpass impulse
responses in the space of square-summable sequences.
Just as in the Haar case, the basis functions are obtained from the filter impulse
responses and their even shifts,

ϕ2k [n] = g0 [n − 2k], ϕ2k+1 [n] = g1 [n − 2k], (3.1.25)

and the coefficients of the expansion ϕ2k , x and ϕ2k+1 , x are obtained by filtering
with h0 [n] and h1 [n] followed by downsampling by 2, with hi [n] = gi [−n].

P ROPOSITION 3.2
The set of functions as given in (3.1.25) is an orthonormal basis for signals
from l2 (Z).

P ROOF
To prove that the set of functions ϕk [n] is indeed an orthonormal basis, again we would
have to demonstrate orthonormality of the set as well as completeness. Let us demonstrate
orthonormality of basis functions. We will do that only for

ϕ2k [n], ϕ2l [n] = δ[k − l], (3.1.26)


110 CHAPTER 3

and leave the other two cases

ϕ2k [n], ϕ2l+1 [n] = 0, (3.1.27)


ϕ2k+1 [n], ϕ2l+1 [n] = δ[k − l], (3.1.28)

as an exercise (Problem 3.1). First, because ϕ2k [n] = ϕ0 [n − 2k], it suffices to show (3.1.26)
for k = 0, or equivalently, to prove that

g0 [n] , g0 [n − 2l] = δ[l].

From (2.5.19) this is equivalent to showing

|G0 (ejω )|2 + |G0 (ej(ω+π) )|2 = 2,



which holds true since G0 (ejω ) = 2 between −π/2 and π/2. The proof of the other
orthogonality relations is similar.
The proof of completeness, which can be made along the lines of the proof in Propo-
sition 3.1, is left to the reader (see Problem 3.1).

As we said, the filters in this case have perfect frequency resolution. However,
the decay of the filters in time is rather poor, being of the order of 1/n. The
multiresolution interpretation we gave for the Haar case holds here as well. The
perfect lowpass filter h0 , followed by downsampling, upsampling and interpolation
by g0 , leads to a projection of the signal onto the subspace of sequences bandlimited
to [−π/2, π/2], given by x0 . Similarly, the other path in Figure 3.1 leads to a
projection onto the subspace of half-band highpass signals given by x1 . The two
subspaces are orthogonal and their sum is l2 (Z). It is also clear that x0 is a coarse,
lowpass approximation to x, while x1 contains the additional frequencies necessary
to reconstruct x from x0 .
An example describing the decomposition of a signal into downsampled lowpass
and highpass components, with subsequent reconstruction using upsampling and
interpolation, is shown in Figure 3.2. Ideal half-band filters are assumed. The
reader is encouraged to verify this spectral decomposition using the downsampling
and upsampling formulas (see (2.5.13) and (2.5.17)) from Section 2.5.3.

3.1.4 Discussion
In both the Haar and sinc cases above, we noticed that the expansion was not
time-invariant, but periodically time-varying. We show below that time invariance
in orthonormal expansions leads only to trivial solutions, and thus, any meaningful
orthonormal expansion of l2 (Z) will be time-varying.
P ROPOSITION 3.3
An orthonormal time-invariant signal decomposition will have no frequency
resolution.
3.1. SERIES EXPANSIONS OF DISCRETE-TIME SIGNALS 111

|X(ejω)|

−π π ω
(a)

(b)
(c)
(d)
(e)

|X(ejω)|

−π π ω
(f)

Figure 3.2 Two-channel decomposition of a signal using ideal filters. Left side
FIGURE
depicts the process in the lowpass TUT3.1
channel, while the right side depicts
figtut3.1
the
process in the highpass channel. (a) Original spectrum. (b) Spectrums after
filtering. (c) Spectrums after downsampling. (d) Spectrums after upsampling.
(e) Spectrums after interpolation filtering. (f) Reconstructed spectrum.

P ROOF
An expansion is time-invariant if x[n] ←→ X[k], then x[n − m] ←→ X[k − m] for all x[n] in
l2 (Z). Thus, we have that

ϕk [n], x[n − m] = ϕk−m [n], x[n] .

By a change of variable, the left side is equal to ϕk [n+m], x[n] , and then using k = k −m,
we find that
ϕk +m [n + m] = ϕk [n], (3.1.29)

that is, the expansion operator is Toeplitz. Now, we want the expansion to be orthonormal,
that is, using (3.1.29),

ϕk [n], ϕk+m [n] = ϕk [n], ϕk [n − m] = δ[m],

or the autocorrelation of ϕk [n] is a Dirac function. In Fourier domain, this leads to

|Φ(ejω )|2 = 1,

showing that the basis functions have no frequency selectivity since they are allpass func-
tions.
112 CHAPTER 3

Table 3.1 Basis functions (synthesis filters) in Haar and


sinc cases.

Haar Sinc
√ sin(π/2)n
g0 [n] (δ[n] + δ[n − 1])/ 2 √1
√ 2 (π/2)n
g1 [n] (δ[n] − δ[n − 1])/ 2  √ (−1)n g0 [−n +
1]

√ −j(ω/2) 2 for ω ∈ [−π/2, π/2],
G0 (e ) 2e cos(ω/2)
√ −j(ω/2) 0 otherwise.
G1 (ejω ) 2je sin(ω/2) −e−jω G0 (−e−jω )

Therefore, time variance is an inherent feature of orthonormal expansions. Note


that Proposition 3.3 does not hold if the orthogonality constraint is removed (see
Problem 3.3). Another consequence of Proposition 3.3 is that there are no banded3
orthonormal Toeplitz matrices, since an allpass filter has necessarily infinite impulse
response. However, in (3.1.18), we saw a banded block Toeplitz matrix (actually,
block diagonal) that was orthonormal. The construction of orthonormal FIR filter
banks is the study of such banded block Toeplitz matrices.
We have seen two extreme cases of structured series expansions of sequences,
based on Haar and sinc filters respectively (Table 3.1 gives basis functions for both
of these cases). More interesting cases exist between these extremes and they will be
implemented with filter banks as shown in Figure 3.1(a). Thus, we did not consider
arbitrary expansions of l2 (Z), but rather a structured subclass. These expansions
will have the multiresolution characteristic already built in, which will be shown
to be a framework for a large body of work on filter banks that appeared in the
literature of the last decade.

3.2 T WO -C HANNEL F ILTER BANKS


We saw in the last section how Haar and sinc expansions of discrete-time signals
could be implemented using a two-channel filter bank (see Figure 3.1(a)). The aim
in this section is to examine two-channel filter banks in more detail. The main idea
is that perfect reconstruction filter banks implement series expansions of discrete-
time signals as in the Haar and sinc cases. Recall that in both of these cases, the
expansion is orthonormal and the basis functions are actually the impulse responses
of the synthesis filters and their even shifts. In addition to the orthonormal case,
we will consider biorthogonal (or general) expansions (filter banks) as well.
The present section serves as a core for the remainder of the chapter; all impor-
tant notions and concepts will be introduced here. For the sake of simplicity, we
concentrate on the two-channel case. More general solutions are given later in the
3
A banded Toeplitz matrix has a finite number of nonzero diagonals.
3.2. TWO-CHANNEL FILTER BANKS 113

chapter. We start with tools for analyzing general filter banks. Then, we examine
orthonormal and linear phase two-channel filter banks in more detail. We then
present results valid for general two-channel filter banks and examine some special
cases, such as IIR solutions.

3.2.1 Analysis of Filter Banks


Consider Figure 3.1(a). We saw in the Haar and sinc cases, that such a two-channel
filter bank implements an orthonormal series expansion of discrete-time signals
with synthesis filters being the time-reversed version of the analysis filters, that is
gi [n] = hi [−n]. Here, we relax the assumption of orthonormality and consider a
general filter bank, with analysis filters h0 [n], h1 [n] and synthesis filters g0 [n], g1 [n].
Our only requirement will be that such a filter bank implements an expansion of
discrete-time signals (not necessarily orthonormal). Such an expansion will be
termed biorthogonal. In the filter bank literature, such a system is called a perfect
reconstruction filter bank.
Looking at Figure 3.1, besides filtering, the key elements in the filter bank
computation of an expansion are downsamplers and upsamplers. These perform
the sampling rate changes and the downsampler creates a periodically time-varying
linear system. As discussed in Section 2.5.3, special analysis techniques are needed
for such systems. We will present three ways to look at periodically time-varying
systems, namely in time, modulation, and polyphase domains. The first approach
was already used in our discussion of the Haar case. The two other approaches
are based on the Fourier or z-transform and aim at decomposing the periodically
time-varying system into several time-invariant subsystems.

Time-Domain Analysis Recall that in the Haar case (see (3.1.18)), in order to vi-
sualize block time invariance, we expressed the transform coefficients via an infinite
matrix, that is
⎛ . ⎞ ⎛ . ⎞ ⎛ . ⎞
.. .. ..
⎜ y [0] ⎟ ⎜ X[0] ⎟ ⎜ x[0] ⎟
⎜ 0 ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ y1 [0] ⎟ ⎜ X[1] ⎟ ⎜ x[1] ⎟
⎜ ⎟ = ⎜ ⎟ = Ta · ⎜ ⎟. (3.2.1)
⎜ y0 [1] ⎟ ⎜ X[2] ⎟ ⎜ x[2] ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ y1 [1] ⎠ ⎝ X[3] ⎠ ⎝ x[3] ⎠
.. .. ..
. . .
3 41 2 3 41 2 3 41 2
y X x
Here, the transform coefficients X[k] are expressed in another form as well. In
the filter bank literature, it is more common to write X[k] as outputs of the two
branches in Figure 3.1(a), that is, as two subband outputs denoted by y0 [k] = X[2k],
114 CHAPTER 3

and y1 [k] = X[2k + 1]. Also, in (3.2.1), T a · x represents the inner products, where
T a is the analysis matrix and can be expressed as
⎛ .. .. .. .. .. .. ⎞
. . . . . .
⎜ h [L − 1] h [L − 2] h [L − 3] · · · h [0] ⎟
⎜ 0 0 0 0 0 0 ⎟
⎜ ⎟
⎜ h [L − 1] h1 [L − 2] h1 [L − 3] · · · h1 [0] 0 0 ⎟
Ta = ⎜ 1 ⎟,
⎜ 0 0 h0 [L − 1] · · · h0 [2] h0 [1] h0 [0] ⎟
⎜ ⎟
⎝ 0 0 h1 [L − 1] · · · h1 [2] h1 [1] h1 [0] ⎠
.. .. .. .. .. ..
. . . . . .

where we assume that the analysis filters hi [n] are finite impulse response (FIR)
filters of length L = 2K. To make the block Toeplitz structure of T a more explicit,
we can write
⎛ .. .. .. .. ⎞
. . . .
⎜ ⎟
⎜ · · · A0 A1 · · · AK−1 0 ···⎟
Ta = ⎜ ⎟. (3.2.2)
⎝ · · · 0 A0 · · · AK−2 AK−1 · · · ⎠
.. .. .. ..
. . . .

The block Ai is given by


 
h0 [2K − 1 − 2i] h0 [2K − 2 − 2i]
Ai = . (3.2.3)
h1 [2K − 1 − 2i] h1 [2K − 2 − 2i]

The transform coefficient


X[k] = ϕk [n], x[n],
equals (in the case k = 2k )

y0 [k ] = h0 [2k − n], x[n],

and (in the case k = 2k + 1)

y1 [k ] = h1 [2k − n], x[n].

The analysis basis functions are thus

ϕ2k [n] = h0 [2k − n], (3.2.4)


ϕ2k+1 [n] = h1 [2k − n]. (3.2.5)

To resynthesize the signal, we use the dual-basis, synthesis, matrix T s

x = T s y = T s X = T s T a x. (3.2.6)
3.2. TWO-CHANNEL FILTER BANKS 115

Similarly to T a , T s can be expressed as


⎛ .. .. .. .. .. .. ⎞
. . . . . .
⎜ g [0] ··· g0 [L − 1] ⎟
⎜ 0 g0 [1] g0 [2] 0 0 ⎟
⎜ ⎟
⎜ g [0] g1 [1] g1 [2] ··· g1 [L − 1] 0 0 ⎟
T Ts = ⎜ 1 ⎟
⎜ 0 0 g0 [0] ··· g0 [L − 3] g0 [L − 2] g0 [L − 1] ⎟
⎜ ⎟
⎝ 0 0 g1 [0] ··· g1 [L − 3] g1 [L − 2] g1 [L − 1] ⎠
.. .. .. .. .. ..
. . . . . .
⎛ ⎞
.. .. .. ..
. . . .
⎜ ⎟
⎜··· S T0 S T1 ··· S TK  −1 0 ···⎟
= ⎜
⎜···
⎟, (3.2.7)
⎝ 0 S T0 ··· S TK  −2 S TK  −1 ···⎟

.. .. .. ..
. . . .

where the block S i is of size 2 × 2 and FIR filters are of length L = 2K  . The block
S i is  
g0 [2i] g1 [2i]
Si = ,
g0 [2i + 1] g1 [2i + 1]
where g0 [n] and g1 [n] are the synthesis filters. The dual synthesis basis functions
are

ϕ̃2k [n] = g0 [n − 2k],


ϕ̃2k+1 [n] = g1 [n − 2k].

Let us go back for a moment to (3.2.6). The requirement that {h0 [2k−n], h1 [2k−n]}
and {g0 [n − 2k], g1 [n − 2k]} form a dual bases pair is equivalent to

T s T a = T a T s = I. (3.2.8)

This is the biorthogonality condition or, in the filter bank literature, the perfect
reconstruction condition. In other words,

ϕk [n], ϕ̃l [n] = δ[k − l],

or in terms of filter impulse responses

hi [2k − n], gj [n − 2l] = δ[k − l] δ[i − j], i, j = 0, 1.

Consider the two branches in Figure 3.1(a) which produce y0 and y1 . Call H i the
operator corresponding to filtering by hi [n] followed by downsampling by 2. Then
116 CHAPTER 3

the output y i can be written as (L denotes the filter length)


⎛ . ⎞ ⎛ ⎞⎛
.. .. .. .. .. ⎞
⎜ . . . ⎟ .
⎜ ⎟ ⎜ · · · hi [L − 1] hi [L − 2] hi [L − 3] · · · ⎟ ⎜ ⎟
⎜ yi [0] ⎟ ⎜ ⎟ ⎜ x[0] ⎟
⎜ ⎟ = ⎜ ⎟ ⎜ ⎟, (3.2.9)
⎝ i ⎠
y [1] ⎝ · · · 0 0 hi [L − 1] · · · ⎠⎝ x[1] ⎠
.. .. .. .. ..
. . . . .
3 41 2 3 41 2 3 41 2
yi Hi x

or, in operator notation


y i = H i x.
Defining GTi similarly to H i but with gi [n] in reverse order (see also the definition
of T s ), the output of the system can now be written as

(G0 H 0 + G1 H 1 ) x.

Thus, to resynthesize the signal (the condition for perfect reconstruction), we have
that
G0 H 0 + G1 H 1 = I.
Of course, by interleaving the rows of H 0 and H 1 , we get T a , and similarly, T s
corresponds to interleaving the columns of G0 and G1 .
To summarize this part on time-domain analysis, let us stress once more that
biorthogonal expansions of discrete-time signals, where the basis functions are ob-
tained from two prototype functions and their even shifts (for both dual bases), is
implemented using a perfect reconstruction, two-channel multirate filter bank. In
other words, perfect reconstruction is equivalent to the biorthogonality condition
(3.2.8).
Completeness is also automatically satisfied. To prove it, we show that there
exists no x[n] with x > 0, such that it has a zero expansion, that is, such that
X = 0. Suppose it is not true, that is, suppose that there exists an x[n] with
x > 0, such that X = 0. But, since X = T a x, we have that

T a x = 0,

and this is possible if and only if

Ta x = 0 (3.2.10)

(since in a Hilbert space — l2 (Z) in this case, v2 = v, v = 0, if and only
if v ≡ 0). We know that (3.2.10) has a nontrivial solution if and only if T a is
singular. However, due to (3.2.8), T a is nonsingular and thus (3.2.10) has only a
trivial solution, x ≡ 0, violating our assumption and proving completeness.
3.2. TWO-CHANNEL FILTER BANKS 117

Modulation-Domain Analysis This approach is based on Fourier or more gener-


ally z-transforms. Recall from Section 2.5.3, that downsampling a signal with the
z-transform X(z) by 2 leads to X  (z) given by
1
X  (z) = X(z 1/2 ) + X(−z 1/2 ) . (3.2.11)
2
Then, upsampling X  (z) by 2 yields X  (z) = X  (z 2 ), or
1
X  (z) = [X(z) + X(−z)] . (3.2.12)
2
To verify (3.2.12) directly, notice that downsampling followed by upsampling by 2
simply nulls out the odd-indexed coefficients, that is, x [2n] = x[2n] and x [2n+1] =
0. Then, note that X(−z) is the z-transform of (−1)n x[n] by the modulation
property, and therefore, (3.2.12) follows.
With this preamble, the z-transform analysis of the filter bank in Figure 3.1(a)
becomes easy. Consider the lower branch. The filtered signal, which has the z-
transform H0 (z) · X(z), goes through downsampling and upsampling, yielding (ac-
cording to (3.2.12))
1
[H0 (z) X(z) + H0 (−z) X(−z)] .
2
This signal is filtered with G0 (z), leading to X0 (z) given by
1
X0 (z) = G0 (z) [H0 (z) X(z) + H0 (−z) X(−z)] . (3.2.13)
2
The upper branch contributes X1 (z), which equals to (3.2.13) up to the change of
index 0 → 1, and the output of the analysis/synthesis filter bank is the sum of the
two components X0 (z) and X1 (z). This is best written in matrix notation as
X̂(z) = X0 (z) + X1 (z) (3.2.14)
  
1 H0 (z) H0 (−z) X(z)
= ( G0 (z) G1 (z) ) .
2 H1 (z) H1 (−z) X(−z)
3 41 2 3 41 2
H m (z) xm (z)

In the above, H m (z) is the analysis modulation matrix containing the modulated
versions of the analysis filters and xm (z) contains the modulated versions of X(z).
Relation (3.2.14) is illustrated in Figure 3.3, where the time-varying part is in
the lower channel. If the channel signals Y0 (z) and Y1 (z) are desired, that is, the
downsampled domain signals, it follows from (3.2.11) and (3.2.14) that
    
Y0 (z) 1 H0 (z 1/2 ) H0 (−z 1/2 ) X(z 1/2 )
= ,
Y1 (z) 2 H1 (z 1/2 ) H1 (−z 1/2 ) X(−z 1/2 )
118 CHAPTER 3

G0

+ 1
x Hm ---
2
x^

G1

(-1)n
FIGURE 3.2 figlast3.2.1

Figure 3.3 Modulation-domain analysis of the two-channel filter bank. The


2 × 2 matrix H m (z) contains the z-transform of the filters and their modulated
versions.

or, calling y(z) the vector [Y0 (z) Y1 (z)]T ,


1
y(z) = H m (z 1/2 ) xm (z 1/2 ).
2
For the system to represent a valid expansion, (3.2.14) has to yield X̂(z) = X(z),
which can be obtained when

G0 (z) H0 (z) + G1 (z) H1 (z) = 2, (3.2.15)


G0 (z) H0 (−z) + G1 (z) H1 (−z) = 0. (3.2.16)

The above two conditions then ensure perfect reconstruction. Expressing (3.2.15)
and (3.2.16) in matrix notation, we get

( G0 (z) G1 (z) ) · H m (z) = ( 2 0 ) . (3.2.17)

We can solve now for G0 (z) and G1 (z) (transpose (3.2.17) and multiply by (H Tm (z))−1
from the left)    
G0 (z) 2 H1 (−z)
= . (3.2.18)
G1 (z) det(H m (z)) −H0 (−z)
In the above, we assumed that H m (z) is nonsingular; that is, its normal rank is
equal to 2. Define P (z) as
2
P (z) = G0 (z) H0 (z) = H0 (z)H1 (−z), (3.2.19)
det(H m (z))

where we used (3.2.18). Observe that det(H m (z)) = − det(H m (−z)). Then, we
can express the product G1 (z)H1 (z) as
−2
G1 (z) H1 (z) = H0 (−z) H1 (z) = P (−z).
det(H m (z))
3.2. TWO-CHANNEL FILTER BANKS 119

It follows that (3.2.15) can be expressed in terms of P (z) as

P (z) + P (−z) = 2. (3.2.20)

We will show later, that the function P (z) plays a crucial role in analyzing and
designing filter banks. It suffices to note at this moment that, due to (3.2.20), all
even-indexed coefficients of P (z) equal 0, except for p[0] = 1. Thus, P (z) is of the
following form: 
P (z) = 1 + p[2k + 1] z −(2k+1) .
k∈Z

A polynomial or a rational function in z satisfying (3.2.20) will be called valid.


Following the definition of P (z) in (3.2.19), we can rewrite (3.2.15) or equivalently
(3.2.20) as
G0 (z) H0 (z) + G0 (−z) H0 (−z) = 2. (3.2.21)
Using the modulation property, its time-domain equivalent is
 
g0 [k] h0 [n − k] + (−1)n g0 [k] h0 [n − k] = 2δ[n],
k∈Z k∈Z

or equivalently, 
g0 [k] h0 [2n − k] = δ[n],
k∈Z

since odd-indexed terms are cancelled. Written as an inner product

g0 [k], h0 [2n − k] = δ[n],

this is one of the biorthogonality relations

ϕ̃0 [k], ϕ2n [k] = δ[n].

Similarly, starting from (3.2.15) or (3.2.16) and expressing G0 (z) and H0 (z) as
a function of G1 (z) and H1 (z) would lead to the other biorthogonality relations,
namely

ϕ̃1 [k], ϕ2n+1 [k] = δ[n],


ϕ̃0 [k], ϕ2n+1 [k] = 0,
ϕ̃1 [k], ϕ2n [k] = 0

Note that we obtained these relations for ϕ̃0 and ϕ̃1 but they hold also for ϕ̃2l and
ϕ̃2l+1 , respectively. This shows once again that perfect reconstruction implies the
biorthogonality conditions. The converse can be shown as well, demonstrating the
equivalence of the two conditions.
120 CHAPTER 3

y0
2 2

x + x^
y1
z 2 2 z-1

(a)

2 y0

x
Hp
z 2 y1

(b)

y0 2

Gp + x^

y1 2 z-1

(c)

FIGURE 3.3 figlast3.2.2

Figure 3.4 Polyphase-domain analysis. (a) Forward and inverse polyphase


transform. (b) Analysis part in the polyphase domain. (c) Synthesis part in
the polyphase domain.

Polyphase-Domain Analysis Although a very natural representation, modulation-


domain analysis suffers from a drawback — it is redundant. Note how in H m (z)
every filter coefficient appears twice, since both the filter Hi (z) and its modulated
version Hi (−z) are present. A more compact way of analyzing a filter bank uses
polyphase-domain analysis, which was introduced in Section 2.5.3.
Thus, what we will do is decompose both signals and filters into their polyphase
components and use (2.5.23) with N = 2 to express the output of filtering followed
by downsampling. For convenience, we introduce matrix notation to express the
two channel signals Y0 and Y1 , or
    
Y0 (z) H00 (z) H01 (z) X0 (z)
= , (3.2.22)
Y1 (z) H10 (z) H11 (z) X1 (z)
3 41 2 3 41 2 3 41 2
y (z) H p (z) xp (z)
3.2. TWO-CHANNEL FILTER BANKS 121

where Hij is the jth polyphase component of the ith filter, or, following (2.5.22–
2.5.23),
Hi (z) = Hi0 (z 2 ) + zHi1 (z 2 ).
In (3.2.22) y(z) contains the signals in the middle of the system in Figure 3.1(a).
H p (z) contains the polyphase components of the analysis filters, and is conse-
quently denoted the analysis polyphase matrix, while xp (z) contains the polyphase
components of the input signal or, following (2.5.20),

X(z) = X0 (z 2 ) + z −1 X1 (z 2 ).

It is instructive to give a block diagram of (3.2.22) as shown in Figure 3.4(b). First,


the input signal X is split into its polyphase components X0 and X1 using a forward
polyphase transform. Then, a two-input, two-output system containing H p (z) as
transfer function matrix leads to the outputs y0 and y1 .
The synthesis part of the system in Figure 3.1(a) can be analyzed in a similar
fashion. It can be implemented with an inverse polyphase transform (as given
on the right side of Figure 3.4(a)) preceded by a two-input two-output synthesis
polyphase matrix Gp (z) defined by
 
G00 (z) G10 (z)
Gp (z) = , (3.2.23)
G01 (z) G11 (z)

where
Gi (z) = Gi0 (z 2 ) + z −1 Gi1 (z 2 ). (3.2.24)
The synthesis filter polyphase components are defined such as those of the signal
(2.5.20–2.5.21), or in reverse order of those of the analysis filters. In Figure 3.4(c),
we show how the output signal is synthesized from the channel signals Y0 and Y1 as
  
−1 G00 (z 2 ) G10 (z 2 ) Y0 (z 2 )
X̂(z) = ( 1 z ) . (3.2.25)
G01 (z 2 ) G11 (z 2 ) Y1 (z 2 )
3 41 2 3 41 2
Gp (z 2 ) y (z 2 )

This equation reflects that the channel signals are first upsampled by 2 (leading to
Yi (z 2 )) and then filtered by filters Gi (z) which can be written as in (3.2.24). Note
that the matrix-vector product in (3.2.25) is in z 2 and can thus be implemented
before the upsampler by 2 (replacing z 2 by z) as shown in the figure.
Note the duality between the analysis and synthesis filter banks. The former
uses a forward, the latter an inverse polyphase transform, and Gp (z) is a transpose
of H p (z). The phase reversal in the definition of the polyphase components in
analysis and synthesis comes from the fact that z and z −1 are dual operators, or,
on the unit circle, ejω = (e−jω )∗ .
122 CHAPTER 3

Obviously the transfer function between the forward and inverse polyphase
transforms defines the analysis/synthesis filter bank. This transfer polyphase matrix
is given by
T p (z) = Gp (z) H p (z).
In order to find the input-output relationship, we use (3.2.22) as input to (3.2.25),
which yields

X̂(z) = ( 1 z −1 ) Gp (z 2 ) H p (z 2 ) xp (z 2 ),
= ( 1 z −1 ) T p (z 2 ) xp (z 2 ). (3.2.26)

Obviously, if T p (z) = I, we have


 
X0 (z 2 )
X̂(z) = ( 1 z −1 ) = X(z),
X1 (z 2 )

following (2.5.20), that is, the analysis/synthesis filter bank achieves perfect recon-
struction with no delay and is equivalent to Figure 3.4(a).

Relationships Between Time, Modulation and Polyphase Representations


Being different views of the same system, the representations discussed are related.
A few useful formulas are given below. From (2.5.20), we can write
     
X0 (z 2 ) 1 1 1 1 X(z)
= , (3.2.27)
X1 (z 2 ) 2 z 1 −1 X(−z)

thus relating polyphase and modulation representations of the signal, that is, xp (z)
and xm (z). For the analysis filter bank, we have that
     
H00 (z 2 ) H01 (z 2 ) 1 H0 (z) H0 (−z) 1 1 1
= , (3.2.28)
H10 (z 2 ) H11 (z 2 ) 2 H1 (z) H1 (−z) 1 −1 z −1

establishing the relationship between H p (z) and H m (z). Finally, following the
definition of Gp (z) in (3.2.23) and similarly to (3.2.28) we have
     
G00 (z 2 ) G10 (z 2 ) 1 1 1 1 G0 (z) G1 (z)
= , (3.2.29)
G01 (z 2 ) G11 (z 2 ) 2 z 1 −1 G0 (−z) G1 (−z)

which relates Gp (z) with Gm (z) defined as


 
G0 (z) G1 (z)
Gm (z) = .
G0 (−z) G1 (−z)

Again, note that (3.2.28) is the transpose of (3.2.29), with a phase change in the
diagonal matrix. The change from the polyphase to the modulation representation
3.2. TWO-CHANNEL FILTER BANKS 123

(and vice versa) involves not only a diagonal matrix with a delay (or phase factor),
but also a sum and/or a difference operation (see the middle matrix in (3.2.27–
3.2.29)). This is actually a size-2 Fourier transform, as will become clear in cases
of higher dimension.
The relation between time domain and polyphase domain is most obvious for
the synthesis filters gi , since their impulse responses correspond to the first basis
functions ϕi . Consider the time-domain synthesis matrix, and create a matrix T s (z)
 −1

K
T s (z) = S i z −i ,
i=0

where S i are the successive 2×2 blocks along a column of the block Toeplitz matrix
(there are K  of them for length 2K  filters), or
 
g0 [2i] g1 [2i]
Si = .
g0 [2i + 1] g1 [2i + 1]

Then, by inspection, it can be seen that T s (z) is identical to Gp (z). A similar


relation holds between H p (z) and the time-domain analysis matrix. It is a bit
more involved since time reversal has to be taken into account, and is given by
 
−K+1 −1 0 1
T a (z) = z H p (z ) ,
z −1 0

where

K−1
T a (z) = Ai z −i ,
i=0

and  
h0 [2(K − i) − 1] h0 [2(K − i) − 2]
Ai = ,
h1 [2(K − i) − 1] h1 [2(K − i) − 2]
K being the number of 2 × 2 blocks in a row of the block Toeplitz matrix. The
above relations can be used to establish equivalences between results in the various
representations (see also Theorem 3.7 below).

3.2.2 Results on Filter Banks


We now use the tools just established to review several classic results from the filter
bank literature. These have a slightly different flavor than the expansion results
which are concerned with the existence of orthogonal or biorthogonal bases. Here,
approximate reconstruction is considered, and issues of realizability of the filters
involved are very important.
124 CHAPTER 3

In the filter bank language, perfect reconstruction means that the output is a
delayed and possibly scaled version of the input,

X̂(z) = cz −k X(z).

This is equivalent to saying that, up to a shift and scale, the impulse responses of the
analysis filters (with time reversal) and of the synthesis filters form a biorthogonal
basis.
Among approximate reconstructions, the most important one is alias-free re-
construction. Remember that because of the periodic time-variance of analy-
sis/synthesis filter banks, the output is both a function of x[n] and its modulated
version (−1)n x[n], or X(z) and X(−z) in the z-transform domain. The aliased
component X(−z) can be very disturbing in applications and thus cancellation of
aliasing is of prime importance. In particular, aliasing represents a nonharmonic
distortion (new sinusoidal components appear which are not harmonically related
to the input) and this is particularly disturbing in audio applications.
What follows now, are results on alias cancellation and perfect reconstruction
for the two-channel case. Note that all the results are valid for a general, N -channel
case as well (substitute N for 2 in statements and proofs).
For the first result, we need to introduce pseudocirculant matrices [311]. These
are N × N circulant matrices with elements Fij (z), except that the lower triangular
elements are multiplied by z, that is

F0,j−i (z) j ≥ i,
Fij (z) =
z · F0,N +j−i (z) j < i.

Then, the following holds:

P ROPOSITION 3.4
Aliasing in a one-dimensional subband coding system will be cancelled if and
only if the transfer polyphase matrix T p is pseudocirculant [311].

P ROOF
Consider a 2 × 2 pseudocirculant matrix
 
F0 (z) F1 (z)
T p (z) = ,
zF1 (z) F0 (z)

and substitute it into (3.2.26)


 
X0 (z 2 )
X̂(z) = ( 1 z −1 ) T p (z 2 ) ,
X1 (z 2 )
3.2. TWO-CHANNEL FILTER BANKS 125

yielding (use F (z) = F0 (z 2 ) + zF1 (z 2 ))


 
X0 (z 2 )
X̂(z) = ( F (z) z −1 F (z) ) · ,
X1 (z 2 )
= F (z) · (X0 (z 2 ) + z −1 X1 (z 2 )),
= F (z) · X(z),

that is, it results in a time-invariant system or aliasing is cancelled. Given a time-invariant


system, defined by a transfer function F (z), it can be shown (see [311]) that its polyphase
implementation is pseudocirculant.

A corollary to Proposition 3.4, is that for perfect reconstruction, the transfer func-
tion matrix has to be a pseudocirculant delay, that is, for an even delay 2k
 
−k 1 0
T p (z) = z ,
0 1
while for an odd delay 2k + 1
 
−k−1 0 1
T p (z) = z .
z 0
The next result indicates when aliasing can be cancelled for a given analysis filter
bank. Since the analysis and synthesis filter banks play dual roles, the result that
we will discuss holds for synthesis filter banks as well.
P ROPOSITION 3.5
Given a two-channel filter bank downsampled by 2 with the polyphase matrix
H p (z), then alias-free reconstruction is possible if and only if the determinant
of H p (z) is not identically zero, that is, H p (z) has normal rank 2.
P ROOF
Choose the synthesis matrix as

Gp (z) = cofactor (H p (z)) ,

resulting in
T p (z) = Gp (z) H p (z) = det (H p (z)) · I
which is pseudocirculant, and thus cancels aliasing. If, on the other hand, the system is
alias-free, then we know (see Proposition 3.4) that T p (z) is pseudocirculant and therefore
has full rank 2. Since the rank of a matrix product is bounded above by the ranks of its
terms, H p (z) has rank 2.4

Often, one is interested in perfect reconstruction filter banks where all filters
involved have a finite impulse response (FIR). Again, analysis and synthesis filter
banks play the same role.
4
Note that we excluded the case of zero reconstruction, even if technically it is also aliasing free
(but of zero interest!).
126 CHAPTER 3

P ROPOSITION 3.6
Given a critically sampled FIR analysis filter bank, perfect reconstruction
with FIR filters is possible if and only if det(H p (z)) is a pure delay.
P ROOF
Suppose that the determinant of H p (z) is a pure delay, and choose

Gp (z) = cofactor (H p (z)) .

It is obvious that the above choice leads to perfect reconstruction with FIR filters. Suppose,
on the other hand, that we have perfect reconstruction with FIR filters. Then, T p (z) has
to be a pseudocirculant shift (corollary below Proposition 3.4), or

det(T p (z)) = det(Gp (z)) · det(H p (z)) = z −l ,

meaning that it has l poles at z = 0. Since the synthesis has to be FIR as well, det(Gp (z))
has only zeros (or poles at the origin). Therefore, det(H p (z)) cannot have any zeros (except
possibly at the origin or ∞).

If det(H p (z)) has no zeros, neither does det(H m (z)) (because of (3.2.28) and
assuming FIR filters). Since det(H m (z)) is an odd function of z, it is of the form

det(H m (z)) = αz −2k−1 ,

(typically, α = 2) and following (3.2.18)


2 2k+1
G0 (z) = z H1 (−z), (3.2.30)
α
2
G1 (z) = − z 2k+1 H0 (−z). (3.2.31)
α
These filters give perfect reconstruction with zero delay but they are noncausal if
the analysis filters are causal. Multiplying them by z −2k−1 gives a causal version
with perfect reconstruction and a delay of 2k + 1 samples (note that the shift can
be arbitrary, since it only changes the overall delay).
In the above results, we used the polyphase decomposition of filter banks. All
these results can be translated to the other representation as well. In particular,
aliasing cancellation can be studied in the modulation domain. Then, a necessary
and sufficient condition for alias cancellation is that (see (3.2.14))

( G0 (z) G1 (z) ) · H m (z)

be a row-vector with only the first component different from zero. One could expand
( G0 (z) G1 (z) ) into a matrix Gm (z) by modulation, that is
 
G0 (z) G1 (z)
Gm (z) = . (3.2.32)
G0 (−z) G1 (−z)
3.2. TWO-CHANNEL FILTER BANKS 127

It is easy to see then that for the system to be alias-free


 
F (z)
T m (z) = Gm (z) H m (z) = .
F (−z)

The matrix T m (z) is sometimes called the aliasing cancellation matrix [272].
Let us for a moment return to (3.2.14). As we said, X(−z) is the aliased version
of the signal. A necessary and sufficient condition for aliasing cancellation is that

G0 (z) H0 (−z) + G1 (z) H1 (−z) = 0. (3.2.33)

The solution proposed by Croisier, Esteban, Galand [69] is known under the name
QMF (quadrature mirror filters), which cancels aliasing in a two-channel filter bank:

H1 (z) = H0 (−z), (3.2.34)


G0 (z) = H0 (z),
G1 (z) = −H1 (z) = −H0 (−z). (3.2.35)

Substituting the above into (3.2.33) leads to H0 (z)H0 (−z)− H0 (−z)H0 (z) = 0, and
aliasing is indeed cancelled. In order to achieve perfect reconstruction, the following
has to be satisfied:

G0 (z) H0 (z) + G1 (z) H1 (z) = 2z −l . (3.2.36)

For the QMF solution, (3.2.36) becomes

H02 (z) − H02 (−z) = 2z −l . (3.2.37)

Note that the left side is an odd function of z, and thus, l has to be odd. The above
relation explains the name QMF. On the unit circle H0 (−z) = H(ej(ω+π) ) is the
mirror image of H0 (z) and both the filter and its mirror image are squared. For FIR
filters, the condition (3.2.37) cannot be satisfied exactly except for the Haar filters

introduced in Section 3.1. Taking a causal Haar filter, or H0 (z) = (1 + z −1 )/ 2,
(3.2.37) becomes

1 1
(1 + 2z −1 + z −2 ) − (1 − 2z −1 + z −2 ) = 2z −1 .
2 2
For larger, linear phase filters, (3.2.37) can only be approximated (see Section 3.2.4).

Summary of Biorthogonality Relations Let us summarize our findings on bior-


thogonal filter banks.
128 CHAPTER 3

T HEOREM 3.7
In a two-channel, biorthogonal, real-coefficient filter bank, the following are
equivalent:

(a) hi [−n], gj [n − 2m] = δ[i − j]δ[m], i = 0, 1.

(b) G0 (z)H0 (z) + G1 (z)H1 (z) = 2, and G0 (z)H0 (−z) + G1 (z)H1 (−z) = 0.

(c) T s · T a = T a · T s = I.

(d) Gm (z)H m (z) = H m (z)Gm (z) = 2I.

(e) Gp (z)H p (z) = H p (z)Gp (z) = I.

The proof follows from the equivalences between the various representations intro-
duced in this section and is left as an exercise (see Problem 3.4). Note that we are
assuming a critically sampled filter bank. Thus, the matrices in points (c)–(e) are
square, and left inverses are also right inverses.

3.2.3 Analysis and Design of Orthogonal FIR Filter Banks


Assume now that we impose two constraints on our filter bank: First, it should
implement an orthonormal expansion5 of discrete-time signals and second, the filters
used should be FIR.
Let us first concentrate on the orthonormality requirement. We saw in the Haar
and sinc cases (both orthonormal expansions), that the expansion was of the form
 
x[n] = ϕk [l], x[l] ϕk [n] = X[k] ϕk [n], (3.2.38)
k∈Z k∈Z

with the basis functions being

ϕ2k [n] = h0 [2k − n] = g0 [n − 2k], (3.2.39)


ϕ2k+1 [n] = h1 [2k − n] = g1 [n − 2k], (3.2.40)

or, the even shifts of synthesis filters (even shifts of time-reversed analysis filters).
We will show here that (3.2.38–3.2.40) describe orthonormal expansions, in the
general case.
5
The term orthogonal is often used, especially for the associated filters or filter banks. For filter
banks, the term unitary or paraunitary is also often used, as well as the notion of losslessness (see
Appendix 3.A).
3.2. TWO-CHANNEL FILTER BANKS 129

Orthonormality in Time Domain Start with a general filter bank as given in Fig-
ure 3.1(a). Impose orthonormality on the expansion, that is, the dual basis {ϕ̃k [n]}
becomes identical to {ϕk [n]}. In filter bank terms, the dual basis — synthesis filters
— now becomes

{g0 [n−2k], g1 [n−2k]} = {ϕ̃k [n]} = {ϕk [n]} = {h0 [2k −n], h1 [2k −n]}, (3.2.41)

or,
gi [n] = hi [−n], i = 0, 1. (3.2.42)
Thus, we have encountered the first important consequence of orthonormality: The
synthesis filters are the time-reversed versions of the analysis filters. Also, since
(3.2.41) holds and ϕk is an orthonormal set, the following are the orthogonality
relations for the synthesis filters:

gi [n − 2k], gj [n − 2l] = δ[i − j] δ[k − l], (3.2.43)

with a similar relation for the analysis filters. We call this an orthonormal filter
bank.
Let us now see how orthonormality can be expressed using matrix notation.
First, substituting the expression for gi [n] given by (3.2.42) into the synthesis matrix
T s given in (3.2.7), we see that

T s = T Ta ,

or, the perfect reconstruction condition is

T s T a = T Ta T a = I. (3.2.44)

That is, the above condition means that the matrix T a is unitary. Because it is
full rank, the product commutes and we have also T a T Ta = I. Thus, having an
orthonormal basis, or perfect reconstruction with an orthonormal filter bank, is
equivalent to the analysis matrix T a being unitary.
If we separate the outputs now as was done in (3.2.9), and note that

Gi = H Ti ,

then the following is obtained from (3.2.43):

H i H Tj = δ[i − j] I, i, j = 0, 1.

Now, the output of one channel in Figure 3.1(a) (filtering, downsampling, upsam-
pling and filtering) is equal to

M i = H Ti H i .
130 CHAPTER 3

It is easy to verify that M i satisfies the requirements for an orthogonal projection


(see Appendix 2.A) since M Ti = M i and M 2i = M i . Thus, the two channels of
the filter bank correspond to orthogonal projections onto spaces spanned by their
respective impulse responses, and perfect reconstruction can be written as the direct
sum of the projections
H T0 H 0 + H T1 H 1 = I.
Note also, that sometimes in order to visualize the action of the matrix T a , it is
expressed in terms of 2 × 2 blocks Ai (see (3.2.2–3.2.3)), which can also be used to
express orthonormality as follows (see (3.2.44)):


K−1
ATi Ai = I,
i=0

K−1
ATi+j Ai = 0, j = 1, . . . , K − 1.
i=0

Orthonormality in Modulation Domain To see how orthonormality translates in


the modulation domain, consider (3.2.43) and i = j = 0. Substitute n = n − 2k.
Thus, we have
g0 [n ], g0 [n + 2(k − l)] = δ[k − l],
or
g0 [n], g0 [n + 2m] = δ[m]. (3.2.45)
Recall that p[l] = g0 [n], g0 [n + l] is the autocorrelation of the sequence g0 [n] (see
Section 2.5.2). Then, (3.2.45) is simply the autocorrelation of g0 [n] evaluated at
even indexes l = 2m, or p[l] downsampled by 2, that is, p [m] = p[2m]. The
z-transform of p [m] is (see Section 2.5.3)
1
P  (z) = [P (z 1/2 ) + P (−z 1/2 )].
2
Replacing z by z 2 (for notational convenience) and recalling that the z-transform
of the autocorrelation of g0 [n] is given by P (z) = G0 (z) · G0 (z −1 ), the z-transform
of (3.2.45) becomes

G0 (z) G0 (z −1 ) + G0 (−z) G0 (−z −1 ) = 2. (3.2.46)

Using the same arguments for the other cases in (3.2.43), we also have that

G1 (z) G1 (z −1 ) + G1 (−z) G1 (−z −1 ) = 2, (3.2.47)


−1 −1
G0 (z) G1 (z ) + G0 (−z) G1 (−z ) = 0. (3.2.48)
3.2. TWO-CHANNEL FILTER BANKS 131

On the unit circle, (3.2.46–3.2.47) become (use G(e−jω ) = G∗ (ejω ) since the filter
has real coefficients)

|Gi (ejω )|2 + |Gi (ej(ω+π) )|2 = 2, (3.2.49)

that is, the filter and its modulated version are power complementary (their mag-
nitudes squared sum up to a constant). Since this condition was used in [270]
for designing the first orthogonal filter banks, it is also called the Smith-Barnwell
condition. Writing (3.2.46–3.2.48) in matrix form,
    
G0 (z −1 ) G0 (−z −1 ) G0 (z) G1 (z) 2 0
= , (3.2.50)
G1 (z −1 ) G1 (−z −1 ) G0 (−z) G1 (−z) 0 2
that is, using the synthesis modulation matrix Gm (z) (see (3.2.32))

GTm (z −1 ) Gm (z) = 2I. (3.2.51)

Since gi and hi are identical up to time reversal, a similar relation holds for the
analysis modulation matrix H m (z) (up to a transpose), or H m (z −1 ) H Tm (z) = 2I.
A matrix satisfying (3.2.51) is called paraunitary (note that we have assumed
that the filter coefficients are real). If all its entries are stable (which they are in this
case, since we assumed the filters to be FIR), then such a matrix is called lossless.
The concept of losslessness comes from classical circuit theory [23, 308] and is
discussed in more detail in Appendix 3.A. It suffices to say at this point that having
a lossless transfer matrix is equivalent to the filter bank implementing an orthogonal
transform. Concentrating on lossless modulation matrices, we can continue our
analysis of orthogonal systems in the modulation domain. First, from (3.2.50) we
can see that ( G1 (z −1 ) G1 (−z −1 ) )T has to be orthogonal to ( G0 (z) G0 (−z) )T .
It will be proven in Appendix 3.A (although in polyphase domain), that this implies
that the two filters G0 (z) and G1 (z) are related as follows:

G1 (z) = −z −2K+1 G0 (−z −1 ), (3.2.52)

or, in time domain


g1 [n] = (−1)n g0 [2K − 1 − n].
Equation (3.2.52) therefore establishes an important property of an orthogonal
system: In an orthogonal two-channel filter bank, all filters are obtained from a
single prototype filter.
This single prototype filter has to satisfy the power complementary property
given by (3.2.49). For filter design purposes, one can use (3.2.46) and design an
autocorrelation function P (z) that satisfies P (z) + P (−z) = 2 as will be shown
below. This special form of the autocorrelation function can be used to prove that
the filters in an orthogonal FIR filter bank have to be of even length (Problem 3.5).
132 CHAPTER 3

Orthonormality in Polyphase Domain We have seen that the polyphase and


modulation matrices are related as in (3.2.29). Since Gm and Gp are related by
unitary operations, Gp will be lossless if and only if Gm is lossless. Thus, one
can search or examine an orthonormal system in either modulation, or polyphase
domain, since
  
T −2 2 1 T −1 1 1 1 0
Gp (z ) Gp (z ) = G (z )
4 m 1 −1 0 z −1
  
1 0 1 1
× Gm (z)
0 z 1 −1
1 T −1
= G (z ) Gm (z) = I, (3.2.53)
2 m

where we used (3.2.51). Since (3.2.53) also implies Gp (z) GTp (z −1 ) = I (left inverse
is also right inverse), it is clear that given a paraunitary Gp (z) corresponding to
an orthogonal synthesis filter bank, we can choose the analysis filter bank with a
polyphase matrix H p (z) = GTp (z −1 ) and get perfect reconstruction with no delay.

Summary of Orthonormality Relations Let us summarize our findings so far.


T HEOREM 3.8
In a two-channel, orthonormal, FIR, real-coefficient filter bank, the following
are equivalent:

(a) gi [n], gj [n + 2m] = δ[i − j] δ[m], i = 0, 1.

(b) G0 (z) G0 (z −1 ) + G0 (−z) G0 (−z −1 ) = 2,


and G1 (z) = −z −2K+1 G0 (−z −1 ), K ∈ Z.

(c) T Ts T s = T s T Ts = I, T a = T Ts .

(d) GTm (z −1 ) Gm (z) = Gm (z) GTm (z −1 ) = 2I, H m (z) = GTm (z −1 ).

(e) GTp (z −1 ) Gp (z) = Gp (z) GTp (z −1 ) = I, H p (z) = GTp (z −1 ).

Again, we used the fact that the left inverse is also the right inverse in a square
matrix in relations (c), (d) and (e). The proof follows from the relations between
the various representations, and is left as an exercise (see Problem 3.7). Note that
the theorem holds in more general cases as well. In particular, the filters do not have
to be restricted to be FIR, and if their coefficients are complex valued, transposes
have to be hermitian transposes (in the case of Gm and Gp , only the coefficients of
the filters have to be conjugated, not z since z −1 plays that role).
3.2. TWO-CHANNEL FILTER BANKS 133

Because all filters are related to a single prototype satisfying (a) or (b), the
other filter in the synthesis filter bank follows by modulation, time reversal and an
odd shift (see (3.2.52)). The filters in the analysis are simply time-reversed versions
of the synthesis filters. In the FIR case, the length of the filters is even. Let us
formalize these statements:

C OROLLARY 3.9
In a two-channel, orthonormal, FIR, real-coefficient filter bank, the following
hold:

(a) The filter length L is even, or L = 2K.

(b) The filters satisfy the power complementary or Smith-Barnwell condi-


tion.

|G0 (ejω )|2 +|G0 (ej(ω+π) )|2 = 2, |G0 (ejω )|2 +|G1 (ejω )|2 = 2. (3.2.54)

(c) The highpass filter is specified (up to an even shift and a sign change)
by the lowpass filter as

G1 (z) = −z −2K+1 G0 (z −1 ).

(d) If the lowpass filter has a zero at π, that is, G0 (−1) = 0, then

G0 (1) = 2. (3.2.55)

Also, an orthogonal filter bank has, as any orthogonal transform, an energy conser-
vation property:

P ROPOSITION 3.10
In an orthonormal filter bank, that is, a filter bank with a unitary polyphase
or modulation matrix, the energy is conserved between the input and the
channel signals,
x2 = y0 2 + y1 2 . (3.2.56)

P ROOF
The energy of the subband signals equals
 2π  
1
y0 2 + y1 2 = |Y 0 (ejω )|2 + |Y 1 (ejω )|2 dω,
2π 0
134 CHAPTER 3

by Parseval’s relation (2.4.37). Using the fact that y(z) = H p (z) xp (z), the right side can
be written as,
 2π  ∗
 2π  ∗ ∗
1 1
y(ejω ) · y(ejω )dω = xp (ejω ) H p (ejω )
2π 0 2π 0
× H p (ejω ) xp (ejω ) dω,
 2π  ∗
1
= xp (ejω ) xp (ejω ) dω,
2π 0
= x0 2 + x1 2 .

We used the fact that H p (ejω ) is unitary and Parseval’s relation. Finally, (3.2.56) follows
from the fact that the energy of the signal is equal to the sum of the polyphase components’
energy, x2 = x0 2 + x1 2 .

Designing Orthogonal Filter Banks Now, we give two design procedures: the
first, based on spectral factorization, and the second, based on lattice structures.
Let us just note that most of the methods in the literature design analysis filters.
We will give designs for synthesis filters so as to be consistent with our approach;
however, analysis filters are easily obtained by time reversing the synthesis ones.

Designs Based on Spectral Factorizations The first solution we will show is due to
Smith and Barnwell [271]. The approach here is to find an autocorrelation se-
quence P (z) = G0 (z)G0 (z −1 ) that satisfies (3.2.46) and then to perform spectral
factorization as explained in Section 2.5.2. However, factorization becomes numeri-
cally ill-conditioned as the filter size grows, and thus, the resulting filters are usually
only approximately orthogonal.

Example 3.1
Choose p[n] as a windowed version of a perfect half-band lowpass filter,

w[n] sin(π/2n)
π/2·n
n = −2K + 1, . . . , 2K − 1,
p[n] =
0 otherwise.

where w[n] is a symmetric window function with w[0] = 1. Because p[2n] = δ[n], the
z-transform of p[n] satisfies
P (z) + P (−z) = 2. (3.2.57)
Also since P (z) is an approximation to a half-band lowpass filter, its spectral factor will be
such an approximation as well. Now, P (ejω ) might not be positive everywhere, in which
case it is not an autocorrelation and has to be modified. The following trick can be used
to find an autocorrelation sequence p [n] close to p[n] [271]. Find the minimum of P (ejω ),
δmin = minω [P (ejω )]. If δmin > 0, we need not do anything, otherwise, subtract it from
p[0] to get the sequence p [n] . Now,

P  (ejω ) = P (ejω ) − δmin ≥ 0,

and P  (z) still satisfies (3.2.57) up to a scale factor (1 − δmin ) which can be divided out.
3.2. TWO-CHANNEL FILTER BANKS 135

0 0

-10 -10

-20 -20
Magnitude response [dB]

Magnitude response [dB]


-30 -30

-40 -40

-50 -50

-60 -60

-70 -70

-80 -80
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Frequency [radians] Frequency [radians]

(a) (b)

0 0

-10 -10

-20 -20
Magnitude response [dB]

-30 Magnitude response [dB] -30

-40 -40

-50 -50

-60 -60

-70 -70

-80 -80
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Frequency [radians] Frequency [radians]

(c) (d)

Figure 3.5 Orthogonal filter designs. Magnitude responses of: (a) Smith and
Barnwell filter of length 8 [271], (b) Daubechies’ filter of length 8 (D4 ) [71],
(c) Vaidyanathan and Hoang filter of length 8 [310], (d) Butterworth filter for
N = 4 [133]. FIGURE 3.4 fignew3.2.1

An example of a design for N = 8 by Smith and Barnwell is given in Figure


3.5(a) (magnitude responses) and Table 3.2 (impulse response coefficients) [271].
Another example based on spectral factorization is Daubechies’ family of max-
imally flat filters [71]. Daubechies’ purpose was that the filters should lead to
continuous-time wavelet bases (see Section 4.4). The design procedure then amounts
to finding orthogonal lowpass filters with a large number of zeros at ω = π. Equiv-
alently, one has to design an autocorrelation satisfying (3.2.46) and having many
zeros at ω = π. That is, we want

P (z) = (1 + z −1 )k (1 + z)k R(z),

which satisfies (3.2.57), where R(z) is symmetric (R(z −1 ) = R(z)) and positive
on the unit circle, R(ejω ) ≥ 0. Of particular interest is the case when R(z) is
136 CHAPTER 3

Table 3.2 Impulse response coefficients for


Smith and Barnwell filter [271], Daubechies’ fil-
ter D4 [71] and Vaidyanathan and Hoang filter
[310] (all of length 8).

n Smith and Daubechies Vaidyanathan


Barnwell and Hoang
0 0.04935260 0.23037781 0.27844300
1 -0.01553230 0.71484657 0.73454200
2 -0.08890390 0.63088076 0.58191000
3 0.31665300 -0.02798376 -0.05046140
4 0.78751500 -0.18703481 -0.19487100
5 0.50625500 0.03084138 0.03547370
6 -0.03380010 0.03288301 0.04692520
7 -0.10739700 -0.01059740 -0.01778800

of minimal degree, which turns out to be when R(z) has powers of z going from
(−k+1) to (k−1). Once the solution to this constrained problem is found, a spectral
factorization of R(z) yields the desired filter G0 (z), which has automatically k zeros
at π. As always with spectral factorization, there is a choice of taking zeros either
inside or outside the unit circle. Taking them systematically from inside the unit
circle, leads to Daubechies’ family of minimum-phase filters.
The function R(z) which is required so that P (z) satisfies (3.2.57) can be found
by solving a system of linear equations or a closed form is possible in the minimum-
degree case [71]. Let us indicate a straightforward approach leading to a system of
linear equations. Assume the minimum-degree solution. Then P (z) has powers of
z going from (−2k + 1) to (2k − 1) and (3.2.57) puts 2k − 1 constraints on P (z).
But because P (z) is symmetric, k − 1 of them are redundant, leaving k active
constraints. Because R(z) is symmetric, it has k degrees of freedom (out of its
2k − 1 nonzero coefficients). Since P (z) is the convolution of (1 + z −1 )k (1 + z)k with
R(z), it can be written as a matrix-vector product, where the matrix contains the
impulse response of (1 + z −1 )k (1 + z)k and its shifts. Gathering the even terms of
this matrix-vector product (which correspond to the k constraints) and expressing
them in terms of the k free parameters of R(z), leads to the desired k × k system
of equation. It is interesting to note that the matrix involved is never singular, and
the R(z) obtained by solving the system of equations is positive on the unit circle.
Therefore, this method automatically leads to an autocorrelation, and by spectral
factorization, to an orthogonal filter bank with filters of length 2k having k zeros
at π and 0 for the lowpass and highpass, respectively.
As an example, we will construct Daubechies’ D2 filter, that is, a length-4
orthogonal filter with two zeros at ω = π (the maximum number of zeros at π is
3.2. TWO-CHANNEL FILTER BANKS 137

equal to half the length, and indicated by the subscript).

Example 3.2
Let us choose k = 2 and construct length-4 filters. This means that

P (z) = G0 (z)G0 (z −1 ) = (1 + z −1 )2 (1 + z)2 R(z).

Now, recall that since P (z) + P (−z) = 2, all even-indexed coefficients in P (z) equal 0,
except for p[0] = 1. To obtain a length-4 filter, the highest-degree term has to be z −3 , and
thus R(z) is of the form
R(z) = (az + b + az −1 ). (3.2.58)

Substituting (3.2.58) into P (z) we obtain

P (z) = az 3 + (4a + b)z 2 + (7a + 4b)z + (8a + 6b) + (4b + 7a)z −1 + (b + 4a)z −2 + az −3 .

Equating the coefficients of z 2 or z −2 with 0, and the one with z 0 with 1 yields

4a + b = 0, 8a + 6b = 1.

The solution to this system of equations is

1 1
a = − , b = ,
16 4

yielding the following R(z):

1 1 1 −1
R(z) = − z+ − z .
16 4 16

We factor now R(z) as


 2
1 √ √ √ √
R(z) = √ (1 + 3 + (1 − 3)z −1 )(1 + 3 + (1 − 3)z).
4 2
√ √
Taking the term with the zero inside the unit circle, that is (1 + 3 + (1 − 3)z −1 ), we
obtain the filter G0 (z) as

1 √ √
G0 (z) = √ (1 + z −1 )2 (1 + 3 + (1 − 3)z −1 ),
4 2
1 √
= √ ((1 + 3)
4 2
√ √ √
+ (3 + 3)z −1 + (3 − 3)z −2 + (1 − 3)z −3 ). (3.2.59)

Note that this lowpass filter has a double zero at z = −1 (important for constructing wavelet
bases, as will be seen in Section 4.4). A longer filter with four zeros at ω = π is shown in
Figure 3.5(b) (magnitude responses of the lowpass/highpass pair) while the impulse response
coefficients are given in Table 3.2 [71].
138 CHAPTER 3

UΚ−1 UΚ−2 U0
x0 ••• y0

x1 z−1 z−1 z−1 y1


•••

Figure 3.6 Two-channel lattice factorization of paraunitary filter banks. The


2 × 2 blocks U i are rotation matrices.

Designs Based on Vaidyanathan and Hoang Lattice Factorizations An alternative and


numerically well-conditioned procedure relies on the fact that paraunitary, just
like unitary matrices, possess canonical factorizations6 into elementary paraunitary
matrices [305, 310] (see also Appendix 3.A). Thus, all paraunitary filter banks with
FIR filters of length L = 2K can be reached by the following lattice structure (here
G1 (z) = −z −2K+1 G0 (−z −1 )):
  K−1   
G00 (z) G10 (z) " 1
Gp (z) = = U0 Ui , (3.2.60)
G01 (z) G11 (z) z −1
i=1

where U i is a 2 × 2 rotation matrix given in (2.B.1)


FIGURE 3.5  figA.1.0
cos αi − sin αi
Ui = .
sin αi cos αi

That the resulting structure is paraunitary is easy to check (it is the product of
paraunitary elementary blocks). What is much more interesting is that all pa-
raunitary matrices of a given degree can be written in this form [310] (see also
Appendix 3.A.1). The lattice factorization is given in Figure 3.6.
As an example of this approach, we construct the D2 filter from the previous
example, using the lattice factorization.

Example 3.3
We construct the D2 filter which is of length 4, thus L = 2K = 4. This means that
   
cos α0 − sin α0 1 cos α1 − sin α1
Gp (z) = −1 ,
sin α0 cos α0 z sin α1 cos α1
 
cos α0 cos α1 − sin α0 sin α1 z −1 − cos α0 sin α1 − sin α0 cos α1 z −1
= .
sin α0 cos α1 + cos α0 sin α1 z −1 − sin α0 sin α1 + cos α0 cos α1 z −1
(3.2.61)
6
By canonical we mean complete factorizations with a minimum number of free parameters.
However, such factorizations are not unique in general.
3.2. TWO-CHANNEL FILTER BANKS 139

We get the lowpass filter G0 (z) as

G0 (z) = G00 (z 2 ) + z −1 G01 (z 2 ),


= cos α0 cos α1 + sin α0 cos α1 z −1 − sin α0 sin α1 z −2 + cos α0 sin α1 z −3 .

We now obtain the D2 filter by imposing a second-order zero at z = −1. So, we obtain the
first equation as

G0 (−1) = cos α1 cos α0 − cos α1 sin α0 − sin α1 sin α0 − sin α1 cos α0 = 0,

or,
cos(α0 + α1 ) − sin(α0 + α1 ) = 0.
This equation implies that
π
α0 + α1 = kπ + .
4

Since we also know that G0 (1) = 2 (see (3.2.55)

cos(α0 + α1 ) + sin(α0 + α1 ) = 2,

we get that
π
α0 + α1 = . (3.2.62)
4
Imposing now a zero at ejω = −1 on the derivative of G0 (ejω ), we obtain
!
dG0 (ejω ) !!
! = cos α1 sin α0 + 2 sin α1 sin α0 + 3 sin α1 cos α0 = 0. (3.2.63)
dω ω=π

Solving (3.2.62) and (3.2.63), we obtain

π π
α0 = , α1 = − .
3 12

Substituting the angles α0 , α1 into the expression for G0 (z) (3.2.61) and comparing it to
(3.2.59), we can see that we have indeed obtained the D2 filter.

An example of a longer filter obtained by lattice factorization is given in Fig-


ure 3.5(c) (magnitude responses) and Table 3.2 (impulse response coefficients). This
design example was obtained by Vaidyanathan and Hoang in [310].

3.2.4 Linear Phase FIR Filter Banks


Orthogonal filter banks have many nice features (conservation of energy, identical
analysis and synthesis) but also some restrictions. In particular, there are no or-
thogonal linear phase solutions with real FIR filters (see Proposition 3.12) except
in some trivial cases (such as the Haar filters). Since linear phase filter banks yield
biorthogonal expansions, four filters are involved, namely H0 , H1 at analysis, and
G0 and G1 at synthesis. In our discussions, we will often concentrate on H0 and
140 CHAPTER 3

H1 first (that is, in this case we design the analysis part of the system, or, one of
the two biorthogonal bases).
First, note that if a filter is linear phase, then it can be written as

H(z) = ±z −L+1 H(z −1 ), (3.2.64)

where ± will mean it is a symmetric/antisymmetric filter, respectively, and L de-


notes the filter’s length. Note that here we have assumed that H(z) has the impulse
response ranging from h[0], . . . , h[L − 1] (otherwise, modify (3.2.64) with a phase
factor). Recall from Proposition 3.6 that perfect reconstruction FIR solutions are
possible if and only if the matrix H p (z) (or equivalently H m (z)) has a determinant
equal to a delay, that is [319]

H00 (z) H11 (z) − H01 (z) H10 (z) = z −l , (3.2.65)


−2l−1
H0 (z) H1 (−z) − H0 (−z) H1 (z) = 2z . (3.2.66)

The right-hand side of (3.2.65) is the determinant of the polyphase matrix H p (z),
while the right-hand side of (3.2.66) is the determinant of the modulation matrix
H m (z). The synthesis filters are then equal to (see (3.2.30–3.2.31))

G0 (z) = z −k H1 (−z), G1 (z) = −z −k H0 (−z),

where k is an arbitrary shift.


Of particular interest is the case when both H0 (z) and H1 (z) are linear phase
(symmetric or antisymmetric) filters. Then, as in the paraunitary case, there are
certain restrictions on possible filters [315, 319].

P ROPOSITION 3.11
In a two-channel, perfect reconstruction filter bank, where all filters are linear
phase, the analysis filters have one of the following forms:

(a) Both filters are symmetric and of odd lengths, differing by an odd mul-
tiple of 2.

(b) One filter is symmetric and the other is antisymmetric; both lengths are
even, and are equal or differ by an even multiple of 2.

(c) One filter is of odd length, the other one of even length; both have all
zeros on the unit circle. Either both filters are symmetric, or one is
symmetric and the other one is antisymmetric (this is a degenerate case)
.
3.2. TWO-CHANNEL FILTER BANKS 141

The proof can be found in [319] and is left as an exercise (see Problem 3.8).
We will discuss it briefly. The idea is to consider the product polynomial P (z) =
H0 (z)H1 (−z) that has to satisfy (3.2.66). Because H0 (z) and H1 (z) (as well as
H1 (−z)) are linear phase, so is P (z). Because of (3.2.66), when P (z) has more
than two nonzero coefficients, it has to be symmetric with one central coefficient
at 2l − 1. Also, the end terms of P (z) have to be of an even index, so they cancel
in P (z) − P (−z). The above two requirements lead to the symmetry and length
constraints for cases (a) and (b). In addition, there is a degenerate case (c), of little
practical interest, when P (z) has only two nonzero coefficients,

P (z) = z −j (1 ± z 2N −1−2j ),

which leads to zeros at odd roots of ±1. Because these are distributed among H0 (z)
and H1 (−z) (rather than H1 (z)), the resulting filters will be a poor set of lowpass
and highpass filters.
Another result that we mentioned at the beginning of this section is:
P ROPOSITION 3.12
There are no two-channel perfect reconstruction, orthogonal filter banks, with
filters being FIR, linear phase, and with real coefficients (except for the Haar
filters).
P ROOF
We know from Theorem 3.8 that orthonormality implies that

H p (z)H Tp (z −1 ) = I,

which further means that

H00 (z)H00 (z −1 ) + H01 (z)H01 (z −1 ) = 1. (3.2.67)

We also know that in orthogonal filter banks, the filters are of even length. Therefore,
following Proposition 3.11, one filter is symmetric and the other one is antisymmetric. Take
the symmetric one, H0 (z) for example, and use (3.2.64)

H0 (z) = H00 (z 2 ) + z −1 H01 (z 2 ),


= z −L+1 H0 (z −1 ) = z −L+1 (H00 (z −2 ) + zH01 (z −2 )),
= z −L+2 H01 (z −2 ) + z −1 (z −L+2 H00 (z −2 )).

This further means that the polyphase components are related as

H00 (z) = z −L/2+1 H01 (z −1 ), H01 (z) = z −L/2+1 H00 (z −1 ). (3.2.68)

Substituting the second equation from (3.2.68) into (3.2.67) we obtain

1
H00 (z) H00 (z −1 ) = .
2
142 CHAPTER 3

However, the only FIR, real-coefficient polynomial satisfying the above is

1
H00 (z) = √ z −l .
2

Performing a similar analysis for H01 (z), we obtain that H01 (z) = 1/ 2z −k , which, in turn,
means that
1
H0 (z) = √ (z −2l + z −2k−1 ), H1 (z) = H0 (−z),
2

or, the only solution yields Haar filters (l = k = 0) or trivial variations thereof.

We now shift our attention to design issues.

Lattice Structure for Linear Phase FiltersUnlike in the paraunitary case, there are no
canonical factorizations for general matrices of polynomials.7 But there are lattice
structures that will produce, for example, linear phase perfect reconstruction filters
[208, 321]. To obtain it, note that H p (z) has to satisfy (if the filters are of the same
length)
   
1 0 −k −1 0 1
H p (z) = · z · H p (z ) · . (3.2.69)
0 −1 1 0

Here, we assume that Hi (z) = Hi0 (z 2 ) + z −1 Hi1 (z 2 ) in order to have causal filters.
This is referred to as the linear phase testing condition (see Problem 3.9). Then,

assume that H p (z) satisfies (3.2.69) and construct H p (z) as
  
 1 1 α
H p (z) = H p (z) .
z −1 α 1


It is then easy to show that H p (z) satisfies (3.2.69) as well. The lattice

  K−1
"  
1 1 1 1 αi
H p (z) = C , (3.2.70)
−1 1 z −1 αi 1
i=1

5K−1
with C = −(1/2) i=1 (1/(1 − α2i )), produces length L = 2K symmetric (lowpass)
and antisymmetric (highpass) filters leading to perfect reconstruction filter banks.
Note that the structure is incomplete [321] and that |αi | = 1. Again, just as in the
paraunitary lattice, perfect reconstruction is structurally guaranteed within a scale
factor (in the synthesis, replace simply αi by −αi and pick C = 1).
7
There exist factorizations of polynomial matrices based on ladder steps [151], but they are not
canonical like the lattice structure in (3.2.60).
3.2. TWO-CHANNEL FILTER BANKS 143

Table 3.3 Impulse response coefficients for analysis and


synthesis filters in two different linear phase cases. There
is a factor of 1/16 to be distributed between hi [n] and
gi [n], like {1/4, 1/4} or {1/16, 1} (the latter was used in
the text).

n h0 [n] h1 [n] g0 [n] g1 [n] h0 [n] h1 [n] g0 [n] g1 [n]


0 1 -1 -1 -1 1 -1 -1 -1
1 3 -3 3 3 2 -2 2 2
2 3 3 3 -3 1 6 6 -1
3 1 1 -1 1 -2 2
4 -1 -1

Example 3.4
Let us construct filters of length 4 where the lowpass has a maximum number of zeros at
z = −1 (that is, the linear phase counterpart of the D2 filter). From the cascade structure,
   
−1 1 1 1 1 α
H p (z) =
2(1 − α2 ) −1 1 z −1 α 1
 
−1 1 + αz −1 α + z −1
= .
2(1 − α2 ) −1 + αz −1 −α + z −1

We can now find the filter H0 (z) as

1 + αz −1 + αz −2 + z −3
H0 (z) = H00 (z 2 ) + z −1 H01 (z 2 ) = .
−2(1 − α2 )

Because H0 (z) is an even-length symmetric filter, it has automatically a zero at z = −1,


or H0 (−1) = 0. Take now the first derivative of H0 (ejω ) at ω = π and set it to 0 (which
corresponds to imposing a double zero at z = −1)
!
dH0 (ejω ) !! −1
! = (α − 2α + 3) = 0,
dω ω=π
2(1 − α2 )

leading to α = 3. Substituting this into the expression for H0 (z), we get

1 1
H0 (z) = (1 + 3z −1 + 3z −2 + z −3 ) = (1 + z −1 )3 , (3.2.71)
16 16

which means that H0 (z) has a triple zero at z = −1. The highpass filter is equal to

1
H1 (z) = (−1 − 3z −1 + 3z −2 + z −3 ). (3.2.72)
16

Note that det(H m (z)) = (1/8) z −3 . Following (3.2.30–3.2.31), G0 (z) = 16z 3 H1 (−z) and
G1 (z) = −16z 3 H0 (−z). A causal version simply skips the z 3 factor. Recall that the key
144 CHAPTER 3

to perfect reconstruction is the product P (z) = H0 (z) · H1 (−z) in (3.2.66), which equals in
this case (using (3.2.71–3.2.72))

1
P (z) = (−1 + 9z −1 + 16z −3 + 9z −4 − z −6 )
256
1
= (1 + z −1 )4 (−1 + 4z −1 − z −2 ),
256

that is, the same P (z) as in Example 3.2. One can refactor this P (z) into a different set of
{H0 (z), H1 (−z)}, such as, for example,

P (z) = H0 (z) H1 (−z)


1 1
= (1 + 2z −1 + z −2 ) (−1 + 2z −1 + 6z −2 + 2z −3 − z −4 ),
16 16

that is, odd-length linear phase lowpass and highpass filters with impulse responses 1/16 [1,
2, 1] and 1/16 [-1, -2, 6, -2, -1], respectively. Table 3.3 gives impulse response coefficients
for both analysis and synthesis filters for the two cases given above.

The above example showed again the central role played by P (z) = H0 (z) · H1 (−z).
In some sense, designing two-channel filter banks boils down to designing P (z)’s
with particular properties, and factoring them in a particular way.
If one relaxes the perfect reconstruction constraint, one can obtain some desir-
able properties at the cost of some small reconstruction error. For example, popular
QMF filters have been designed by Johnston [144], which have linear phase and “al-
most” perfect reconstruction. The idea is to approximate perfect reconstruction in
a QMF solution (see (3.2.37)) as well as possible, while obtaining a good lowpass
filter (the highpass filter H1 (z) being equal to H0 (−z), is automatically as good as
the lowpass). Therefore, define an objective function depending on two quantities:
(a) stopband attenuation error of H0 (z)
 π
S = |H0 (ejω )|2 dω,
ωs

and (b) reconstruction error


 π
E = |2 − (H0 (ejω ))2 + (H0 (ej(ω+π) ))2 |2 dω.
0

The objective function is


O = cS + (1 − c)E,
where c assigns the relative cost to these two quantities. Then, O is minimized
using the coefficients of H0 (z) as free variables. Such filter designs are tabulated in
[67, 144].
3.2. TWO-CHANNEL FILTER BANKS 145

Complementary Filters The following question sometimes arises in the design of


filter banks: given an FIR filter H0 (z), is there a complementary filter H1 (z) such
that the filter bank allows perfect reconstruction with FIR filters? The answer is
given by the following proposition which was first proven in [139]. We will follow
the proof in [319]:
P ROPOSITION 3.13
Given a causal FIR filter H0 (z), there exists a complementary filter H1 (z)
if and only if the polyphase components of H0 (z) are coprime (except for
possible zeros at z = ∞).
P ROOF
From Proposition 3.6, we know that a necessary and sufficient condition for perfect FIR
reconstruction is that det(H p (z)) be a monomial. Thus, coprimeness is obviously neces-
sary, since if there is a common factor between H00 (z) and H01 (z), it will show up in the
determinant. Sufficiency follows from the Euclidean algorithm or Bezout’s identity: given
two coprime polynomials a(z) and b(z), the equation a(z)p(z)+b(z)q(z) = c(z) has a unique
solution (see, for example, [32]). Thus, choose c(z) = z −k and then, the solution {p(z), q(z)}
corresponds to the two polyphase components of H1 (z).

Note that the solution H1 (z) is not unique [32, 319]. Also, coprimeness of
H00 (z), H01 (z) is equivalent with H0 (z) not having any pair of zeros at locations α
and −α. This can be used to prove that the filter H0 (z) = (1 + z −1 )N always has
a complementary filter (see Problem 3.12).

Example 3.5
Consider the filter H0 (z) = (1 + z −1 )4 = 1 + 4z −1 + 6z −2 + 4z −3 + z −4 . It can be verified
that its two polyphase components are coprime, and thus, there is a complementary filter.
We will find a solution to the equation

det(H p (z)) = H00 (z) · H11 (z) − H01 (z) · H10 (z) = z −1 , (3.2.73)

with H00 (z) = 1 + 6z −1 + z −2 and H01 (z) = 4 + 4z −1 . The right side of (3.2.73) was chosen
so that there is a linear phase solution. For example,

1 1
H10 (z) = (1 + z −1 ), H11 (z) = ,
16 4

is a solution to (3.2.73), that is, H1 (z) = (1 + 4z −1 + z 2 )/16. This of course leads to the
same P (z) as in Examples 3.3 and 3.4.

3.2.5 Filter Banks with IIR Filters


We will now concentrate on orthogonal filter banks with infinite impulse response
(IIR) filters. An early study of IIR filter banks was done in [313], and further
developed in [234] as well as in [269] for perfect reconstruction in the context of
146 CHAPTER 3

image coding. The main advantage of such filter banks is good frequency selectivity
and low computational complexity, just like in regular IIR filtering. However, this
advantage comes with a cost. Recall that in orthogonal filter banks, the synthesis
filter impulse response is the time-reversed version of the analysis filter. Now if
the analysis uses causal filters (with impulse response going from 0 to +∞), then
the synthesis has anticausal filters. This is a drawback from the point of view of
implementation, since in general anticausal IIR filters cannot be implemented unless
their impulse responses are truncated. However, a case where anticausal IIR filters
can be implemented appears when the signal to be filtered is of finite length, a case
encountered in image processing [234, 269]. IIR filter banks have been less popular
because of this drawback, but their attractive features justify a brief treatment as
given below. For more details, the reader is referred to [133].
First, return to the lattice factorization for FIR orthogonal filter banks (see
(3.2.60)). If one substitutes an allpass section8 for the delay z −1 in (3.2.60), the
factorization is still paraunitary. For example, instead of the diagonal matrix used
in (3.2.60), take a diagonal matrix D(z) such that
  
−1 F0 (z) 0 F0 (z −1 ) 0
D(z) D(z ) = = I,
0 F1 (z) 0 F1 (z −1 )
where we have assumed that the coefficients are real, and have used two allpass
sections (instead of 1 and z −1 ). What is even more interesting is that such a
factorization is complete [84].
Alternatively, recall that one of the ways to design orthogonal filter banks is to
find an autocorrelation function P (z) which is valid, that is, which satisfies
P (z) + P (−z) = 2, (3.2.74)
and then factor it into P (z) = H0 (z)H0 (z −1 ). This approach is used in [133] to
construct all possible orthogonal filter banks with rational filters. The method goes
as follows:
First, one chooses an arbitrary polynomial R(z) and forms P (z) as
2R(z)R(z −1 )
P (z) = . (3.2.75)
R(z)R(z −1 ) + R(−z)R(−z −1 )
It is easy to see that this P (z) satisfies (3.2.74). Since both the numerator and the
denominator are autocorrelations (the latter being the sum of two autocorrelations),
P (z) is as well. It can be shown that any valid autocorrelation can be written as
in (3.2.75) [133]. Then factor P (z) as H(z)H(z −1 ) and form the filter
H0 (z) = AH0 (z) H(z),
8
Remember that a filter H(ejω ) is allpass if |H(ejω )| = c, c > 0, for all ω. Here we choose
c = 1.
3.2. TWO-CHANNEL FILTER BANKS 147

where AH0 (z) is an arbitrary allpass. Finally choose

H1 (z) = z 2K−1 H0 (−z −1 ) AH1 (z), (3.2.76)

where AH1 (z) is again an arbitrary allpass. The synthesis filters are then

G0 (z) = H0 (z −1 ), G1 (z) = −H1 (z −1 ). (3.2.77)

The above construction covers the whole spectrum of possible solutions. For exam-
ple, if R(z)R(z −1 ) is in itself a valid function, then

R(z)R(z −1 ) + R(−z)R(−z −1 ) = 2,

and by choosing AH0 , AH1 to be pure delays, the solutions obtained by the above
construction are FIR.

Example 3.6 Butterworth Filters


As an example, consider a family of IIR solutions constructed in [133]. It is obtained using
the above construction and imposing a maximum number of zeros at z = −1. Choosing
R(z) = (1 + z −1 )N in (3.2.75) gives

(1 + z −1 )N (1 + z)N
P (z) = = H(z)H(z −1 ). (3.2.78)
(z −1 + 2 + z)N + (−z −1 + 2 − z)N

These filters are the IIR counterparts of the Daubechies’ filters given in Example 3.2. These
are, in fact, the N th order half-band digital Butterworth filters [211] (see also Example 2.2).
That these particular filters satisfy the conditions for orthogonality was also pointed out
in [269]. The Butterworth filters are known to be the maximally flat IIR filters of a given
order.
Choose N = 5, or P (z) equals

(1 + z)5 (1 + z −1 )5
P (z) = .
10z 4 + 120z 3 + 252 + 120z −2 + 10z −4
In this case, we can obtain a closed form spectral factorization of P (z), which leads to

1 + 5z −1 + 10z −2 + 10z −3 + 5z −4 + z −5
H0 (z) = √ , (3.2.79)
2(1 + 10z −2 + 5z −4 )
1 − 5z + 10z 2 − 10z 3 + 5z 4 − z 5
H1 (z) = z −1 √ . (3.2.80)
2(1 + 10z 2 + 5z 4 )

For the purposes of implementation, it is necessary to factor H i (z) into stable causal (poles
inside the unit circle) and anticausal (poles outside the unit circle) parts. For comparison
with earlier designs, where length-8 FIR filters were designed, we show in Figure 3.5(d) the
magnitude responses of H0 (ejω ) and H1 (ejω ) for N = 4. The form of the P (z) is then

z −4 (1 + z)4 (1 + z −1 )4
P (z) = .
1 + 28z −2 + 70z −4 + 28z −6 + z −8
148 CHAPTER 3

As we pointed out in Proposition 3.12, there are no real FIR orthogonal sym-
metric/antisymmetric filter banks. However, if we allow IIR filters instead, then
solutions do exist. There are two cases, depending if the center of symmetry/anti-
symmetry is at a half integer (such as in an even-length FIR linear phase filter)
or at an integer (such as in the odd-length FIR case). We will only consider the
former case. For discussion of the latter case as well as further details, see [133].
It can be shown that the polyphase matrix for an orthogonal, half-integer sym-
metric/antisymmetric filter bank is necessarily of the form
 
A(z) z −l A(z −1 )
H p (z) = ,
−z l−n A(z) z −n A(z −1 )

where A(z)A(z −1 ) = 1, that is, A(z) is an allpass filter. Choosing l = n = 0 gives

H0 (z) = A(z 2 ) + z −1 A(z −2 ), H1 (z) = −A(z 2 ) + z −1 A(z −2 ), (3.2.81)

which is an orthogonal, linear phase pair. For a simple example, choose

1 + 6z −1 + (15/7)z −2
A(z) = . (3.2.82)
(15/7) + 6z −1 + z −2
This particular solution will prove useful in the construction of wavelets (see Sec-
tion 4.6.2). Again, for the purposes of implementation, one has to implement stable
causal and anticausal parts separately.

Remarks The main advantage of IIR filters is their good frequency selectivity and
low computational complexity. The price one pays, however, is the fact that the
filters become noncausal. For the sake of discussion, assume a finite-length signal,
and a causal analysis filter, which will be followed by an anticausal synthesis filter.
The output will be infinite even though the input is of finite length. One can take
care of this problem in two ways. Either one stores the state of the filters after
the end of the input signal and uses this as an initial state for the synthesis filters
[269], or one takes advantage of the fact that the outputs of the analysis filter bank
decay rapidly after the input is zero, and stores only a finite extension of these
signals. While the former technique is exact, the latter is usually a good enough
approximation. This short discussion indicates that the implementation of IIR filter
banks is less straightforward than that of their FIR counterparts, and explains their
lesser popularity.

3.3 T REE -S TRUCTURED F ILTER BANKS


An easy way to construct multichannel filter banks is to cascade two-channel banks
appropriately. One case can be seen in Figure 3.7(a), where frequency analysis is
3.3. TREE-STRUCTURED FILTER BANKS 149

x H1 2

H0 2 H1 2

stage 1
H0 2

stage 2 H1 2

H0 2

stage J
(a)

2 G1 + x^
W1

2 G1 + 2 G0
W2 V1
stage 1
2 G0
V2
2 G1 + stage 2
WJ

2 G0
VJ
stage J
(b)

FIGURE 3.5 fignew3.3.1


Figure 3.7 An octave-band filter bank with J stages. Decomposition spaces
Vi , Wi are indicated. If hi [n] is an orthogonal filter, and gi [n] = hi [−n], the
structure implements an orthogonal discrete-time wavelet series expansion. (a)
Analysis part. (b) Synthesis part.

obtained by simply iterating a two-channel division on the previous lowpass channel.


This is often called a constant-Q or constant relative bandwidth filter bank since the
bandwidth at each channel, divided by its center frequency, is constant. It is also
sometimes called a logarithmic filter bank since the channels are equal bandwidth
on a logarithmic scale. We will call it an octave-band filter bank since each successive
highpass output contains an octave of the input bandwidth. Another case appears
when 2J equal bandwidth channels are desired. This can be obtained by a J-step
subdivision into 2 channels, that is, the two-channel bank is now iterated on both
the lowpass and highpass channels. This results in a tree with 2J leaves, each
corresponding to (1/2J )th of the original bandwidth, with a downsampling by 2J .
Another possibility is building an arbitrary tree-structured filter bank, giving rise
150 CHAPTER 3

to wavelet packets, discussed later in this section.

3.3.1 Octave-Band Filter Bank and Discrete-Time Wavelet Series


Consider the filter bank given in Figure 3.7. We see that the signal is split first via a
two-channel filter bank, then the lowpass version is split again using the same filter
bank, and so on. It will be shown later that this structure implements a discrete-
time biorthogonal wavelet series (we assume here that the two-channel filter banks
are perfect reconstruction). If the two-channel filter bank is orthonormal, then it
implements an orthonormal discrete-time wavelet series.9
Recall that the basis functions of the discrete-time expansion are given by the
impulse responses of the synthesis filters. Therefore, we will concentrate on the
synthesis filter bank (even though, in the orthogonal case, simple time reversal
relates analysis and synthesis filters). Let us start with a simple example which
should highlight the main features of octave-band filter bank expansions.

Example 3.7
Consider what happens if the filters gi [n] from Figure 3.7(a)-(b) are Haar filters defined in
z-transform domain as

1 1
G0 (z) = √ (1 + z −1 ), G1 (z) = √ (1 − z −1 ).
2 2

Take, for example, J = 3, that is, we will use three two-channel filter banks. Then, using
the multirate identity which says that G(z) followed by upsampling by 2 is equivalent to
upsampling by 2 followed by G(z 2 ) (see Section 2.5.3), we can transform this filter bank
into a four-channel one as given in Figure 3.8. The equivalent filters are

(1) 1
G1 (z) = G1 (z) = √ (1 − z −1 ),
2
(2) 1
G1 (z) = G0 (z) G1 (z ) = (1 + z −1 − z −2 − z −3 ),
2
2
(3)
G1 (z) = G0 (z) G0 (z 2 ) G1 (z 4 )
1
= √ (1 + z −1 + z −2 + z −3 − z −4 − z −5 − z −6 − z −7 ),
2 2
(3)
G0 (z) = G0 (z) G0 (z 2 ) G0 (z 4 )
1
= √ (1 + z −1 + z −2 + z −3 + z −4 + z −5 + z −6 + z −7 ),
2 2

preceded by upsampling by 2, 4, 8 and 8 respectively. The impulse responses follow by


(3)
inverse z-transform. Denote by g0 [n] the equivalent filter obtained by going through three

9
This is also sometimes called a discrete-time wavelet transform in the literature.
3.3. TREE-STRUCTURED FILTER BANKS 151

1
y0(n) 2 ------- ( 1, – 1 )
2

1
y1(n) 48 --- ( 1, 1, – 1, – 1 )
2

+ x(n)
1
y2(n) 8 ---------- ( 1, 1, 1, 1, – 1, – 1, – 1, – 1 )
2 2

1
y3(n) 8 ---------- ( 1, 1, 1, 1, 1, 1, 1, 1 )
2 2

FIGURE 3.6
fignew3.3.2
Figure 3.8 Octave-band synthesis filter bank with Haar filters and three stages.
It is obtained by transforming the filter bank from Figure 3.7(b) using the mul-
tirate identity for filtering followed by upsampling.

stages of lowpass filters g0 [n] each preceded by upsampling by 2. It can be defined recursively
as (we give it in z-domain for simplicity)

(3) 2 (2)
"
2
k
G0 (z) = G0 (z 2 ) G0 (z) = G0 (z 2 ).
k=0

(1) (i)
Note that this implies that G0 (z) = G0 (z). On the other hand, we denote by g1 [n], the
equivalent filter corresponding to highpass filtering followed by (i − 1) stages of lowpass
filtering, each again preceded by upsampling by 2. It can be defined recursively as

(3) 2 (2) 2 "


1
k
G1 (z) = G1 (z 2 ) G0 (z) = G1 (z 2 ) G0 (z 2 ), j = 1, 2, 3.
k=0

Since this is an orthonormal system, the time-domain matrices representing analysis and
synthesis are just transposes of each other. Thus the analysis matrix T a representing the
(1) (2) (3) (3)
actions of the filters h1 [n], h1 [n], h1 [n], h0 [n] contains as lines the impulse responses
(1) (2) (3) (3) (j)
of g1 [n], g1 [n], g1 [n], and g0 [n] or of hi [−n] since analysis and synthesis filters are
linked by time reversal. The matrix T a is block-diagonal,

⎛ ⎞
..
⎜ . ⎟
⎜ A0 ⎟
Ta = ⎜

⎟,
⎟ (3.3.1)
⎝ A0 ⎠
..
.
152 CHAPTER 3

where the block A0 is of the following form:


⎛ 2 −2 0 0 0 0 0 0 ⎞
⎜ 0 0 2 −2 0 0 0 0 ⎟
⎜ ⎟
⎜ 0 0 0 0 2 −2 0 0 ⎟
1 ⎜ ⎜ √0 √0 0
√ 0
√ 0 0 2

−2 ⎟
A0 = √ ⎜ ⎟. (3.3.2)
2 2⎜ 2 2 − 2 − 2 √0 √0 √0 √0 ⎟
⎜ ⎟
⎜ 0 0 0 0 2 2 − 2 − 2⎟
⎝ ⎠
1 1 1 1 −1 −1 −1 −1
1 1 1 1 1 1 1 1
(1)
Note how this matrix reflects the fact that the filter g1 [n] is preceded by upsampling by
(2)
2 (the row ( 2 −2 ) is shifted by 2 each time and appears 4 times in the matrix). g1 [n]
is preceded by upsampling by 4 (the corresponding row is shifted by 4 and appears twice),
(3) (3)
while filters in g1 [n], g0 [n] are preceded by upsampling by 8 (the corresponding rows
appear only once in the matrix). Note that the ordering of the rows in (3.3.2) is somewhat
arbitrary; we simply gathered successive impulse responses for clarity.

Now that we have seen how it works in a simple case, we take more general
filters gi [n], and a number of stages J. We concentrate on the orthonormal case
(the biorthogonal one would follow similarly). In an orthonormal octave-band filter
bank with J stages, the equivalent filters (basis functions) are given by (again we
give them in z-domain for simplicity)

(J) (J−1) J −1
"
J−1
K
G0 (z) = G0 (z) G0 (z 2 ) = G0 (z 2 ), (3.3.3)
K=0
"
j−2
(j) (j−1) 2j−1 2j−1 K
G1 (z) = G0 (z) G1 (z ) = G1 (z ) G0 (z 2 ),
K=0
j = 1, . . . , J. (3.3.4)
In time domain, each of the outputs in Figure 3.7(a) can be described as
H 1 H j−1
0 x, j = 1, . . . , J − 1
except for the last, which is obtained by
H J0 x.
Here, the time-domain matrices H 0 , H 1 are as defined in Section 3.2.1, that is,
each line is an even shift of the impulse response of gi [n], or equivalently, of hi [−n].
Since each stage in the analysis bank is orthonormal and invertible, the overall
scheme is as well. Thus, we get a unitary analysis matrix T a by interleaving the
rows of H 1 , H 1 H 0 , . . ., H 1 H J−1
0 , H J0 , as was done in (3.3.1–3.3.2). A formal
proof of this statement will be given in Section 3.3.2 under orthogonality of basis
functions.
3.3. TREE-STRUCTURED FILTER BANKS 153

Example 3.8
Let us go back to the Haar case and three stages. We can form matrices H 1 , H 1 H 0 ,
H 1 H 20 , H 30 as

⎛ ⎞
.. .. .. ..
⎜ . . . . ⎟
1 ⎜··· 1 −1 0 0 ···⎟
H1 = √ ⎜ ⎟, (3.3.5)
⎜··· 0 0 1 −1 ···⎟
2 ⎝ ⎠
.. .. .. ..
. . . .
⎛ ⎞
.. .. .. ..
⎜ . . . . ⎟
1 ⎜··· ···⎟
H0 = √ ⎜ 1 1 0 0 ⎟, (3.3.6)
⎜··· 0 0 1 1 ···⎟
2 ⎝ ⎠
.. .. .. ..
. . . .
⎛ ⎞
.. .. .. .. .. .. .. ..
⎜ . . . . . . . . ⎟
1 ⎜··· 1 1 −1 −1 0 0 0 0 ···⎟
H 1H 0 = ⎜ ⎟, (3.3.7)
2 ⎜··· 0 0 0 0 1 1 −1 −1 ···⎟
⎝ ⎠
.. .. .. .. .. .. .. ..
. . . . . . . .
⎛ ⎞
.. .. .. .. .. .. .. .. .. ..
⎜ . . . . . . . . . . ⎟
1 ⎜··· −1 −1 −1 −1 0 ···⎟
H 1 H 20 = √ ⎜ 1 1 1 1 0 ⎟, (3.3.8)
⎜··· 0 0 0 0 0 0 0 0 1 1 ···⎟
2 2 ⎝ ⎠
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
⎛ ⎞
.. .. .. .. .. .. .. .. .. ..
⎜ . . . . . . . . . . ⎟
1 ⎜··· ···⎟
H 30 = √ ⎜ 1 1 1 1 1 1 1 1 0 0 ⎟. (3.3.9)
2 2⎜
⎝··· 0 0 0 0 0 0 0 0 1 1 ···⎟

.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .

Now, it is easy to see that by interleaving (3.3.5–3.3.9) we obtain the matrix T a as in (3.3.1–
3.3.2). To check that it is unitary, it is enough to check that A0 is unitary (which it is, just
compute the product A0 AT0 ).

Until now, we have concentrated on the orthonormal case. If one would relax
the orthonormality constraint, we would obtain a biorthogonal tree-structured filter
bank. Now, hi [n] and gi [n] are not related by simple time reversal, but are impulse
responses of a biorthogonal perfect reconstruction filter bank. We therefore have
(j) (J)
both equivalent synthesis filters g1 [n − 2j k], g0 [n − 2J k] as given in (3.3.3–3.3.4)
(j) (J)
and analysis filters h1 [n−2j k], h0 [n−2J k], which are defined similarly. Therefore
if the individual two-channel filter banks are biorthogonal (perfect reconstruction),
then the overall scheme is as well. The proof of this statement will follow the proof
for the orthonormal case (see Section 3.3.2 for the discrete-time wavelet series case),
and is left as an exercise to the reader.
154 CHAPTER 3

3.3.2 Discrete-Time Wavelet Series and Its Properties


What was obtained in the last section is called a discrete-time wavelet series. It
should be noted that this is not an exact equivalent of the continuous-time wavelet
transform or series discussed in Chapter 4. In continuous time, there is a single
wavelet involved, whereas in the discrete-time case, there are different iterated
filters.
At the risk of a slight redundancy, we go once more through the whole process
leading to the discrete-time wavelet series. Consider a two-channel orthogonal filter
bank with filters h0 [n], h1 [n], g0 [n] and g1 [n], where hi [n] = gi [−n]. Then, the input
signal can be written as
 (1)
 (1)
x[n] = X (1) [2k + 1] g1 [n − 21 k] + X (1) [2k] g0 [n − 21 k], (3.3.10)
k∈Z k∈Z

where
(1)
X (1) [2k] = h0 [21 k − l], x[l],
(1)
X (1) [2k + 1] = h1 [21 k − l], x[l],
are the convolutions of the input with h0 [n] and h1 [n] evaluated at even indexes
(1) (1)
2k. In these equations hi [n] = hi [n], and gi [n] = gi [n]. In an octave-band
filter bank or discrete-time wavelet series, the lowpass channel is further split by
lowpass/highpass filtering and downsampling. Then, the first term on the right side
of (3.3.10) remains unchanged, while the second can be expressed as
 (1)
 (2)
X (1) [2k] h0 [21 k − n] = X (2) [2k + 1] g1 [n − 22 k]
k∈Z k∈Z
 (2)
+ X (2) [2k] g0 [n − 22 k], (3.3.11)
k∈Z

where
(2)
X (2) [2k] = h0 [22 k − l], x[l],
(2)
X (2) [2k + 1] = h1 [22 k − l], x[l],
that is, we applied (3.3.10) once more. In the above, basis functions g(i) [n] are as
(2)
defined in (3.3.3) and (3.3.4). In other words, g0 [n] is the time-domain version of
(2)
G0 (z) = G0 (z) G0 (z 2 ),
(2)
while g1 [n] is the time-domain version of
(2)
G1 (z) = G0 (z) G1 (z 2 ).
3.3. TREE-STRUCTURED FILTER BANKS 155

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

g(1)
1
g(2)
1
g(3)
1
g(4)
1
g(4)
0

FIGURE 3.7 fignew3.3.3


Figure 3.9 Dyadic sampling grid used in the discrete-time wavelet series. The
(j) (J)
shifts of the basis functions g1 are shown, as well as g0 (case J = 4 is shown).
This corresponds to the “sampling” of the discrete-time wavelet series. Note
the conservation of the number of samples between the signal and transform
domains.

With (3.3.11), the input signal x[n] in (3.3.10) can be written as


 (1)
 (2)
x[n] = X (1) [2k + 1] g1 [n − 21 k] + X (2) [2k + 1] g1 [n − 22 k]
k∈Z k∈Z
 (2)
+ X (2)
[2k] g0 [22 k − n]. (3.3.12)
k∈Z

Repeating the process in (3.3.12) J times, one obtains the discrete-time wavelet
series over J octaves, plus the final octave containing the lowpass version. Thus,
(3.3.12) becomes


J 
(j)
 (J)
x[n] = X (j) [2k + 1] g1 [n − 2j k] + X (J) [2k] g0 [n − 2J k], (3.3.13)
j=1 k∈Z k∈Z

where
(j)
X (j) [2k + 1] = h1 [2j k − l], x[l], j = 1, . . . , J, (3.3.14)
(J)
X (J) [2k] = h0 [2J k − l], x[l].
(j) (J)
In (3.3.13) the sequence g1 [n] is the time-domain version of (3.3.4), while g0 [n]
(j) (j)
is the time-domain version of (3.3.3) and hi [n] = gi [−n]. Because any input
(j)
sequence can be decomposed as in (3.3.13), the family of functions {g1 [2j k −
(J)
n], g0 [2J k − n]}, j = 1, . . . , J, and k, n ∈ Z, is an orthonormal basis for l2 (Z).
Note the special sampling used in the discrete-time wavelet series. Each sub-
sequent channel is downsampled by 2 with respect to the previous one and has a
156 CHAPTER 3

bandwidth that is reduced by 2 as well. This is called a dyadic sampling grid, as


shown in Figure 3.9.
Let us now list a few properties of the discrete-time wavelet series (orthonormal
and dyadic).

Linearity Since the discrete-time wavelet series involves inner products or convo-
lutions (which are linear operators) it is obviously linear.

Shift Recall that multirate systems are not shift-invariant in general, and two-
channel filter banks downsampled by 2 are shift-invariant with respect to even
shifts only. Therefore, it is intuitive that a J-octave discrete-time wavelet series
will be invariant under shifts by multiples of 2J . A visual interpretation follows
from the fact that the dyadic grid in Figure 3.9, when moved by k2J , will overlap
with itself, whereas it will not if the shift is a noninteger multiple of 2J .

P ROPOSITION 3.14
In a discrete-time wavelet series expansion over J octaves, if

x[l] ←→ X (j) [2k + 1], j = 1, 2, . . . , J

then
x[l − m2J ] ←→ X (j) [2(k − m2J−j ) + 1].

P ROOF
If y[l] = x[l − m2J ], then its transform is, following (3.3.14),

(j)
Y (j) [2k + 1] = h1 [2j k − l], x[l − m2J ]
(j)
= h1 [2j k − l − m2J ], x[l ]
(j)
= X [2j (k − m2J −j ) + 1].

Very similarly, one proves for the lowpass channel that, when x[l] produces X (J) [2k],
then x[l − m2J ] leads to X (J) [2(k − m)].

(J) (j)
Orthogonality We have mentioned before that g0 [n] and g1 [n], j = 1, . . . , J, with
appropriate shifts, form an orthonormal family of functions (see [274]). This stems
from the fact that we have used two-channel orthogonal filter banks, for which we
know that
gi [n − 2k], gj [n − 2l] = δ[i − j] δ[k − l].
3.3. TREE-STRUCTURED FILTER BANKS 157

P ROPOSITION 3.15
In a discrete-time wavelet series expansion, the following orthogonality rela-
tions hold:
(J) (J)
g0 [n − 2J k], g0 [n − 2J l] = δ[k − l], (3.3.15)
(j) (i)
g1 [n − 2j k], g1 [n − 2i l] = δ[i − j] δ[k − l], (3.3.16)
(J) (j)
g0 [n − 2J k], g1 [n − 2j l] = 0. (3.3.17)

P ROOF
We will here prove only (3.3.15), while (3.3.16) and (3.3.17) are left as an exercise to the
reader (see Problem 3.15). We prove (3.3.15) by induction.
It will be convenient to work with the z-transform of the autocorrelation of the filter
(j)
G0 (z), which we call P (j) (z) and equals
(j) (j)
P (j) (z) = G0 (z) G0 (z −1 ).

Recall that because of the orthogonality of g0 [n] with respect to even shifts, we have that

P (1) (z) + P (1) (−z) = 2,

or, equivalently, that the polyphase decomposition of P (1) (z) is of the form
(1)
P (1) (z) = 1 + zP1 (z 2 ).
(j)
This is the initial step for our induction. Now, assume that g0 [n] is orthogonal to its
translates by 2j . Therefore, the polyphase decomposition of its autocorrelation can be
written as
j
2 −1
(j) j
P (j) (z) = 1 + z i Pi (z 2 ).
i=1

Now, because of the recursion (3.3.3), the autocorrelation of G(j+1) (z) equals
j
P (j+1) (z) = P (j) (z) P (1) (z 2 ).

Expanding both terms on the right-hand side, we get


⎛ j

2 −1
j
 j j+1

P (j+1) (z) = ⎝1 + z i Pi (z 2 )⎠ 1 + z 2 P1 (z 2 ) .
(j) (1)

i=1

We need to verify that the 0th polyphase component of P (j+1) (z) is equal to 1, or that
coefficients of z’s which are raised to powers multiple of 2j+1 are 0. Out of the four products
that appear when multiplying out the above right-hand side, only the product involving the
polyphase components needs to be considered,
j
2 −1
(j) j j (1) j+1
z i Pi (z 2 ) · z 2 P1 (z 2 ).
i=1
158 CHAPTER 3

The powers of z appearing in the above product are of the form l = i + k2j + 2j + m2j+1 ,
where i = 0 · · · 2j − 1 and k, m ∈ Z. Thus, l cannot be a multiple of 2j+1 , and we have
shown that
2j+1
−1 i (j+1) 2j+1
j+1
P (z) = 1 + z Pi (z ),
i=1

thus completing the proof.

Parseval’s Equality Orthogonality together with completeness (which follows from


perfect reconstruction) leads to conservation of energy, also called Bessel’s or Par-
seval’s equality, that is

 
J
x[n] =
2
(|X (J) 2
[2k]| + |X (j) [2k + 1]|2 ).
k∈Z j=1

3.3.3 Multiresolution Interpretation of Octave-Band Filter Banks


The two-channel filter banks studied in Sections 3.1 and 3.2 have the property
of splitting the signal into two lower-resolution versions. One was a lowpass or
coarse resolution version, and the other was a highpass version of the input. Then,
in this section, we have applied this decomposition recursively on the lowpass or
coarse version. This leads to a hierarchy of resolutions, also called a multiresolution
decomposition.
Actually, in computer vision as well as in image processing, looking at signals at
various resolutions has been around for quite some time. In 1983, Burt and Adelson
introduced the pyramid coding technique, that builds up a signal from its lower-
resolution version plus a sequence of details (see also Section 3.5.2) [41]. In fact, one
of the first links between wavelet theory and signal processing was Daubechies’ [71]
and Mallat’s [180] recognition that the scheme of Burt and Adelson is closely related
to wavelet theory and multiresolution analysis, and that filter banks or subband
coding schemes can be used for the computation of wavelet decompositions. While
these relations will be further explored in Chapter 4 for the continuous-time wavelet
series, here we study the discrete-time wavelet series or its octave-band filter bank
realization. This discrete-time multiresolution analysis was studied by Rioul [240].
Since this is a formalization of earlier concepts, we need some definitions. First
we introduce the concept of embedded closed spaces. We will say that the space V0
is the space of all square-summable sequences, that is,

V0 = l2 {Z}. (3.3.18)

Then, a multiresolution analysis consists of a sequence of embedded closed spaces

VJ ⊂ · · · ⊂ V2 ⊂ V1 ⊂ V0 . (3.3.19)
3.3. TREE-STRUCTURED FILTER BANKS 159

It is obvious that due to (3.3.18–3.3.19)

6
J
Vj = V0 = l2 {Z}.
j=0

The orthogonal complement of Vj+1 in Vj will be denoted by Wj+1 , and thus

Vj = Vj+1 ⊕ Wj+1 , (3.3.20)

with Vj+1 ⊥ Wj+1 , where ⊕ denotes the direct sum (see Section 2.2.2). Assume
that there exists a sequence g0 [n] ∈ V0 such that

{g0 [n − 2k]}k∈Z

is a basis for V1 . Then, it can be shown that there exists a sequence g1 [n] ∈ V such
that
{g1 [n − 2k]}k∈Z
is a basis for W1 . Such a sequence is given by

g1 [n] = (−1)n g0 [−n + 1]. (3.3.21)

In other words, and having in mind (3.3.20), {g0 [n − 2k], g1 [n − 2k]}k∈Z is an


orthonormal basis for V0 . This splitting can be iterated on V1 . Therefore, one can
see that V0 can be decomposed in the following manner:

V0 = W1 ⊕ W2 ⊕ · · · ⊕ WJ ⊕ VJ , (3.3.22)

by simply iterating the decomposition J times.


Now, consider the octave-band filter bank in Figure 3.7(a). The analysis filters
are the time-reversed versions of g0 [n] and g1 [n]. Therefore, the octave-band analy-
sis filter bank computes the inner products with the basis functions for W1 , W2 , . . . ,
WJ and VJ .
In Figure 3.7(b), after convolution with the synthesis filters, we get the orthog-
onal projection of the input signal onto W1 , W2 , . . . , WJ and VJ . That is, the input
is decomposed into a very coarse resolution (which exists in VJ ) and added details
(which exist in the spaces Wi , i = 1, . . . , J). By (3.3.22), the sum of the coarse
version and all the added details yields back the original signal; a result that follows
from the perfect reconstruction property of the analysis/synthesis system as well.
We will call Vj ’s approximation spaces and Wj ’s detail spaces. Then, the pro-
cess of building up the signal is intuitively very clear — one starts with its lower-
resolution version belonging to VJ , and adds up the details until the final resolution
is reached.
160 CHAPTER 3

..
.
V2
V1 V0
•••

VJ WJ ••• W2 W1

••• ω
π π π π π
----- ------------- --- ---
2J 2J – 1 4 2

FIGURE 3.8 fignew3.3.4

Figure 3.10 Ideal division of the spectrum by the discrete-time wavelet series
using sinc filters. Note that the spectrums are symmetric around zero. Division
into Vi spaces (note how Vi ⊂ Vi−1 ), and resulting Wi spaces. (Actually, Vj
and Wj are of height 2j/2 , so they have unit norm).

It will be seen in Chapter 4 that the decomposition into approximation and


detail spaces is very similar to the multiresolution framework for continuous-time
signals. However, there are a few important distinctions. First, in the discrete-time
case, there is a “finest” resolution, associated with the space V0 , that is, one cannot
refine the signal further. Then, we are considering a finite number of decomposition
steps J, thus leading to a “coarsest” resolution, associated with VJ . Finally, in
the continuous-time case, a simple function and its scales and translates are used,
whereas here, various iterated filters are involved (which, under certain conditions,
resemble scales of each other as we will see).

Example 3.9 Sinc Case


In the sinc case, introduced in Section 3.1.3, it is very easy to spot the multiresolution
flavor. Since the filters used are ideal lowpass/highpass filters, respectively, at each stage
the lowpass filter would halve the coarse space, while the highpass filter would take care
of the difference between them. The above argument is best seen in Figure 3.10. The
original signal (discrete in time and thus its spectrum occupies (−π, π)) is lowpass filtered
using the ideal half-band filter. As a result, starting from the space V0 , we have derived
a lower-resolution signal by halving V0 , resulting in V1 . Then, an even coarser version is
obtained by using the same process, resulting in the space V2 . Using the above process
repeatedly, one obtains the final coarse (approximation) space VJ . Along the way we have
created difference spaces, Wi , as well.
For example, the space V1 occupies the part (−π/2, π/2) in the spectrum, while W1
will occupy (−π, −π/2) ∪ (π/2, π). It can be seen that g0 [n] as defined in (3.1.23) with its
even shifts, will constitute a basis for V1 , while g1 [n] following (3.3.21) constitutes a basis
for W1 . In other words, g0 [n], g1 [n] and their even shifts would constitute a basis for the
original (starting) space V0 (l2 (Z)).
3.3. TREE-STRUCTURED FILTER BANKS 161

FIGURE 3.9 fignew3.3.5

Figure 3.11 All possible combinations of tree-structured filter banks of depth


2. Symbolically, a fork stands for a two-channel filter bank with the lowpass
on the bottom. From left to right is the full tree (STFT like), the octave-band
tree (wavelet), the tree where only the highpass is split further, the two-band
tree and finally the nil-tree tree (no split at all). Note that all smaller trees
are pruned versions of the full tree.

Because we deal with ideal filters, there is an obvious frequency interpretation. How-
ever, one has to be careful with the boundaries between intervals. With our definition of
g0 [n] and g1 [n], cos((π/2)n)10 belongs to V1 while sin((π/2)n) belongs to W1 .

3.3.4 General Tree-Structured Filter Banks and Wavelet Packets


A major part of this section was devoted to octave-band, tree-structured filter
banks. It is easy to generalize that discussion to arbitrary tree structures, starting
from a single two-channel filter bank, all the way through the full grown tree of
depth J. Consider, for example, Figure 3.11. It shows all possible tree structures
of depth less or equal to two.
Note in particular the full tree, which yields a linear division of the spectrum sim-
ilar to the short-time Fourier transform, and the octave-band tree, which performs
a two-step discrete-time wavelet series expansion. Such arbitrary tree structures
were recently introduced as a family of orthonormal bases for discrete-time signals,
and are known under the name of wavelet packets [63]. The potential of wavelet
packets lies in the capacity to offer a rich menu of orthonormal bases, from which
the “best” one can be chosen (“best” according to a particular criterion). This
will be discussed in more detail in Chapter 7 when applications in compression are
considered. What we will do here, is define the basis functions and write down
the appropriate orthogonality relations; however, since the octave-band case was
discussed in detail, the proofs will be omitted (for a proof, see [274]).
10
To be precise, since cos((π/2)n) is not of finite energy and does not belong to l2 (Z), one needs
to define windowed versions of unit norm and take appropriate limits.
162 CHAPTER 3

(j) (j)
Denote the equivalent filters by gi [n], i = 0, . . . , 2j − 1. In other words, gi is
the ith equivalent filter going through one of the possible paths of length j. The
ordering is somewhat arbitrary, and we will choose the one corresponding to a full
tree with a lowpass in the lower branch of each fork, and start numbering from the
bottom.

Example 3.10
Let us find all equivalent filters in Figure 3.11, or the filters corresponding to depth-1 and
depth-2 trees. Since we will be interested in the basis functions, we consider the synthesis
filter banks. For simplicity, we do it in z-domain.
(1) (1)
G0 (z) = G0 (z), G1 (z) = G1 (z),
(2) (2)
G0 (z) = G0 (z) G0 (z 2 ), G1 (z) = G0 (z) G1 (z 2 ), (3.3.23)
(2) 2 (2) 2
G2 (z) = G1 (z) G0 (z ), G3 (z) = G1 (z) G1 (z ). (3.3.24)

Note that with the ordering chosen in (3.3.23–3.3.24), increasing index does not always cor-
(2)
respond to increasing frequency. It can be verified that for ideal filters, G2 (ejω ) chooses
(2) jω
the range [3π/4, π], while G3 (e ) covers the range [π/2, 3π/4] (see Problem 3.16). Be-
side the identity basis, which corresponds to the no-split situation, we have four possible
orthonormal bases, corresponding to the four trees in Figure 3.11. Thus, we have a family
W = {W0 , W1 , W2 , W3 , W4 }, where W4 is simply {δ[n − k]}k∈Z .
(2) (2) (2) (2)
W0 = {g0 [n − 22 k], g1 [n − 22 k], g2 [n − 22 k], g3 [n − 22 k]}k∈Z ,

corresponds to the full tree.


(1) (2) (2)
W1 = {g1 [n − 2k], g0 [n − 22 k], g1 [n − 22 k]}k∈Z ,

corresponds to the octave-band tree.


(1) (2) (2)
W2 = {g0 [n − 2k], g2 [n − 22 k], g3 [n − 22 k]}k∈Z ,

corresponds to the tree with the highband split twice, and


(0) (1)
W3 = {g0 [n − 2k], g1 [n − 2k]}k∈Z ,

is simply the usual two-channel filter bank basis.

This small example should have given the intuition behind orthonormal bases
generated from tree-structured filter banks. In the general case, with filter banks of
depth J, it can be shown that, counting the no-split tree, the number of orthonormal
bases satisfies
2
MJ = MJ−1 + 1. (3.3.25)
Among this myriad of bases, there are the STFT-like basis, given by
(J) (J)
W0 = {g0 [n − 2J k], . . . , g2J −1 [n − 2J k]}k∈Z , (3.3.26)
3.4. MULTICHANNEL FILTER BANKS 163

and the wavelet-like basis,


(1) (2) (J) (J)
W1 = {g1 [n − 2k], g1 [n − 22 k], . . . , g1 [n − 2J k], g0 [n − 2J k]}k∈Z . (3.3.27)

It can be shown that the sets of basis functions in (3.3.26) and (3.3.27), as well as
in all other bases generated by the filter bank tree, are orthonormal (for example,
along the lines of the proof in the discrete-time wavelet series case). However, this
would be quite cumbersome. A more immediate proof is sketched here. Note that
we have a perfect reconstruction system by construction, and that the synthesis
and the analysis filters are related by time reversal. That is, the inverse operator
of the analysis filter bank (whatever its particular structure) is its transpose, or
equivalently, the overall filter bank is orthonormal. Therefore, the impulse responses
of all equivalent filters and their appropriate shifts form an orthonormal basis for
l2 (Z).
It is interesting to consider the time-frequency analysis performed by various
filter banks. This is shown schematically in Figure 3.12 for three particular cases
of binary trees. Note the different trade-offs in time and frequency resolutions.
Figure 3.13 shows a dynamic time-frequency analysis, where the time and fre-
quency resolutions are modified as time evolves. This is achieved by modifying the
frequency split on the fly [132], and can be used for signal compression as discussed
in Section 7.3.4.

3.4 M ULTICHANNEL F ILTER BANKS


In the previous section, we have seen how one can obtain multichannel filter banks
by cascading two-channel ones. Although this is a very easy way of achieving
the goal, one might be interested in designing multichannel filter banks directly.
Therefore, in this section we will present a brief analysis of N-channel filter banks,
as given in Figure 3.14. We start the section by discussing two special cases which
are of interest in applications: the first, block transforms, and the second, lapped
orthogonal transforms. Then, we will formalize our treatment of N-channel filter
banks (time-, modulation- and polyphase-domain analyses). Finally, a particular
class of multichannel filter banks, where all filters are obtained by modulating a
single, prototype filter — called modulated filter banks — is presented.

3.4.1 Block and Lapped Orthogonal Transforms


Block Transforms Block transforms, which are used quite frequently in signal
compression (for example, the discrete cosine transform), are a special case of filter
banks with N channels, filters of length N , and downsampling by N . Moreover,
when such transforms are unitary or orthogonal, they are the simplest examples
of orthogonal (also called paraunitary or lossless) N-channel filter banks. Let us
164 CHAPTER 3

f f

(a) (b)

t t

(c)

figtut3.2
Figure 3.12 Time-frequency analysis achieved by different binary subband
trees. The trees are on bottom, the time-frequency tilings on top. (a) Full tree
or STFT. (b) Octave-band tree or wavelet series. (c) Arbitrary tree or one
possible wavelet packet.

analyze such filter banks in a manner similar to Section 3.2. Therefore, the channel
signals, after filtering and sampling can be expressed as

⎛ .. ⎞
.
⎜ ⎟
⎜ y0 [0] ⎟
⎜ ⎟ ⎛ ⎞⎛ . ⎞
⎜ .. ⎟ .. .. ..
⎜ . ⎟ . .
⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ yN −1 [0] ⎟ ⎜··· A0 0 · · · ⎟ ⎜ x[0] ⎟
⎜ ⎟ = ⎜ ⎟⎜ ⎟, (3.4.1)
⎜ y0 [1] ⎟ ⎝··· 0 A0 · · · ⎠ ⎝ x[1] ⎠
⎜ ⎟ .. ..
⎜ .. ⎟ ..
⎜ . ⎟ . . .
⎜ ⎟
⎝ yN −1 [1] ⎠
..
.
3.4. MULTICHANNEL FILTER BANKS 165

Figure 3.13 Dynamic time-frequency analysis achieved by concatenating the


analyses from Figure 3.12. The tiling and the evolving tree are shown.

figtut3.3
yN-1
HN – 1 Ν
•••
Ν GN – 1
•••

•••

•••

x + x^
y1
H1 Ν
•••
Ν G1

y0
H0 Ν
•••
Ν G0

Figure 3.14 N-channel analysis/synthesis


FIGUREfilter bank with critical downsampling
3.10 fignew3.4.1
by N .

where the block A0 is equal to (similarly to (3.2.3))


⎛ ⎞ ⎛ ⎞
h0 [N − 1] ··· h0 [0] g0 [0]g0 [N − 1] ···
A0 = ⎝ .. .. ⎠ = ⎝ .. .. ⎠.
. . . .
hN −1 [N − 1] ··· hN −1 [0] · · · gN −1 [N − 1]
gN −1 [0]
(3.4.2)
The second equality follows since the transform is unitary, that is,

A0 AT0 = AT0 A0 = I. (3.4.3)

We can see that (3.4.2–3.4.3) imply that

hi [kN − n], hj [lN − n] = gi [n − kN ], gj [n − lN ] = δ[i − j] δ[k − l],


166 CHAPTER 3

that is, we obtained the orthonormality relations for this case. Denoting by
ϕkN +i [n] = gi [n − kN ], we have that the set of basis functions {ϕkN +i [n]} =
{g0 [n − kN ], g1 [n − kN ], . . . , gN −1 [n − kN ]}, with i = 0, . . . , N − 1, and k ∈ Z, is
an orthonormal basis for l2 (Z).

Lapped Orthogonal Transforms Lapped orthogonal transforms (LOT’s), intro-


duced by Cassereau [43] and Malvar [189, 188] are a class of N-channel unitary filter
banks where some additional constraints are imposed. In particular, the length of
the filters is restricted to L = 2N , or twice the number of channels (or down-
sampling rate), and thus, it is easy to interpret LOT’s as an extension of block
transforms where neighboring filters overlap. Usually, the number of channels is
even and sometimes they are all obtained from a single prototype window by mod-
ulation. In this case, fast algorithms taking advantage of the modulation relation
between the filters reduce the order N 2 operations per N outputs of the filter bank
to cN log2 N (see also Chapter 6). This computational efficiency, as well as the
simplicity and close relationship to block transforms, has made LOT’s quite pop-
ular. A related class of filter banks, called time-domain aliasing cancellation filter
banks, studied by Princen and Bradley [229] can be seen as another interpretation
of LOT’s. For an excellent treatment of LOT’s, see the book by Malvar [188], to
which we refer for more details.
Let us examine the lapped orthogonal transform. First, the fact that the filter
length is 2N , means that the time-domain matrix analogous to the one in (3.4.1),
has the following form:
⎛ .. .. .. .. ⎞
. . . .
⎜ ⎟
⎜ · · · A0 A1 0 0 · · · ⎟
Ta = ⎜ ⎟, (3.4.4)
⎝ · · · 0 A0 A1 0 · · · ⎠
.. .. .. ..
. . . .
that is, it has a double block diagonal. The fact that T a is orthogonal, or T a T Ta =
T Ta T a = I, yields
AT0 A0 + AT1 A1 = A0 AT0 + A1 AT1 = I, (3.4.5)
as well as
AT0 A1 = AT1 A0 = 0, A0 AT1 = A1 AT0 = 0. (3.4.6)
The property (3.4.6) is called orthogonality of tails since overlapping tails of the basis
functions are orthogonal to each other. Note that these conditions characterize
nothing but an N-channel orthogonal filter bank, with filters of length 2N and
downsampling by N . To obtain certain classes of LOT’s, one imposes additional
constraints. For example, in Section 3.4.3, we will consider a cosine modulated
filter bank.
3.4. MULTICHANNEL FILTER BANKS 167

Generalizations What we have seen in these two simple cases, is how to obtain
N-channel filter banks with filters of length N (block transforms) and filters of
length 2N (lapped orthogonal transforms). It is obvious that by allowing longer
filters, or more blocks Ai in (3.4.4), we can obtain general N-channel filter banks.

3.4.2 Analysis of Multichannel Filter Banks


The analysis of N-channel filter banks is in many ways analogous to that of two-
channel filter banks; therefore, the treatment here will be fairly brisk, with refer-
ences to Section 3.2.

Time-Domain Analysis We can proceed here exactly as in Section 3.2.1. Thus,


we can say that the channel outputs (or transform coefficients) in Figure 3.14 can
be expressed as in (3.2.1)
y = X = T a x,
where the vector of transform coefficients is X, with X[N k+i] = yi [k]. The analysis
matrix T a is given as in (3.2.2) with blocks Ai of the form
⎛ ⎞
h0 [N k − 1 − N i] ··· h0 [N k − N − N i]
Ai = ⎝ .. .. ⎠.
. .
hN −1 [N k − 1 − N i] ··· hN −1 [N k − N − N i]

When the filters are of length L = KN , there are K blocks Ai of size N × N


each. Similarly to (3.2.4–3.2.5), we see that the basis functions of the first basis
corresponding to the analysis are

ϕN k+i [n] = hi [N k − n].

Defining the synthesis matrix as in (3.2.7), we obtain the basis functions of the dual
basis
ϕ̃N k+i [n] = gi [n − N k],
and they satisfy the following biorthogonality relations:

ϕk [n], ϕ̃l [n] = δ[k − l],

which can be expressed in terms of analysis/synthesis matrices as

T s T a = I.

As was done in Section 3.2, we can define single operators for each branch. If the
operator H i represents filtering by hi followed by downsampling by N , its matrix
168 CHAPTER 3

representation is
⎛ ⎞
.. .. ..
⎜ . . . ⎟
⎜ · · · hi [L − 1] · · · hi [L − N ] hi [L − N − 1] · · · ⎟

Hi = ⎜ ⎟.
· · · 0 · · · 0 h [L − 1] · · · ⎟
⎝ i ⎠
.. .. ..
. . .

Defining Gi similarly to H i (except that there is no time reversal), the output of


the system can then be written as
N −1 

x̂ = GTi H i x.
i=0

Then, the condition for perfect reconstruction is


N −1
GTi H i = I.
i=0

We leave the details and proofs of the above relationships as an exercise (Problem
3.21), since they are simple extensions of the two-channel case seen in Section 3.2.

Modulation-Domain Analysis Let us turn our attention to filter banks repre-


sented in the modulation domain. We write directly the expressions we need in
the z-domain. One can verify that downsampling a signal x[n] by N followed by
upsampling by N (that is, replacing x[n], n mod N = 0 by 0) produces a signal y[n]
with z-transform Y (z) equal to

N −1
1  √
Y (z) = X(WNi z), WN = e−j2π/N , j= −1
N
i=0

because of the orthogonality of the roots of unity. Then, the output of the system
in Figure 3.14 becomes, in a similar fashion to (3.2.14)

1 T
X̂(z) = g (z) H m (z) xm (z),
N

where g T (z) = ( G0 (z) . . . GN −1 (z) ) is the vector containing synthesis filters,


xm (z) = ( X(z) . . . X(WNN −1 z) )T and the ith line of H m (z) is equal to
( Hi (z) . . . Hi (WNN −1 z) ), i = 0, . . . , N − 1. Then, similarly to the two-channel
case, to cancel aliasing, gT H m has to have all elements equal to zero, except for
3.4. MULTICHANNEL FILTER BANKS 169

the first one. To obtain perfect reconstruction, this only nonzero element has to be
equal to a scaled pure delay.
As in the two-channel case, it can be shown that the perfect reconstruction
condition is equivalent to the system being biorthogonal, as given earlier. The
proof is left as an exercise for the reader (Problem 3.21). For completeness, let us
define Gm (z) as the matrix with the ith row equal to

( G0 (WNi z) G1 (WNi z) ... GN −1 (WNi z) ) .

Polyphase-Domain Analysis The gist of the polyphase analysis of two-channel


filter banks downsampled by 2 was to expand signals and filter impulse responses
into even- and odd-indexed components (together with some adequate phase terms).
Quite naturally, in the N-channel case with downsampling by N , there will be N
polyphase components. We follow the same definitions as in Section 3.2.1 (the
choice of the phase in the polyphase component is arbitrary, but consistent).
Thus, the input signal can be decomposed into its polyphase components as


N −1
X(z) = z −j Xj (z N ),
j=0

where


Xj (z) = x[nN + j] z −n .
n=−∞

Define the polyphase vector as

xp (z) = ( X0 (z) X1 (z) . . . XN −1 (z) )T .

The polyphase components of the synthesis filter gi are defined similarly, that is


N −1
Gi (z) = z −j Gij (z N ),
j=0

where


Gij (z) = gi [nN + j] z −n .
n=−∞

The polyphase matrix of the synthesis filter bank is given by

[Gp (z)]ji = Gij (z),


170 CHAPTER 3

where the implicit transposition should be noticed. Up to a phase factor and a


transpose, the analysis filter bank is decomposed similarly. The filter is written as


N −1
Hi (z) = z j Hij (z N ), (3.4.7)
j=0

where


Hij (z) = hi [nN − j] z −n . (3.4.8)
n=−∞

The analysis polyphase matrix is then defined as follows:

[H p (z)]ij = Hij (z).

For example, the vector of channel signals,

y(z) = ( y0 (z) y1 (z) . . . yN −1 (z) )T ,

can be compactly written as

y(z) = H p (z) xp (z).

Putting it all together, the output of the analysis/synthesis filter bank in Figure 3.14
can be written as

X̂(z) = ( 1 z −1 z −1 ... z −N +1 ) · Gp (z N ) · H p (z N ) · xp (z N ).

Similarly to the two-channel case, we can define the transfer function matrix T p (z) =
Gp (z)H p (z). Then, the same results hold as in the two-channel case. Here, we just
state them (the proofs are N-channel counterparts of the two-channel ones).

T HEOREM 3.16 Multichannel Filter Banks


(a) Aliasing in a one-dimensional system is cancelled if and only if the trans-
fer function matrix is pseudo-circulant [311].

(b) Given an analysis filter bank downsampled by N with polyphase matrix


H p (z), alias-free reconstruction is possible if and only if the normal rank
of H p (z) is equal to N .

(c) Given a critically sampled FIR analysis filter bank, perfect reconstruction
with FIR filters is possible if and only if det(H p (z)) is a pure delay.
3.4. MULTICHANNEL FILTER BANKS 171

Note that the modulation and polyphase representations are related via the Fourier
matrix. For example, one can verify that
⎛ ⎞
1
1 ⎜⎜ z ⎟
⎟ F xm (z),
xp (z N ) = ⎝ .. ⎠ (3.4.9)
N .
z N −1
where F kl = WNkl = e−j(2π/N )kl . Similar relationships hold between H m (z), Gm (z)
and H p (z), Gp (z), respectively (see Problem 3.22). The important point to note
is that modulation and polyphase matrices are related by unitary operations (such
as F and delays as in (3.4.9)).

Orthogonal Multichannel FIR Filter Banks Let us now consider the particular
but important case when the filter bank is unitary or orthogonal. This is an ex-
tension of the discussion in Section 3.2.3 to the N-channel case. The idea is to
implement an orthogonal transform using an N-channel filter bank, or in other
words, we want the following set:
{g0 [n − N K], . . . , gN −1 [n − N K]} , n ∈ Z
to be an orthonormal basis for l2 (Z). Then
gi [n − N k], gj [n − N l] = δ[i − j] δ[l − k]. (3.4.10)
Since in the orthogonal case analysis and synthesis filters are identical up to a time
reversal, (3.4.10) holds for hi [N k − l] as well. By using (2.5.19), (3.4.10) can be
expressed in z-domain as

N −1
Gi (WNk z) Gj (WN−k z −1 ) = N δ[i − j], (3.4.11)
k=0
or
GTm∗ (z −1 ) Gm (z) = N I,
where the subscript ∗ stands for conjugation of the coefficients but not of z (this is
necessary since Gm (z) has complex coefficients). Thus, as in the two-channel case,
having an orthogonal transform is equivalent to having a paraunitary modulation
matrix. Unlike the two-channel case, however, not all of the filters are obtained
from a single prototype filter.
Since modulation and polyphase matrices are related, it is easy to check that
having a paraunitary modulation matrix is equivalent to having a paraunitary
polyphase matrix, that is
GTm∗ (z −1 ) Gm (z) = N I ⇐⇒ GTp (z −1 ) Gp (z) = I. (3.4.12)
172 CHAPTER 3

Finally, in time domain

Gi GTj = δ[i − j] I, i, j = 0, 1,

or
T Ta T a = I.
The above relations lead to a direct extension of Theorem 3.8, where the particular
case N = 2 was considered.
Thus, according to (3.4.12), designing an orthogonal filter bank with N channels
reduces to finding N × N paraunitary matrices. Just as in the two-channel case,
where we saw a lattice realization of orthogonal filter banks (see (3.2.60)), N ×
N paraunitary matrices can be parametrized in terms of cascades of elementary
matrices (2×2 rotations and delays). Such parametrizations have been investigated
by Vaidyanathan, and we refer to his book [308] for a thorough treatment. An
overview can be found in Appendix 3.A.2. As an example, we will see how to
construct three-channel paraunitary filter banks.

Example 3.11
We use the factorization given in Appendix 3.A.2, (3.A.8). Thus, we can express the 3 × 3
polyphase matrix as
⎡ ⎛ −1 ⎞ ⎤
"
K−1 z
Gp (z) = U 0 ⎣ ⎝ 1 ⎠ U i⎦ ,
i=1 1

where
⎛ ⎞⎛ ⎞
1 0 0 cos α01 0 − sin α01
U0 = ⎝ 0 cos α00 − sin α00 ⎠⎝ 0 1 0 ⎠
0 sin α00 cos α00 sin α01 0 cos α01
⎛ ⎞
cos α02 − sin α02 0
× ⎝ sin α02 cos α02 0 ⎠,
0 0 1

and U i are given by


⎛ ⎞⎛ ⎞
cos αi0 − sin αi0 0 1 0 0
Ui = ⎝ sin αi0 cos αi0 0 ⎠⎝ 0 cos αi1 − sin αi1 ⎠ .
0 0 1 0 sin αi1 cos αi1

The degrees of freedom are given by the angles αij . To obtain the three analysis filters, we
upsample the polyphase matrix, and thus

[G0 (z) G1 (z) G2 (z)] = [1 z −1 z −2 ] Gp (z 3 ).

To design actual filters, one could minimize an objective function as the one given in [306],
where the sum of all the stopbands was minimized.
3.4. MULTICHANNEL FILTER BANKS 173

It is worthwhile mentioning that N-channel orthogonal filter banks with more


than two channels have greater design freedom. It is possible to obtain orthogo-
nal linear phase FIR solutions [275, 321], a solution which was impossible for two
channels (see Appendix 3.A.2).

3.4.3 Modulated Filter Banks


We will now examine a particular class of N channel filter banks — modulated
filter banks. The name stems from the fact that all the filters in the analysis bank
are obtained by modulating a single prototype filter. If we impose orthogonality
as well, the synthesis filters will obviously be modulated as well. The first class
we consider imitates the short-time Fourier transform (STFT), but in the discrete-
time domain. The second one — cosine modulated filter banks, is an interesting
counterpart to the STFT, and when the length of the filters is restricted to 2N , it
is an example of a modulated LOT.

Short-Time Fourier Transform in the Discrete-Time Domain The short-time


Fourier or Gabor transform [204, 226] is a very popular tool for nonstationary
signal analysis (see Section 2.6.3). It has an immediate filter bank interpretation.
Assume a window function hpr [n] with a corresponding z-transform Hpr (z). This
window function is a prototype lowpass filter with a bandwidth of 2π/N , which is
then modulated evenly over the frequency spectrum using consecutive powers of
the N th root of unity
Hi (z) = Hpr (WNi z), i = 0, . . . , N − 1, WN = e−j2π/N , (3.4.13)
or
hi [n] = WN−in hpr [n]. (3.4.14)
That is, if Hpr (ejω ) is a lowpass filter centered around ω = 0, then Hi (ejω ) is a
bandpass filter centered around ω = (i2π)/N . Note that the prototype window is
usually real, but the bandpass filters are complex.
In the short-time Fourier transform, the window is advanced by M samples
at a time, which corresponds to a downsampling by M of the corresponding filter
bank. This filter bank interpretation of the short-time Fourier transform analysis
is depicted in Figure 3.15. The short-time Fourier transform synthesis is achieved
similarly with a modulated synthesis filter bank. Usually, M is chosen smaller than
N (for example, N/2), and then, it is obviously an oversampled scheme or a noncrit-
ically sampled filter bank. Let us now consider what happens if we critically sample
such a filter bank, that is, downsample by N . Compute a critically sampled discrete
short-time Fourier (or Gabor) transform, where the window function is given by
the prototype filter. It is easy to verify the following negative result [315] (which is
a discrete-time equivalent of the Balian-Low theorem, given in Section 5.3.3):
174 CHAPTER 3

HN−1 Μ yN-1

•••

•••
x Ν>Μ

H1 Μ y1

H0 Μ y0

FIGURE 3.11 fignew3.4.3


Figure 3.15 A noncritically sampled filter bank; it has N branches followed
by sampling by M (N > M ). When the filters are modulated versions (by
the N th root of unity), then this implements a discrete-time version of the
short-time Fourier transform.

T HEOREM 3.17
There are no finite-support bases with filters as in (3.4.13) (except trivial ones
with only N nonzero coefficients).
P ROOF
The proof consists in analyzing the polyphase matrix H p (z). Write the prototype filter
Hpr (z) in terms of its polyphase components (see (3.4.7–3.4.8))


N−1
Hpr (z) = z j Hprj (z N ),
j=0

where Hprj (z) is the jth polyphase component of Hpr (z).


Obviously, following (3.4.7) and (3.4.13),

Hi (z) = WNij z j Hprj (z N ).

Therefore, the polyphase matrix H p (z) has entries

[H p (z)]ij = WNij Hprj (z).

Then, H p (z) can be factored as


⎛ ⎞
Hpr0 (z)
⎜ Hpr1 (z) ⎟
⎜ ⎟
H p (z) = F ⎜ .. ⎟, (3.4.15)
⎝ . ⎠
HprN −1 (z)
3.4. MULTICHANNEL FILTER BANKS 175

where Fkl = WNkl = e−j(2π/N)kl . For FIR perfect reconstruction, the determinant of Hp (z)
has to be a delay (by Theorem 3.16). Now,

"
N−1
det(H p (z)) = c Hprj (z),
j=0

where c is a complex number equal to det(F ). Therefore, for perfect FIR reconstruction,
Hprj (z) has to be of the form αi · z −m , that is, the prototype filter has exactly N nonzero
coefficients. For an orthogonal solution, the αi ’s have to be unit-norm constants.

What happens if we relax the FIR requirement? For example, one can choose
the following prototype:


N −1
Hpr (z) = Pi (z N ) z i , (3.4.16)
i=0

where Pi (z) are allpass filters. The factorization (3.4.15) still holds, with Hpri (z) =
Pi (z), and since Pi (z −1 ) · Pi (z) = 1, H p (z) is paraunitary. While this gives an
orthogonal modulated filter bank, it is IIR (either analysis or synthesis will be
noncausal), and the quality of the filter in (3.4.16) can be poor.

Cosine Modulated Filter Banks The problems linked to complex modulated fil-
ter banks can be solved by using appropriate cosine modulation. Such cosine-
modulated filter banks are very important in practice, for example in audio com-
pression (see Section 7.2.2). Since they are often of length L = 2N (where N is the
downsampling rate), they are sometimes referred to as modulated LOT’s, or MLT’s.
A popular version was proposed in [229] and thus called the Princen-Bradley filter
bank. We will study one class of cosine modulated filter banks in some depth, and
refer to [188, 308] for a more general and detailed treatment. The cosine modulated
filter banks we consider here are a particular case of pseudoquadrature mirror filter
banks (PQMF) when the filter length is restricted to twice the number of channels
L = 2N . Pseudo QMF filters have been proposed as an extension to N channels
of the classical two-channel QMF filters. Pseudo QMF analysis/synthesis systems
achieve in general only cancellation of the main aliasing term (aliasing from neigh-
boring channels). However, when the filter length is restricted to L = 2N , they
can achieve perfect reconstruction. Due to the modulated structure and just as in
the STFT case, there are fast computational algorithms, making such filter banks
attractive for implementations.
A family of PQMF filter banks that achieves cancellation of the main aliasing
176 CHAPTER 3

term is of the form [188, 321]11


    
1 π(2k + 1) L−1
hk [n] = √ hpr [n] cos n− + φk , (3.4.17)
N 2N 2
for the analysis filters (hpr [n] is the impulse response of the window). The modu-
lating frequencies of the cosines are at π/2N, 3π/2N, . . . , (2N − 1)π/2N , and the
prototype window is a lowpass filter with support [−π/2N, π/2N ]. Then, the kth
filter is a bandpass filter with support from kπ/N to (k + 1)π/N (and a mirror
image from −kπ/N to −(k + 1)π/N ), thus covering the range from 0 to π evenly.
Note that for k = 0 and N − 1, the two lobes merge into a single lowpass and
highpass filter respectively. In the general case, the main aliasing term is canceled
for the following possible value of the phase:
π π
φk = +k .
4 2
For this value of phase, and in the special case L = 2N , exact reconstruction is
achieved. This yields filters of the form
 
1 2k + 1
hk [n] = √ hpr [n] cos (2n − N + 1)π , (3.4.18)
N 4N
for k = 0, . . . , N − 1, n = 0, . . . , 2N − 1. Since the filter length is 2N , we have
an LOT, and we can use the formalism in (3.4.4). It can be shown that, due to
the particular structure of the filters, if hpr [n] = 1, n = 0, . . . , 2N − 1, (3.4.5–
3.4.6) hold. The idea of the proof is the following (we assume N to be even):
Being of length 2N , each filter has a left and a right tail of length N . It can be
verified that with the above choice of phase, all the filters have symmetric left tails
(hk [N/2 − 1 − l] = hk [N/2 + l], for l = 0, . . . , N/2 − 1) and antisymmetric right tails
(hk [3N/2 − 1 − l] = hk [3N/2 + l], for l = 0, . . . , N/2 − 1). Then, orthogonality of
the tails (see (3.4.6)) follows because the product of the left and right tail is an odd
function, and therefore, sums to zero. Additionally, each filter is orthogonal to its
modulated versions and has norm 1, and thus, we have an orthonormal LOT. The
details are left as an exercise (see Problem 3.24).
Suppose now that we use a symmetric window hpr [n]. We want to find conditions
under which (3.4.5–3.4.6) still hold. Call B i the blocks in (3.4.5–3.4.6) when no
windowing is used, or hpr [n] = 1, n = 0, . . . , 2N − 1, and Ai the blocks, with a
general symmetric window hpr [n]. Then, we can express A0 in terms of B 0 as
⎛ ⎞
h0 [2N − 1] ··· h0 [N ]
A0 = ⎝ .. .. ⎠ (3.4.19)
. .
hN −1 [2N − 1] · · · hN −1 [N ]
11
The derivation of this type of filter bank is somewhat technical and thus less explicit at times
than other filter banks seen so far.
3.4. MULTICHANNEL FILTER BANKS 177
⎛ ⎞
hpr [2N − 1]
⎜ .. ⎟
= B0 · ⎝ . ⎠ (3.4.20)
hpr [N ]
⎛ ⎞
hpr [0]
⎜ .. ⎟
= B0 · ⎝ . ⎠ (3.4.21)
hpr [N − 1]
3 41 2
W
since hpr is symmetric, that is hpr [n] = hpr [2N − 1 − n], and W denotes the window
matrix. Using the antidiagonal matrix J,
⎛ ⎞
1
J = ⎝ ··· ⎠,
1
it is easy to verify that A1 is related to B 1 , in a similar fashion, up to a reversal of
the entries of the window function, or
A1 = B 1 J W J. (3.4.22)
Note also that due to the particular structure of the cosines involved, the following
are true as well:
1 1
B T0 B 0 = (I − J ), B T1 B 1 = (I + J ). (3.4.23)
2 2
The proof of the above fact is left as an exercise to the reader (see Problem 3.24).
Therefore, take (3.4.5) and substitute the expressions for A0 and A1 given in
(3.4.19) and (3.4.22)
AT0 A0 + AT1 A1 = W B T0 B 0 W + JWJ B T1 B 1 J W J = I.
Using now (3.4.23), this becomes
1 2 1
W + J W 2 J = I,
2 2
where we used the fact that J 2 = I. In other words, for perfect reconstruction, the
following has to hold:
h2pr [i] + h2pr [N − 1 − i] = 2, (3.4.24)
that is, a power complementary property. Using the expressions for A0 and A1 ,
one can easily prove that (3.4.6) holds as well.
Condition (3.4.24) also regulates the shape of the window. For example, if
instead of length 2N , one uses shorter window of length 2N − 2M , then the outer
M coefficients of each “tail” (the symmetric√nonconstant half of the window) are
set to zero, and the inner M ones are set to 2 according to (3.4.24).
178 CHAPTER 3

Table 3.4 Values of a power complementary


window used for generating cosine mod-
ulated filter banks (the window satisfies
(3.4.24)). It is symmetric (hpr [16 − k − 1] =
hpr [k]).

hpr [0] 0.125533 hpr [4] 1.111680


hpr [1] 0.334662 hpr [5] 1.280927
hpr [2] 0.599355 hpr [6] 1.374046
hpr [3] 0.874167 hpr [7] 1.408631

0
k = 0 k = 1
1 1
-10
0.5 0.5
Magnitude response [dB]

2 6 10 14 2 6 10 14 -20

-0.5 -0.5
-30
-1 -1

k = 2 k = 3 -40
1 1

0.5 0.5 -50

2 6 10 14 2 6 10 14
-60
-0.5 -0.5

-1 -1 0 0.5 1 1.5 2 2.5 3


Frequency [radians]

(a) (b)

Figure 3.16 An example of a cosine modulated filter bank with N = 8. (a)


Impulse responses for the first four filters. (b) The magnitude responses of all
the filters are given. The symmetric prototype window is of length 16 with the
first 8 coefficients given in Table 3.4.

Example 3.12

Consider the case N = 8. The center frequency of the modulated filter hk [n] is (2k+1)2π/32,
and since this is a cosine modulation and the filters are real, there is a mirror lobe at
(32 − 2k − 1)2π/32. For the filters h0 [n] and h7 [n], these two lobes overlap to form a single
lowpass and highpass, respectively, while h1 [n], . . . , h6 [n] are bandpass filters. A possible
symmetric window of length 16 and satisfying (3.4.24) is given in Table 3.4, while the impulse
responses of the first four filters as well as the magnitude responses of all the modulated
filters are given in Figure 3.16.

Note that cosine modulated filter banks which are orthogonal have been recently
generalized to lengths L = KN where K can be larger than 2. For more details,
refer to [159, 188, 235, 308].

FIGURE 3.12 fignew3.4.4


3.5. PYRAMIDS AND OVERCOMPLETE EXPANSIONS 179

3.5 P YRAMIDS AND OVERCOMPLETE E XPANSIONS


In this section, we will consider expansions that are overcomplete, that is, the set
of functions used in the expansion is larger than actually needed. In other words,
even if the functions play the role of a set of “basis functions”, they are actually
linearly dependent. Of course, we are again interested in structured overcomplete
expansions and will consider the ones implementable with filter banks. In filter
bank terminology, overcomplete means we have a noncritically sampled filter bank,
as the one given in Figure 3.15.
In compression applications, such redundant representations tend to be avoided,
even if an early example of a multiresolution overcomplete decomposition (the pyra-
mid scheme to be discussed below) has been used for compression. Such schemes
are also often called hierarchical transforms in the compression literature.
In some other applications, overcomplete expansions might be more appropriate
than bases. One of the advantages of such expansions is that, due to oversampling,
the constraints on the filters used are relaxed. This can result in filters of a superior
quality than those in critically sampled systems. Another advantage is that time
variance can be reduced, or in the extreme case of no downsampling, avoided. One
such example is the oversampled discrete-time wavelet series which is also explained
in what follows.

3.5.1 Oversampled Filter Banks


The simplest way to obtain a noncritically sampled filter bank is not to sample at
all, producing an overcomplete expansion. Thus, let us consider a two-channel filter
bank with no downsampling. In the scheme given in Figure 3.15 this means that
N = 2 and M = 1. Then, the output is (see also Example 5.2)

X̂(z) = [G0 (z) H0 (z) + G1 (z) H1 (z)] X(z), (3.5.1)

and perfect reconstruction is easily achievable. For example, in the FIR case if
H0 (z) and H1 (z) have no zeros in common (that is, the polynomials in z −1 are
coprime), then one can use Euclid’s algorithm [32] to find G0 (z) and G1 (z) such
that
G0 (z) H0 (z) + G1 (z) H1 (z) = 1
is satisfied leading to X̂(z) = X(z) in (3.5.1). Note how coprimeness of H0 (z) and
H1 (z), used in Euclid’s algorithm, is also a very natural requirement in terms of
signal processing. A common zero would prohibit FIR reconstruction, or even IIR
reconstruction (if the common zero is on the unit circle). Another case appears
when we have two filters G0 (z) and G1 (z) which have unit norm and satisfy

G0 (z) G0 (z −1 ) + G1 (z) G1 (z −1 ) = 2, (3.5.2)


180 CHAPTER 3

since then with H0 (z) = G0 (z −1 ) and H1 (z) = G1 (z −1 ) one obtains

X̂(z) = [G0 (z) G0 (z −1 ) + G1 (z) G1 (z −1 )] X(z) = 2X(z).

Writing this in time domain (see Example 5.2), we realize that the set {gi [n − k]},
i = 0, 1, and k ∈ Z, forms a tight frame for l2 (Z) with a redundancy factor R = 2.
The fact that {gi [n − k]} form a tight frame simply means that they can uniquely
represent any sequence from l2 (Z) (see also Section 5.3). However, the basis vectors
are not linearly independent and thus they do not form an orthonormal basis. The
redundancy factor indicates the oversampling rate; we can indeed check that it is
two in this case, that is, there are twice as many basis functions than actually needed
to represent sequences from l2 (Z). This is easily seen if we remember that until
now we needed only the even shifts of gi [n] as basis functions, while now we use the
odd shifts as well. Also, the expansion formula in a tight frame is similar to that in
the orthogonal case, except for the redundancy (which means the functions in the
expansion are not linearly independent). There is an energy conservation relation,
or Parseval’s formula, which says that the energy of the expansion coefficients equals
R times the energy of the original. In our case, calling yi [n] the output of the filter
hi [n], we can verify (Problem 3.26) that

x2 = 2(y0 2 + y1 2 ). (3.5.3)

To design such a tight frame for l2 (Z) based on filter banks, that is, to find solutions
to (3.5.2), one can find a unit norm12 filter G0 (z) which satisfies

0 ≤ |G0 (ejω )|2 ≤ 2,

and then take the spectral factorization of the difference 2 − G0 (z)G0 (z −1 ) =


G1 (z)G1 (z −1 ) to find G1 (z). Alternatively, note that (3.5.2) means the 2 × 1 vector
( G0 (z) G1 (z) )T is lossless, and one can use a lattice structure for its factorization,
just as in the 2 × 2 lossless case [308]. On the unit circle, (3.5.2) becomes

|G0 (ejω )|2 + |G1 (ejω )|2 = 2,

that is, G0 (z) and G1 (z) are power complementary. Note that (3.5.2) is less restric-
tive than the usual orthogonal solutions we have seen in Section 3.2.3. For example,
odd-length filters are possible.
Of course, one can iterate such nondownsampled two-channel filter banks, and
get more general solutions.
7 In2particular,
8 by adding two-channel nondownsampled
2
filter banks with filters H0 (z ), H1 (z ) to the lowpass analysis channel and iter-
ating (raising z to the appropriate power) one can devise a discrete-time wavelet
12
Note that the unit norm requirement is not necessary for constructing a tight frame.
3.5. PYRAMIDS AND OVERCOMPLETE EXPANSIONS 181

coarse
version

~
H0 2 2 H0 V1

original difference
signal − signal
V0 + W1

FIGURE 3.13 fignew3.5.1

Figure 3.17 Pyramid scheme involving a coarse lowpass approximation and


a difference between the coarse approximation and the original. We show the
case where an orthogonal filter is used and therefore, the coarse version (after
interpolation) is a projection onto V1 , while the difference is a projection onto
W1 . This indicates the multiresolution behavior of the pyramid.

series. This is a very redundant expansion, since there is no downsampling. How-


ever, unlike the critically sampled wavelet series, this expansion is shift-invariant
and is useful in applications where shift invariance is a requirement (for example,
object recognition).
More general cases of noncritically sampled filter banks, that is, N -channel filter
banks with downsampling by M where M < N , have not been much studied (except
for the Fourier case discussed below). While some design methods are possible (for
example, embedding into larger lossless systems), there are still open questions.

3.5.2 Pyramid Scheme


In computer vision and image coding, a successive approximation or multiresolution
technique called an image pyramid is frequently used. This scheme was introduced
by Burt and Adelson [41] and was recognized by the wavelet community to have a
strong connection to multiresolution analysis as well as orthonormal bases of wave-
lets. It consists of deriving a low-resolution version of the original, then predicting
the original based on the coarse version, and finally taking the difference between the
original and the prediction (see Figure 3.17). At the reconstruction, the prediction
is added back to the difference, guaranteeing perfect reconstruction. A shortcoming
of this scheme is the oversampling, since we end up with a low-resolution version
and a full-resolution difference signal (at the initial rate). Obviously, the scheme
can be iterated, decomposing the coarse version repeatedly, to obtain a coarse ver-
sion at level J plus J detailed versions. From the above description, it is obvious
that the scheme is inherently multiresolution. Consider, for example, the coarse
and detailed versions at the first level (one stage). The coarse version is now at
twice the scale (downsampling has contracted it by 2) and half the resolution (in-
formation loss has occurred), while the detailed version is also of half resolution but
182 CHAPTER 3

of the same scale as the original. Also, a successive approximation flavor is easily
seen: One could start with the coarse version at level J, and by adding difference
signals, obtain versions at levels J − 1, . . . , 1, 0, (that is, the original).
An advantage of the pyramid scheme in image coding is that nonlinear inter-
polation and decimation operators can be used. A disadvantage, however, as we
have already mentioned, is that the scheme is oversampled, although the overhead
in number of samples decreases as the dimensionality increases. In n dimensions,
oversampling s as a function of the number of levels L in the pyramid is given by

L−1
1
i
2n
s = < , (3.5.4)
2n 2n − 1
i=0

which is an overhead of 50–100% in one dimension. It goes down to 25–33% in two


dimensions, and further down to 12.5–14% in three dimensions. However, we will
show below [240, 319] that if the system is linear and the lowpass filter is orthogonal
to its even translates, then one can actually downsample the difference signal after
filtering it. In that case, the pyramid reduces exactly to a critically downsampled
orthogonal subband coding scheme.
First, the prediction of the original, based on the coarse version, is simply the
projection onto the space spanned by {h0 [2k − n], k ∈ Z}. That is, calling the
prediction x̄
x̄ = H T0 H 0 x.
The difference signal is thus

d = (I − H T0 H 0 ) x.

But, because it is a perfect reconstruction system

I − H T0 H 0 = H T1 H 1 ,

that is, d is the projection onto the space spanned by {h1 [2k−n], k ∈ Z}. Therefore,
we can filter and downsample d by 2, since

H 1 H T1 H 1 = H 1 .

In that case, the redundancy of d is removed (d is now critically sampled) and the
pyramid is equivalent to an orthogonal subband coding system.
The signal d can be reconstructed by upsampling by 2 and filtering with h1 [n].
Then we have
H T1 (H 1 H T1 H 1 ) x = H T1 H 1 x = d
and this, added to x̄ = H T0 H 0 x, is indeed equal to x. In the notation of the
multiresolution scheme the prediction x̄ is the projection onto the space V1 and d
3.5. PYRAMIDS AND OVERCOMPLETE EXPANSIONS 183

is the projection onto W1 . This is indicated in Figure 3.17. We have thus shown
that pyramidal schemes can be critically sampled as well, that is, in Figure 3.17 the
difference signal can be followed by a filter h1 [n] and a downsampler by 2 without
any loss of information.
Note that we assumed an orthogonal filter and no quantization of the coarse
version. The benefit of the oversampled pyramid comes from the fact that arbitrary
filters (including nonlinear ones) can be used, and that quantization of the coarse
version does not influence perfect reconstruction (see Section 7.3.2).
This scheme is very popular in computer vision, not so much because perfect
reconstruction is desired but because it is a computationally efficient way to obtain
multiple resolution of an image. As a lowpass filter, an approximation to a Gaus-
sian, bell-shaped filter is often used and because the difference signal resembles the
original filtered by the Laplace operator, such a scheme is usually called a Laplacian
pyramid.

3.5.3 Overlap-Save/Add Convolution and Filter Bank Implementations


Filter banks can be used to implement algorithms for the computation of convolu-
tions (see also Section 6.5.1). Two classic examples are block processing schemes —
the overlap-save and overlap-add algorithms for computing a running convolution
[211]. Essentially, a block of input is processed at a time (typically with frequency-
domain circular convolution) and the output is merged so as to achieve true linear
running convolution. Since the processing advances by steps (which corresponds
to downsampling the input by the step size), these two schemes are multirate in
nature and have an immediate filter bank interpretation [317].

Overlap-Add Scheme This scheme performs the following task: Assuming a


filter of length L, the overlap-add algorithm takes a block of input samples of
length M = N − L + 1, and feeds it into a size-N FFT (N > L). This results in
a linear convolution of the signal with the filter. Since the size of the FFT is N ,
there will be L − 1 samples overlapping with adjacent blocks of size M , which are
then added together (thus the name overlap-add). One can see that such a scheme
can be implemented with an N -channel analysis filter bank downsampled by M ,
followed by multiplication (convolution in Fourier domain), upsampling by M and
an N -channel synthesis filter bank, as shown in Figure 3.18.
For the details on computational complexity of the filter bank, refer to Sec-
tions 6.2.3 and 6.5.1. Also, note, that the filters used are based on the short-time
Fourier transform.

Overlap-Save Scheme Given a length-L filter, the overlap-save algorithm per-


forms the following: It takes N input samples, computes a circular convolution of
184 CHAPTER 3

HN – 1 Μ CN − 1 Μ GN – 1

•••

•••

•••
x + x^

H1 Μ C1 Μ G1

H0 Μ C0 Μ G0

FIGURE 3.14 fignew3.5.2

Figure 3.18 N-channel analysis/synthesis filter bank with downsampling by


M and filtering of the channel signals. The downsampling by M is equiva-
lent to moving the input by M samples between successive computations of
the output. With filters based on the Fourier transform, and filtering of the
channels chosen to perform frequency-domain convolution, such a filter bank
implements overlap-save/add running convolution.

which N − L + 1 samples are valid linear convolution outputs and L − 1 samples


are wrap-around effects. These last L − 1 samples are discarded. The N − L + 1
valid ones are kept and the algorithm moves up by N − L + 1 samples. The filter
bank implementation is similar to the overlap-add scheme, except that analysis and
synthesis filters are interchanged [317].

Generalizations The above two schemes are examples from a general class of
oversampled filter banks which compute running convolution. For example, the
pointwise multiplication in the above schemes can be replaced by a true convolu-
tion and will result in a longer overall convolution if adequately chosen. Another
possibility is to use analysis and synthesis filters based on fast convolution algo-
rithms other than Fourier ones. For more details, see [276, 317] and Section 6.5.1.

3.6 M ULTIDIMENSIONAL F ILTER BANKS


It seems natural to ask if the results we have seen so far on expansion of one-
dimensional discrete-time signals can be generalized to multiple dimensions. This is
both of theoretical interest as well as relevant in practice, since popular applications
such as image compression often rely on signal decompositions. One easy solution
to the multidimensional problem is to apply all known one-dimensional techniques
separately along one dimension at a time. Although a very simple solution, it suffers
from some drawbacks: First, only separable (for example, two-dimensional) filters
3.6. MULTIDIMENSIONAL FILTER BANKS 185

are obtained in this way, leading to fairly constrained designs (nonseparable filters of
size N1 ×N2 would offer N1 ·N2 free design variables versus N1 +N2 in the separable
case). Then, only rectangular divisions of the spectrum are possible, though one
might need divisions that would better capture the signal’s energy concentration
(for example, close to circular).
Choosing nonseparable solutions, while solving some of these problems, comes
at a price: the design is more difficult, and the complexity is substantially higher.
The first step toward using multidimensional techniques on multidimensional
signals is to use the same kind of sampling as before (that is, in the case of an im-
age, sample first along the horizontal and then along the vertical dimension), but use
nonseparable filters. A second step consists in using nonseparable sampling as well
as nonseparable filters. This calls for the development of a new theory that starts
by pointing out the major difference between one- and multidimensional cases —
sampling. Sampling in multiple dimensions is represented by lattices. An excellent
presentation of lattice sampling can be found in the tutorial by Dubois [86] (Ap-
pendix 3.B gives a brief overview). Filter banks using nonseparable downsampling
were studied in [11, 314]. The generalization of one-dimensional analysis methods
to multidimensional filter banks using lattice downsampling was done in [155, 325].
The topic has been quite active recently (see [19, 47, 48, 160, 257, 264, 288]).
In this section, we will give an overview of the field of multidimensional filter
banks. We will concentrate mostly on two cases: the separable case with down-
sampling by 2 in two dimensions, and the quincunx case, that is, the simplest
multidimensional nonseparable case with overall sampling density of 2. Both of
these cases are of considerable practical interest, since these are the ones mostly
used in image processing applications.

3.6.1 Analysis of Multidimensional Filter Banks


In Appendix 3.B, a brief account of multidimensional sampling is given. Using the
expressions given for sampling rate changes, analysis of multidimensional systems
can be performed in a similar fashion to their one-dimensional counterparts. Let
us start with the simplest case, where both the filters and the sampling rate change
are separable.

Example 3.13 Separable Case with Sampling by 2 in Two Dimensions


If one uses the scheme as in Figure 3.19 then all one-dimensional results are trivially extended
to two dimensions. However, all limitations appearing in one dimension, will appear in
two dimensions as well. For example, we know that there are no real two-channel perfect
reconstruction filter banks, being orthogonal and linear phase at the same time. This implies
that the same will hold in two dimensions if separable filters are used.
Alternatively, one could still sample separately (see Figure 3.20(a)) and yet use
186 CHAPTER 3

f2
π
H1 2 HH

H
H1L 2

H0 2 HL f1
−π π
x

H1 2 LH
−π
H0Η
H 2

H0 2 LL LL LH HL HH
horizontal
vertical
(a) (b)

Figure 3.19 Separable filter bank in two dimensions, with separable downsam-
FIGURE
pling by 2. (a) Cascade of horizontal and 3.15 fignew3.6.1
vertical decompositions. (b) Division
of the frequency spectrum.

n2 n2

n1 n1

(a) (b)

FIGURE 3.16 fignew3.6.2

Figure 3.20 Two often used lattices. (a) Separable sampling by 2 in two
dimensions. (b) Quincunx sampling.

nonseparable filters. In other words, one could have a direct four-channel implemen-
tation of Figure 3.19 where the four filters could be H0 , H1 , H2 , H3 . While before,
Hi (z1 , z2 ) = Hi1 (z1 )Hi2 (z2 ) where Hi (z) is a one-dimensional filter, Hi (z1 , z2 ) is now a true
two-dimensional filter. This solution, while more general, is more complex to design and
implement. It is possible to obtain an orthogonal linear phase FIR solution [155, 156], which
cannot be achieved using separable filters (see Example 3.15 below).

Similarly to the one-dimensional case, one can define polyphase decompositions


of signals and filters. Recall that in one dimension, the polyphase decomposition of
the signal with respect to N was simply the subsignals which have the same indexes
modulo N . The generalization in multiple dimensions are cosets with respect to
3.6. MULTIDIMENSIONAL FILTER BANKS 187

a downsampling lattice. There is no natural ordering such as in one dimension


but as long as all N cosets are included, the decomposition is valid. In separable
downsampling by 2 in two dimensions, we can take as coset representatives the
points {(0, 0), (1, 0), (0, 1), (1, 1)}. Then the signal X(z1 , z2 ) can be written as
X(z1 , z2 ) = X00 (z12 , z22 ) + z1−1 X10 (z12 , z22 ) + z2−1 X01 (z12 , z22 ) + z1−1 z2−1 X11 (z12 , z22 ),
(3.6.1)
where 
Xij (z1 , z2 ) = z1−m z2−n x[2m + i, 2n + j].
m n
Thus, the polyphase component with indexes i, j corresponds to a square lattice
downsampled by 2, and with the origin shifted to (i, j). The recombination of
X(z1 , z2 ) from its polyphase components as given in (3.6.1) corresponds to an in-
verse polyphase transform and its dual is therefore the forward polyphase transform.
The polyphase decomposition of analysis and synthesis filter banks follow similarly.
The synthesis filters are decomposed just as the signal (see (3.6.1)), while the
analysis filters have reverse phase. We shall not dwell longer on these decompo-
sitions since they follow easily from their one-dimensional counterparts but tend
to involve a bit of algebra. The result, as to be expected, is that the output of
an analysis/synthesis filter bank can be written in terms of the input polyphase
components times the product of the polyphase matrices.
The output of the system could also be written in terms of modulated versions
of the signal and filters. For example, downsampling by 2 in two dimensions, and
then upsampling by 2 again (zeroing out all samples except the ones where both
indexes are even) can be written in z-domain as
1
(X(z1 , z2 ) + X(−z1 , z2 ) + X(z1 , −z2 ) + X(−z1 , −z2 )).
4
Therefore, it is easy to verify that the output of a four-channel filter bank with
separable downsampling by 2 has an output that can be written as
1 T
Y (z1 , z2 ) = g (z1 , z2 ) H m (z1 , z2 ) xm (z1 , z2 ),
4
where
g T (z1 , z2 ) =
( G0 (z1 , z2 ) G1 (z1 , z2 ) G2 (z1 , z2 ) G3 (z1 , z2 ) ) , (3.6.2)
H m (z1 , z2 ) =
⎛ ⎞
H0 (z1 , z2 ) H0 (−z1 , z2 ) H0 (z1 , −z2 ) H0 (−z1 , −z2 )
⎜ H1 (z1 , z2 ) H1 (−z1 , z2 ) H1 (z1 , −z2 ) H1 (−z1 , −z2 ) ⎟
⎜ ⎟, (3.6.3)
⎝ H2 (z1 , z2 ) H2 (−z1 , z2 ) H2 (z1 , −z2 ) H2 (−z1 , −z2 ) ⎠
H3 (z1 , z2 ) H3 (−z1 , z2 ) H3 (z1 , −z2 ) H3 (−z1 , −z2 )
188 CHAPTER 3

xm (z1 , z2 ) =
( X(z1 , z2 ) X(−z1 , z2 ) X(z1 , −z2 ) X(−z1 , −z2 ) ) .

Let us now consider an example involving nonseparable downsampling. We


examine quincunx sampling (see Figure 3.20(b)) because it is the simplest mul-
tidimensional nonseparable lattice. Moreover, it samples by 2, that is, it is the
counterpart of the one-dimensional two-channel case we discussed in Section 3.2.

Example 3.14 Quincunx Case


It is easy to verify that, given X(z1 , z2 ), quincunx downsampling followed by quincunx
upsampling (that is, replacing the locations with empty circles in Figure 3.20(b) by 0)
results in a z-transform equal to 1/2(X(z1 , z2 ) + X(−z1 , −z2 )). From this, it follows that
a two-channel analysis/synthesis filter bank using quincunx sampling has an input/output
relationship given by

1   H (z , z ) H (−z , −z ) 
0 1 2 0 1 2
Y (z1 , z2 ) = G0 (z1 , z2 ) G1 (z1 , z2 )
2 H1 (z1 , z2 ) H1 (−z1 , −z2 )
 
X(z1 , z2 )
.
X(−z1 , −z2 )

Similarly to the one-dimensional case, it can be verified that the orthogonality of the system
is achieved when the lowpass filter satisfies

H0 (z1 , z2 )H0 (z1−1 , z2−1 ) + H0 (−z1 , −z2 )H0 (−z1−1 , −z2−1 ) = 2, (3.6.4)

that is, the lowpass filter is orthogonal to its shifts on the quincunx lattice. Then, a possible
highpass filter is given by

H1 (z1 , z2 ) = −z1−1 H0 (−z1−1 , −z2−1 ). (3.6.5)

The synthesis filters are the same (within shift reversal, or Gi (z1 , z2 ) = Hi (z1−1 , z2−1 )). In
polyphase domain, define the two polyphase components of the filters as

Hi0 (z1 , z2 ) = hi [n1 + n2 , n1 − n2 ]z1−n1 z2−n2 ,
(n1 ,n2 )∈Z 2

Hi1 (z1 , z2 ) = hi [n1 + n2 + 1, n1 − n2 ]z1−n1 z2−n2 ,
(n1 ,n2 )∈Z 2

with
Hi (z1 , z2 ) = Hi0 (z1 z2 , z1 z2−1 ) + z1−1 Hi1 (z1 z2 , z1 z2−1 ).

The results on alias cancellation and perfect reconstruction are very similar to
their one-dimensional counterparts. For example, perfect reconstruction with FIR
filters is achieved if and only if the determinant of the analysis polyphase matrix is
a monomial, that is,

H p (z1 , . . . , zn ) = c · z1−K1 · · · · zn−Kn .


3.6. MULTIDIMENSIONAL FILTER BANKS 189

Since the results are straightforward extensions of one-dimensional results, we rather


discuss two cases