100% found this document useful (1 vote)
656 views438 pages

Image Processing For Engineers

Uploaded by

Gabriel Ferreyra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
656 views438 pages

Image Processing For Engineers

Uploaded by

Gabriel Ferreyra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 438

Image Processing

For Engineers

by Andrew E. Yagle and Fawwaz T. Ulaby


Book companion website:

: ip.eecs.umich.edu
“book” — 2016/3/15 — 6:35 — page iii — #3

IMAGE
PROCESSING
FOR ENGINEERS

Andrew E. Yagle
The University of Michigan

Fawwaz T. Ulaby
The University of Michigan
Copyright  2018 Andrew E. Yagle and Fawwaz T. Ulaby

This book is published by Michigan Publishing under an agreement with the authors.
It is made available free of charge in electronic form to any student or instructor
interested in the subject matter.

Published in the United States of America by


Michigan Publishing
Manufactured in the United States of America

ISBN 978-1-60785-488-3 (hardcover)


ISBN 978-1-60785-489-0 (electronic)
“book” — 2016/3/15 — 6:35 — page v — #5

This book is dedicated to the memories of


Professor Raymond A. Yagle and Mrs. Anne Yagle
Contents

Preface Chapter 4 Image Interpolation 128

Chapter 1 Imaging Sensors 1 4-1 Interpolation Using Sinc Functions 129


4-2 Upsampling and Downsampling Modalities 130
1-1 Optical Imagers 3 4-3 Upsampling and Interpolation 133
1-2 Radar Imagers 13 4-4 Implementation of Upsampling Using 2-D DFT
1-3 X-Ray Computed Tomography (CT) 18 in MATLAB 137
1-4 Magnetic Resonance Imaging 19 4-5 Downsampling 140
1-5 Ultrasound Imager 23 4-6 Antialias Lowpass Filtering 141
1-6 Coming Attractions 27 4-7 B-Splines Interpolation 143
4-8 2-D Spline Interpolation 149
Chapter 2 Review of 1-D Signals and Systems 38 4-9 Comparison of 2-D Interpolation Methods 150
4-10 Examples of Image Interpolation Applications 152
2-1 Review of 1-D Continuous-Time Signals 41
2-2 Review of 1-D Continuous-Time Systems 43 Chapter 5 Image Enhancement 159
2-3 1-D Fourier Transforms 47
2-4 The Sampling Theorem 53 5-1 Pixel-Value Transformation 160
2-5 Review of 1-D Discrete-Time Signals and 5-2 Unsharp Masking 163
Systems 59 5-3 Histogram Equalization 167
2-6 Discrete-Time Fourier Transform (DTFT) 66 5-4 Edge Detection 171
2-7 Discrete Fourier Transform (DFT) 70 5-5 Summary of Image Enhancement Techniques 176
2-8 Fast Fourier Transform (FFT) 76
2-9 Deconvolution Using the DFT 80 Chapter 6 Deterministic Approach to Image 180
2-10 Computation of Continuous-Time Fourier Restoration
Transform (CTFT) Using the DFT 82
6-1 Direct and Inverse Problems 181
Chapter 3 2-D Images and Systems 89 6-2 Denoising by Lowpass Filtering 183
6-3 Notch Filtering 188
3-1 Displaying Images 90 6-4 Image Deconvolution 191
3-2 2-D Continuous-Space Images 91 6-5 Median Filtering 194
3-3 Continuous-Space Systems 93 6-6 Motion-Blur Deconvolution 195
3-4 2-D Continuous-Space Fourier Transform
(CSFT) 94 Chapter 7 Wavelets and Compressed Sensing 202
3-5 2-D Sampling Theorem 107
7-1 Tree-Structured Filter Banks 203
3-6 2-D Discrete Space 113
7-2 Expansion of Signals in Orthogonal Basis
3-7 2-D Discrete-Space Fourier Transform (DSFT) 118
Functions 206
3-8 2-D Discrete Fourier Transform (2-D DFT) 119
7-3 Cyclic Convolution 209
3-9 Computation of the 2-D DFT Using MATLAB 121
7-4 Haar Wavelet Transform 213
7-5 Discrete-Time Wavelet Transforms 218 Chapter 10 Color Image Processing 334
7-6 Sparsification Using Wavelets of Piecewise-
Polynomial Signals 223 10-1 Color Systems 335
7-7 2-D Wavelet Transform 228 10-2 Histogram Equalization and Edge Detection 340
7-8 Denoising by Thresholding and Shrinking 232 10-3 Color-Image Deblurring 343
7-9 Compressed Sensing 236 10-4 Denoising Color Images 346
7-10 Computing Solutions to Underdetermined
Chapter 11 Image Recognition 353
Equations 238
7-11 Landweber Algorithm 241 11-1 Image Classification by Correlation 354
7-12 Compressed Sensing Examples 242 11-2 Classification by MLE 357
11-3 Classification by MAP 358
Chapter 8 Random Variables, Processes, and 254
11-4 Classification of Spatially Shifted Images 360
Fields
11-5 Classification of Spatially Scaled Images 361
8-1 Introduction to Probability 255 11-6 Classification of Rotated Images 366
8-2 Conditional Probability 259 11-7 Color Image Classification 367
8-3 Random Variables 261 11-8 Unsupervised Learning and Classification 373
8-4 Effects of Shifts on Pdfs and Pmfs 263 11-9 Unsupervised Learning Examples 377
8-5 Joint Pdfs and Pmfs 265 11-10 K-Means Clustering Algorithm 380
8-6 Functions of Random Variables 269
Chapter 12 Supervised Learning and 389
8-7 Random Vectors 272
Classification
8-8 Gaussian Random Vectors 275
8-9 Random Processes 278 12-1 Overview of Neural Networks 390
8-10 LTI Filtering of Random Processes 282 12-2 Training Neural Networks 396
8-11 Random Fields 285 12-3 Derivation of Backpropagation 403
12-4 Neural Network Training Examples 404
Chapter 9 Stochastic Denoising and 291
Deconvolution Appendix A Review of Complex Numbers 411
9-1 Estimation Methods 292 Appendix B MATLAB® and MathScript 415
9-2 Coin-Flip Experiment 298
9-3 1-D Estimation Examples 300 Index 421
9-4 Least-Squares Estimation 303
9-5 Deterministic versus Stochastic Wiener
Filtering 307
9-6 2-D Estimation 309
9-7 Spectral Estimation 313
9-8 1-D Fractals 314
9-9 2-D Fractals 320
9-10 Markov Random Fields 322
9-11 Application of MRF to Image Segmentation 327
Preface

“A picture is worth a thousand words.” • An introduction to discrete wavelets, and application of


This is an image processing textbook with a difference. Instead wavelet-based denoising algorithms using thresholding and
of just a picture gallery of before-and-after images, we provide shrinkage, including examples and problems.
(on the accompanying website) MATLAB programs (.m files) • An introduction to compressed sensing, including exam-
and images (.mat files) for each of the examples. These allow ples and problems.
the reader to experiment with various parameters, such as noise
strength, and see their effect on the image processing procedure. • An introduction to Markov random fields and the ICM
We also provide general MATLAB programs, and Javascript algorithm.
versions of them, for many of the image processing procedures
presented in this book. We believe studying image processing • An introduction to supervised and unsupervised learning
without actually performing it is like studying cooking without and neural networks.
turning on an oven. • Coverage of both deterministic (least-squares) and stochas-
Designed for a course on image processing (IP) aimed at both tic (a priori power spectral density) image deconvolution,
graduate students as well as undergraduates in their senior year, and how the latter gives better results.
in any field of engineering, this book starts with an overview
in Chapter 1 of how imaging sensors—from cameras to radars • Interpolation using B-splines.
to MRIs and CAT—form images, and then proceeds to cover a
wide array of image processing topics. The IP topics include: • A review of probability, random processes, and MLE,
image interpolation, magnification, thumbnails, and sharpening, MAP, and LS estimation.
edge detection, noise filtering, de-blurring of blurred images,
supervised and unsupervised learning, and image segmentation, Book Companion Website: ip.eecs.umich.edu
among many others. As a prelude to the chapters focused on
image processing (Chapters 3–12), the book offers in Chapter 2 The book website is a rich resource developed to extend the
a review of 1-D signals and systems, borrowed from our 2018 educational experience of the student beyond the material cov-
book Signals and Systems: Theory and Applications, by Ulaby ered in the textbook. It contains MATLAB programs, standard
and Yagle. images to which the reader can apply the image processing tools
outlined in the book, and Javascript image processing modules
with selectable parameters. It also contains solutions to “concept
Book highlights: questions” and “exercises,” and, for instructors, solutions to
• A section in Chapter 1 called “Coming Attractions,” of- homework problems.
fering a sampling of the image processing applications
covered in the book. Acknowledgments: Mr. Richard Carnes—our friend, our pi-
ano performance teacher, and our LATEX super compositor—
• MATLAB programs and images (.m and .mat files) on the deserves singular thanks and praise for the execution of this
book’s website for all examples and problems. All of these book. We are truly indebted to him for his meticulous care and
also run on NI LabVIEW Mathscript. attention. We also thank Ms. Rose Anderson for the elegant de-
• Coverage of standard image processing techniques, includ- sign of the cover and for creating the printable Adobe InDesign
ing upsampling and downsampling, rotation and scaling, version of the book.
histogram equalization, lowpass filtering, classification,
edge detection, and an introduction to color image process- A NDREW YAGLE AND FAWWAZ U LABY, 2018
ing.
Chapter 1
1 Imaging Sensors
Contents
Lens of diameter D
Overview, 2
1-1 Optical Imagers, 3
y
1-2 Radar Imagers, 13
1-3 X-Ray Computed Tomography (CT), 18 Source s
1-4 Magnetic Resonance Imaging, 19 x
θ
1-5 Ultrasound Imager, 12
1-6 Coming Attractions, 27
Problems, 36 Object plane Image plane

Image processing has applications in medicine,


Objectives robotics, human-computer interface, and manufac-
turing, among many others. This book is about the
Learn about: mathematical methods and computational
algorithms used in processing an image from its raw
■ How a digital camera forms an image, and what form—whether generated by a digital camera, an
determines the angular resolution of the camera.
ultrasound monitor, a high-resolution radar, or any
■ How a thermal infrared imager records the distribu- other 2-D imaging system—into an improved form
tion of energy emitted by the scene. suitable for the intended application. As a prelude,
this chapter provides overviews of the image
■ How a radar can create images with very high formation processes associated with several sensors.
resolution from satellite altitudes.

■ How an X-ray system uses computed tomography


(CT) to generate 3-D images.

■ How magnetic resonance is used to generate 3-D


MRI images.

■ How an ultrasound instrument generates an image of


acoustic reflectivity, much like an imaging radar.

■ The history of image processing.

■ The types of image-processing operations examined


in detail in follow-up chapters.
2 CHAPTER 1 IMAGING SENSORS

Overview sectional images (called slice) of the attenuation for specific


areas of interest. A rather different process occurs in magnetic
In today’s world we use two-dimensional (2-D) images gen- resonance imaging (MRI).
erated by a variety of different sensors, from optical cameras For these and many other sensing processes, the formation of
and ultrasound monitors to high-resolution radars and others. the 2-D image is only the first step. As depicted in Fig. 1-1, we
A camera uses light rays and lenses to form an image of the call such an image the raw image, because often we subject the
brightness distribution across the scene observed by the lens, raw image to a sequence of image processing steps designed
ultrasound imagers use sound waves and transducers to measure to transform the image into a product more suitable for the
the reflectivity of the scene or medium exposed to the sound intended application (Table 1-1). These steps may serve to filter
waves, and radar uses antennas to illuminate a scene with out (most of) the noise that may have accompanied the (desired)
microwaves and then detect the fraction of energy scattered signal in the image detection process, rotate or interpolate the
back toward the radar. The three image formation processes image if called for by the intended application, enhance certain
are markedly different, yet their output product is similar: a image features to accentuate recognition of objects of interest,
2-D analog or digital image. An X-ray computed tomography or compress the number of pixels representing the image so as
(CT) scanner measures the attenuation of X-rays along many to reduce data storage (number of bits), as well as other related
directions through a 3-D object, such as a human head, and actions.
then processes the data to generate one or more 2-D cross-

Raw Improved
Image image Image image
Image display
formation processing

Image storage/
Sensor transmission

Image Formation and Processing

Image analysis

Figure 1-1 After an image is formed by a sensor, image processing tools are applied for many purposes, including changing its scale and
orientation, improving its information content, or reducing its digital size.
1-1 OPTICAL IMAGERS 3

processing techniques covered in future chapters can accom-


Table 1-1 Examples of image-processing applications.
plish.
• Medicine (radiological diagnoses, microscopy)
• Defense (radar, sonar, infrared, satellites, etc.)
1-1 Optical Imagers
• Robotics / machine vision (e.g., “intelligent” vehicles)
• Human-computer interfaces (face/fingerprint “recognition” Even though the prime objective of this book is to examine
for security, character recognition) the various image processing techniques commonly applied to
• Compression for storage, transmission from space probes, a raw image (Fig. 1-1) to transform it into an improved image
etc. of specific utility, it will prove helpful to the reader to have a
fundamental understanding of the image formation process that
• Entertainment industry led to the raw image in the first place. We consider five types
• Manufacturing (e.g., part inspection) of imaging sensors in this introductory chapter, of which four
are electromagnetic (EM), and the fifth is acoustic. Figure 1-2
depicts the EM spectrum, extending from the gamma-ray region
to the radio region. Optical imagers encompass imaging systems
◮ This book covers the mathematical bases and compu- that operate in the visible, ultraviolet, and infrared segments
tational techniques used to realize these image-processing of the EM spectrum. In the present section we feature digital
transformations. ◭ cameras, which record reflected energy in the visible part of the
spectrum, and infrared imagers, which sense thermal radiation
self-emitted by the observed scene.
To set the stage for the material covered in future chapters,
this introductory chapter introduces the reader to overviews of 1-1.1 Digital Cameras
the image formation processes associated with several different
types of sensors. Each of Sections 1-1 through 1-5 sketches the In June of 2000, Samsung introduced the first mobile phone
fundamental physical principles and the terminology commonly with a built-in digital camera. Since then, cameras have become
used in connection with that particular imaging sensor. The integral to most mobile phones, computer tablets, and laptop
chapter concludes with Section 1-6, which provides visual computers. And even though cameras may vary widely in terms
demonstrations of the various image operations that the image of their capabilities, they all share the same imaging process. As

X-ray imagers Optical imagers Radar MRI


Gamma-ray X-ray Ultraviolet Visible Infrared Microwave Radio
10−12 10−10 10−8 0.5×10 −6 10−5 10−2 100 102 103
Wavelength (m)

3 × 1020 3 × 1018 3 × 1016 3 × 1014 3 × 1012 3 × 108 3 × 104

Frequency (Hz)

10,000,000 K 10,000 K 100 K 1K

Temperature of object
Temperature when
of object radiation
when radiationisismost
mostintense
intense

Figure 1-2 Electromagnetic spectrum.


4 CHAPTER 1 IMAGING SENSORS

an optical imager, the camera records the spatial distribution of


visible light reflected by a scene due to illumination by the sun
or an artificial light source. In the simplified diagram shown in
Fig. 1-3, the converging lens of the camera serves to focus the Digital
light reflected by the apple to form a sharp image in the image storage
plane of the camera. To “focus” the image, it is necessary to device
adjust the location of the lens so as to satisfy the lens law
1 1 1 Amplifiers
+ = (lens law), (1.1)
do di f
Figure 1-4 An active pixel sensor uses a 2-D array of
where do and di are the distances between the lens and the object photodetectors, usually made of CMOS, to detect incident light
and image planes, respectively, and f is the focal length of the in the red, green, and blue bands.
lens.
In a traditional analog camera, the image is captured by a
film containing light-sensitive silver halide crystals. The crystals
undergo a chemical change—and an associated darkening—in
replaced with APS arrays over the past 20 years (Moynihan,
proportion to the amount of light absorbed by the crystals.
2015) because APS arrays consume less power to operate and
Modern cameras use arrays of charge-coupled devices
are less expensive to fabricate (but they are more susceptible to
(CCDs) or active pixel sensors (APSs), placed in the image
noise than CCDs).
plane, to capture the image and then transfer the intensity
A photodetector uses CMOS (complementary metal-oxide
readings to a data storage device (Fig. 1-4). The CCD relies
semiconductor) technology to convert incident photons into an
on charge transfer in response to incident photons, whereas an
output voltage. Because both CCD and CMOS are sensitive to
APS uses a photodetector and an amplifier. CCD arrays were
the entire visible spectrum, from about 0.4 µ m to 0.7 µ m, as
the sensors of choice in the 1970–2000 era, but they have been
well as part of the near-infrared (NIR) spectrum from 0.7 µ m to

Converging lens
of focal length f

2-D detector array

do di

Object plane Image plane

Figure 1-3 Camera imaging system.


1-1 OPTICAL IMAGERS 5

Ii (x, y; λ ): continuous intensity image in the image plane of the


1.2 camera, with (x, y) denoting the coordinates of the image plane.
Relative spectral sensitivities

Blue Green
1.0
V [n1 , m1 ] = {Vred [n1 , m1 ], Vgreen [n1 , m1 ], Vblue [n1 , m1 ]} distribu-
0.8 tion: discrete 2-D array of the voltage outputs of the CCD or
Red photo detector array.
0.6
B[n2 , m2 ] = {Bred [n2 , m2 ], Bgreen [n2 , m2 ], Bblue [n2 , m2 ]} distribu-
0.4
tion: discrete 2-D array of the brightness across the LCD array.
0.2

0
0.40 0.45 0.50 0.55 0.60 0.65 0.70
◮ Our notation uses parentheses ( ) with continuous-space
Wavelength λ ( μm)
signals and images, as in Io (x′ , y′ ), and square brackets [ ]
with discrete-space images, as in V [n, m]. ◭
Figure 1-5 Spectral sensitivity plots for photodetectors.
(Courtesy Nikon Corporation.)

1 µ m, it is necessary to use a filter to block the IR spectrum and The three associated transformations are:
to place red (R), green (G), or blue (B) filters over each pixel
so as to separate the visible spectrum of the incident light into
(1) Optical Transformation: from Io (x′ , y′ ; λ ) to Ii (x, y; λ ).
the three primary colors. Thus, the array elements depicted in
Fig. 1-4 in red respond to red light, and a similar correspondence
applies to those depicted in green and blue. Typical examples (2) Detection Transformation: from Ii (x, y; λ ) to V [n1 , m1 ].
of color sensitivity spectra are shown in Fig. 1-5 for a Nikon
camera.
Regardless of the specific detection mechanism (CCD or (3) Display Transformation: from V [n1 , m1 ] to B[n2 , m2 ].
APS), the array output is transferred to a digital storage device
with specific markers denoting the location of each element of
the array and its color code (R, G, or B). Each array consists of
three subarrays, one for red, another for green, and a third for Indices [n1 , m1 ] and [n2 , m2 ] vary over certain ranges of discrete
blue. This information is then used to synchronize the output values, depending on the chosen notation. For a discrete image,
of the 2-D detector array with the 2-D pixel arrangement on an the two common formats are:
LCD (liquid crystal display) or other electronic displays.

A. Continuous and Discrete Images


(1) Centered Coordinate System: The central pixel of V [n, m]
By the time an image appears on an LCD screen, it will have
is at (n = 0, m = 0), as shown in Fig. 1-7(a), and the image
undergone a minimum of three transformations, involving a
extends to ±N for n and to ±M for m. The total image
minimum of three additional images. With λ denoting the
size is (2M + 1) × (2N + 1) pixels. Note that index n varies
light wavelength and using Fig. 1-6 as a guide, we define the
horizontally.
following images:

Io (x′ , y′ ; λ ): continuous intensity brightness in the object plane, (2) Corner Coordinate System: In Fig. 1-7(b), indices n and m
with (x′ , y′ ) denoting the coordinates of the object plane. of V [n, m] start at 1 (rather than zero). Image size is M × N.
6 CHAPTER 1 IMAGING SENSORS

Discretized version
of apple
Optical transformation Detection transformation Display transformation

]
n 1,m1
[
V red

Detector array Discretized


full screen

Red
Io(x′, y′) Ii(x, y) filter B[n2, m2]
]
m1
[ n 1,
en
V gre

Green
filter

]
,m1
Blue [e n 1
Object plane Image plane filter V blu Display array

Detector array

Figure 1-6 Io (x′ , y′ ; λ ) and Ii (x, y; λ ) are continuous scene brightness and image intensities, whereas V [n1 , m1 ] and B[n2 , m2 ] are discrete
images of the detected voltage and displayed brightness, respectively.
1-1 OPTICAL IMAGERS 7

Image Notation
N columns
(0, M)

M rows
(0,1)
n
(−N,0) (−1,0) (0,0) (1,0) (N,0) m

(0,−1) • Index n varies horizontally and index m varies verti-


cally.
• Image notation is the same as matrix notation.

(0,−M) • Image size = # of rows × # of columns = M × N.

(a) Centered coordinates with


(2M+ 1) × (2N + 1) pixels The detection and display images may or may not be of
the same size. For example, if image compression is used
to generate a “thumbnail,” then far fewer pixels are used to
1 represent the imaged object in the display image than in the
detected image. Conversely, if the detected image is to be
2 enlarged through interpolation, then more pixels are used to
display the object than in the detected image.
3
B. Point Spread Function
Consider the scenario depicted in Fig. 1-8(a). An infinitesimally
small source of monochromatic (single wavelength) light, de-
noted s, is located in the center of the object plane, and the lens
location is adjusted to satisfy Eq. (1.1), thereby producing in the
image plane the best-possible image of the source. We assume
that the lens has no aberrations due to shape or material imper-
fections. We observe that even though the source is infinitesimal
M in spatial extent—essentially like a spatial impulse, its image
n is definitely not impulse-like. The image exhibits a circularly
1 2 3 N symmetric diffraction pattern caused by the phase interference
m
of the various rays of light that had emanated from the source
(b) Corner coordinates with (M × N ) pixels and traveled to the image plane through the lens. The pattern is
called an Airy disc.
Figure 1-7 (a) In the centered coordinate system, index m Figure 1-8(b) shows a 1-D plot of the image pattern in terms
extends between −M and +M and index n varies between −N of the intensity Ii (θ ) as a function of θ , where θ is the angular
and +N, whereas (b) in the corner coordinate system, indices n deviation from the central horizontal axis (Fig. 1-8(a)). The
and m start at 1 and conclude at N and M, respectively. expression for Ii (θ ) is
 2
2J1 (γ )
Ii (θ ) = Io , (1.2)
γ
8 CHAPTER 1 IMAGING SENSORS

s
where J1 (γ ) is the first-order Bessel function of the first kind, x2 + y2
and sin θ = , (1.5)
πD x2 + y2 + di2
γ= sin θ . (1.3)
λ and Eq. (1.4) can be rewritten as
Here, λ is the wavelength of the light (assumed to be monochro-
 
matic for simplicity) and D is the diameter of the converging Ii (x, y) 2J1 (γ ) 2
lens. The normalized form of Eq. (1.2) represents the impulse h(x, y) = = , (1.6)
I0 γ
response h(θ ) of the imaging system,
  with s
Ii (θ ) 2J1 (γ ) 2 πD x2 + y2
h(θ ) = = . (1.4) γ= . (1.7)
Io γ λ x2 + y2 + di2
For a 2-D image, the impulse response is called the point spread The expressions given by Eqs. (1.2) through (1.7) pertain to
function (PSF). coherent monochromatic light. Unless the light source is a laser,
Detector arrays are arranged in rectangular grids. For a pixel the light source usually is panchromatic, in which case the
at (x, y) in the image plane (Fig. 1-9), diffraction pattern that would be detected by each of the three-
color detector arrays becomes averaged over the wavelength
range of that array. The resultant diffraction pattern maintains
the general shape of the pattern in Fig. 1-8(b), but it exhibits a
Lens of diameter D gentler variation with θ (with no distinct minima). Here, h(x, y)
denotes the PSF in rectangular coordinates relative to the center
of the image plane.
y

Source s
θ x

Object plane Image plane


(a) Image of impulse source s
y
Ii(θ )
(0,y)
Io θ
di

πD
γ= sin θ y
λ Along y axis: sin θ = q
y2 + di2
γ s
−5 5 x2 + y2
For an image pixel at (x, y): sin θ =
(b) 1-D profile of imaged response x + y2 + di2
2

Figure 1-8 For an aberration-free lens, the image of a point


source is a diffraction pattern called an Airy disc. Figure 1-9 Relating angle θ to pixel at (x, y) in image plane.
1-1 OPTICAL IMAGERS 9

the geometry in Fig. 1-10 with sin θ ≈ θ leads to


◮ The implication of this PSF is that when the optical
system is used to image a scene, the image it forms in the
image plane is the result of a 2-D convolution (as defined ∆ymin λ
later in Section 3-3) of the brightness distribution of the ∆θmin ≈ ≈ 1.22 (1.9a)
di D
scene in the object plane, Io (x, y), with the PSF given by (angular resolution).
Eq. (1.7):
Ii (x, y) = Io (x, y) ∗ ∗ h(x, y). (1.8)
The angular width ∆θmin is the angular resolution of the
The convolution effect is then embedded in the discrete imaging system and ∆ymin is the image spatial resolution. On
2-D detected image as well as in all subsequent manifes- the object side of the lens, the scene spatial resolution is
tations. ◭

λ
∆y′min = 1.22do ∆θmin = 1.22do . (1.9b)
C. Spatial Resolution D
(scene spatial resolution)
Each of the two coherent, monochromatic sources shown in
Fig. 1-10 produces a diffraction pattern. If the two sources are
sufficiently far apart so that their patterns are essentially distinct, This is known as the Rayleigh resolution criterion. Because
then we should be able to distinguish them from one another. But the lens diameter D is in the denominator, using a larger lens
as we bring them closer together, their diffraction patterns in the improves spatial resolution. Thus telescopes are made with very
image plane start to overlap, making it more difficult to discern large lenses and/or mirrors.
their images as those of two distinct sources. These expressions apply to the y and y′ directions at wave-
One definition of the spatial resolution capability of the length λ . Since the three-color detector arrays operate over
imaging system along the y′ direction is the separation ∆y′min different wavelength ranges, the associated angular and spatial
between the two point sources (Fig. 1-10) such that the peak of resolutions are the smallest at λblue ≈ 0.48 µ m and the largest at
the diffraction pattern of one of them occurs at the location of λred ≈ 0.6 µ m (Fig. 1-5). Expressions with identical form apply
the first null of the diffraction pattern of the other one, and vice along the x and x′ direction (i.e., upon replacing y and y′ with x
versa. Along the y direction in the image plane, the first null and x′ , respectively).
occurs when [2J1 (γ )/γ ]2 = 0 or, equivalently, γ = 3.832. Use of
D. Detector Resolution
The inherent spatial resolution in the image plane is
y ∆ymin = di λ /D, but the detector array used to record the
image has its own detector resolution ∆p, which is the pixel
Image pattern of s2
size of the active pixel sensor. For a black and white imaging
camera, to fully capture the image details made possible by the
s1 imaging system, the pixel size ∆p should be, at most, equal to
∆θmin
∆ymin . In a color camera, however, the detector pixels of an
∆ymin
′ ∆θmin ∆ymin individual color are not adjacent to one another (see Fig. 1-4),
s2 so ∆p should be several times smaller than ∆ymin .
do
Image pattern of s1 1-1.2 Thermal IR Imagers
di Density slicing is a technique used to convert a parameter
of interest from amplitude to pseudocolor so as to enhance
Figure 1-10 The separation between s1 and s2 is such that the the visual display of that parameter. An example is shown in
peak of the diffraction pattern due to s1 is coincident with the Fig. 1-11, wherein color represents the infrared (IR) tempera-
first null of the diffraction pattern of s2 , and vice versa. ture of a hot-air balloon measured by a thermal infrared imager.
The vertical scale on the right-hand side provides the color-
10 CHAPTER 1 IMAGING SENSORS

Figure 1-11 IR image of a hot-air balloon (courtesy of Ing.-Buro für Thermografie). The spatial pattern is consistent with the fact that
warm air rises.
1-1 OPTICAL IMAGERS 11

Medium-wave IR
λ = 0.76 μm 2 μm 4 μm 103 μm
Near IR Long-wave IR

Gamma Ultraviolet
X-rays Infrared Microwaves Radio waves
rays rays

10−6 μm 10−3 μm 1 μm 103 μm 106 μm 109 μm

Figure 1-12 Infrared subbands.

temperature conversion. Unlike the traditional camera—which


measures the reflectance of the observed scene—a thermal IR Spectral emittance (Wm−2μm−1) 108
imager measures the emission by the scene, without an external Sun spectrum
source of illumination. IR imagers are used in many applications 5800 K
including night vision, surveillance, fire detection, and thermal 106
insulation in building construction.
104
1000 K
A. IR Spectrum 100
The wavelength range of the infrared spectrum extends from the 300 K
end of the red part of the visible spectrum at about 0.76 µ m 1 Earth spectrum
to the edge of the millimeter-wave band at 1000 µ m (or,
equivalently, λ = 1 mm). For historical reasons, the IR band 0.01
has been subdivided into multiple subbands, but these subbands 0.1 1 10 100
do not have a standard nomenclature, nor standard definitions Wavelength (μm)
for their wavelength extents. The prevailing practice assigns the
following names and wavelength ranges (Fig. 1-12): Figure 1-13 The peak of the blackbody radiation spectrum of
the sun is in the visible part of the EM spectrum, whereas the
(a) The near IR (NIR) extends from λ = 0.76 µ m to peak for a terrestrial object is in the IR (at ≈ 10 µ m).
λ = 2 µ m.

(b) The middle-wave IR (MWIR) extends from λ = 2 µ m to


λ = 4 µ m.
The basis for the self-emission is the blackbody radiation
(c) The long-wave IR (LWIR) extends from λ = 4 µ m to law, which states that all material objects radiate EM energy,
λ = 1000 µ m. and the spectrum of the radiated energy depends on the physical
temperature of the object, its material composition, and its
Most sensors operating in the NIR subband are similar to visible surface properties. A blackbody is a perfect emitter and perfect
light cameras in that they record light reflectance, but only absorber, and its radiation spectrum is governed by Planck’s law.
in the 0.76–2 µ m range, whereas IR sensors operating at the Figure 1-13 displays plots of spectral emittance for the sun (at
longer wavelengths rely on measuring energy self-emitted by the an effective radiating temperature of 5800 K) and a terrestrial
observed object, which depends, in part, on the temperature of blackbody at 300 K (27 ◦ C). We observe from Fig. 1-13
the object. Hence, such IR sensors are called thermal imagers. that the peak of the terrestrial blackbody is at approximately
12 CHAPTER 1 IMAGING SENSORS

1.0 IR lens
Dectector
0.95 array
Spectral emissivity

0.90 Data
processing
0.85 Ocean unit
Vegetation IR
0.80 Desert signal
Snow/ice
0.75 Cooler
Optics
3.3 μm 5 μm 10 μm 20 μm
Wavelength 50 μm
Figure 1-15 Thermal IR imaging systems often use cryogenic
cooling to improve detection sensitivity.
Figure 1-14 Emissivity spectra for four types of terrain.
(Courtesy the National Academy Press.)

10 µ m, which is towards the short wavelength end of the LWIR


subband. This means that the wavelength range around 10 µ m is
particularly well suited for measuring radiation self-emitted by
objects at temperatures in the range commonly encountered on
Earth. The amount of energy emitted at any specific wavelength
depends not only on the temperature of the object, but also on
its material properties. The emissivity of an object is defined as
the ratio of the amount of energy radiated by that object to the
amount of energy that would have been radiated by the object Visible light Thermal IR
had it been an ideal blackbody at the same physical temperature.
By way of an example, Fig. 1-14 displays spectral plots of the Figure 1-16 Comparison of black-and-white visible-light
emissivity for four types of terrain: an ocean surface, a desert photography with an IR thermal image of the same scene.
surface, a surface covered with snow or ice, and a vegetation-
covered surface.

B. Imaging System availability and use of a cryogenic agent, such as liquid nitrogen,
as well as placing the detectors in a vacuum-sealed container.
The basic configuration of a thermal IR imaging system Consequently, cooled IR imagers are significantly more expen-
(Fig. 1-15) is similar to that of a visible-light camera, but the sive to construct and operate than uncooled imagers.
lenses and detectors are designed to operate over the intended We close this section with two image examples. Figure 1-16
IR wavelength range of the system. Two types of detectors compares the image of a scene recorded by a visible-light black-
are used, namely uncooled detectors and cooled detectors. By and-white camera with a thermal IR image of the same scene.
cooling a semiconductor detector to very low temperatures, The IR image is in pseudocolor, with red representing high IR
typically in the 50–100 K range, its self-generated thermal noise emission and blue representing (comparatively) low IR emis-
is reduced considerably, thereby improving the signal-to-noise sion. The two images convey different types of information, but
ratio of the detected IR signal emitted by the observed scene. they also have significantly different spatial resolutions. Today,
Cooled detectors exhibit superior sensitivity in comparison with digital cameras with 16 megapixel detector arrays are readily
uncooled detectors, but the cooling arrangement requires the available and fairly inexpensive. In contrast, most standard
1-2 RADAR IMAGERS 13

detector arrays of thermal IR imagers are under 1 megapixel in


Exercise 1-1: An imaging lens used in a digital camera
size. Consequently, IR images appear “blurry” when compared
has a diameter of 25 mm and a focal length of 50 mm.
with their photographic counterparts.
Considering only the photodetectors responsive to the red
Our second image, shown in Fig. 1-17, is an IR thermal image
band centered at λ = 0.6 µ m, what is the camera’s spatial
of a person’s head and neck. Such images are finding increased
resolution in the image plane, given that the image distance
use in medical diagnostics, particularly for organs close to the
from the lens is di = 50.25 mm? What is the corresponding
surface [Ring and Ammer, 2012].
resolution in the object plane?
Answer: ∆ymin = 1.47 µ m; ∆y′min = 0.3 mm.

Exercise 1-2: At λ = 10 µ m, what is the ratio of the


emissivity of a snow-covered surface relative to that of a
sand-covered surface? (See Fig. 1-14.)
Answer: esnow /esand ≈ 0.985/0.9 = 1.09.

1-2 Radar Imagers


Conceptually, a radar can generate an image of the reflectivity of
a scene by scanning its antenna beam across the scene in a raster-
like format, as depicted in Fig. 1-18. Even though the imaging
process is very different from the process used by a lens in a
camera, the radar and the camera share the same fundamental
relationship for angular resolution. In Section 1-1.1, we stated
in Eq. (1.9a) that the angular resolution of a converging lens is
approximately ∆θmin = 1.22λ /D, and the corresponding spatial
resolution is

Figure 1-17 Thermal IR image of a person’s head and neck. λ


∆y′min = do ∆θmin = 1.22do (camera). (1.10a)
D
Here, λ is the wavelength of the light and D is the diameter of
the lens.
Concept Question 1-1: What is a camera’s point spread
Equation (1.10a) is approximately applicable to a microwave
function? What role does it play in the image formation radar with a dish antenna of diameter D (in the camera case,
process? the scene illumination is external to the camera, so the lens gets
involved in only the receiving process, whereas in the radar case
the antenna is involved in both the transmitting and receiving
Concept Question 1-2: How are the image and scene processes). In the radar literature, the symbol usually used to
spatial resolutions related to one another? denote the range between the radar antenna and the target is the
symbol R. Hence, upon replacing do with R, we have
Concept Question 1-3: What is the emissivity of an ob- λ
ject? ∆y′min ≈ R (radar). (1.10b)
D

Concept Question 1-4: Why is an IR imager called a It is important to note that λ of visible light is much shorter
thermal imager? than λ in the microwave region. In the middle of the visible
spectrum, λvis ≈ 0.5 µ m, whereas at a typical microwave radar
14 CHAPTER 1 IMAGING SENSORS

∆y′min

∆θmin

Figure 1-18 Radar imaging of a scene by raster scanning the antenna beam.

frequency of 6 GHz, λmic ≈ 5 cm. The ratio is sion and transmits very short pulses to achieve fine resolution in
the orthogonal dimension. The predecessor to SAR is the real-
λmic 5 × 10−2 aperture side-looking airborne radar (SLAR). A SLAR uses
= = 105 !
λvis 0.5 × 10−6 a rectangular- or cylindrical-shaped antenna that gets mounted
along the longitudinal direction of an airplane, and pointed
This means that the angular resolution capability of an optical partially to the side (Fig. 1-19).
system is on the order of 100,000 times better than the angular Even though the antenna beam in the elevation direction
resolution of a radar, if the lens diameter is the same size as the is very wide, fine discrimination can be realized along the x
antenna diameter. direction in Fig. 1-19 by transmitting a sequence of very short
To fully compensate for the large wavelength ratio, a radar pulses. At any instant in time, the extent of the pulse along x is
antenna would need a diameter on the order of 1 km to produce
an image with the same resolution as a camera with a lens 1 cm cτ
∆x′min = (scene range resolution), (1.11)
in diameter. Clearly, that is totally impractical. In practice, most 2 sin θ
radar antennas are on the order of centimeters to meters in size,
where c is the velocity of light, τ is the pulse width, and θ is
but certainly not kilometers. Yet, radar can image the Earth
the incidence angle relative to nadir-looking. This represents the
surface from satellite altitudes with spatial resolutions on the
scene spatial resolution capability along the x′ direction. At a
order of 1 m—equivalent to antenna sizes several kilometers in
typical angle of θ = 45◦ , the spatial resolution attainable when
extent! How is that possible?
transmitting pulses each 5 µ s in width is
A. Synthetic-Aperture Radar 3 × 108 × 5 × 10−9
∆x′min = ≈ 1.05 m.
As we will see shortly, a synthetic-aperture radar (SAR) uses 2 sin 45◦
a synthesized aperture to achieve good resolution in one dimen-
1-2 RADAR IMAGERS 15

Recorder (digital storage)

va Transmitte
Tr ter-
ter-
Transmitter-receiver

An
Antenna
Idealized
lized elevation
antenna pattern ly

Short pu
pulse
Trees
Bank edge
Water Truck

x′′
Shadow ∆y′′

∆y
Start sweep
Truck
Sloping edge
Trees
Shadow
End of sweep

Video Water
amplitude

Time
(range)
A-scope display
Scrub growth
(brush, grass, bare earth, etc.)

Figure 1-19 Real-aperture SLAR imaging technique. The antenna is mounted along the belly of the aircraft.

Not only is this an excellent spatial resolution along the x′ Fig. 1-19). The shape of the beam of the cylindrical antenna
direction, but it is also independent of range R (distance between is illustrated in Fig. 1-20. From range R, the extent of the beam
the radar and the surface), which means it is equally applicable along the y direction is
to a satellite-borne radar.
As the aircraft flies along the y direction, the radar beam λ λh
∆y′min ≈ R= , (1.12)
sweeps across the terrain, while constantly transmitting pulses, ly ly cos θ
receiving their echoes, and recording them on an appropriate (real-aperture azimuth resolution)
medium. The sequential echoes are then stitched together to
form an image. where h is the aircraft altitude. This is the spatial resolution
By designing the antenna to be as long as practicable along capability of the radar along the flight direction. For a 3 m long
the airplane velocity direction, the antenna pattern exhibits a antenna operating at λ = 3 cm from an altitude of 1 km, the
relatively narrow beam along that direction (y′ direction in
16 CHAPTER 1 IMAGING SENSORS

βxz ≈ λ Along-track resolution


lx Real-aperture ∆y′min = λR/ly
ly βyz ≈ λ Synthetic-aperture ∆y′min = ly /2
ly
R
Example: λ = 4 cm
lx spacecraft radar
Length of
synthetic aperture Resolution
of synthetic
Fan beam 8 km aperture
1m
Figure 1-20 Radiation pattern of a cylindrical reflector.
400 km

resolution ∆y′min at θ = 45◦ is ly = 2 m


Length of
real aperture
3 × 10−2
∆y′min ≈ × 103 ≈ 14 m.
3 cos45◦ 8 km
Ideally, an imaging system should have similar resolution ca- Resolution of
pabilities along both directions of the imaged scene. In the real aperture
present case, ∆x′min ≈ 1.05 m, which is highly desirable, but
∆y′min ≈ 14 m, which for most imaging applications is not so Figure 1-21 An illustration of how synthetic aperture works.
desirable, particularly if the altitude h is much higher than 1 km.
Furthermore, since ∆y′min is directly proportional to the altitude
h of the flying vehicle, whereas ∆x′min is independent of h, the
disparity between ∆x′min and ∆y′min will get even greater when
we consider radars flown at satellite altitudes. B. Point Spread Function
To improve the resolution ∆y′min and simultaneously remove
its dependence on the range R, we artificially create an array The first of our two SAR-image examples displays a large part
of antennas as depicted in Fig. 1-21. In the example shown in of Washington, D.C. (Fig. 1-22). The location information of
Fig. 1-21, the real satellite-borne radar antenna is 2 m long and a particular pixel in the observed scene is computed, in part,
the synthetic aperture is 8 km long! The latter consists of pulse from the round-trip travel time of the transmitted pulse. Conse-
returns recorded as the real antenna travels over a distance of quently, a target that obscures the ground beneath it, such as the
8 km, and then processed later as if they had been received by Washington Monument in Fig. 1-22, ends up generating a radar
an 8 km long array of antennas, each 2 m long, simultaneously. shadow because no signal is received from the obscured area.
The net result of the processing is an image with a resolution The radar shadow of the obelisk of the Washington Monument
along the y direction given by appears as a dark line projected from the top onto the ground
surface.
ly Radar shadow is also apparent in the SAR image of the plane
∆y′min = (SAR azimuth resolution), (1.13)
2 and helicopter of Fig. 1-23.
In Section 1-1.1, we stated that the image formed by the lens
where ly is the length of the real antenna. For the present of an optical camera represents the convolution of the reflectivity
example, ly = 2 m and ∆y′min = 1 m, which is approximately the of the scene (or the emission distribution in the case of the IR
same as ∆x′min . Shortening the antenna length would improve the imager) with the point spread function (PSF) of the imaging
azimuth resolution, but considerations of signal-to-noise ratio system. The concept applies equally well to the imaging radar
would require the transmission of higher power levels. case. For an x–y SAR image with x denoting the side-looking
direction and y denoting the flight direction, the SAR PSF is
1-2 RADAR IMAGERS 17

NRL
N

Capitol

Pentagon

White House

Washington
Monument

Figure 1-22 SAR image collected over Washington, D.C. Right of center is the Washington Monument, though only the shadow of the obelisk
is readily apparent in the image. [Courtesy of Sandia National Laboratories.]
18 CHAPTER 1 IMAGING SENSORS

given by
Exercise 1-3: With reference to the diagram in Fig. 1-21,
h(x, y) = hx (x) hy (y), (1.14)
suppose the length of the real aperture were to be increased
with hx (x) describing the shape of the transmitted pulse and from 2 m to 8 m. What would happen to (a) the antenna
hy (y) describing the shape of the synthetic antenna-array pat- beamwidth, (b) length of the synthetic aperture, and (c) the
tern. Typically, the pulse shape is like a Gaussian: SAR azimuth resolution?
2 Answer: (a) Beamwidth is reduced by a factor of 4, (b)
hx (x) = e−2.77(x/τ ) , (1.15a)
synthetic aperture length is reduced from 8 km to 2 km, and
where τ is the effective width of the pulse (width between half- (c) SAR resolution changes from 1 m to 4 m.
peak points). The synthetic array pattern is sinc-like in shape,
but the sidelobes may be suppressed further by assigning differ-
ent weights to the processed pulses. For the equally weighted
case,   1-3 X-Ray Computed Tomography
2 1.8y (CT)
hy (y) = sinc , (1.15b)
l
where l is the length of the real antenna, and the sinc function is Computed tomography, also known as CT scan, is a technique
defined such that sinc(z) = sin(π z)/(π z) for any variable z. capable of generating 3-D images of the X-ray attenuation (ab-
sorption) properties of an object, such as the human body. The
X-ray absorption coefficient of a material is strongly dependent
Concept Question 1-5: Why is a SAR called a
on the density of that material. CT has the sensitivity necessary
“synthetic”-aperture radar?
to image body parts across a wide range of densities, from soft
tissue to blood vessels and bones.
Concept Question 1-6: What system parameters deter- As depicted in Fig. 1-24(a), a CT scanner uses an X-ray
mine the PSF of a SAR? source, with a narrow slit to generate a fan-beam, wide enough
to encompass the extent of the body, but only about 1 mm
thick. The attenuated X-ray beam is captured by an array of
∼ 900 detectors. The X-ray source and the detector array are
mounted on a circular frame that rotates in steps of a fraction
of a degree over a full 360◦ circle around the object or patient,
each time recording an X-ray attenuation profile from a different
angular direction. Typically, on the order of 1000 such profiles
are recorded, each composed of measurements by 900 detectors.
For each horizontal slice of the body, the process is completed
in less than 1 second. CT uses image reconstruction algorithms
to generate a 2-D image of the absorption coefficient of that
horizontal slice. To image an entire part of the body, such as
the chest or head, the process is repeated over multiple slices
(layers).
For each anatomical slice, the CT scanner generates on
the order of 9 × 105 measurements (1000 angular orientations
×900 detectors). In terms of the coordinate system shown in
Fig. 1-24(b), we define α (ξ , η ) as the absorption coefficient
of the object under test at location (ξ , η ). The X-ray beam is
directed along the ξ direction at η = η0 . The X-ray intensity
Figure 1-23 High-resolution image of an airport runway received by the detector located at ξ = ξ0 and η = η0 is given
with a plane and helicopter. [Courtesy of Sandia National by  
Z ξ0
Laboratories.]
I(ξ0 , η0 ) = I0 exp − α (ξ , η0 ) d ξ , (1.16)
0
1-4 MAGNETIC RESONANCE IMAGING 19

where I0 is the X-ray intensity radiated by the source. Outside


the body, α (ξ , η ) = 0. The corresponding logarithmic path
attenuation p(ξ0 , η0 ) is defined as
X-ray
Fan beam Z ξ0
source I(ξ0 , η0 )
of X-rays p(ξ0 , η0 ) = − log = α (ξ , η0 ) d ξ . (1.17)
I0 0
Detector
array The path attenuation p(ξ0 , η0 ) is the integrated absorption
coefficient across the X-ray path.
In the general case, the path traversed by the X-ray source is at
a range r and angle θ in a polar coordinate system, as depicted
in Fig. 1-24(c). The direction of the path is orthogonal to the
direction of r. For a path corresponding to a specific set (r, θ ),
Computer Eq. (1.17) becomes
and monitor Z ∞Z ∞
(a) CAT scanner p(r, θ ) = α (ξ , η ) δ (r − ξ cos θ − η sin θ ) d ξ d η ,
−∞ −∞
η (1.18)
where the Dirac impulse δ (r − ξ cos θ − η sin θ ) dictates that
only those points in the (ξ , η ) plane that fall along the path
X-ray specified by fixed values of (r, θ ) are included in the integration.
I(ξ0,η0)
source The relation between p(r, θ ) and α (ξ , η ) is known as the
I0
η0 X-ray detector 2-D Radon transform of α (ξ , η ). The goal of CT is to re-
Absorption construct α (ξ , η ) from the measured path attenuations p(r, θ ),
coefficient I(ξ0,η0) = ξ0 by inverting the Radon transform given by Eq. (1.18), which is
α(ξ,η) accomplished with the help of the Fourier transform.
I0 exp(− ∫ α(ξ,η0) dξ)
Object 0
Concept Question 1-7: What physical attribute of the
imaged body is computed and displayed by a CT scanner?
0 ξ
0 ξ0
(b) Horizontal path
1-4 Magnetic Resonance Imaging
η
X-ray source Since its early demonstration in the 1970s, magnetic resonance
I0 imaging (MRI) has become a highly valuable tool in diagnostic
radiology, primarily because it can generate high-resolution
anatomical images of the human body, without exposing the
r patient to ionizing radiation. Like X-ray CT scanners, magnetic
α(ξ,η) resonance (MR) imagers can generate 3-D images of the body
I(r,θ) part of interest, from which 2-D slices can be extracted along
θ Detector any orientation of interest. The name MRI derives from the fact
ξ that the MRI scanner measures nuclear magnetic resonance
(c) Path at radius r and orientation θ (NMR) signals emitted by the body’s tissues and blood vessels
in response to excitation by a magnetic field introduced by a
Figure 1-24 (a) CT scanner, (b) X-ray path along x, and (c) radio frequency (RF) system.
X-ray path along arbitrary direction.
1-4.1 Basic System Configuration
The MRI system shown in Fig. 1-25 depicts a human body lying
20 CHAPTER 1 IMAGING SENSORS

Superconducting magnet generates static field B0 The Magnetic Field


Superconducting
Gradient coils magnet
generate field BG
RF coil excites
nuclei and “listens” yˆ B0
to the response
zˆ ~1.5 T

Radio frequency Gradient Radio frequency xˆ


transmitter amplifiers receiver

~1 × 10−4 T
Computer
Figure 1-26 B0 is static and approximately uniform within
the cavity. Inside the cavity, B0 ≈ 1.5 T (teslas), compared with
Figure 1-25 Basic diagram of an MRI system. only 0.1 to 0.5 milliteslas outside.

inside a magnetic core. The magnetic field at a given location


Magnetic moment
(x, y, z) within the core and at a given instant in time t may
consist of up to three magnetic field contributions: B0

B = B0 + BG + BRF , mI = −1/2

where B0 is a static field, BG is the field gradient, and BRF is


the radio frequency (RF) excitation used to solicit a response
from the biological material placed inside the core volume. Each
of these three components plays a critical role in making MRI
possible, so we will discuss them individually. mI = 1/2
θ

A. Static Field B0
Figure 1-27 Nuclei with spin magnetic number of ±1/2
Field B0 is a strong, static (non–time varying) magnetic field precessing about B0 at the Larmor angular frequency ω0 .
created by a magnet designed to generate a uniform (constant)
distribution throughout the magnetic core (Fig. 1-26). Usually, a
superconducting magnet is used for this purpose because it can
generate magnetic fields with much higher magnitudes than can come magnetized when exposed to a magnetic field. Among the
be realized with resistive and permanent magnets. The direction substances found in a biological material, the hydrogen nucleus
of B0 is longitudinal (ẑ direction in Fig. 1-26) and its magnitude has a strong susceptibility to magnetization, and hydrogen is
is typically on the order of 1.5 teslas (T). The conversion factor highly abundant in biological tissue. For these reasons, a typical
between teslas and gauss is 1 T = 104 gauss. Earth’s magnetic MR image is related to the concentration of hydrogen nuclei.
field is on the order of 0.5 gauss, so B0 inside the MRI core is The strong magnetic field B0 causes the nuclei of the material
on the order of 30,000 times that of Earth’s magnetic field. inside the core space to temporarily magnetize and to spin
Biological tissue is composed of chemical compounds, and (precess) like a top about the direction of B0 . The precession ori-
each compound is organized around the nuclei (protons) of the entation angle θ , shown in Fig. 1-27, is determined by the spin
atoms comprising that compound. Some, but not all, nuclei be- quantum number I of the spinning nucleus and the magnetic
1-4 MAGNETIC RESONANCE IMAGING 21

Table 1-2 Gyromagnetic ratio γ– for biological nuclei. y


z
Isotope Spin I % Abundance γ– MHz/T
1H 1/2 99.985 42.575 x
13 C 1/2 1.108 10.71
14 N 1 99.63 3.078
17 O z x y
5/2 0.037 5.77
19 F 1/2 100 40.08
23 Na 3/2 100 11.27
31 P 1/2 100 17.25

Figure 1-28 Magnetic coils used to generate magnetic fields


quantum number mI . A material, such as hydrogen, with a spin along three orthogonal directions. All three gradient fields BG
system of I = 1/2 has magnetic quantum numbers mI = ±1/2. point along ẑ, but their intensities vary linearly along x, y, and z.
Hence, the nucleus may p spin along two possible directions
defined by cos θ = mI / I(I + 1), which yields θ = ±54◦ 44′
[Liang and Lauterbur, 2000].
gradient magnetic field is localization (in addition to other infor-
The associated angular frequency of the nuclear precession
mation that can be extracted about the tissue material contained
is called the Larmor frequency and is given by
in the core volume during the activation and deactivation cycles
ω0 = γ B0 (Larmor angular frequency), (1.19a) of the gradient fields). Ideally, the gradient fields assume the
following spatial variation inside the core volume:
with ω0 in rad/s, B0 in teslas (T), and γ , the gyromagnetic ratio
of the material, in (rad/s)/T. Alternatively, we can express the BG = (Gx x + Gy y + Gz z)ẑ,
precession in terms of the frequency f0 = ω0 /2π , in which case
with the center of the (x, y, z) coordinate system placed at the
Eq. (1.19a) assumes the equivalent form
center of the core volume. The gradient coefficients Gx , Gy ,
f0 = γ–B0 (Larmor frequency), (1.19b) and Gz are on the order of 10 mT/m, and they are controlled
individually by the three gradient coils.
where γ– = γ /2π . This fundamental relationship between f0 and Let us consider an example in which Gx = Gz = 0 and Gy = 10
B0 is at the heart of what makes magnetic resonance imaging mT/m, and let us assume that the vertical dimension of the core
possible. Table 1-2 provides a list of nuclei of biological interest volume is 1 m. If B0 = 1.5 T, the combined field will vary from
that have nonzero spin quantum numbers, along with their cor- (
responding gyromagnetic ratios. For the hydrogen isotope 1 H, 1.495 T @ y = − 12 m, to
B = B 0 + Gy y =
γ– = 42.575 MHz/T, so the Larmor frequency for hydrogen at 1.505 T @ y = 12 m,
B0 = 1.5 T is f0 = 63.8625 MHz, which places it in the RF part
of the EM spectrum. Since the human body is made up primarily as depicted in Fig. 1-29. By Eq. (1.19b), the corresponding
of water, the most commonly imaged nucleus is hydrogen. Larmor frequency for hydrogen will vary from 63.650 MHz for
hydrogen nuclei residing in the plane at y = −0.5 m to 64.075
B. Gradient Field BG MHz for nuclei residing in the plane at y = +0.5 m. As we will
explain shortly, when an RF signal at a particular frequency fRF
The MRI system includes three current-activated gradient coils is introduced inside the core volume, those nuclei whose Larmor
(Fig. 1-28) designed to generate magnetic fields pointed along frequency f0 is the same as fRF will resonate by absorbing part
the ẑ direction—the same as B0 , but whose magnitudes exhibit of the RF energy and then reemitting it at the same frequency (or
linear spatial variations along the x̂, ŷ, and ẑ directions. That slightly shifted in the case of certain chemical reactions). The
is why they are called gradient fields. The three coils can be strength of the emitted response is proportional to the density
activated singly or in combination. The primary purpose of the of nuclei. By varying the total magnetic field B linearly along
22 CHAPTER 1 IMAGING SENSORS

intended application [Liang and Lauterbur, 2000]. The magnetic


y field of the transmitted energy causes the exposed biological
tissue to resonate at its Larmor frequency. With the transmitter
Gradient field intensity

off, the receiver picks up the resonant signals emitted by the


biological tissue. The received signals are Fourier transformed
so as to establish a one-to-one correspondence to the locations
of the voxels responsible for the emission. For each voxel, the
strength of the associated emission is related to the density of 1 H
nuclei in that voxel as well as to other parameters that depend
on the tissue properties and pulse timing.
Image slice
Volume
1-4.2 Point Spread Function
Figure 1-29 Imposing a gradient field that varies linearly with Generating the MR image involves applying the discrete form
y allows stratification into thin slices, each characterized by its of the Fourier transform. Accordingly, the point spread function
own Larmor frequency. of the MR image is given by a discrete form of the sinc function,
namely [Liang and Lauterbur, 2000]:

sin(π N ∆k x)
the vertical direction, the total core volume can be discretized hx (x) = ∆k , (1.20)
sin(π ∆k x)
into horizontal layers called slices, each corresponding to a
different value of f0 (Fig. 1-29). This way, the RF signal can where x is one of the two MR image coordinates, k is a spatial
communicate with each slice separately by selecting the RF frequency, ∆k is the sampling interval in k space, and N is the
frequency to match f0 of that slice. In practice, instead of total number of Fourier samples. A similar expression applies
sending a sequence of RF signals at different frequencies, the RF to hy (y). The spatial resolution of the MR image is equal to the
transmitter sends out a short pulse whose frequency spectrum equivalent width of hx (x), which can be computed as follows:
covers the frequency range of interest for all the slices in the
Z 1/(1 ∆k)
volume, and then a Fourier transformation is applied to the 1 1
response from the biological tissue to separate the responses ∆xmin = hx (x) dx = . (1.21)
h(0) −1/(1 ∆k) N ∆k
from the individual slices.
The gradient magnetic field along the ŷ direction allows dis- The integration was performed over one period (1/∆k) of hx (x).
cretization of the volume into x–y slices. A similar process can According to Eq. (1.21), the image resolution is inversely
be applied to generate x–z and y–z slices, and the combination is proportional to the product N ∆k. The choices of values for N
used to divide the total volume into a three-dimensional matrix and ∆k are associated with signal-to-noise ratio and scan time
of voxels (volume pixels). The voxel size defines the spatial considerations.
resolution capability of the MRI system.
1-4.3 MRI-Derived Information
C. RF System
Generally speaking, MRI can provide three types of information
The combination of the strong static field B0 and the gradient about the imaged tissue:
field BG (whose amplitude is on the order of less than 1%
of B0 ) defines a specific Larmor frequency for the nuclei of (a) The magnetic characteristics of tissues, which are related
every isotope within each voxel. As we noted earlier through to biological attributes and blood vessel conditions.
Table 1-1, at B0 intensities in the 1 T range, the Larmor frequen-
cies of common isotopes are in the MHz range. The RF system (b) Blood flow, made possible through special time-dependent
consists of a transmitter and a receiver connected to separate gradient excitations.
coils, or the same coil can be used for both functions. The
transmitter generates a burst of narrow RF pulses. In practice, (c) Chemical properties discerned from measurements of
many different pulse configurations are used, depending on the small shifts in the Larmor frequency.
1-5 ULTRASOUND IMAGER 23

receive switch. Thus, the array serves to both launch acoustic


waves in response to electrical excitation as well as to receive
the consequent acoustic echoes and convert them back into
electrical signals. The echoes are reflections from organs and
tissue underneath the skin of the body part getting imaged by
the ultrasound imager (Fig. 1-32).
The transmitter unit in Fig. 1-31, often called the pulser,
generates a high-voltage short-duration pulse (on the order of
a few microseconds in duration) and sends it to the transmit
beamforming unit, which applies individual time delays to
the pulse before passing it on to the transducers through the
Figure 1-30 MR image. transmit/receive switch. The choice of time delays determines
the range at which the acoustic waves emitted by the four
transducers interfere constructively, as well as the direction
of that location relative to the axis of the array. The range
An example of an MR image is shown in Fig. 1-30. operation is called focusing and the directional operation is
called steering. Reciprocal operations are performed by the
Concept Question 1-8: An MRI system uses three dif- receive beamforming unit; it applies the necessary time delays
ferent types of magnetic fields. For what purpose? to the individual signals made available by the transducers and
then combines them together coherently to generate the receive
echo. The focusing and steering operations are the subject of the
Concept Question 1-9: What determines the Larmor fre-
next subsection.
quency of a particular biological material?

1-5 Ultrasound Imager 1-5.2 Beam Focusing and Steering


Human hearing extends up to 20 kHz. Ultrasound is defined The focusing operation is illustrated by the diagram in Fig. 1-33
as sound at frequencies above that range. Ultrasound imaging using eight transducers. In response to the electrical stimulations
systems, which operate in the 2 to 20 MHz range, have numer- introduced by the beamforming unit, all of the transducers
ous industrial and medical applications, and the latter include generate outward-going acoustic waves that are identical in
both diagnosis and therapy. Fundamentally, ultrasound imagers every respect except for their phases (time delays). The specific
are similar to radar imagers in that both sensors employ phase distribution of the time delays shown in Fig. 1-33(a) causes
shifting (or, equivalently, time delaying) to focus and steer their the eight acoustic waves to interfere constructively at the point
beams at the desired distances and along the desired directions. labeled Focus 1 at range Rf1 . The time-delay distribution is
Imaging radars use 1-D or 2-D arrays of antennas, and likewise, symmetrical relative to the center of the array, so the direction
ultrasound imagers use 1-D or 2-D arrays of transducers. How- of Focus 1 is broadside to the array axis. Changing the delay
ever, electromagnetic waves and sound waves have different shifts between adjacent elements, while keeping the distribution
propagation properties, so the focusing and steering techniques symmetrical, as in Fig. 1-33(b), causes the focal point to move
are not quite identical. to Focus 2 at range Rf2 . If no time delay is applied to any of the
eight transducer signals, the focal point moves to infinity.
1-5.1 Ultrasound System Architecture
Ultrasound imagers use both 1-D and 2-D transducer arrays
(with some as large as 2000 × 8000 elements and each on the ◮ The combined beam of the transducer array can be
order of 5 µ m × 5 µ m in size), but for the sake of simplicity, focused as a function of depth by varying the incremental
we show in Fig. 1-31 only a 1-D array with four elements. The time delay between adjacent elements in a symmetrical
system has a transmitting unit and a receiving unit, with the time-delay distribution. ◭
transducer array connected to the two units through a transmit/
24 CHAPTER 1 IMAGING SENSORS

Transmit pulse

Transmit
Transmitter τ beamforming
unit (time delay
generator) Transducers

System T/R
Display
processor switch

Receive echo

Receive
Data
beamforming
acquisition
(time delay
unit
generator)

Figure 1-31 Block diagram of an ultrasound system with a 4-transducer array.

The image displayed in Fig. 1-34 is a simulation of acoustic


energy across two dimensions, the lateral dimension parallel to
the array axis and the axial dimension along the range direction.
The array consists of 96 elements extending over a length of
20 mm, and the beam is focused at a range R = 40 mm.
The delay-time distribution shown in Fig. 1-35 is symmetrical
relative to the broadside direction. By shifting the axis of
symmetry to another direction, the focal point moves to a new
direction. This is called steering the beam, and is illustrated in
Fig. 1-35 and simulated in part (b) of Fig. 1-34.
With a 2-D array of transducers, the steering can be realized
along two orthogonal directions, so the combination of focusing
and steering ends up concentrating the acoustic energy radiated
by the transducer array into a small voxel within the body
getting imaged by the ultrasound probe. Similar operations are
performed by the receive beamforming unit so as to focus and
steer the beam of the transducer array to receive the echo from
the same voxel.

Figure 1-32 Ultrasound imaging of the thyroid gland.


1-5 ULTRASOUND IMAGER 25

Beamforming unit Beamforming unit

Array
axis Array axis

Rf1
Rf2

Focus 1
Focus 2

Broadside direction Broadside direction

(a) Focal point at Rf1 (b) Focal point at Rf2

Figure 1-33 Changing the inter-element time delay across a symmetrical time-delay distribution shifts the location of the focal point in
the range direction.

1-5.3 Spatial Resolution cycles in the pulse. The wavelength is related to the signal
frequency by
v
For a 2-D transducer array of size (Lx × Ly ) and focused at λ= , (1.23)
f
range Rf , as shown in Fig. 1-36 (side Ly is not shown in
the figure), the size of the resolution voxel is given by an where v is the wave velocity and f is the frequency. In biological
axial resolution ∆Rmin along the range direction and by lateral tissue, v ≈ 1540 m/s. For an ultrasound system operating at
resolutions ∆xmin and ∆ymin along the two lateral directions. The f = 5 MHz and generating pulses with N = 2 cycles per pulse,
axial resolution is given by
vN 1540 × 2
∆Rmin = = ≈ 0.3 mm.
λN 2f 2 × 5 × 106
∆Rmin = (axial resolution), (1.22)
2
where λ is the wavelength of the pulse in the material in which
the acoustic waves are propagating and N is the number of
26 CHAPTER 1 IMAGING SENSORS

90 90
70 70
50 50
Lateral distance (mm)

30 30 Lx ∆ xmin
10 10
∆ Rmin
10 10
30 30
50 50 Rf
70 70
90 90 Figure 1-36 Axial resolution ∆Rmin and lateral resolution
0 20 40 60 80 100 0 20 40 60 80 ∆xmin for a transducer array of length Lx focused at range Rf .
Axial distance (mm)
(a) Focused beam (b) Focused beam
with no steering with steering
by 45o
The lateral resolution ∆xmin is given by
Figure 1-34 Simulations of acoustic energy distribution for
(a) a beam focused at Rf = 40 mm by a 96-element array and λ Rf v
∆xmin = Rf = (lateral resolution), (1.24)
(b) a beam focused and steered by 45◦ . Lx Lx f
where Rf is the focal length (range at which the beam is
focused). If the beam is focused at Rf = 5 cm and the array
length Lx = 4 cm and f = 5 MHz, then

5 × 10−2 × 1540
∆xmin = ≈ 0.4 mm,
4 × 10−2 × 5 × 106

(a) Uniform distribution (b) Linear shift (c) Non-uniform (d) Nonlinear shift
symmetrical and non-uniform
distribution distribution

Figure 1-35 Beam focusing and steering are realized by shaping the time-delay distribution.
1-6 COMING ATTRACTIONS 27

which is comparable with the magnitude of the axial resolution 1-6 Coming Attractions
∆Rmin . The resolution along the orthogonal lateral direction,
∆ymin , is given by Eq. (1.24) with Lx replaced with Ly . The size Through examples of image processing products, this section
of the resolvable voxel is presents images extracted from various sections in the book.
In each case, we present a transformed image, along with a
∆V = ∆Rmin × ∆xmin × ∆ymin. (1.25) reference to its location within the text.

Figure 1-37 displays an ultrasound image of a fetus.


1-6.1 Image Warping by Interpolation
Section 4-10 demonstrates how an image can be warped by
nonlinear shifts of its pixel locations and then interpolating the
results to generate a smooth image. An example is shown in
Fig. 1-38.

199
0 199

Figure 1-37 Ultrasound image of a fetus. (a) Original clown image

50

100
Concept Question 1-10: How does an ultrasound imager
focus its beam in the range direction and in the lateral 150

direction? There are two orthogonal lateral directions, so 200

how is that managed? 250

300

Exercise 1-4: A 6 MHz ultrasound system generates pulses 350

with 2 cycles per pulse using a 5 cm × 5 cm 2-D transducer 399


0 50 100 150 200 250 300 350 399
array. What are the dimensions of its resolvable voxel when
focused at a range of 8 cm in a biological material? (b) Warped image product

Answer: ∆V = ∆Rmin × ∆xmin × ∆ymin Figure 1-38 Original clown image and nonlinearly warped
= 0.26 mm × 0.41 mm × 0.41 mm. product. [Extracted from Figs. 4-14 and 4-17.]
28 CHAPTER 1 IMAGING SENSORS

1-6.2 Image Sharpening by Highpass Filtering 1-6.3 Brightening by Histogram Equalization


Section 5-2 illustrates how an image can be sharpened by spatial Histogram equalization (nonlinear transformation of pixel val-
highpass filtering, and how this amplifies noise in the image. The ues) can be used to brighten an image, as illustrated by the pair
original image of an electronic circuit and its highpass-filtered of images in Fig. 1-40.
version are displayed in Fig. 1-39.

200
0 200
(a) Dark clown image

(a) Original image 0

200
0 200
(b) Brightened clown image

Figure 1-40 Application of histogram equalization to the dark


(b) Sharpened image image in (a) leads to the brighter image in (b). [Extracted from
Fig. 5-8.]
Figure 1-39 Image of electronic circuit before and after
application of a highpass sharpening filter. [Extracted from
Fig. 5-6.]
1-6 COMING ATTRACTIONS 29

1-6.4 Edge Detection 1-6.5 Notch Filtering


Edges can be enhanced in an image by applying edge detection Notch filtering can be used to remove sinusoidal interference
algorithms, such as the Canny edge detector that was applied to from an image. The original Mariner space probe image and its
the image in Fig. 1-41(a). notch-filtered version are shown in Fig. 1-42.

0 0

20

40

60

80

100

120

140 659
0 799
160 (a) Original Mariner image
180
0
200
0 20 40 60 80 100 120 140 160 180 200

(a) Original clown image

20

40

60

80 659
0 799
100
(b) Notch-filtered Mariner image
120

140
Figure 1-42 The horizontal lines in (a) are due to sinusoidal
interference in the recorded image. The lines were removed by
160 applying notch filtering. [Extracted from Fig. 6-7.]
180

200
0 20 40 60 80 100 120 140 160 180 200

(b) Edge-detected image

Figure 1-41 Original clown image and its Canny edge-


detected version. [Extracted from Fig. 5-16.]
30 CHAPTER 1 IMAGING SENSORS

1-6.6 Motion-Blur Deconvolution 1-6.7 Denoising of Images Using Wavelets


Section 6-6 shows how to deblur a motion-blurred image. A One method used to denoise an image is by thresholding
blurred image and its motion-deblurred version are shown in and shrinking its wavelet transform. An example is shown in
Fig. 1-43. Fig. 1-44.

0
20

40

60

80

100

120

140
224 160
0 274
180
(a) Original motion-blurred image caused by
taking a photograph in a moving car 200
20 40 60 80 100 120 140 160 180 200
(a) A noisy clown image
0

20

40

60

80

100

120

140

160
274 180
0 274
200
(b) Motion deblurred image 20 40 60 80 100 120 140 160 180 200
(b) Wavelet-denoised clown image
Figure 1-43 Motion blurring is removed. [Extracted from
Fig. 6-11.] Figure 1-44 Image denoising. [Extracted from Fig. 7-21.]
1-6 COMING ATTRACTIONS 31

1-6.8 Image Inpainting 1-6.9 Deconvolution Using Deterministic and


The wavelet transform of an image can be used to “inpaint” Stochastic Image Models
(restore missing pixels in an image). An image with deleted Chapters 8 and 9 provide reviews of probability and estimation.
pixels and its inpainted version are shown in Fig. 1-45. The reason for these reviews is that using a stochastic image
model, in which the 2-D power spectral density is modeled
as that of a fractal image, can yield much better results in
refocusing an image than is possible with deterministic models.
20
An example is shown in Fig. 1-46.
40

60

80

100

120

140

160

180

200
20 40 60 80 100 120 140 160 180 200 (a) Unfocused MRI image
(a) Image with missing pixels

20

40

60

80

100

120 (b) Deterministically refocused MRI image


140

160

180

200
20 40 60 80 100 120 140 160 180 200
(b) Inpainted image

Figure 1-45 The image in (b) was created by “filling in”


values for the missing pixels in the upper image. [Extracted
from Fig. 7-23.] (c) Stochastically refocused MRI image

Figure 1-46 The images in (b) and (c) demonstrate two


methods used for refocusing an unfocused image. [Extracted
from Fig. 9-4.]
32 CHAPTER 1 IMAGING SENSORS

1-6.10 Markov Random Fields for Image 1-6.11 Motion-Deblurring of a Color Image
Segmentation Chapters 1–9 consider grayscale (black-and-white) images,
In a Markov random field (MRF) image model, the value of each since color images consist of three (red, green, blue) images.
pixel is stochastically related to its surrounding values. This Motion deblurring of a color image is presented in Section 10-3,
is useful in segmenting images, as presented in Section 9-11. an example of which is shown in Fig. 1-48.
Figure 1-47 illustrates how an MRF image model can improve
the segmentation of an X-ray image of a foot into tissue and
bone.

(a) Motion-blurred Christmas tree


(a) Noisy image

(b) Segmented image


(b) Deblurred Christmas tree
Figure 1-47 Segmenting a noisy image into two distinct
classes: bone and tissue. [Extracted from Fig. 9-14.]
Figure 1-48 Deblurring a motion-blurred color image. [Ex-
tracted from Fig. 10-11.]
1-6 COMING ATTRACTIONS 33

1-6.12 Wavelet-Based Denoising of a Color 1-6.13 Histogram Equalization (Brightening) of


Image a Color Image
Wavelet-based denoising can be used on each color component Color images can be brightened using histogram equalization, as
of a color image, as presented in Section 10-4. An illustration is presented in Section 10-5. Figure 1-50(a) displays a dark image
given in Fig. 1-49. of a toucan, and part (b) of the same figure displays the result of
applying histogram equalization to each color.

(a) Noisy flag image

(a) Original dark toucan image

(b) Denoised image

Figure 1-49 Denoising an image of the American flag using


wavelet-based denoising. [Extracted from Fig. 10-12.]

(b) Brightened image

Figure 1-50 Application of histogram equalization to a color


image. [Extracted from Figs. 10-13 and 10-14.]
34 CHAPTER 1 IMAGING SENSORS

1-6.14 Unsupervised Learning 1-6.15 Supervised Learning


In unsupervised learning, a set of training images is used to Supervised learning by neural networks is presented in Chapter
determine a set of reference images, which are then used to 12. An example of a neural network is shown in Fig. 1-52.
classify an observed image. The training images are mapped to a
subspace spanned by the most significant singular vectors of the
singular value decomposition of a training matrix. In Fig. 1-51,
the training images, depicted by blue “ ” symbols, cluster into
different image classes.

0
u2 1
1.5 2

784 terminals
3
1 4
Class 1 5
0.5 6
7
0
8
9
−0.5 Class 2 Class 3
−1

Input terminals Hidden layer Output layer


−1.5 u1
1 1.5 2 2.5 3 3.5
Figure 1-52 A multilayer neural network.
Figure 1-51 Depiction of training images in 2-D subspace.
[Extracted from Fig. 11-13.]
1-6 COMING ATTRACTIONS 35

Summary
Concepts
• Images may be formed using any of these imaging • Color images are actually triplets of red, green, and blue
modalities: optical, infrared, radar, x-rays, ultrasound, images, displayed together.
and magnetic resonance imaging. • The effect of an image acquisition system on an image
• Image processing is needed to process a raw image, can usually be modelled as 2-D convolution with the
formed directly from data, into a final image, which has point spread function of the system (see below).
been deblurred, denoised, interpolated, or enhanced, all • The resolution of an image acquisition system can be
of which are subjects of this book. computed using various formulae (see below).

Mathematical Formulae
1 1 1
Lens law + = X-ray tomography path attenuation
d0 di f Z ∞Z ∞
p(r, θ ) = a(ξ , η ) δ (r − ξ cos θ − η sin θ ) d ξ d η
Optical point spread function −∞ −∞
  λ
2J1 (γ ) 2 πD Optical resolution ∆θmin ≈ 1.22
h(θ ) = , γ= sin θ D
γ λ λ
Radar resolution ∆y′min ≈ R
SAR point spread function D
  λN
2 1.8y Ultrasound resolution ∆Rmin =
h(x, y) = e−2.77(x/τ ) sinc2 2
l
sin(π N ∆k x)
MRI point spread function hx (x) = ∆k
sin(π ∆k x)
2-D convolution Z ∞Z ∞
Ii (x, y) = Io (x, y) ∗ ∗ h(x, y) = Io (x − x′ , y − y′) h(x′ , y′ ) dx′ dy′
−∞ −∞

Important Terms Provide definitions or explain the meaning of the following terms:
active pixel sensor liquid crystal display radar X-ray computed tomography
beamforming magnetic resonance imaging (MRI) resolution
charge-coupled device optical imaging synthetic-aperture radar
infrared imaging point spread function ultrasound imaging
36 CHAPTER 1 IMAGING SENSORS

PROBLEMS 1.6 The following program loads an image stored in


sar.mat as Io (x, y), passes it through an imaging system with
Section 1-1: Optical Imagers the PSF given by Eq. (1.15), and displays Io (x, y) and Ii (x, y).
Parameters ∆, τ and l are specified in the program’s first line.
1.1 An imaging lens in a digital camera has a focal length of clear;Delta=0.1;l=5;tau=1;I=[-15:15];
6 cm. How far should the lens be from the camera’s CCD array z=pi*1.8*Delta*I/l;load sar.mat;
to focus on an object hy=sin(pi*z)./(pi*z);hy(16)=1;hy=hy.*hy;
(a) 12 cm in front of the lens? hx=exp(-2.77*Delta*Delta*I.*I/tau/tau);
(b) 15 cm in front of the lens? H=hy’*hx;Y=conv2(X,H);
figure,imagesc(X),axis off,colormap(gray),
1.2 An imaging lens in a digital camera has a focal length of
figure,imagesc(Y),axis off,colormap(gray)
4 cm. How far should the lens be from the camera’s CCD array
to focus on an object
Run the program and display Io (x, y) (input) and Ii (x, y)
(a) 12 cm in front of the lens? (output).
(b) 8 cm in front of the lens?
1.3 The following program loads an image stored in
clown.mat as Io (x, y), passes it through an imaging system Section 1-3: X-Ray Computed Tomography (CT)
with the PSF given by Eq. (1.6), and displays Io (x, y) and
Ii (x, y). Parameters ∆, D, di , and λ (all in mm) are specified in 1.7 (This problem assumes prior knowledge of the 1-D Fourier
the program’s first line. transform (FT)). The basic CT problem is to reconstruct α (ξ , η )
clear;Delta=0.0002;D=0.03; in Eq. (1.18) from p(r, θ ). One way to do this is as follows:
lambda=0.0000005;di=0.003;
(a) Take the FT of Eq. (1.18), transforming r to f . Define
T=round(0.01/Delta);
p(−r, θ ) = p(r, θ + π ).
for I=1:T;for J=1:T;
x2y2(I,J)=(I-T/2).*(I-T/2)+(J-T/2). (b) Define and substitute µ = f cos θ and ν = f sin θ in this
*(J-T/2);end;end; FT.
gamma=pi*D/lambda* (c) Show that the result defines 2 FTs, transforming ξ to µ and
sqrt(x2y2./(x2y2+di*di/Delta/Delta)); η to ν , and that A(µ , ν ) = P( f , θ ). Hence, α (ξ , η ) is the
h=2*besselj(1,gamma)./gamma; inverse FT of P( f , θ ).
h(T/2,T/2)=(h(T/2+1,T/2)+h(T/2-1,T/2)
+h(T/2,T/2+1)+h(T/2,T/2-1))/4;
h=h.*h;H=h(T/2-5:T/2+5,T/2-5:T/2+5); Section 1-4: Magnetic Resonance Imaging
load clown.mat;Y=conv2(X,H);
figure,imagesc(X),axis off,colormap(gray),
figure,imagesc(Y),axis off,colormap(gray) 1.8 The following program loads an image stored in
mri.mat as Io (x, y), passes it through an imaging system with
Run the program and display Io (x, y) (input) and Ii (x, y) the PSF given by Eq. (1.20), and displays Io (x, y) and Ii (x, y).
(output). Parameters ∆, N, and dk are specified in the program’s first line.
clear;N=16;Delta=0.01;dk=1;
I=[-60:60];load mri.mat;
Section 1-2: Radar Imagers h=dk*sin(pi*N*dk*I*Delta)./sin(pi*dk*I*Delta);
h(61)=N;H=h’*h;Y=conv2(X,H);
1.4 Compare the azimuth resolution of a real-aperture radar
figure,imagesc(X),axis off,colormap(gray),
with that of a synthetic-aperture radar, with both pointed at the
figure,imagesc(Y),axis off,colormap(gray)
ground from an aircraft at a range R = 5 km. Both systems
operate at λ = 3 cm and utilize a 2-m-long antenna.
Run the program and display Io (x, y) (input) and Ii (x, y)
1.5 A 2-m-long antenna is used to form a synthetic-aperture (output).
radar from a range of 100 km. What is the length of the synthetic
aperture?
PROBLEMS 37

Section 1-5: Ultrasound Imager

1.9 This problem shows how beamforming works on a linear


array of transducers, as illustrated in Fig. 1-35, in a medium
with a wave speed of 1540 m/s. We are given a linear array
of transducers located 1.54 cm apart along the x axis, with
the nth transducer located at x = 1.54n cm. Outputs {yn (t)}
from the transducers are delayed and summed to produce the
signal y(t) = ∑n yn (t − 0.05n). In what direction (angle from
perpendicular to the array) is the array focused?
Chapter 2
2 Review of 1-D Signals
and Systems
x(t)
Contents 1.0 Trumpet signal
0.8
Overview, 39 0.6
0.4
2-1 Review of 1-D Continuous-Time Signals, 41
0.2
2-2 Review of 1-D Continuous-Time Systems, 43
0 t (ms)
2-3 1-D Fourier Transforms, 47
−0.2
2-4 The Sampling Theorem, 53 −0.4
2-5 Review of 1-D Discrete-Time Signals −0.6
and Systems, 59 −0.8
2-6 Discrete-Time Fourier Transform (DTFT), 66 −1
0 1 2 3 4 5 6 7
2-7 Discrete Fourier Transform (DFT), 70
(a) x(t)
2-8 Fast Fourier Transform (FFT), 76
2-9 Deconvolution Using the DFT, 80 |X( f )|

2-10 Computation of Continuous-Time Fourier 0.35


Magnitude spectrum of trumpet signal
Transform (CTFT) Using the DFT, 82 0.30
Problems, 86
0.25

Objectives 0.20

0.15
Learn to:
0.10

■ Compute the response of an LTI system to a given 0.05


input using convolution. 0 f (Hz)
0 500 1500 2500 3500 4500
■ Compute the frequency response (response to a (b) |X( f )|
sinusoidal input) of an LTI system.
Many techniques and transforms in image process-
■ Compute the continuous-time Fourier transform of a ing are direct generalizations of techniques and
signal or impulse response.
transforms in 1-D signal processing. Reviewing
■ Use the sampling theorem to convert a continuous- these 1-D concepts enhances the understanding of
time signal to a discrete-time signal. their 2-D counterparts. These include: linear
time-invariant (LTI) systems, convolution, frequen-
■ Perform the three tasks listed above for continuous- cy response, filtering, Fourier transforms for
time signals on discrete-time signals. continuous and discrete-time signals, and the
sampling theorem. This chapter reviews these 1-D
■ Use the discrete Fourier transform (DFT) to denoise, concepts for generalization to their 2-D counterparts
filter, and deconvolve signals. in Chapter 3.
Overview
Some topics and concepts in 1-D signals and systems generalize
directly to 2-D. This chapter provides quick reviews of those
topics and concepts in 1-D so as to simplify their repeat presen-
tation in 2-D in future chapters. We assume the reader is already
familiar with 1-D signals and systems∗ —in both continuous and
discrete time, so the presentation in this chapter is more in the
form of a refresher than an extensive treatment. Moreover, we
limit the coverage to topics that generalize directly from 1-D
to 2-D. These topics include those listed in the box below.
Some topics that do not generalize readily from 1-D to 2-D
include: causality; differential and difference equations; transfer
functions; poles and zeros; Laplace and z-transforms. Hence,
these topics will not be covered in this book.

1-D Signals and Systems 2-D Signals and Systems


(1) Linear time-invariant (LTI) 1-D systems Linear shift-invariant (LSI) 2-D systems
(2) Frequency response of LTI systems Spatial frequency response of LSI systems
(3) Impulse response of 1-D systems Point-spread function of LSI systems
(4) 1-D filtering and convolution 2-D filtering and convolution
(5) Sampling theorem in 1-D Sampling theorem in 2-D
(6) Discrete-time Fourier transform (DTFT) Discrete-space Fourier transform (DSFT)

∗ For a review, see Engineering Signals and Systems in Continuous and

Discrete Time, Ulaby and Yagle, NTS Press, 2016.

39
40 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

For the sake of clarity, we start this chapter with a synopsis of


the terminology and associated symbols used to represent 1-D
continuous-time and discrete-time signals and 2-D continuous-
space and discrete-space signals, and their associated spectra.

1-D Signals
Continuous Time
FT
x(t) X( f )
Signal in Spectrum in
time domain frequency domain

Discrete Time
DTFT DFT
x[n] X(Ω) X[k]
Signal at Spectrum at Spectrum at
2πk
discrete times continuous discrete frequencies Ω =
N
t = n∆ frequency Ω 0≤k≤N−1

2-D Images
Continuous Space
CSFT
f (x,y) F(μ,ν)
Image in Spectrum in
spatial domain frequency domain

Discrete Space
DSFT 2-D DFT
f [n,m] F(Ω1,Ω2) F[k1,k2]
Image in Spectrum in order N Spectrum in
discrete space continuous discrete frequency domain:
frequency domain 2πk1 2πk2
Ω1 = , Ω2 =
N N
2-1 REVIEW OF 1-D CONTINUOUS-TIME SIGNALS 41

2-1 Review of 1-D Continuous-Time The rectangle function rect(t) is defined as


Signals (
1 for − 1/2 < t < 1/2,
rect(t) = (2.3a)
A continuous-time signal is a physical quantity, such as voltage 0 otherwise.
or acoustic pressure, that varies with time t, where t is a real
number having units of time (usually seconds). Mathematically The pulse x(t) defined in Eq. (2.2) can be written in terms of
a continuous-time signal is a function x(t) of t. Although t can rect(t) as
also be a spatial variable, we will refer to t as time, to avoid  
t − t0
confusion with the spatial variables used for images. x(t) = rect . (2.3b)
T
Rectangle Pulse
2-1.1 Fundamental 1-D Signals
In Table 2-1, we consider three types of fundamental,
continuous-time signals.
C. Impulses
An impulse δ (t) is defined as a function that has the sifting
A. Eternal Sinusoids property
An (eternal) sinusoid with amplitude A, phase angle θ (radians),
Z ∞
and frequency f0 (Hz), is described by the function
x(t) δ (t − t0 ) dt = x(t0 ). (2.4)
−∞
x(t) = A cos(2π f0t + θ ), −∞ < t < ∞. (2.1)
Sifting Property
The period of x(t) is T = 1/ f0 .
Even though an eternal sinusoid cannot exist physically (since
it would extend from before the beginning of the universe until ◮ Multiplying a function x(t) that is continuous at t = t0 by
after its end), it is used nevertheless to mathematically describe a delayed impulse δ (t −t0 ) and integrating over t “sifts out”
periodic signals in terms of their Fourier series. Also, another the value x(t0 ). ◭
useful aspect of eternal sinusoids is that the response of a linear
time-invariant LTI system (defined in Section 2-2.1) to an eter-
nal sinusoid is another eternal sinusoid at the same frequency as Setting x(t) = 1 in Eq. (2.4) shows that an impulse has an area
that of the input sinusoid, but with possibly different amplitude of unity.
and phase. Despite the fact that the sinusoids are eternal (and An impulse can be thought of (non-rigorously) as the limiting
therefore, unrealistic), they can be used to compute the response case of a pulse of width T = 2ε multiplied by an amplitude
of real systems to real signals. Consider, for example, a sinusoid A = 1/(2ε ) as ε → 0:
that starts at t = 0 as the input to a stable and causal LTI system.
The output consists of a transient response that decays to zero, 1  t 
plus a sinusoid that starts at t = 0. The output sinusoid has the
δ (t) = lim rect . (2.5)
ε →0 2ε 2ε
same amplitude, phase, and frequency as the response that the
system would have had to an eternal input sinusoid. Because the width of δ (t) is the reciprocal of its amplitude
(Fig. 2-1), the area of δ (t) remains 1 as ε → 0.
The limit is undefined, but it is useful to think of an impulse
B. Pulses as the limiting case of a short duration and high pulse with unit
area. Also, the pulse shape need not be rectangular; a Gaussian
A (rectangular) pulse of duration T centered at time t0 (Table or sinc function can also be used.
2-1) is defined as Changing variables from t to t ′ = at yields the time scaling
( property (Table 2-1) of impulses:
1 for (t0 − T /2) < t < (t0 + T /2),
x(t) = (2.2)
0 otherwise. δ (at) = δ (t)/|a| (2.6)
42 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Table 2-1 Types of signals and signal properties.

Types of Signals
x(t)
A cos (2πt /T )
A
Eternal sinusoid x(t) = A cos(2π f0t + θ ), −∞ < t < ∞
t
−T −T/2 0 T/2 T

t − t0
rect ( )T
  T
t − t0 1
Pulse (rectangle) x(t) = rect
( T
t (s)
1 for (t0 − T /2) < t < (t0 + T /2), 0 t0
=
0 otherwise

1 δ(t − t0)
Z ∞
Impulse δ (t) x(t) δ (t − t0 ) dt = x(t0 )
−∞
t
0 t0

Properties
x(t)

Causal x(t) = 0 for t < 0 (starts at or after t = 0)


t
0
6
x(t + 10)
4 x(t)

Time delay by t0 x(t) x(t − t0 ) 2 x(t − 10)


0 t (s)
−10 −6 0 4 10 14 20 30

y1(t) = x(2t)
10 x(t) y2(t) = x(t / 2)

Time scaling by a x(t) x(at) t


Z ∞ 1 2 3 4 5
Signal energy E= |x(t)|2 dt
−∞
2-2 REVIEW OF 1-D CONTINUOUS-TIME SYSTEMS 43

The scaling property can be interpreted using Eq. (2.5) as A nonzero signal x(t) that is zero-valued outside the interval
follows. For a > 1 the width of the pulse in Eq. (2.5) is
compressed by |a|, reducing its area by a factor of |a|, but its [a, b] = {t : a ≤ t ≤ b},
height is unaltered. Hence the area under the pulse is reduced to
1/|a|. (i.e., x(t) = 0 for t ∈
/ [a, b]), has support [a, b] and duration b − a.
Impulses are important tools used in defining the impulse
responses of 1-D systems and the point-spread functions of 2-D Concept Question 2-1: Why does scaling time in an im-
spatial systems (such as a camera or an ultrasound), as well as pulse also scale its area?
for deriving the sampling theorem.
R∞
Exercise 2-1: Compute the value of −∞ δ (3t − 6) t 2 dt.
2-1.2 Properties of 1-D Signals
Answer: 43 . (See IP )
A. Time Delay
Delaying signal x(t) by t0 generates signal x(t −t0 ). If t0 > 0, the Exercise 2-2: Compute
 the energy of the pulse defined by
waveform of x(t) is shifted to the right by t0 , and if t0 < 0, the t−2
x(t) = 5 rect 6 .
waveform of x(t) is shifted to the left by |t0 |. This is illustrated
by the time-delay figure in Table 2-1. Answer: 150. (See IP )

B. Time Scaling
2-2 Review of 1-D Continuous-Time
A signal x(t) time-scaled by a becomes x(at). If a > 1, the
waveform of x(t) is compressed in time by a factor of a. If Systems
0 < a < 1, the waveform of x(t) is expanded in time by a factor
of 1/a, as illustrated by the scaling figure in Table 2-1. If a < 0, A continuous-time system is a device or mathematical model
the waveform of x(t) is compressed by |a| or expanded by 1/|a|, that accepts a signal x(t) as its input and produces another signal
and then time-reversed. y(t) at its output. Symbolically, the input-output relationship is
expressed as
C. Signal Energy
x(t) SYSTEM y(t)
The energy E of a signal x(t) is
Z ∞
Table 2-2 provides a list of important system types and proper-
E= |x(t)|2 dt. (2.7)
−∞ ties.

2-2.1 Linear and Time-Invariant Systems


δ(t) Systems are classified on the basis of two independent proper-
Area = 1
ties: (a) linearity and (b) time invariance, which leads to four
possible classes:
1 (1) Linear (L), but not time-invariant.

(2) Linear and time-invariant (LTI).
t (3) Nonlinear, but time-invariant (TI).
−ε 0 ε

(4) Nonlinear and not time-invariant.
Figure 2-1 Rectangular pulse model for δ (t). Most practical systems (including 2-D imaging systems such
a camera, ultrasound, radar, etc.) belong to one of the first two
44 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Table 2-2 Types of systems and associated properties.

Property Definition
N N
Linear System (L) If xi (t) L yi (t), then ∑ ci xi (t) L ∑ ci yi (t)
i=1 i=1

Time-Invariant (TI) If x(t) TI y(t), then x(t − τ ) TI y(t − τ )

N N
Linear Time-Invariant (LTI) If xi (t) LTI yi (t), then ∑ ci xi (t − τi) LTI ∑ ci yi (t − τi)
i=1 i=1

Impulse Response of LTI System δ (t − τ ) LTI y(t) = h(t − τ )

of the above four classes. If a system is moderately nonlinear, exactly the same direction. That is, if
it can be approximated by a linear model, and if it is highly
nonlinear, it may be possible to divide its input-output response
x(t) TI y(t),
into a series of quasi-linear regions. In this book, we limit our
treatment to linear (L) and linear time-invariant (LTI) systems.
then it follows that

A. Linear Systems x(t − τ ) TI y(t − τ ). (2.9)


A system is linear (L) if its response to a linear combination
of input signals acting simultaneously is the same as the linear for any input signal x(t) and constant time shift τ .
combination of the responses to each of the input signals acting
alone. That is:
If
◮ Systems that are both linear and time-invariant are termed
xi (t) L yi (t), (2.8a)
linear time-invariant (LTI). ◭
then for any N inputs {xi (t), i = 1 . . . N} and any N constants
{ci , i = 1 . . . N},
N N 2-2.2 Impulse Response
∑ ci xi (t) L ∑ ci yi (t). (2.8b)
i=1 i=1 In general, we use the symbols x(t) and y(t) to denote, respec-
tively, the input signal into a system and the resultant output
Under mild assumptions (regularity) about the system, the finite
response. The term impulse response is used to denote the out-
sum can be extended to infinite sums and integrals.
put response for the specific case when the input is an impulse.
Linearity also is called the superposition property.
For non–time-invariant systems, the impulse response depends
on the time at which the impulse is nonzero. The response to
B. Time-Invariant System the impulse δ (t) delayed by τ , which is δ (t − τ ), is denoted as
h(t; τ ). If the system is time-invariant, then delaying the impulse
A system is time-invariant (TI) if time shifting (delaying) the merely delays the impulse response, so h(t; τ ) = h(t − τ ), where
input, time shifts the output by exactly the same amount and in h(t) is the response to the impulse δ (t). This can be summarized
2-2 REVIEW OF 1-D CONTINUOUS-TIME SYSTEMS 45

in the following two equations: which is known as the convolution integral. Often, the convolu-
tion integral is represented symbolically by
Z ∞
δ (t − τ ) SYSTEM h(t; τ ), (2.10a)
x(t) ∗ h(t) = x(τ ) h(t − τ ) d τ . (2.15)
−∞
and for a time-invariant system,
Combining the previous results leads to the symbolic form

δ (t − τ ) TI h(t − τ ). (2.10b)
x(t) LTI y(t) = x(t) ∗ h(t). (2.16)

The steps leading to Eq. (2.16) are summarized in Fig. 2-2.


The convolution in Eq. (2.15) is realized by time-shifting the
2-2.3 Convolution impulse response h(t). Changing variables from τ to t − τ shows
that convolution has the commutative property:
A. Linear System
x(t) ∗ h(t) = h(t) ∗ x(t). (2.17)
Upon interchanging t and t0 in the sifting property given by
Eq. (2.4) and replacing t0 with τ , we obtain the relationship The expression for y(t) can also be derived by time-shifting the
Z ∞
input signal x(t) instead, in which case the result would be
x(τ ) δ (t − τ ) d τ = x(t). (2.11) Z ∞
−∞ y(t) = h(t) ∗ x(t) = x(t − τ ) h(τ ) d τ . (2.18)
−∞
Next, if we multiply both sides of Eq. (2.10a) by x(τ ) and then
integrate τ over the limits (−∞, ∞), we obtain Table 2-3 provides a summary of key properties of convolution
that are extendable to 2-D, and Fig. 2-3 offers a graphical
Z ∞ Z ∞ representation of how two of those properties—the associative
x(τ ) δ (t − τ ) d τ L y(t) = x(τ ) h(t; τ ) d τ . and distributive properties—are used to characterize the overall
−∞ −∞
(2.12) impulse responses of systems composed of multiple systems,
Upon using Eq. (2.11) to replace the left-hand side of Eq. (2.12) when connected in series or in parallel, in terms of the impulse
with x(t), Eq. (2.12) becomes responses of the individual systems.

Z ∞
x(t) L y(t) = x(τ ) h(t; τ ) d τ . (2.13)
−∞
Concept Question 2-2: What is the significance of a sys-
This integral is called the superposition integral. tem being linear time-invariant?

Concept Question 2-3: Why does delaying either of two


signals delay their convolution?
B. LTI System
For an LTI system, h(t; τ ) = h(t − τ ), in which case the expres- Exercise 2-3: Is the following system linear, time-invariant,
sion for y(t) in Eq. (2.13) becomes both, or neither?
dy
Z ∞ = 2x(t − 1) + 3tx(t + 1)
dt
y(t) = x(τ ) h(t − τ ) d τ , (2.14)
−∞
Convolution Integral Answer: System is linear but not time-invariant. (See IP )
46 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

LTI System with Zero Initial Conditions

1. δ (t) LTI y(t) = h(t)

2. δ (t − τ ) LTI y(t) = h(t − τ )

3. x(τ ) δ (t − τ ) LTI y(t) = x(τ ) h(t − τ )

Z ∞ Z ∞
4. x(τ ) δ (t − τ ) d τ LTI y(t) = x(τ ) h(t − τ ) d τ
−∞ −∞

Z ∞
5. x(t) LTI y(t) = x(τ ) h(t − τ ) d τ = x(t) ∗ h(t)
−∞

Figure 2-2 Derivation of the convolution integral for a linear time-invariant system.

Table 2-3 Convolution properties.


Z ∞
Convolution Integral y(t) = h(t) ∗ x(t) = h(τ ) x(t − τ ) d τ
−∞
Z t
• Causal Systems and Signals: y(t) = h(t) ∗ x(t) = u(t) h(τ ) x(t − τ ) d τ
0

Property Description

1. Commutative x(t) ∗ h(t) = h(t) ∗ x(t) (2.19a)

2. Associative [g(t) ∗ h(t)] ∗ x(t) = g(t) ∗ [h(t) ∗ x(t)] (2.19b)

3. Distributive x(t) ∗ [h1 (t) + · · · + hN (t)] = x(t) ∗ h1 (t) + · · · + x(t) ∗ hN (t) (2.19c)
Z t
4. Causal ∗ Causal = Causal y(t) = u(t) h(τ ) x(t − τ ) d τ (2.19d)
0

5. Time-shift h(t − T1 ) ∗ x(t − T2 ) = y(t − T1 − T2 ) (2.19e)

6. Convolution with Impulse x(t) ∗ δ (t − T ) = x(t − T ) (2.19f)


2-3 1-D FOURIER TRANSFORMS 47

x(t) h1(t) h2(t) hN(t) y(t) Answer:


(
e−2t − e−3t for t > 0,
y(t) =
0 for t < 0.

x(t) h1(t) ∗ h2(t) ∗ … ∗ hN(t) y(t) (See IP )

(a) In series
2-3 1-D Fourier Transforms
The continuous-time Fourier transform (CTFT) is a powerful
h1(t) tool for
• computing the spectra of signals, and
x(t) h2(t) y(t)
• analyzing the frequency responses of LTI systems.

hN(t) 2-3.1 Definition of Fourier Transform


The 1-D Fourier transform X( f ) of x(t) and the inverse 1-D
Fourier transform x(t) of X( f ) are defined by the transforma-
tions
x(t) h1(t) + h2(t) + … + hN(t) y(t) Z ∞
X( f ) = F {x(t)} = x(t) e− j2π f t dt (2.20a)
(b) In parallel −∞
and
Z ∞
Figure 2-3 (a) The overall impulse response of a system com-
posed of multiple LTI systems connected in series is equivalent
x(t) = F −1 {X( f )} = X( f ) e j2π f t d f . (2.20b)
−∞
to the cumulative convolution of the impulse responses of the
individual systems. (b) For LTI systems connected in parallel,
the overall impulse response is equal to the sum of the impulse Throughout this book, variables written in boldface (e.g., X( f ))
responses of the individual systems. denote vectors or complex-valued quantities.

A. Alternative Definitions of the Fourier


Transform
Note that Eq. (2.20) differs slightly from the usual electrical
Exercise 2-4: Compute the output y(t) of an LTI system engineering definition of the Fourier-transform pair:
with impulse response h(t) to input x(t), where Z ∞
( Xω (ω ) = x(t) e− jω t dt (2.21a)
−∞
e−3t for t > 0,
h(t) =
0 for t < 0, and Z ∞
1
x(t) = Xω (ω ) e jω t d ω . (2.21b)
and ( 2π −∞
e−2t for t > 0, Whereas Eq. (2.20) uses the oscillation frequency f (in Hz),
x(t) =
0 for t < 0. the definition given by Eq. (2.21) uses ω (in rad/s) instead,
where ω = 2π f . Using Hz makes interpretation of the Fourier
48 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

transform as a spectrum—as well as the presentation of the Shifting:


sampling theorem in Section 2-4—easier. Z ∞
The definition of the Fourier transform used by mathemati- x(t − τ ) = X( f ) e j2π f (t−τ ) d f
cians has a different sign for ω than the definition used by −∞
Z ∞
electrical engineers: = X( f ) e j2π f t e− j2π f τ d f
Z ∞ −∞
X−ω (ω ) = x(t) e jω t dt (2.22a) =F −1
{X( f ) e− j2π f τ }. (2.24)
−∞

and Modulation:
Z ∞
1 − jω t Z ∞
x(t) = X−ω (ω ) e dω . (2.22b) e j2π f0t x(t) = X( f ) e j2π f t e j2π f0t d f
2π −∞
−∞
Geophysicists use different sign conventions for time and space! Z ∞
In addition, some computer programs,√such as Mathematica, = X( f ) e j2π ( f + f0 )t d f
−∞
split the 1/(2π ) factor into factors of 1/ 2π in both the forward −1
and inverse transforms. =F {X( f − f0 )}. (2.25)

Derivative:
◮ In this book, we use the definition of the Fourier trans- Z ∞

dx(t) d j2π f t
form given by Eq. (2.20) exclusively. ◭ = X( f ) e df
dt dt −∞
Z ∞
= X( f ) ( j2π f ) e j2π f t d f
B. Fourier Transform Notation −∞
−1
Throughout this book, we use Eq. (2.20) as the definition of the =F {( j2π f ) X( f )}. (2.26)
Fourier transform, we denote the individual transformations by
Zero frequency:
F {x(t)} = X( f ) Setting f = 0 in Eq. (2.20a) leads to
Z ∞
and X(0) = x(t) dt.
F −1 {X( f )} = x(t), −∞

and we denote the combined bilateral pair by Zero time:


Similarly, setting t = 0 in Eq. (2.20b) leads to
x(t) X( f ). Z ∞
x(0) = X( f ) d f .
−∞
2-3.2 Fourier Transform Properties
The other properties in Table 2-4 follow readily from the
The major properties of the Fourier transform are summarized definition of the Fourier transform given by Eq. (2.20), except
in Table 2-4, and derived next. for the convolution property, which requires a few extra steps of
algebra.
Scaling: Parseval’s theorem states that
For a 6= 0, Z ∞ Z ∞
Z ∞ E= x(t) y∗ (t) dt = X( f ) Y∗ ( f ) d f . (2.27)
j2π f (at) −∞ −∞
x(at) = X( f ) e df
−∞
Z ∞ Setting y(t) = x(t) gives Rayleigh’s theorem (also commonly
= X( f ) e j2π ( f a)t d(a f )/a known as Parseval’s theorem), which states that the energies of
−∞
  
−1 1 f
=F X . (2.23)
|a| a
2-3 1-D FOURIER TRANSFORMS 49

Table 2-4 Major properties of the Fourier transform.


Z ∞
Property x(t) X( f ) = F [x(t)] = x(t) e− j2π f t dt
−∞

1. Linearity ∑ ci xi (t) ∑ ci Xi ( f )
 
1 f
2. Time scaling x(at) X
|a| a
3. Time shift x(t − τ ) e− j2π f τ X( f )

4. Frequency shift (modulation) e j2π f0 t x(t) X( f − f0 )

dx
5. Time derivative x′ = j2π f X( f )
dt
6. Reversal x(−t) X(− f )

7. Conjugation x∗ (t) X∗ (− f )

8. Convolution in t x(t) ∗ y(t) X( f ) Y( f )

9. Convolution in f (multiplication in t) x(t) y(t) X( f ) ∗ Y( f )

10. Duality X(t) x(− f )

Special FT Relationships
Z ∞
11. Zero frequency X(0) = x(t) dt
−∞
Z ∞
12. Zero time x(0) = X( f ) d f
−∞
Z ∞ Z ∞
13. Parseval’s theorem x(t) y∗ (t) dt = X( f ) Y∗ ( f ) d f
−∞ −∞

x(t) and X( f ) are equal: where the even component xe (t) and the odd component xo (t)
Z ∞ Z ∞ are formed from their parent signal x(t) as follows:
E= |x(t)|2 dt = |X( f )|2 d f . (2.28)
−∞ −∞ xe (t) = [x(t) + x∗ (−t)]/2 (2.30a)

and
xo (t) = [x(t) − x∗(−t)]/2. (2.30b)
A signal is said to have even symmetry if x(t) = x∗ (−t), in
A. Even and Odd Parts of Signals which case x(t) = xe (t) and xo (t) = 0. Similarly, a signal has
odd symmetry if x(t) = −x∗ (−t), in which case x(t) = xo (t) and
xe (t) = 0.
A signal x(t) can be decomposed into even xe (t) and odd x0 (t)
components:
x(t) = xe (t) + x0(t), (2.29)
50 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

B. Conjugate Symmetry and a frequency response


(
If x(t) is real-valued, then the following conjugate symmetry 1 for | f | < fc ,
relations hold: HLP ( f ) = (2.33)
X(− f ) = X∗ ( f ), (2.31a) 0 for | f | > fc ,

where X∗ ( f ) is the complex conjugate of X( f ), eliminates all frequency components of X( f ) above fc .


X( f ) = − X(− f ) (phase is an odd function), (2.31b)
D. Sinc Functions
|X( f )| = |X(− f )| (magnitude is even), (2.31c)
real(X( f )) = real(X(−f)) (real part is even), (2.31d) The impulse response of the ideal lowpass filter characterized
by Eq. (2.33) is
imag(X( f )) = −imag(X(− f )) (imaginary part is odd).
(2.31e) hLP (t) = F −1 {HLP ( f )}
Z fc
(
The real and imaginary parts of the Fourier transform of a real- j2π f t sin(2π fct)/π t for t 6= 0,
valued signal x(t) are the Fourier transforms of the even and = 1e df =
− fc 2 fc for t = 0.
odd parts of x(t), respectively. So the Fourier transform of a
(2.34)
real-valued and even function is real-valued, and the Fourier
transform of a real-valued and odd function is purely imaginary: The scientific literature contains two different, but both com-
monly used definitions for the sinc function:
x(t) is even X( f ) is real,
x(t) is odd X( f ) is imaginary, (1) sinc(x) = sin(x)/x, and
x(t) is real and even X( f ) is real and even, (2) sinc(x) = sin(π x)/π x.
x(t) is real and odd X( f ) is imaginary and odd.
With either definition, sinc(0) = 1 since sin(x) ≈ x for x ≪ 1.

C. Filtering and Frequency Response


◮ Throughout this book, we use the sinc function definition
The following is a very important property of the Fourier (
transform: sin(π x)/π x for x 6= 0,
sinc(x) = (2.35)
1 for x = 0.
◮ The Fourier transform of a convolution of two functions
is equal to the product of their Fourier transforms:
Hence, per the definition given by Eq. (2.35), the impulse
response of the ideal lowpass filter is given by
x(t) LTI y(t) = h(t) ∗ x(t)
hLP (t) = 2 fc sinc(2 fc t). (2.36)
implies that
Y( f ) = H( f ) X( f ). (2.32) 2-3.3 Fourier Transform Pairs
Commonly encountered Fourier transform pairs are listed in
Table 2-5. Note the duality between entries #1 and #2, #4 and
The function H( f ) = F {h(t)} is called the frequency response #5, and #6 and itself.
of the system. The relationship described by Eq. (2.32) defines
the frequency filtering process performed by the system. At a
given frequency f0 , frequency component X( f0 ) of the input is 2-3.4 Interpretation of the Fourier Transform
multiplied by H( f0 ) to obtain the frequency component Y( f0 ) A Fourier transform can be interpreted in three ways:
of the output.
For example, an ideal lowpass filter with cutoff frequency fc (1) as the frequency response of an LTI system,
2-3 1-D FOURIER TRANSFORMS 51

Table 2-5 Examples of Fourier transform pairs. Note that constant a ≥ 0.

|x(t)| X( f ) = F [x(t)] |X( f )|


BASIC FUNCTIONS
δ(t) 1
1a. 1 δ (t) 1
t f
1 1
1b. t δ (t − τ ) e− j2π f τ
τ f
1
2. 1 δ(f) 1
t f
1 2a 2/a
3. e−a|t| , a > 0
(2π f )2 + a2
t f
1 T
t f
rect(t/T ) T sinc( f T )
−T T
4.
−1 1
2 2 T T
T 1
f f
5. −1 1 f0 sinc( f0t) rect( f / f0 ) f0 f0
f0 f0 −
2 2
1 2 2 1
6. e−π t e−π f
t f

(2) as the spectrum of a signal, and B. Spectrum

(3) as the energy spectral density of a signal.


The spectrum of x(t) is X( f ). Consider, for example, the eternal
sinusoid defined by Eq. (2.1):

A. Frequency Response x(t) = A cos(2π f0t + θ )


A A
The frequency response H( f ) of an LTI system is the frequency = e jθ e j2π f0t + e− jθ e− j2π f0t , (2.37)
2 2
domain equivalent of the system’s impulse response h(t):
where we used the relation cos(x) = (e jx + e− jx )/2. From the
H( f ) h(t). properties and pairs listed in Tables 2-4 and 2-5, we have
A jθ A
X( f ) = e δ ( f − f0 ) + e− jθ δ ( f + f0 ). (2.38)
2 2
52 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Strictly speaking, the Fourier transform of an eternal sinusoid is


undefined, since an eternal sinusoid is not absolutely integrable.
Nevertheless, this example provides a convenient illustration
that the spectrum of a sinusoid at f0 is concentrated entirely
at ± f0 . By extension, the spectrum of a constant signal is x(t)
concentrated entirely at f = 0.
1.0 Trumpet signal
Real signals are more complicated than simple sinusoids, as
are their corresponding spectra. Figure 2-4(a) displays 7 ms 0.8
of a trumpet playing note B, and part (b) of the same figure 0.6
displays the corresponding spectrum. The spectrum is concen-
trated around narrow spectral lines located at 491 Hz and the 0.4
next 6 harmonics. In contrast, speech exhibits a much broader 0.2
spectrum, as illustrated by the examples in Fig. 2-5. 0 t (ms)
−0.2
C. Energy Spectral Density −0.4
−0.6
A more rigorous interpretation of X( f ) that avoids impulses in
frequency uses the concept of energy spectral density. Let the −0.8
spectrum of a signal x(t) be −1
0 1 2 3 4 5 6 7
(
constant for f0 < f < f0 + ε , (a) x(t)
X( f ) =
0 otherwise,
( |X( f )|
X( f0 ) for f0 < f < f0 + ε ,
≈ (2.39) 0.35
0 otherwise. Magnitude spectrum of trumpet signal

The approximation becomes exact in the limit as ε → 0, pro- 0.30


vided X( f ) is continuous at f = f0 .
Using Rayleigh’s theorem, the energy of x(t) is 0.25
Z ∞
0.20
E= |X( f )|2 d f ≈ |X( f0 )|2 ε . (2.40)
−∞
0.15
The energy of x(t) in the interval f0 < f < f0 + ε is |X( f0 )|2 ε .
The energy spectral density at frequency f = f0 (analogous to 0.10
probability density or to mass density of a physical object) is
|X( f0 )|2 . 0.05
Note that a real-valued x(t) will, by conjugate symmetry, also
have nonzero X( f ) in the interval − f0 > f > − f0 − δ . So the 0 f (Hz)
bilateral energy spectral density of x(t) at f0 is 2|X( f0 )|2 . 0 500 1500 2500 3500 4500
(b) |X( f )|
Concept Question 2-4: Provide three applications of the
Figure 2-4 Trumpet signal (note B) and its magnitude spectrum.
Fourier transform.

Concept Question 2-5: Provide an application of the


sinc function.
2-4 THE SAMPLING THEOREM 53

d
Sound magnitude Exercise 2-6: Compute the Fourier transform of dt [sinc(t)].
Answer:
  (
“oo” as in “cool” d j2π f for | f | < 0.5,
F sinc(t) =
dt 0 for | f | > 0.5.

(See IP )

f (kHz)
1 2 3
(a) “oo” spectrum
2-4 The Sampling Theorem
The sampling theorem is an operational cornerstone of both
Sound magnitude discrete-time 1-D signal processing and discrete-space 2-D
image processing.

“ah” as in “Bach” 2-4.1 Sampling Theorem Statement


The samples {x(n∆)} of a signal x(t) sampled every ∆ seconds
are
{x(n∆), n = . . . , −2, −1, 0, 1, 2, . . .}. (2.41)
f (kHz) The inverse of the sampling interval ∆ is the sampling rate
1 2 3 S = 1/∆ samples per second. The sampling rate has the same
(b) “ah” spectrum dimension as Hz, and is often expressed in “Hz.” For example,
the standard sampling rate for CDs is 44100 samples/second,
Figure 2-5 Spectra of two vowel sounds. often stated as 44100 Hz. The corresponding sampling interval
is ∆ = 1/44100 s = 22.676 µ s.
A signal x(t) is bandlimited to a maximum frequency of
B (extending from −B to B), measured in Hz, if its Fourier
transform X( f ) = 0 for | f | > B. Although real-world signals
are seldom truly bandlimited, their spectra are often negligible
above some frequency B.

◮ The sampling theorem states that if


Exercise 2-5: A square wave x(t) has the Fourier series
expansion X( f ) = 0 for | f | > B,

1 1 1 and if
x(t) = sin(t) + sin(3t) + sin(5t) + sin(7t) + · · ·
3 5 7 x(t) is sampled at a sampling rate of S samples/s,
Compute output y(t) if then
x(t) can be reconstructed exactly from {x(n∆),
x(t) h(t) = 0.4 sinc(0.4t) y(t). n = . . . , −2, −1, 0, 1, 2, . . .}, provided S > 2B.

Answer: y(t) = sin(t). (See IP )


The sampling rate must exceed double the maximum frequency
in the spectrum X( f ) of x(t). The minimum (actually an infi-
54 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

mum) sampling rate 2B samples/second is called the Nyquist Dividing by ∆ and recalling that S = 1/∆ gives
rate, and the frequency 2B is called the Nyquist frequency.

xs (t) S ∑ X( f − kS). (2.47)
k=−∞

2-4.2 Sampling Theorem Derivation The spectrum Xs ( f ) of xs (t) consists of a superposition of copies
of the spectrum X( f ) of x(t), repeated every S = 1/∆ and
A. The Sampled Signal xs (t) multiplied by S. If these copies do not overlap in frequency,
we may then recover X( f ) from Xs ( f ) using a lowpass filter,
Given a signal x(t), we construct the sampled signal xs (t) by provided S > 2B [see Fig. 2-6(a)].
multiplying x(t) by the impulse train
∞ X s( f )
δs (t) = ∑ δ (t − n∆). (2.42) S X(0)
n=−∞ f
−B − S B−S −B 0 B S−B S+B
That is, Copy Copy
(a) S > 2B

xs (t) = x(t) δs (t) = ∑ x(t) δ (t − n∆) X s( f )
n=−∞

f
= ∑ x(n∆) δ (t − n∆). (2.43) −B − S B − S0S − B S+B
n=−∞
−B B
(b) S < 2B

B. Spectrum of the sampled signal xs (t) Figure 2-6 Sampling a signal x(t) with maximum frequency B at
a rate of S makes X( f ) change amplitude to S X( f ) and to repeat
Using Fourier series, it can be shown that the Fourier transform in f with period S. These copies (a) do not overlap if S > 2B, but
of the impulse train δs (t) is itself an impulse train in frequency: (b) they do if S < 2B.

∞ ∞
∆ ∑ δ (t − n∆) ∑ δ ( f − k/∆). (2.44)
n=−∞ k=−∞
2-4.3 Aliasing
This result can be interpreted as follows. A periodic signal has
If the sampling rate S does not exceed 2B, the copies of X( f )
a discrete spectrum (zero except at specific frequencies) given
will overlap one another, as shown in Fig. 2-6(b). This is called
by the signal’s Fourier series expansion. By Fourier duality, a
an aliased condition, the consequence of which is that the
discrete signal (zero except at specific times) such as xs (t) has
reconstructed signal will no longer match the original signal
a periodic spectrum. So a discrete and periodic signal such as
x(t).
δs (t) has a spectrum that is both discrete and periodic.
Multiplying Eq. (2.44) by x(t), using the definition for xs (t)
given by Eq. (2.43), and applying property #9 in Table 2-4 leads Example 2-1: Two Sinusoids and Aliasing
to

xs (t) ∆ X( f ) ∗ ∑ δ ( f − k/∆), (2.45)
Two signals, a 2 Hz sinusoid and a 12 Hz sinusoid:
k=−∞

which, using property #6 of Table 2-3, simplifies to x1 (t) = cos(4π t)



and
xs (t) ∆ ∑ X( f − k/∆). (2.46)
x2 (t) = cos(24π t),
k=−∞
2-4 THE SAMPLING THEOREM 55

were sampled at 20 samples/s. Generate plots for B. Sinc Interpolation Formula


(a) x1 (t) and its sampled version x1s (t),
Mathematically, we can use an ideal lowpass filter with a cutoff
(b) x2 (t) and its sampled version x2s (t), frequency anywhere between B and S − B. It is customary to use
S/2 as the cutoff frequency, since it is halfway between B and
(c) Spectra X1 ( f ) and X1s ( f ), and S − B, so as to provide a safety margin for avoiding aliasing if
the actual maximum frequency of X( f ) exceeds B but is less
(d) Spectra X2 ( f ) and X2s ( f ). than S/2. The frequency response of this ideal lowpass filter is,
from Eq. (2.48),
(
Solution: (a) Figure 2-7(a) displays x1 (t) and x1s (t), with the
1/S for | f | < S/2,
latter generated by sampling x1 (t) at S = 20 samples/s. The H( f ) = (2.49)
applicable bandwidth of x1 (t) is B1 = 2 Hz. Since S > 2B1 , it 0 for | f | > S/2.
should be possible to reconstruct x1 (t) from x1s (t), which we
demonstrate in a later subsection. Setting f0 = S in entry #4 of Table 2-5, the impulse response is
Similar plots are displayed in Fig. 2-7(b) for the 12 Hz found to be
sinusoid. In this latter case, B = 12 Hz and S = 20 samples/s.
h(t) = (1/S)[S sinc(St)] = sinc(St). (2.50)
Hence, S < 2B.
(b) Spectrum X1 ( f ) of x1 (t) consists of two impulses at Using the convolution property x(t) ∗ δ (t − τ ) = x(t − τ ) [see
±2 Hz, as shown in Fig. 2-8(a). The spectrum of the sampled property #6 in Table 2-3], we can derive the following sinc
version consists of the same spectrum X1 ( f ) of x1 (t), scaled by interpolation formula:
the factor S, plus additional copies repeated every ±S = 20 Hz
(Fig. 2-8(b). Note that the central spectrum in (Fig. 2-8(b), x(t) = xs (t) ∗ h(t)
corresponding to S X1 ( f ), does not overlap with the neighboring ∞
copies. = ∑ x(n∆) δ (t − n∆) ∗ h(t)
(c) Spectra X2 ( f ) and X2s ( f ) are shown in Fig. 2-9. Because n=−∞
S < 2B, the central spectrum overlaps with its two neighbors. ∞
= ∑ x(n∆) sinc(S(t − n∆)). (2.51)
n=−∞
2-4.4 Sampling Theorem Implementation
In principle, this formula can be used to reconstruct x(t) for
A. Physical Lowpass Filter any time t from its samples {x(n∆)}. But since it requires an
infinite number of samples {x(n∆)} to reconstruct x(t), it is of
If S > 2B, the original signal x(t) can be recovered from theoretical interest only.
the sampled signal xs (t) by subjecting the latter to a lowpass
filter that passes frequencies below B with a gain of 1/S (to
compensate for the factor of S induced by sampling (as noted
in Eq. (2.47)) and rejects frequencies greater than (S − B) Hz:
(
1/S for | f | < B, C. Reconstruction of X( f ) from Samples {x(n∆)}
H( f ) = (2.48)
0 for | f | > S − B.
According to Eq. (2.43)), the sampled signal xs (t) is given by
This type of filter must be implemented using a physical ∞
circuit. For example, a Butterworth filter can be constructed xs (t) = ∑ x(n∆) δ (t − n∆). (2.52)
by connecting op-amps, capacitors and resistors in a series of n=−∞
Sallen-Key configurations.† This is clearly impractical for image
processing. Application of property #6 in Table 2-5 yields

† Ulaby and Yagle, Signals and Systems: Theory and Applications, pp. 296– Xs ( f ) = ∑ x(n∆) e− j2π f n∆. (2.53)
297. n=−∞
56 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

0.8 x1(t)

0.6

x1s(t)
0.4

0.2

0 t (s)

−0.2

−0.4

−0.6

−0.8

−1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(a) 2 Hz sinusoid x1(t) and its sampled version x1s(t)

1
x2(t)
0.8

0.6

0.4
x2s(t)
0.2

0 t (s)

−0.2

−0.4

−0.6

−0.8

−1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(b) 12 Hz sinusoid x2(t) and its sampled version x2s(t)

Figure 2-7 Plots of (a) x1 (t) and x1s (t) and (b) x2 (t) and x2s (t). Sampling rate S = 20 samples/s.
2-4 THE SAMPLING THEOREM 57

X1( f ) X1( f ) of 2 Hz sinusoid


0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0 f (Hz)
−30 −20 −10 0 10 20 30

(a) X1( f ) of 2 Hz sinusoid

X1s( f )
X1s( f ) for 2 Hz sinusoid sampled at 20 Hz. Note no aliasing.
10

9
S X1( f )
Copy of S X1( f ) Copy of S X1( f )
8 at −20 Hz at +20 Hz
7

0 f (Hz)
−30 −20 −10 0 10 20 30
S = 20 Hz S = 20 Hz
(b) Spectrum X1s( f ) of sampled signal x1s(t)

Figure 2-8 Spectra (a) X1 ( f ) and (b) X1s ( f ) of the 2 Hz sinusoid and its sampled version, respectively. The spectrum X1s ( f ) consists of
X( f ) scaled by S = 20, plus copies thereof at integer multiples of ±20 Hz. The vertical axes denote areas under the impulses.
58 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

X2( f ) X2( f ) of 12 Hz sinusoid


0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0 f (Hz)
−30 −20 −10 0 10 20 30

(a) Spectrum X2( f ) of 12 Hz sinusoid


X2s( f )
X2s( f ) for 12 Hz sinusoid sampled at 20 Hz. Note the presence of aliasing.
10

7 Copy of S X2( f ) Copy of S X2( f )


at −20 Hz at +20 Hz
6
S X2( f )
5

4 Overlap Overlap

0 f (Hz)
−30 −20 −10 0 10 20 30
S = 20 Hz S = 20 Hz
(b) Spectrum X2s( f ) of sampled signal x2s(t)

Figure 2-9 Spectra (a) X2 ( f ) and (b) X2s ( f ) of the 12 Hz sinusoid and its sampled version. Note the overlap in (b) between the spectrum
of X2s ( f ) and its neighboring copies. The vertical axes denote areas under the impulses.
2-5 REVIEW OF 1-D DISCRETE-TIME SIGNALS AND SYSTEMS 59

Note that Xs ( f ) is periodic in f with period 1/∆, as it should be. reconstruction x̂1 (t), we apply Eq. (2.57) with ∆ = 1/S = 0.05 s:
In the absence of aliasing,
X̂1 ( f ) = X1s ( f ) ∆ sinc(∆ f ).
1
X( f ) = Xs ( f ) for | f | < S/2. (2.54)
S The sinc function is displayed in Fig. 2-10(a) in red and uses the
vertical scale on the right-hand side, and the spectrum X̂1 ( f ) is
The relationship given by Eq. (2.53)) still requires an infinite displayed in blue using the vertical scale on the left-hand side.
number of samples {x(n∆)} to reconstruct X( f ) at each fre- The sinc function preserves the spectral components at ±2 Hz,
quency f . but attenuates the components centered at ±20 Hz by a factor
of 10 (approximately).
D. Nearest-Neighbor (NN) Interpolation (b) Application of Eq. (2.56) to x1 (n∆) = cos(4π n∆) with
∆ = 1/20 s yields plot x̂1 (t) shown in Fig. 2-10(b).
A common procedure for computing an approximation to x(t)
from its samples {x(n∆)} is the nearest neighbor interpolation.
The signal x(t) is approximated by x̂(t): Concept Question 2-6: Why must the sampling rate of a
signal exceed double its maximum frequency, if it is to be
 reconstructed from its samples?
x(n∆)
 for (n − 0.5)∆ < t < (n + 0.5)∆,
x̂(t) = x((n + 1)∆) for (n + 0.5)∆ < t < (n + 1.5)∆,

 .. .. Concept Question 2-7: Why does nearest-neighbor in-
. . terpolation work as well as it does?
(2.55)
So x̂(t) is a piecewise-constant approximation to x(t), and it is
related to the sampled signal xs (t) by Exercise 2-7: What is the Nyquist sampling rate for a signal
bandlimited to 4 kHz?
x̂(t) = xs (t) ∗ rect(t/∆). (2.56)
Answer: 8000 samples/s. (See IP )
Using the Fourier transform of a rectangle function (entry #4 in
Table 2-5), the spectrum X̂( f ) of x̂(t) is Exercise 2-8: A 500 Hz sinusoid is sampled at 900
samples/s. No anti-alias filter is being used. What is the
X̂( f ) = Xs ( f ) ∆ sinc(∆ f ), (2.57) frequency of the reconstructed continuous-time sinusoid?
where Xs ( f ) is the spectrum of the sampled signal. The zero- Answer: 400 Hz. (See IP )
crossings of the sinc function occur at frequencies f = k/∆ = kS
for integers k. These are also the centers of the copies of the
original spectrum X( f ) induced by sampling. So these copies
are attenuated if the maximum frequency B of X( f ) is such that 2-5 Review of 1-D Discrete-Time
B ≪ S. The factor ∆ in Eq. (2.57) cancels the factor S = 1/∆ in
Eq. (2.47). Signals and Systems
Through direct generalizations of the 1-D continuous-time def-
initions and properties of signals and systems presented earlier,
Example 2-2: Reconstruction of 2 Hz Sinusoid
we now extend our review to their discrete counterparts.

For the 2 Hz sinusoid of Example 2-1: (a) plot spectrum X̂1 ( f )


of the approximated reconstruction x̂1 (t), and (b) apply nearest- 2-5.1 Discrete-Time Notation
neighbor interpolation to generate x̂1 (t).
A discrete-time signal is a physical quantity—such as voltage or
Solution: (a) Spectrum X1s ( f ) of the sampled version of the acoustic pressure—that varies with discrete time n, where n is a
2 Hz sinusoid was generated earlier in Example 2-1 and dis- dimensionless integer. Mathematically, a discrete-time signal is
played in Fig. 2-8(b). To obtain the spectrum of the approximate a function x[n] of discrete time n.
60 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

ˆ 1( f )
X sinc(0.05 f )
0.5 1.0

0.45 0.9

0.4 0.8

0.35 sinc(0.05 f ) 0.7


ˆ 1( f )
X
0.3 0.6
Left scale Right scale
0.25 0.5

0.2 0.4

0.15 0.3

0.1 0.2

0.05 0.1
0
0 f (Hz)
−30 −20 −10 0 10 20 30

(a) Spectrum Xˆ 1( f ) of 2 Hz signal x1(t). The sinc function is shown in red.

0.8

0.6

0.4

0.2

−0.2

−0.4

−0.6

−0.8

−1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(b) 2 Hz sinusoidal signal reconstructed from samples at 20 Hz,


using nearest neighbor interpolation. The 2 Hz sinusoid is shown in red.

Figure 2-10 Plots of Example 2-2.


2-5 REVIEW OF 1-D DISCRETE-TIME SIGNALS AND SYSTEMS 61

property of impulses still holds, with a summation replacing the


4 integral:

3
∑ x[i] δ [n − i] = x[n]. (2.61)
2 i=−∞

1
n 2-5.2 Discrete-Time Eternal Sinusoids
−2 −1 0 1 2 3
A discrete-time eternal sinusoid is defined as
Figure 2-11 Stem plot representation of x[n]. x[n] = A cos(Ω0 n + θ ), −∞ < n < ∞, (2.62)

where Ω0 is the discrete-time frequency with units of radians


per sample, so it is dimensionless.
• Discrete-time signals x[n] use square brackets, Comparing the discrete-time eternal sinusoid to the
whereas continuous-time signals x(t) use parentheses. continuous-time eternal sinusoid given by Eq. (2.1), which we
repeat here as
• t has units of seconds, while n is dimensionless.
x(t) = A cos(2π f0t + θ ), −∞ < t < ∞, (2.63)
Discrete-time signals x[n] usually result from sampling a
it is apparent that a discrete-time sinusoid can be viewed as a
continuous-time signal x(t) at integer multiples of a sampling
continuous-time sinsoid sampled every ∆ seconds, at a sampling
interval of ∆ seconds. That is,
rate of S = 1/∆ samples/s. Thus,
• x[n] = x(n∆) for n = {. . . , −2, −1, 0, 1, 2, . . .}.
Discrete-time signals are often represented using bracket nota- Ω0 = 2π f0 ∆ = 2π f0 /S, (2.64)
tion, and plotted using stem plots. For example, the discrete-time
signal x[n] defined by
which confirms that Ω0 , like n, is dimensionless. However,
 almost all discrete-time eternal sinusoids are nonperiodic! In

 3 for n = −1,

2 fact, x[n] is periodic only if
for n = 0,
x[n] = (2.58) 2π N

 4 for n = 2, = , (2.65)


0 for all other n, Ω0 D

can be depicted using either the bracket notation with N/D being a rational number. In such a case, the fun-
damental period of the sinusoid is N, provided N/D has been
x[n] = {3, 2, 0, 4}, (2.59) reduced to lowest terms.

where the underlined value is the value at time n = 0, or in the


form of the stem plot shown in Fig. 2-11. Example 2-3: Discrete Sinusoid
The support of this x[n] is the interval [−1, 2], and its
duration is 2 − (−1) + 1 = 4. In general, a discrete-time signal
Compute the fundamental period of
with support [a, b] has duration b − a + 1.
A discrete-time (Kronecker) impulse δ [n] is defined as x[n] = 3 cos(0.3π n + 2).
(
1 for n = 0,
δ [n] = {1} = (2.60) Solution: From the expression for x[n], we deduce that
0 for n 6= 0.
Ω0 = 0.3π . Hence,
Unlike the continuous-time impulse, the discrete-time impulse 2π 2π 20
has no issues about infinite height and zero width. The sifting = =
Ω0 0.3π 3
62 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

reduced to lowest terms. Therefore, the period is N = 20. convolution


Another important property of discrete-time eternal sinusoids
is that the discrete-time frequency Ω0 is periodic, which is not
y[n] = h[n] ∗ x[n] = ∑ h[i] x[n − i]. (2.71a)
i=−∞
true for continuous-time sinusoids. For any integer k, Eq. (2.62) Discrete-time convolution
can be rewritten as

x[n] = A cos(Ω0 n + θ ), −∞ < n < ∞ Most of the continuous-time properties of convolution also
= A cos((Ω0 + 2π k)n + θ ), apply in discrete time.
Real-world signals and filters are defined over specified
= A cos(Ω′0 n + θ ), −∞ < n < ∞, (2.66) ranges of n (and set to zero outside those ranges). If h[n] has
support in the interval [n1 , n2 ], Eq. (2.71a) becomes
with
Ω′0 = Ω0 + 2π k. (2.67) n2

Also, the nature of the variation of x[n] with n has a peculiar


y[n] = h[n] ∗ x[n] = ∑ h[i] x[n − i]. (2.71b)
i=n1
dependence on Ω0 . Consider, for example, the sinusoid
Reversing the sequence of h[n] and x[n] leads to the same
x[n] = cos(Ω0 n), −∞ < n < ∞, (2.68) outcome. That is, if x[n] has support in the interval [n3 , n4 ], then
where, for simplicity, we assigned it an amplitude of 1 and n4
a phase angle θ = 0. Next, let us examine what happens as y[n] = h[n] ∗ x[n] = ∑ x[i] h[n − i]. (2.71c)
we increase Ω0 from a value slightly greater than zero to 2π . i=n3
Initially, as Ω0 is increased, x[n] oscillates faster and faster, until
x[n] reaches a maximum rate of oscillation at Ω0 = π , namely ◮ The duration of the convolution of two signals of dura-
n tions N1 and N2 is Nc = N1 + N2 − 1, not N1 + N2 . Since
x[n] = cos(π n) = (−1) at (Ω0 = π ). (2.69)
h[n] is of length N1 = n2 − n1 + 1 and x[n] is of length
At Ω0 = π , x[n] oscillates as a function of n between (−1) and N2 = n4 − n3 + 1, the length of the convolution y[n] is
(+1). As Ω0 is increased beyond π , oscillation slows down and Nc = N1 + N2 − 1 = (n2 − n1) + (n4 − n3 ) + 1. ◭
then stops altogether when Ω0 reaches 2π :

x[n] = cos(2π n) = 1 at (Ω0 = 2π ). (2.70) For causal signals (x[n] and h[n] equal to zero for n < 0), y[n]
assumes the form
Beyond Ω0 = 2π , the oscillatory behavior starts to increase
again, and so on. This behavior has no equivalence in the world n
of continuous-time sinusoids. y[n] = ∑ x[i] h[n − i], n ≥ 0. (2.71d)
i=0
Causal
2-5.3 1-D Discrete-Time Systems
A 1-D discrete-time system accepts an input x[n] and produces For example,
an output y[n]:
{1, 2} ∗ {3, 4} = {3, 10, 8}. (2.72)
x[n] SYSTEM y[n]. The duration of the output is 2 + 2 − 1 = 3.

The definition of LTI for discrete-time systems is identical to 2-5.4 Discrete-Time Convolution Properties
the definition of LTI for continuous-time systems. If a discrete-
time system has impulse response h[n], then the output y[n] With one notable difference, the properties of the discrete-time
can be computed from the input x[n] using the discrete-time convolution are the same as those for continuous time. If (t)
2-5 REVIEW OF 1-D DISCRETE-TIME SIGNALS AND SYSTEMS 63

Table 2-6 Comparison of convolution properties for continuous-time and discrete-time signals.

Property Continuous Time Discrete Time


Z ∞ ∞
Definition y(t) = h(t) ∗ x(t) =
−∞
h(τ ) x(t − τ ) d τ y[n] = h[n] ∗ x[n] = ∑ h[i] x[n − i]
i=−∞

1. Commutative x(t) ∗ h(t) = h(t) ∗ x(t) x[n] ∗ h[n] = h[n] ∗ x[n]

2. Associative [g(t) ∗ h(t)] ∗ x(t) = g(t) ∗ [h(t) ∗ x(t)] [g[n] ∗ h[n]]∗ x[n] = g[n] ∗ [h[n] ∗ x[n]]

3. Distributive x(t) ∗ [h1(t) + · · · + hN (t)] = x[n] ∗ [h1 [n] + · · · + hN [n]] =


x(t) ∗ h1(t) + · · · + x(t) ∗ hN (t) x[n] ∗ h1[n] + · · · + x[n] ∗ hN [n]
Z t n
4. Causal ∗ Causal = Causal y(t) = u(t)
0
h(τ ) x(t − τ ) d τ y[n] = u[n] ∑ h[i] x[n − i]
i=0
5. Time-Shift h(t − T1 ) ∗ x(t − T2 ) = y(t − T1 − T2 ) h[n − a] ∗ x[n − b] = y[n − a − b]

6. Sampling x(t) ∗ δ (t − T ) = x(t − T ) x[n] ∗ δ [n − a] = x[n − a]

7. Width width y(t) = width x(t) + width h(t) width y[n] =


width x[n] + width h[n] − 1
! !
∞ ∞ ∞
8. Area area of y(t) = area of x(t) × area of h(t) ∑ y[n] = ∑ h[n] ∑ x[n]
n=−∞ n=−∞ n=−∞
Z t n
9. Convolution with Step y(t) = x(t) ∗ u(t) =
−∞
x(τ ) d τ x[n] ∗ u[n] = ∑ x[i]
i=−∞

is replaced with [n] and integrals are replaced with sums, the 2-5.5 Delayed-Impulses Computation Method
convolution properties listed in Table 2-3 lead to those listed in
Table 2-6. For finite-duration signals, computation of the convolution sum
The notable difference is associated with property #7. In can be facilitated by expressing one of the signals as a linear
discrete time, the width (duration) of a signal that is zero-valued combination of delayed impulses. The process is enabled by the
outside interval [a, b] is b − a + 1, not b − a. Consider two sampling property (#6 in Table 2-6).
signals, h[n] and x[n], defined as follows: Consider, for example, the convolution sum of the two signals
x[n] = {2, 3, 4} and h[n] = {5, 6, 7}, namely
Signal From To Duration
y[n] = x[n] ∗ h[n] = {2, 3, 4} ∗ {5, 6, 7}.
h[n] a b b−a+1
x[n] c d d−c+1 The sampling property allows us to express x[n] in terms of
impulses,
y[n] a+c b+d (b + d) − (a + c) + 1
x[n] = 2δ [n] + 3δ [n − 1] + 4δ [n − 2],
where y[n] = h[n] ∗ x[n]. Note that the duration of y[n] is
which leads to
(b + d) − (a + c) + 1 = (b − a + 1) + (d − c + 1) − 1
= duration h[n] + duration x[n] − 1. y[n] = (2δ [n] + 3δ [n − 1] + 4δ [n − 2])∗ h[n]
= 2h[n] + 3h[n − 1] + 4h[n − 2].
64 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

2
Given that both x[n] and h[n] are of duration = 3, the duration
of their sum is 3 + 3 − 1 = 5, and it extends from n = 0 y[3] = ∑ x[i] h[3 − i]
i=1
to n = 4. Computing y[0] using the delayed-impulses method
(while keeping in mind that h[i] has a non-zero value for only = x[1] h[2] + x[2] h[1] = 3 × 7 + 4 × 6 = 45,
i = 0, 1, and 2) leads to 2
y[4] = ∑ x[i] h[4 − i] = x[2] h[2] = 4 × 7 = 28,
y[0] = 2h[0] + 3h[−1] + 4h[−2] i=2

= 2 × 5 + 3 × 0 + 4 × 0 = 10. y[n] = 0, otherwise.

The process can then be repeated to obtain the values of y[n] for Hence,
n = 1, 2, 3, and 4. y[n] = {10, 27, 52, 45, 28}.
(b) The convolution sum can be computed graphically
through a four-step process.
Example 2-4: Discrete-Time Convolution Step 1: Replace index n with index i and plot x[i] and h[−i],
as shown in Fig. 2-12(a). Signal h[−i] is obtained from h[i] by
reflecting it about the vertical axis.
Given x[n] = {2, 3, 4} and h[n] = {5, 6, 7}, compute
Step 2: Superimpose x[i] and h[−i], as in Fig. 2-12(b), and
y[n] = x[n] ∗ h[n] multiply and sum them. Their product is 10.

by (a) applying the sum definition and (b) graphically. Step 3: Shift h[−i] to the right by 1 to obtain h[1 − i], as
shown in Fig. 2-12(c). Multiplication and summation of x[i] by
Solution: (a) Both signals have a length of 3 and start at time h[1 − i] generates y[1] = 27. Shift h[1 − i] by one more unit to
zero. That is, x[0] = 2, x[1] = 3, x[2] = 4, and x[i] = 0 for all the right to obtain h[2 − i], and then repeat the multiplication
other values of i. Similarly, h[0] = 5, h[1] = 6, h[2] = 7, and and summation process to obtain y[2]. Continue the shifting and
h[i] = 0 for all other values of i. multiplication and summation processes until the two signals no
By Eq. (2.71d), the convolution sum of x[n] and h[n] is longer overlap.
n
y[n] = x[n] ∗ h[n] = ∑ x[i] h[n − i]. Step 4: Use the values of y[n] obtained in step 3 to generate a
i=0 plot of y[n], as shown in Fig. 2-12(g);
Since h[i] = 0 for all values of i except i = 0, 1, and 2, it follows y[n] = {10, 27, 52, 45, 28}.
that h[n − i] = 0 for all values of i except for i = n, n − 1, and
n − 2. With this constraint in mind, we can apply Eq. (2.71d) at
discrete values of n, starting at n = 0:
Concept Question 2-8: Why are most discrete-time si-
0 nusoids not periodic?
y[0] = ∑ x[i] h[0 − i] = x[0] h[0] = 2 × 5 = 10,
i=0
1 Concept Question 2-9: Why is the length of the convo-
y[1] = ∑ x[i] h[1 − i] lution of two discrete-time signals not equal to the sum of
i=0 the lengths of the two signals?
= x[0] h[1] + x[1] h[0] = 2 × 6 + 3 × 5 = 27,
2 Exercise 2-9: A 28 Hz sinusoid is sampled at 100 sam-
y[2] = ∑ x[i] h[2 − i] ples/s. What is Ω0 for the resulting discrete-time sinusoid?
i=0
What is the period of the resulting discrete-time sinusoid?
= x[0] h[2] + x[1] h[1] + x[2] h[0]
Answer: Ω0 = 0.56π ; N = 25. (See IP )
= 2 × 7 + 3 × 6 + 4 × 5 = 52,
2-5 REVIEW OF 1-D DISCRETE-TIME SIGNALS AND SYSTEMS 65

8 8
n=0 7 n=0
6 x[i] 6 6
5 h[−i]
4 4 4
3
2 2 2
i i
−3 −2 −1 0 1 2 3 4 5 6 −3 −2 −1 0 1 2 3 4 5 6
(a) x[i] and h[−i]
y[0] = 2 × 5 = 10 y[1] = 2 × 6 + 3 × 5 = 27
8 8
h[−i] n=0 h[1 − i] n=1
6 6
4 4
2 x[i] 2 x[i]

i i
−3 −2 −1 0 1 2 3 4 5 6 −3 −2 −1 0 1 2 3 4 5 6
(b) (c)
y[2] = 2 × 7 + 3 × 6 + 4 × 5 = 52 y[3] = 3 × 7 + 4 × 6 = 45
8 8
h[2 − i] n=2 h[3 − i] n=3
6 6
4 4
2 x[i] 2 x[i]
i i
−3 −2 −1 0 1 2 3 4 5 6 −3 −2 −1 0 1 2 3 4 5 6
(d) (e)
y[4] = 4 × 7 = 28 y[n]
8
n=4
6 60
h[4 − i] 52
4 40 45
x[i] 27 28
2 20
10
i n
−3 −2 −1 0 1 2 3 4 5 6 −1 0 1 2 3 4
(f ) (g) y[n]

Figure 2-12 Graphical computation of convolution sum.


66 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

(a) m 6= n
Exercise 2-10: Compute the output y[n] of a discrete-time
LTI system with impulse response h[n] and input x[n], where
Evaluation of the integral in Eq. (2.74) leads to
h[n] = { 3, 1 } and x[n] = { 1, 2, 3, 4 }.
π
Z π
Answer: { 3, 7, 11, 15, 4 }. (See IP ) 1 jΩ(m−n) e jΩ(m−n)
e dΩ =
2 π −π 2π j(m − n)
−π
e j π (m−n) − e jπ (m−n)

=
2-6 Discrete-Time Fourier Transform j2π (m − n)
(DTFT) (−1)m−n − (−1)m−n
=
j2π (m − n)
The discrete-time Fourier transform (DTFT) is the discrete- = 0, (m 6= n). (2.75)
time counterpart to the Fourier transform. It has the same two
functions: (1) to compute spectra of signals and (2) to analyze
the frequency responses of LTI systems.

(b) m = n
2-6.1 Definition of the DTFT
If m = n, the integral reduces to
The DTFT of x[n], denoted X(Ω), and its inverse are defined as Z π Z π
1 1
e jΩ(n−n) dΩ = 1 dΩ = 1. (2.76)

2π −π 2π −π
X(Ω) = ∑ x[n] e− jΩn (2.73a) The results given by Eqs. (2.75) and (2.76) can be combined into
n=−∞
the definition of the orthogonality property given by Eq. (2.74).
and Having verified the validity of the orthogonality property, we
Z π
1 now use it to derive Eq. (2.73b). Upon multiplying the definition
x[n] = X(Ω) e jΩn dΩ. (2.73b)
2π −π for the DTFT given by Eq. (2.73a) by 21π e jΩm and integrating
over Ω, we have
Readers familiar with the Fourier series will recognize that Z π Z π ∞
1 1
the DTFT X(Ω) is a Fourier series expansion with x[n] as the X(Ω) e jΩm dΩ = ∑ x[n] e jΩ(m−n) dΩ
2π −π 2π −π n=−∞
coefficients of the Fourier series. The inverse DTFT is simply Z π

the formula used for computing the coefficients x[n] of the 1
= ∑ x[n] e jΩ(m−n) dΩ
Fourier series expansion of the periodic function X(Ω). 2π n=−∞ −π
We note that the DTFT definition given by Eq. (2.73a) is ∞
the same as the formula given by Eq. (2.53) for computing the = ∑ x[n] δ [m − n] = x[m]. (2.77)
spectrum Xs ( f ) of a continuous-time signal x(t) directly from n=−∞
its samples {x(n∆)}, with Ω = 2π f ∆.
The inverse DTFT given by Eq. (2.73b) can be derived as Equation (2.74) was used in the final step leading to Eq. (2.77).
follows. First, we introduce the orthogonality property Exchanging the order of integration and summation in Eq. (2.77)
is acceptable if the summand is absolutely summable; i.e., if the
Z π
1 DTFT is defined. Finally, replacing the index m with n in the top
e jΩ(m−n) dΩ = δ [m − n]. (2.74) left-hand side and bottom right-hand side of Eq. (2.77) yields
2π −π
the inverse DTFT expression given by Eq. (2.73b).
To establish the validity of this property, we consider two cases,
namely: (1) when m 6= n and (2) when m = n.
2-6 DISCRETE-TIME FOURIER TRANSFORM (DTFT) 67

2-6.2 Properties of the DTFT


◮ The DTFT X(Ω) is periodic with period 2π . ◭
The DTFT can be regarded as the Fourier transform of the
sampled signal xs (t) with a sampling interval ∆ = 1:
( )
∞ We also note the following special relationships between x[n]
X(Ω) = F ∑ x[n] δ (t − n) . (2.78) and X(Ω):
n=−∞

This statement can be verified by subjecting Eq. (2.73a) to the X(0) = ∑ x[n], (2.79a)
n=−∞
time-shift property of the Fourier transform (#3 in Table 2-4). Z π
Consequently, most (but not all) of the Fourier transform prop- 1
x[0] = X(Ω) dΩ, (2.79b)
erties listed in Table 2-4 extend directly to the DTFT with 2π f 2π −π
replaced with Ω, which we list here in Table 2-7. The exceptions
mostly involve the following property of the DTFT: and

X(±π ) = ∑ (−1)n x[n]. (2.79c)
n=−∞

Table 2-7 Properties of the DTFT. If x[n] is real-valued, then conjugate symmetry holds:

Property x[n] X(Ω) X(Ω)∗ = X(−Ω).

1. Linearity ∑ ci xi [n] ∑ ci Xi (Ω) Parseval’s theorem for the DTFT states that the energy of x[n]
is identical, whether computed in the discrete-time domain n or
2. Time shift x[n − n0] X(Ω) e− jn0 Ω in the frequency domain Ω:
∞ Z π
3. Modulation x[n] e jΩ0 n X(Ω − Ω0) 1
∑ |x[n]|2 = |X(Ω)|2 dΩ (2.80)
n=−∞ 2π −π
4. Time reversal x[−n] X(−Ω)

5. Conjugation x∗ [n] X∗ (−Ω) The energy spectral density is now 21π |X(Ω)|2 .
Finally, by analogy to continuous time, a discrete-time ideal
6. Time h[n] ∗ x[n] H(Ω) X(Ω) lowpass filter with cutoff frequency Ω0 has the frequency
convolution response for |Ω| < π (recall that H(Ω) is periodic with period π )
(
Special DTFT Relationships 1 for |Ω| < Ω0 ,
H(Ω) = (2.81)
0 for Ω0 < |Ω| ≤ π ,
7. Conjugate X∗ (Ω) = X(−Ω)
symmetry
which eliminates frequency components of x[n] that lie in the
8. Zero X(0) = ∑∞
n=−∞ x[n] range Ω0 < |Ω| ≤ π .
frequency Z π
1
9. Zero time x[0] = X(Ω) dΩ
2π −π 2-6.3 Important DTFT Pairs

n
10. Ω = ±π X(±π ) = ∑ (−1) x[n] For easy access, several DTFT pairs are provided in Table 2-8.
n=−∞
∞ Z In all cases, the expressions for X(Ω) are periodic with period
1 π
11. Rayleigh’s ∑ |x[n]|2 = |X(Ω)|2 dΩ 2π , as they should be.
n=−∞ 2π −π
Entries #7 and #8 of Table 2-8 deserve more discussion,
(often called
Parseval’s) which we now present.
theorem
68 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Table 2-8 Discrete-time Fourier transform (DTFT) pairs.

x[n] X(Ω) Condition


1. δ [n] 1
1a. δ [n − m] e− jmΩ m = integer

2. 1 2π ∑ δ (Ω − 2π k)
k=−∞

3. e jΩ0 n 2π ∑ δ (Ω − Ω0 − 2π k)
k=−∞

4. cos(Ω0 n) π ∑ [δ (Ω − Ω0 − 2π k) + δ (Ω + Ω0 − 2π k)]
k=−∞

π ∞
5. sin(Ω0 n)
j ∑ [δ (Ω − Ω0 − 2π k) − δ (Ω + Ω0 − 2π k)]
k=−∞

e j2Ω cos θ − ae jΩ cos(Ω0 − θ )


6. an cos(Ω0 n + θ ) u[n] |a| < 1
e j2Ω − 2ae jΩ cos Ω0 + a2
hni 
sin Ω N + 21
7. rect   Ω 6= 2π k
N sin Ω2
  ∞  
Ω0 Ω0 sin(Ω0 n) Ω − 2π k
8.
π
sinc
π
n =
πn ∑ rect 2Ω0
k=−∞

A. Discrete-Time Sinc Functions by

hFIR [n] = h[n] hHam [n]


    π n 

 Ω0 Ω0

π sinc n 0.54 + 0.46 cos , |n| ≤ N,
π | {z N }
= | {z }

 h[n] hHam [n]

0, |n| > N.
(2.83)

The impulse response of an ideal lowpass filter is As can be seen in Fig. 2-13(c) and (d), hFIR [n] with N = 10
Z Ω0   provides a good approximation to an ideal lowpass filter, with
Ω0 Ω0 a finite duration. The Hamming-windowed filter belongs to a
h[n] = 1e jΩn dΩ = sinc n . (2.82)
−Ω0 π π group of filters called finite-impulse response (FIR) filters. FIR
filters can also be designed using a minimax criterion, resulting
This is called a discrete-time sinc function. A discrete-time sinc in an equiripple filter. This and other FIR filter design proce-
function h[n] with Ω0 = π /4 is displayed in Fig. 2-13(a), along dures are discussed in discrete-time signal processing textbooks.
with its frequency response H(Ω) in Fig. 2-13(b). Such a filter
is impractical for real-world applications, because it is unstable
and it has infinite duration. To override these limitations, we
can multiply h[n] by a window function, such as a Hamming
window. The modified impulse response hFIR [n] is then given
2-6 DISCRETE-TIME FOURIER TRANSFORM (DTFT) 69

h[n] H(Ω)
Impulse response Frequency response
0.25 1.5
0.20
1
0.10
0 n 0.5
−0.10 0 Ω
−20 −15 −10 −5 0 5 10 15 20 −2π −π −Ω00 Ω0 π 2π
(a) Impulse response h[n] (b) Ideal lowpass filter spectrum H(Ω) with Ω0 = π/4

hFIR[n] HFIR(Ω)
Frequency response
Impulse response 1.5
0.25
0.20 1
0.10
0.5
0 n
−0.05 0
−10 −5 0 5 10 −π 0 π
(c) Impulse response of Hamming-windowed filter (d) Spectrum HFIR(Ω) of Hamming-windowed filter

Figure 2-13 Parts (a) and (b) are for an ideal lowpass filter with Ω0 = π /4, and parts (c) and (d) are for the same filter after multiplying its
impulse response with a Hamming window of length N = 10.

B. Discrete Sinc Functions with r = e jΩ , the summation in Eq. (2.85) becomes


N 2N
  ∑ e− jΩn = e− jΩN ∑ e jΩn
A discrete-time rectangle function rect Nn is defined as n=−N n=0

( 1 − e jΩ(2N+1)
hni 1 for |n| ≤ N, = e− jΩN
rect = (2.84) 1 − e jΩ
N 0 for |n| > N. sin((2N + 1)Ω/2)
= . (2.87)
  sin(Ω/2)
We note that rect Nn has duration 2N + 1. This differs from the
continuous-time rect
 function
 rect(t/T ), which has duration T . This is called a discrete (or periodic) sinc function. A rectangu-
The DTFT of rect Nn is obtained from Eq. (2.73a) by setting lar pulse with N = 10 is shown in Fig. 2-14 along with its DTFT.
x[n] = 1 and limiting the summation to the range (−N, N):
N
DTFT{rect[n/N]} = ∑ e− jΩn . (2.85)
n=−N Concept Question 2-10: Why does the DTFT share so
many properties with the CTFT?
Using the formula
N
1 − rN+1 Concept Question 2-11: Why is the DTFT periodic in
∑ rk = 1−r
(2.86) frequency?
k=0
70 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

x[n] X(Ω) and X[k]


1.5 14
12 X[k]
1 10 X(Ω)
8
0.5 6
4
0 n 2
0 1 2 3 4 5 6

−15 −10 −5 0 5 10 15
(a) x[n] = rect(n/10)
Figure 2-15 The DFT X[k] is a sampled version of the DTFT
X(Ω).
X(Ω)
21
20
15
Exercise 2-12: Compute the inverse DTFT of
10
5 4 cos(2Ω) + 6 cos(Ω) + j8 sin(2Ω) + j2 sin(Ω).
0 Ω
−2π −π 0 π 2π Answer:

(b) X(Ω) DTFT−1 [4 cos(2Ω) + 6 cos(Ω) + j8 sin(2Ω) + j2 sin(Ω)]


= { 6, 4, 0, 2, −2 }.
Figure 2-14 Discrete-time rectangle function with N = 10 and
its DTFT. (See IP )

2-7 Discrete Fourier Transform (DFT)


The discrete Fourier transform (DFT) is the numerical bridge
between the DTFT and the fast Fourier transform (FFT). For a
signal {x[n], n = 0, . . . , M − 1} of duration M, its DFT of order
N is X[k], where X[k] is X(Ω) sampled at the N frequencies
Ω = {2π k/N, k = 0, . . . , N − 1}:

X[k] = X(Ω = 2π k/N), k = 0, . . . , N − 1. (2.88)


Exercise 2-11: Compute the DTFT of 4 cos(0.15π n + 1).
Answer: An example is shown in Fig. 2-15. Usually N = M; i.e., the order
N of the DFT is equal to the duration M of x[n]. However, in
DTFT[4 cos(0.15π n + 1)] some situations, it is desirable to select N to be larger than M.
∞ Fro example, to compute and plot the DTFT of a short-duration
= ∑ 4π e j1[δ (Ω − 0.15 − 2kπ )] signal such as x[n] = { 3, 1, 4 }, for which M = 3, we may choose
k=−∞ N to be 256 or 512 so as to produce a smooth plot, such as the
∞ blue plot in Fig. 2-15. Choosing N > M will also allow the use of
+ ∑ 4π e− j1[δ (Ω + 0.15 − 2kπ )]. the FFT to compute convolutions quickly (see Section 2-7.2C).
k=−∞
As a result, the properties of the DFT follow those of the
DTFT, with some exceptions, such as time reversal and cyclic
(See IP )
convolution (discussed later in Section 2-7.2C).
2-7 DISCRETE FOURIER TRANSFORM (DFT) 71

by 1
N e j2π mk/N and summing over k gives
◮ To avoid confusion between the DTFT and the DFT, the
DFT, being a discrete function of integer k, uses square N−1 N−1 M−1
1 1
brackets, as in X[k], while the DTFT, being a continuous ∑ X[k] e j2π mk/N = ∑ ∑ x[n] e j2π (m−n)k/N
and periodic function of real numbers Ω, uses round paren- N k=0 N k=0 n=0
theses, as in X(Ω). ◭ 1 M−1 N−1
=
N ∑ x[n] ∑ e j2π (m−n)k/N
n=0 k=0
M−1
= ∑ x[n] δ [m − n]
n=0
(
x[m] for 0 ≤ m ≤ M − 1,
=
0 for M ≤ m ≤ N − 1.
(2.91)

Upon changing index m to n on the left-hand side of Eq. (2.91)


and in the right-hand side in x[m], we obtain the formal definition
of the inverse DFT given by Eq. (2.89b).
The main use of the DFT is to compute spectra of signals and
2-7.1 Definition of the DFT frequency responses of LTI systems. All plots of spectra in this
book (and all other books on signal and image processing) were
made by computing them using the DFT and plotting the results.

The N-point (or Nth-order) DFT of {x[n], n = 0, . . . , M − 1},


denoted X[k], and the inverse DFT of X[k], namely x[n], are
defined as

M−1 Example 2-5: DFT of Periodic Sinusoids


X[k] = ∑ x[n] e− j2π nk/N , k = 0, . . . , N − 1, (2.89a)
n=0
and
N−1
1
x[n] =
N ∑ X[k] e j2π nk/N , n = 0, . . . , M − 1. (2.89b) Compute the N-point DFT of the segment of a periodic discrete-
k=0 time sinusoid

x[n] = A cos(2π (k0 /N)n + θ ), 0 ≤ n ≤ N − 1, (2.92)


The definition for X[k] given by Eq. (2.89a) is obtained by
applying Eq. (2.88) to Eq. (2.73a), namely by replacing Ω with with k0 a fixed integer.
2π k/N and limiting the range of summation over n to [0, M − 1].
For the inverse DFT, the definition given by Eq. (2.89b) can
be derived by following a process similar to that we presented Solution: We start by rewriting x[n] as the sum of two expo-
earlier in Section 2-6.1 in connection with the inverse DTFT. nentials:
Specifically, we start with the discrete equivalent of the orthog- A jθ j2π k0 n/N A − jθ − j2π k0 n/N
onality property given by Eq. (2.74): x[n] = e e + e e
2 2
N−1 A jθ j2π k0 n/N A − jθ j2π (N−k0 )n/N
1 = e e + e e , (2.93)
N ∑ e j2π (m−n)k/N = δ [m − n]. (2.90) 2 2
k=0
where we have multiplied the second term in the first step by
Next, multiplying the definition of the DFT given by Eq. (2.89a) e j2π N/N = 1.
72 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Inserting Eq. (2.93) into Eq. (2.89a) with M = N, we have


Table 2-9 Properties of the DFT. In the time-shift and
N−1 modulation properties, (n − n0 ) and (k − k0 ) must be reduced
A jθ j2π k0 n/N − j2π nk/N
X[k] = ∑ 2
e e e mod(N).
n=0
N−1 Property x[n] X[k]
A − jθ j2π (N−k0 )n/N − j2π nk/N
+ ∑ 2
e e e
1. Linearity ∑ ci xi [n] ∑ ci Xi [k]
n=0

A jθ N−1 2. Time shift x[n − n0 ] e− j2π n0 k/N X[k]


=
2
e ∑ e j2π n(k0−k)/N 3. Modulation e j2π k0 n/N x[n] X[k − k0 ]
n=0

A − jθ N−1 j2π n(N−k0 −k)/N 4. Time reversal x[N − n] X[N − k]


+
2
e ∑e . (2.94a)
5. Conjugation x∗ [n] X∗ [N − k]
n=0
6. Convolution h[n]
c x[n] H[k] X[k]
In view of the orthogonality property given by Eq. (2.90), the
summations simplify to impulses, resulting in Special DFT Relationships
A jθ A X∗ [k] = X[N − k]
X[k] = e N δ [k − k0] + e− jθ N δ [N − k − k0], (2.94b) 7. Conjugate
2 2 symmetry
N−1
which can be restated as 8. Zero frequency X[0] = ∑ x[n]
 n=0
 N jθ for k = k0 ,
 2 Ae 1 N−1
X[k] = N2 Ae− jθ for k = N − k0 , (2.94c) 9. Zero time x[0] =
N ∑ X[k]

0 n=0
otherwise. N−1
10. k = N/2 X[N/2] = ∑ (−1)n x[n]
Thus, the DFT of a segment of a periodic sinusoid with n=0
N−1 1 N−1
Ω0 = 2π k0/N consists of two discrete-time impulses, at indices 11. Parseval’s ∑ |x[n]| 2
= ∑ |X[k]|2
k = k0 and k = N − k0 . N
theorem n=0 k=0

2-7.2 Properties of the DFT


Table 2-9 provides a summary of the salient properties of the A. Conjugate Symmetry Property of the DFT
DFT, as well as some of the special relationships between x[n]
and X[k]. Of particular note are the three relationships If x[n] is real-valued, then conjugate symmetry holds for the
DFT, which takes the form
N−1
X[0] = ∑ x[n], (2.95) X∗ [k] = X[N − k], k = 1, . . . , N − 1. (2.98)
n=0

1 N−1 For example, the 4-point DFT of x[n] = { 1, 2, 3, 4 } is


x[0] =
N ∑ X[k], (2.96)
n=0 X[k] = { 10, −2 + j2, −2, −2 − j2 }
and and
N−1
X[N/2] = ∑ (−1)n x[n] for N even. (2.97)
X∗ [1] = −2 − j2 = X[4 − 1] = X[3] = −2 − j2.
n=0

Similarly,
X∗ [2] = X[4 − 2] = X[2] = −2,
2-7 DISCRETE FOURIER TRANSFORM (DFT) 73

which is real-valued. This conjugate-symmetry property follows where (n − n1 )N means (n − n1 ) reduced mod N (i.e., reduced
from the definition of the DFT given by Eq. (2.89a): by the largest integer multiple of N without (n − n1 ) becoming
negative).
N−1
X∗ [k] = ∑ x[n] e j2π nk/N (2.99)
n=0

and
N−1 N−1
C. DFT and Cyclic Convolution
X[N − k] = ∑ x[n] e− j2π n(N−k)/N = ∑ x[n] e− j2π ne j2π k/N .
n=0 n=0
(2.100) Because of the mod N reduction cycle, the expression on the
Since n is an integer, e− j2π n = 1 and Eq. (2.100) reduces to right-hand side of Eq. (2.103) is called the cyclic or circular
convolution of signals x1 [n] and x2 [n]. The terminology helps
N−1
distinguish it from the traditional linear convolution of two
X[N − k] = ∑ e j2π nk/N = X∗[k]. nonperiodic signals.
n=0
The symbol commonly used to denote cyclic convolution is .
c
Combining Eqs. (2.101) and (2.103) leads to
B. Use of DFT for Convolution
N−1
The convolution property of the DTFT extends to the DFT after
some modifications. Consider two signals, x1 [n] and x2 [n], with
yc [n] = x1 [n]
c x2 [n] = ∑ x1[n1 ] x2 [(n − n1)N ]
n1 =0
N-point DFTs X1 [k] and X2 [k]. From Eq. (2.89b), the inverse
DFT of their product is = DFT−1 (X1 [k] X2 [k])
N−1
1 2π
1 N−1
jk 2Nπ n
= ∑ X1 [k] X2 [k] e jk N n . (2.104)
DFT−1 (X1 [k] X2 [k]) = ∑ (X1 [k] X2 [k])e N k=0
N k=0
" # The cyclic convolution yc [n] can certainly by computed by
1 N−1 jk 2π n N−1
− jk 2Nπ n1
= ∑e N
N k=0 ∑ x1 [n1 ] e applying Eq. (2.104), but it can also be computed from the
n1 =0 linear convolution x1 [n]∗ x2[n] by aliasing the latter. To illustrate,
" # suppose x1 [n] and x2 [n] are both of duration N. The linear
N−1
− jk 2Nπ n2
· ∑ x2[n2 ] e . (2.101) convolution of the two signals
n2 =0
y[n] = x1 [n] ∗ x2[n] (2.105)
Rearranging the order of the summations gives
is of duration 2N − 1, extending from n = 0 to n = 2N − 2.
DFT−1 (X1 [k] X2 [k]) = Aliasing y[n] means defining z[n], the aliased version of y[n],
as
1 N−1 N−1 N−1 2π

N n∑ ∑ x1 [n1] x2[n2 ] ∑ e jk N (n−n1−n2 ) . (2.102)


z[0] = y[0] + y[0 + N]
1 =0 n2 =0 k=0
z[1] = y[1] + y[1 + N]
In view of the orthogonality property given by Eq. (2.90),
Eq. (2.102) reduces to ..
.
DFT−1 (X1 [k] X2 [k]) z[N − 2] = y[N − 2] + y[2N − 2]
z[N − 1] = y[N − 1]. (2.106)
1 N−1 N−1
N n∑ ∑ x1 [n1] x2[n2 ] N δ [(n − n1 − n2)N ]
=
1 =0 n2 =0 The aliasing process leads to the result that z[n] is the cyclic
N−1 convolution of x1 [n] and x2 [n]:
= ∑ x1 [n1] x2 [(n − n1)N ], (2.103)
yc [n] = z[n] = x1 [n]
c x2 [n]. (2.107)
n1 =0
74 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Hence, by Eq. (2.107),


Example 2-6: Cyclic Convolution
x1 [n]
c x2 [n] = z[n] = { 28, 21, 30, 31 },
Given the two signals
which is the same answer obtained in part (a).
x1 [n] = { 2, 1, 4, 3 },
x2 [n] = { 5, 3, 2, 1 }, 2-7.3 DFT and Linear Convolution
compute the cyclic convolution of the two signals by In the preceding subsection, we examined how the DFT can
be used to compute the cyclic convolution of two discrete-
(a) applying the DFT method; time signals (Eq. (2.104)). The same method can be applied to
compute the linear convolution of the two signals, provided a
(b) applying the aliasing of the linear convolution method.
preparatory step of zero-padding the two signals is applied first.
Let us suppose that signal x1 [n] is of duration N1 and signal
Solution: x2 [n] is of duration N2 , and we are interested in computing their
(a) With N = 4, application of Eq. (2.89a) to x1 [n] and x2 [n] linear convolution
leads to
y[n] = x1 [n] ∗ x2[n].
X1 [k] = { 10, −2 + j2, 2, −2 − j2 },
X2 [k] = { 11, 3 − j2, 3, 3 + j2 }. The duration of y[n] is

The point-by-point product of X1 [k] and X2 [k] is Nc = N1 + N2 − 1. (2.108)

X1 [k] X2 [k] Next, we zero-pad x1 [n] and x2 [n] so that their durations are
= { 10 × 11, (−2 + j2)(3 − j2), 2 × 3, (−2 − j2)(3 + j2) } equal to or greater than Nc . As we will see in Section 2-8 on how
the fast Fourier transform (FFT) is used to compute the DFT, it
= { 110, −2 + j10, 6, −2 − j10 }. is advantageous to choose the total length of the zero-padded
signals to be M such that M ≥ Nc , and simultaneously M is a
Application of Eq. (2.104) leads to
power of 2.
c x2 [n] = { 28, 21, 30, 31 }.
x1 [n] The zero-padded signals are defined as

(b) Per Eq. (2.71d), the linear convolution of x′1 [n] = {x1 [n], 0, . . . , 0}, (2.109a)
|{z} | {z }
N1 M−N1
x1 [n] = { 2, 1, 4, 3 }
x′2 [n] = {x2 [n], 0, . . . , 0}, (2.109b)
|{z} | {z }
and N2 M−N2
x2 [n] = { 5, 3, 2, 1 }
and their M-point DFTs are X′1 [k] and X′2 [k], respectively. The
is linear convolution y[n] can now be computed by a modified
version of Eq. (2.104), namely
y[n] = x1 [n] ∗ x2[n]
3 y[n] = x′1 [n] ∗ x′2[n]
= ∑ x1 [i] x2 [n − i]
i=0 = DFT−1 { X′1 [k] X′2 [k] }
= { 10, 11, 27, 31, 18, 10, 3 }. M−1
1
=
M ∑ X′1[k] X′2 [k] e j2π nk/M . (2.110)
Per Eq. (2.106), k=0

z[n] = { y[0] + y[4], y[1] + y[5], y[2] + y[6], y[3] } Note that the DFTs can be computed using M-point DFTs of
x′1 [n] and x′2 [n], since using an M-point DFT performs the zero-
= { 10 + 18, 11 + 10, 27 + 3, 31 } = { 28, 21, 30, 31 }. padding automatically.
2-7 DISCRETE FOURIER TRANSFORM (DFT) 75

and
Example 2-7: DFT Convolution
X′2 [3] = −2 + j2.

Multiplication of corresponding pairs gives


Given signals x1 [n] = {4, 5} and x2 [n] = {1, 2, 3}, (a) compute
their convolution in discrete time, and (b) compare the result X′1 [0] X′2 [0] = 9 × 6 = 54,
with the DFT relation given by Eq. (2.110). X′1 [1] X′2 [1] = (4 − j5)(−2 − j2) = −18 + j2,
X′1 [2] X′2 [2] = −1 × 2 = −2,
Solution:
(a) Application of Eq. (2.71a) gives and
3
x1 [n] ∗ x2[n] = ∑ x1 [i] x2 [n − i] = {4, 13, 22, 15}. X′1 [3] X′2 [3] = (4 + j5)(−2 + j2) = −18 − j2.
i=0
Application of Eq. (2.110) gives
(b) Since x1 [n] is of length N1 = 2 and x2 [n] is of length
N2 = 3, their convolution is of length Nc −1
1
y[n] = x′1 [n] ∗ x′2[n] = ∑ X′1 [k] X′2 [k] e j2π nk/Nc
Nc
Nc = N1 + N2 − 1 = 2 + 3 − 1 = 4. k=0
3
1
Hence, we need to zero-pad x1 [n] and x2 [n] as =
4 ∑ X′1 [k] X′2 [k] e jkπ n/2.
k=0
x′1 [n] = {4, 5, 0, 0}
Evaluating the summation for n = 0, 1, 2 and 3 leads to
and
x′2 [n] = {1, 2, 3, 0}. y[n] = x′1 [n] ∗ x′2[n] = {4, 13, 22, 15},

From Eq. (2.89a) with N = Nc = 4, the 4-point DFT of which is identical to the answer obtained earlier in part (a).
x′1 [n] = {4, 5, 0, 0} is For simple signals like those in this example, the DFT method
involves many more steps than does the straightforward con-
3 volution method of part (a), but for the type of signals used in
X1 [k] = ∑ x′1[n] e− jkπ n/2, k = 0, 1, 2, 3, practice, the DFT method is computationally superior.
n=0

which gives

X′1 [0] = 4(1) + 5( j) + 0(1) + 0( j) = 9,


X′1 [1] = 4(1) + 5(− j) + 0(−1) + 0( j) = 4 − j5,
X′1 [2] = 4(1) + 5(−1) + 0(1) + 0(−1) = −1,

and

X′1 [3] = 4(1) + 5( j) + 0(−1) + 0(− j) = 4 + j5.

Similarly, the 4-point DFT of x′2 [n] = {1, 2, 3, 0} gives

X′2 [0] = 6,
X′2 [1] = −2 − j2,
X′2 [2] = 2,
76 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Contrast these large number of multiplications and additions


◮ To summarize, the linear convolution y[n] = h[n] ∗ x[n] (MADs) with the number required using the FFT algorithm:
defined in Eq. (2.71) computes the response (output) y[n] for N large, the number of complex multiplications is reduced
to the input x[n] for an LTI system with impulse response from N 2 to approximately (N/2) log2 N, which is only 2304
h[n]. The cyclic convolution yc [n] = h[n]
c x[n] defined in complex multiplications for N = 512. For complex additions,
Eq. (2.104) is what the DFT maps to products, so a cyclic the number is reduced from N(N − 1) to N log2 N, or 4608
convolution can be computed very quickly using the FFT for N = 512. These reductions, thanks to the efficiency of the
algorithm to compute the DFTs H[k] of h[n] and X[k] of FFT algorithm, are on the order of 100 for multiplications and
x[n], and the inverse DFT of H[k] X[k]. Fortunately, a linear on the order of 50 for addition. The reduction ratios become
convolution can be zero-padded to a cyclic convolution, as increasingly more impressive at larger values of N (Table 2-10).
presented in Subsection 2-7.3, so linear convolutions can The computational efficiency of the FFT algorithm relies on a
also be computed quickly using the FFT algorithm. Cyclic “divide and conquer” concept. An N-point DFT is decomposed
convolutions will also be used in Chapter 7 for computing (divided) into two (N/2)-point DFTs. Each of the (N/2)-point
wavelet transforms, because the cyclic convolution of a DFTs is decomposed further into two (N/4)-point DFTs. The
signal x[n] of duration N with an impulse response h[n] (that decomposition process, which is continued until it reaches the
has been zero-padded to length N) gives an output yc [n] 2-point DFT level, is illustrated in the next subsections.
of the same length as that of the input x[n]. For wavelets,
the cyclic convolution approach is superior to the linear
convolution approach because the latter results in an output 2-8.1 2-Point DFT
longer than the input. ◭
For notational efficiency, we introduce the symbols

Exercise 2-13: Compute the 4-point DFT of {4, 3, 2, 1}. WN = e− j2π /N , (2.111a)
− j2π nk/N
Answer: {10, (2 − j2), 2, (2 + j2)}. (See IP ) WNnk =e , (2.111b)
and
WN−nk = e j2π nk/N . (2.111c)
2-8 Fast Fourier Transform (FFT)
Using this shorthand notation, the summations for the DFT, and
its inverse given by Eq. (2.89), assume the form
◮ The fast Fourier transform (FFT) is a computational
N−1
algorithm used to compute the discrete Fourier transforms
(DFT) of discrete signals. Strictly speaking, the FFT is X[k] = ∑ x[n] WNnk , k = 0, 1, . . . , N − 1, (2.112a)
n=0
not a transform, but rather an algorithm for computing the
transform. ◭ and

1 N−1
As was mentioned earlier, the fast Fourier transform (FFT) is x[n] = ∑ X[k]WN−nk ,
N k=0
n = 0, 1, . . . , N − 1. (2.112b)
a highly efficient algorithm for computing the DFT of discrete
time signals. An N-point DFT performs a linear transformation
from an N-long discrete-time vector, namely x[n], into an N- In this form, the N-long vector X[k] is given in terms of the
long frequency domain vector X[k] for k = 0, 1, . . . , N − 1. N-long vector x[n], and vice versa, with WNnk and WN−nk acting
Computation of each X[k] involves N complex multiplications, as weighting coefficients.
so the total number of multiplications required to perform the For a 2-point DFT,
DFT for all X[k] is N 2 . This is in addition to N(N − 1) complex
additions. For N = 512, for example, direct implementation of N = 2,
the DFT operation requires 262,144 multiplications and 261,632 W20k = e− j0 = 1,
complex additions. For small N, these numbers are smaller, and
since multiplication by any of { 1, −1, j, − j } does not count as
a true multiplication. W21k = e− jkπ = (−1)k .
2-8 FAST FOURIER TRANSFORM (FFT) 77

Table 2-10 Comparison of number of complex computations required by a standard DFT and an FFT using the formulas in the
bottom row.

N Multiplication Additions
Standard DFT FFT Standard DFT FFT
2 4 1 2 2
4 16 4 12 8
8 64 12 56 24
16 256 32 240 64
.. .. .. .. ..
. . . . .
512 262,144 2,304 261,632 4,608
1,024 1,048,576 5,120 1,047,552 10,240
2,048 4,194,304 11,264 4,192,256 22,528
N
N N2 log2 N N(N − 1) N log2 N
2

2-8.2 4-Point DFT


1
x[0] X[0]
1
1
x[1] X[1] For a 4-point DFT, N = 4 and
−1
WNnk = W4nk = e− jnkπ /2 = (− j)nk . (2.115)
Figure 2-16 Signal flow graph for a 2-point DFT.
From Eq. (2.112a), we have
3
X[k] = ∑ x[n] W4nk
Hence, Eq. (2.112a) yields the following expressions for X[0] n=0
and X[1]: = x[0] + x[1] W41k + x[2] W42k + x[3] W43k , (2.116)
X[0] = x[0] + x[1] (2.113a) k = 0, 1, 2, 3.
and
X[1] = x[0] − x[1], (2.113b) Upon evaluating W41k , W42k , and W43k and the relationships
between them, Eq. (2.116) can be cast in the form
which can be combined into the compact form
X[k] = [x[0] + (−1)k x[2]] + W41k [x[1] + (−1)k x[3]], (2.117)
| {z } | {z }
k
X[k] = x[0] + (−1) x[1], k = 0, 1. (2.114) 2-point DFT 2-point DFT

which consists of two 2-point DFTs: one that includes values of


The equations for X[0] and X[1] can be represented by the signal x[n] for even values of n, and another for odd values of n. At this
flow graph shown in Fig. 2-16, which is often called a butterfly point, it is convenient to define xe [n] and xo [n] as x[n] at even and
diagram. odd times:

xe [n] = x[2n], n = 0, 1, (2.118a)


xo [n] = x[2n + 1], n = 0, 1. (2.118b)
78 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

divided into four 4-point DFTs, which are just additions and
1 Xe[0] 1 X[0] subtractions. This conquers the 16-point DFT by dividing it into
xe[0] = x[0]
1 1 4-point DFTs and additional MADs.
1
Xe[1] 1 X[1]
xe[1] = x[2]
−1
W14
2-point DFT
A. Dividing a 16-Point DFT
1
xo[0] = x[1] 1 Xo[0] −1 X[2] We now show that the 16-point DFT can be computed for
1 even values of k using an 8-point DFT of (x[n] + x[n + 8]) and
1 1 for odd values of k using an 8-point DFT of the modulated
Xo[1] −W14 X[3] signal (x[n] − x[n + 8])e− j2π n/16. Thus, the 16-point DFT can
xo[1] = x[3]
−1 be computed as an 8-point DFT (for even values of k) and as a
2-point DFT Recomposition modulated 8-point DFT (for odd values of k).

Figure 2-17 Signal flow graph for a 4-point DFT. Weighting


coefficient W41 = − j. Note that summations occur only at red
intersection points.
B. Computation at Even Indices

We consider even and odd indices k separately.


Thus, xe [0] = x[0] and xe [1] = x[2] and, similarly, xo [0] = x[1] For even values of k, we can write k = 2k′ and split the
and xo [1] = x[3]. When expressed in terms of xe [n] and xo [n], 16-point DFT summation into two summations:
Eq. (2.117) becomes
15

k
X[2k′ ] = ∑ x[n] e− j2π (2k /16)n
X[k] = [xe [0] + (−1) xe [1]] n=0
| {z }
7 15
2-point DFT of xe [n] ′ ′
= ∑ x[n] e− j2π (2k /16)n + ∑ x[n] e− j2π (2k /16)n.
+ W41k [xo [0] + (−1)k xo [1]], (2.119) n=0 n=8
| {z } (2.120)
2-point DFT of xo [n]

k = 0, 1, 2, 3. Changing variables from n to n′ = n − 8 in the second summa-


tion, and recognizing 2k′ /16 = k′ /8, gives
The FFT computes the 4-point DFT by computing the two 7 7
′ ′ ′
2-point DFTs, followed by a recomposition step that involves
multiplying the even 2-point DFT by W41k and then adding it to
X[2k′ ] = ∑ x[n] e− j2π (k /8)n + ∑

x[n′ + 8] e− j2π (k /8)(n +8)
n=0 n =0
the odd 2-point DFT. The entire process is depicted by the signal 7

flow graph shown in Fig. 2-17. In the graph, Fourier coefficients = ∑ (x[n] + x[n + 8])e− j2π (k /8)n
Xe [0] and Xe [1] represent the outputs of the even 2-point DFT, n=0
and similarly, Xo [0] and Xo [1] represent the outputs of the odd = DFT({x[n] + x[n + 8], n = 0, . . . , 7}). (2.121)
2-point DFT.
So for even values of k, the 16-point DFT of x[n] is the 8-point
DFT of { x[n] + x[n + 8], n = 0, . . . , 7 }.
2-8.3 16-Point DFT
We now show how to compute a 16-point DFT using two 8-point
DFTs and 8 multiplications and additions (MADs). This divides
the 16-point DFT into two 8-point DFTs, which in turn can be
2-8 FAST FOURIER TRANSFORM (FFT) 79

C. Computation at Odd Indices • For even values of index k, we have


For odd values of k, we can write k = 2k′ + 1 and split the 16- X[0, 2, 4, 6] = DFT({ 7 + 8, 1 + 5, 4 + 3, 2 + 6 })
point DFT summation into two summations: = { 36, 8 + j2, 8, 8 − j2 }.
15

For odd index values, we need twiddle mults. The twiddle
X[2k′ + 1] = ∑ x[n] e− j2π (2k +1)/16n factors are given by { e− j2π n/8√} for n = 0, 1, 2, and 3, which
n=0 √
7

reduce to { 1, 22 (1 − j), − j, 22 (−1 − j) }.
= ∑ x[n] e− j2π (2k +1)/16n
n=0
• Implementing the twiddle mults gives
15
− j2π (2k′ +1)/16n
+ ∑ x[n] e . (2.122) { 7 − 8, 1 − 5, 4 − 3, 2 − 6 }
n=8 ( √ √ )
2 2
Changing variables from n to n′ = n − 8 in the second summa- × 1, (1 − j), − j, (−1 − j)
2 2
tion, and recognizing that e− j2π 8/16 = −1 and √ √
= { −1, 2 2(−1 + j), − j, 2 2(1 + j) }.
2k′ + 1 k′ 1
= + ,
16 8 16
• For odd values of index k, we have
gives √ √
X[1, 3, 5, 7] = DFT({ −1, 2 2(−1 + j), − j, 2 2(1 + j) })
7
′ = { −1 + j4.66, −1 + j6.66, −1 − j6.66, −1 − j4.66 }.
X[2k′ + 1] = ∑ (x[n] e− j2π (1/16)n )e− j2π (k /8)n
n=0
7
′ ′ ′ • Combining these results for even and odd k gives
+ ∑ (x[n′ + 8] e− j2π (1/16)(n +8) )e− j2π (k /8)(n +8)

n =0 DFT({ 7, 1, 4, 2, 8, 5, 3, 6 })
7
− j2π (k′/8)n = { 36, −1 + j4.7, 8 + j2, −1 + j6.7, 8,
= ∑ e− j2π (1/16)n(x[n] − x[n + 8])e − 1 − j6.7, 8 − j2, −1 − j4.7 }.
n=0

= DFT({e− j2π (1/16)n (x[n] − x[n + 8]), n = 0, . . . , 7}). Note the conjugate symmetry in the second and third lines:
(2.123) X[7] = X∗ [1], X[6] = X∗ [2], and X[5] = X∗ [3].
So for odd values of k, the 16-point DFT of x[n] is the 8-point
• This result agrees with direct MATLAB computation using
DFT of { e− j(2π /16)n(x[n] − x[n + 8]), n = 0, . . . , 7 }. The signal
{ x[n] − x[n + 8], n = 0, . . . , 7 } has been modulated through
multiplication by e− j(2π /16)n. The multiplications by e− j(2π /16)n fft([7 1 4 2 8 5 3 6]).
are known as twiddle multiplications (mults) by the twiddle
factors e− j(2π /16)n.
2-8.4 Dividing Up a 2N-Point DFT
We now generalize the procedure to a 2N-point DFT by dividing
Example 2-8: Dividing an 8-Point DFT it into two N-point DFTs and N twiddle mults.
into Two 4-Point DFTs (1) For even indices k = 2k′ we have:
N−1

Divide the 8-point DFT of { 7, 1, 4, 2, 8, 5, 3, 6 } into two 4-point


X[2k′ ] = ∑ (x[n] + x[n + N])e− j2π (k /N)n
n=0
DFTs and twiddle mults.
= DFT{ x[n] + x[n + N], n = 0, 1, . . . , N − 1 }. (2.124)
Solution:
80 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

(2) For odd indices k = 2k′ + 1 we have: 2-9.1 Deconvolution Procedure


N−1
′ If x[n] has duration M and h[n] has duration L, then y[n] has
X[2k′ + 1] = ∑ e− j2π (1/(2N))n(x[n] − x[n + N])e− j2π (k /N)n duration N = L+ M − 1. Let us define the zero-padded functions
n=0
= DFT{e− j2π (1/(2N))n(x[n] − x[n + N])}. (2.125) e
h[n] = {h[n], 0, . . . , 0 }, (2.127a)
| {z }
N−L zeros
Thus, a 2N-point DFT can be divided into
xe[n] = {x[n], 0, . . . , 0 }. (2.127b)
| {z }
• Two N-point DFTs, N−M zeros

• N multiplications by twiddle factors e− j2π (1/2N)n, and (1) With xe[n], e


h[n], and y[n] all now of duration N, we can obtain
their respective N-point DFTs, X[k],e e
H[k], and Y[k], which are
• 2N additions and subtractions. interrelated by
e X[k].
Y[k] = H[k] e (2.128)

2-8.5 Dividing and Conquering e and taking an N-point inverse DFT, we


Upon dividing by H[k]
have
Now suppose N is a power of two; e.g., N = 1024 = 210 . In
that case, we can apply the algorithm of the previous subsection xe[n] = DFT−1 {X[k]}
e
recursively to divide an N-point DFT into two N/2-point DFTs, ( )
then into four N/4-point DFTs, then into eight N/8-point DFTs, −1 Y[k]
= DFT
and so on until we reach the following 4-point DFTs: e
H[k]
( )
X[0] = x[0] + x[1] + x[2] + x[3], DFT{y[n]}
= DFT−1 . (2.129)
X[1] = x[0] − jx[1] − x[2] + jx[3], DFT{e h[n]}
(2.126)
X[2] = x[0] − x[1] + x[2] − x[3],
(2) Discarding the (N − M) final zeros in xe[n] gives x[n]. The
X[3] = x[0] + jx[1] − x[2] − jx[3]. zero-padding and unpadding processes allow us to perform the
e is nonzero
deconvolution problem for any system, provided H[k]
At each stage, half of the DFTs are modulated, requiring N/2
for all 0 ≤ k ≤ N − 1.
multiplications. So if N is a power of 2, then an N-point
DFT computed using the FFT will require approximately
(N/2) log2 (N) multiplications and N log2 (N) additions. These
2-9.2 FFT Implementation Issues
can be reduced slightly by recognizing that some multiplications
are simply multiplications by ±1 and ± j. (a) To use the FFT algorithm (Section 2-8) to compute the three
To illustrate the computational significance of the FFT, sup- DFTs, N should be rounded up to the next power of 2 because
pose we wish to compute a 32768-point DFT. Direct compu- the FFT can be computed more rapidly.
tation using Eq. (2.89a) would require (32768)2 ≈ 1.1 × 109
MADs. In contrast, computation using the FFT would require e may be zero, which
(b) In some cases, some of the values of H[k]
less than 32768
2 log2 (32768) ≈ 250,000 MADs, representing a is problematic because the computation of Y[k]/H[k] e would
computational saving of a factor of 4000! involve dividing by zero. A possible solution to the division-by-
zero problem is to change the value of N. Suppose H[k]e = 0 for
some value of index k, such as k = 3. This corresponds to H(Ω)
2-9 Deconvolution Using the DFT having a zero at 2π k/N for k = 3, because by definition, the DFT
is the DTFT sampled at Ω = 2π k/N for integers k. Changing N
Recall that the objective of deconvolution is to reconstruct the to, say, N + 1 (or some other suitable integer) means that the
input x[n] of a system from measurements of its output y[n] and DFT is now the DTFT H(Ω) sampled at Ω = 2π k/(N + 1), so
knowledge of its impulse response h[n]. That is, we seek to solve the zero at k = 3 when the order was N may now get missed with
y[n] = h[n] ∗ x[n] for x[n], given y[n] and h[n]. the sampling at the new order N + 1. Changing N to N + 1 may
2-9 DECONVOLUTION USING THE DFT 81

e
avoid one or more zeros in H[k], but it may also introduce new and
ones. It may be necessary to try multiple values of N to satisfy
e 6= 0 for all k.
the condition that H[k] Y[3] = 6(1) + 19( j) + 32(−1) + 21(− j) = −26 − j2.

The 4-point DFT of x[ñ] is, therefore,

Example 2-9: DFT Deconvolution e = Y[0] =


X[0]
78
= 13,
e
H[0] 6
e = Y[1] = −26 + j2
X[1] = 6 − j7,
e
H[1] −2 − j2
In response to an input x[n], an LTI system with an impulse
response h[n] = {1, 2, 3} generated an output e = Y[2] = −2 = −1,
X[2]
e
H[2] 2
y[n] = {6, 19, 32, 21}.
and
Determine x[n], given that it is of finite duration.
e = Y[3] = −26 − j2 = 6 + j7.
X[3]
Solution: The output is of duration N = 4, so we should zero- e −2 + j2
H[3]
pad h[n] to the same duration by defining
e is
By Eq. (2.89b), the inverse DFT of X[k]
e
h[n] = {1, 2, 3, 0}. (2.130)
1 3 e
From Eq. (2.89a), the 4-point DFT of e
h[n] is xe[n] = ∑ X[k]e j2π kn/4, n = 0, 1, 2, 3, (2.132)
4 k=0
3
e =
H[k] ∑ eh[n] e− j2π kn/4, k = 0, 1, 2, 3, (2.131) which yields
n=0 xe[n] = {6, 7, 0, 0}.
which yields Given that y[n] is of duration N = 4 and h[n] is of du-
ration L = 3, it follows that x[n] must be of duration
e = 1(1) + 2(1) + 3(1) + 0(1) = 6,
H[0] M = N − L + 1 = 4 − 3 + 1 = 2, if its duration is finite. Deletion
e = 1(1) + 2(− j) + 3(−1) + 0( j) = −2 − j2, of the zero-pads from xe[n] leads to
H[1]
e = 1(1) + 2(−1) + 3(1) + 0(−1) = 2,
H[2] x[n] = {6, 7}, (2.133)

and whose duration is indeed 2.

e = 1(1) + 2( j) + 3(−1) + 0(− j) = −2 + j2.


H[3] Example 2-10: Removal of Periodic Interference
Similarly, the 4-point DFT of y[n] = {6, 19, 32, 21} is
We are given the signal of two actual trumpets playing si-
3
− j2π kn/4 multaneously notes A and B. The goal is to use the DFT to
Y[k] = ∑ y[n] e , k = 0, 1, 2, 3,
eliminate the trumpet playing note A, while preserving the
n=0
trumpet playing note B. We only need to know that note B is
which yields at a higher frequency than note A.

Y[0] = 6(1) + 19(1) + 32(1) + 21(1) = 78, Solution: The two-trumpets signal time-waveform is shown
Y[1] = 6(1) + 19(− j) + 32(−1) + 21( j) = −26 + j2, in Fig. 2-18(a), and the corresponding spectrum is shown in
Fig. 2-18(b). We note that the spectral lines occur in pairs of
Y[2] = 6(1) + 19(−1) + 32(1) + 21(−1) = −2, harmonics with the lower harmonic of each pair associated with
82 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

nent of each pair of spectral lines to zero. The modified spectrum


x(t) is shown in Fig. 2-18(c). The inverse DFT of this spectrum,
Waveform of two-trumpet signal
followed by reconstruction to continuous time, is shown in
0.8
0.6 Fig. 2-18(d).
0.4 The filtering process eliminated the signal due to the trumpet
0.2 playing note A, while preserving the signal due to note B, almost
0
−0.2 completely. This can be confirmed by listening to the signals
−0.4 before and after filtering.
−0.6
t (ms) Whereas it is easy to distinguish between the harmonics of
(a) 23 24 25 26 27 note A and those of note B at lower frequencies, this is not
the case at higher frequencies, particularly when they overlap.
X[k] Hence, neither note can be eliminated without affecting the other
Spectrum of two-trumpet signal slightly. Fortunately, the overlapping high-frequency harmonics
0.10
Note A (440 Hz) contain very little power compared with the non-overlapping,
0.08
Note B (491 Hz) low-frequency harmonics, and therefore, their role is quite
0.06 insignificant.
0.04
0.02
0 k 2-10 Computation of Continuous-Time
(b) 0 1000 2000 3000 4000
Fourier Transform (CTFT) Using
X[k] (after filtering) the DFT
0.10 Spectrum of filtered two-trumpet signal Let us consider a continuous-time signal x(t) with support
0.08 [−T /2, T /2], which means that
0.06
T
0.04 x(t) = 0 for |t| > .
2
0.02
0 k Also, let us assume (for the time being) that the signal spectrum
(c) 0 1000 2000 3000 4000 X( f ) is bandwidth-limited to a maximum frequency F/2 in Hz,
which means that
xf (t)
Waveform of filtered two-trumpet signal F
X( f ) = 0 for | f | > .
0.6 2
0.4 Our goal is to compute samples {X(k∆ f )} of X( f ) at a
0.2 frequency spacing ∆ f from signal samples { x(n∆t ) } of x(t)
0
−0.2 recorded at time interval ∆t , and to do so using the DFT.
−0.4
t (ms)
(d) 23 24 25 26 27 2-10.1 Derivation Using Sampling Theorem
Twice
Figure 2-18 Removing the spectrum of note A.
Whereas it is impossible for a signal to be simultaneously
bandlimited and time-limited, as presumed earlier, many real-
world signals are approximately band- and time-limited.
According to the sampling theorem (Section 2-4), x(t) can
note A and the higher harmonic of each pair associated with be reconstructed from its samples { x(n∆t ) } if the sampling rate
note B. St = 1/∆t > 2 F2 = F. Applying the sampling theorem with t
Since we wish to eliminate note A, we set the lower compo- and f exchanged, X( f ) can be reconstructed from its samples
2-10 COMPUTATION OF CONTINUOUS-TIME FOURIER TRANSFORM (CTFT) USING THE DFT 83

{ X(k∆ f ) } if its sampling rate S f = 1/∆ f > 2 T2 = T . In the exponent of Eq. (2.136), the expression becomes
In the sequel, we use the minimum sampling intervals
M
∆t = 1/F and ∆ f = 1/T . Finer discretization can be achieved
Xs (k∆ f ) = ∑ x(n∆t ) e− j2π (k∆ f )(n∆t ) , |k| ≤ M
by simply increasing F and/or T . In practice, F and/or T is (are)
n=−M
increased slightly so that N = FT is an odd integer, which makes
M
the factor M = (N − 1)/2 also an integer (but not necessarily an x(n∆t ) e− j2π nk/(2M+1) ,
odd integer). The factor M is related to the order of the DFT,
= ∑ |k| ≤ M. (2.140)
n=−M
which has to be an integer.
The Fourier transform of the synthetic sampled signal xs (t), This expression looks like a DFT of order 2M + 1. Recall from
defined in Eq. (2.43) and repeated here as the statement in connection with Eq. (2.47) that the spectrum
Xs ( f ) of the sampled signal includes the spectrum X( f ) of the

continuous-time signal (multiplied by the sampling rate St ) plus
xs (t) = ∑ x(n∆t ) δ (t − n∆t ), (2.134)
additional copies repeated every ±St along the frequency axis.
n=−∞
With St = 1/∆t = F in the present case,
was computed in Eq. (2.53), and also repeated here as
Xs ( f ) = FX( f ), for | f | < F, (2.141)

Xs ( f ) = ∑ x(n∆t ) e− j2π f n∆t . (2.135) from which we deduce that
n=−∞
M
Setting f = k∆ f gives X(k∆ f ) = Xs ( f ) ∆t = ∑ x(n∆t ) e− j2π nk/(2M+1) ∆t , |k| ≤ M.
n=−M

(2.142)
Xs (k∆ f ) = ∑ x(n∆t ) e− j2π (k∆ f )(n∆t ) . (2.136) Ironically, this is the same result that would be obtained by
n=−∞
simply discretizing the definition of the continuous-time Fourier
Noting that x(t) = 0 for |t| > T2 and X( f ) = 0 for | f | > F transform! But this derivation shows that discretization gives the
2, we
restrict the ranges of n and k to exact result if x(t) is time- and bandlimited.

T /2 FT N
|n| ≤ = = (2.137a)
∆t 2 2 Example 2-11: Computing CTFT by DFT
and
F/2 FT N
|k| ≤ = = . (2.137b) Use the DFT to compute the Fourier transform of the continuous
∆f 2 2
Gaussian signal
Next, we introduce factor M defined as 1 2
x(t) = √ e−t /2 .
N −1 2π
M= , (2.138)
2
Solution: Our first task is to assign realistic values for the
and we note that if N is an odd integer, M is guaranteed to be signal duration T and the width of its spectrum F. It is an
an integer. In view of Eq. (2.137), the ranges of n and k become “educated” trial-and-error process. At t = 4, x(t) = 0.00013,
n, k = −M, . . . , M. Upon substituting so we will assume that x(t) ≈ 0 for |t| > 4. Since x(t) is
symmetrical with respect to the vertical axis, we assign
1 1 1 1 1
∆t ∆ f = = = = . (2.139)
FT FT N 2M + 1 T = 2 × 4 = 8 s.
2 2
The Fourier transform of x(t) is X( f ) = e−2π f . By trial and
error, we determine that F = 1.2 Hz is sufficient to characterize
X( f ). The combination gives

N = T F = 8 × 1.2 = 9.6.
84 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

X[k]

0.8

0.6

0.4

0.2

k
−4 −3 −2 −1 0 1 2 3 4

Figure 2-19 Comparison of exact (blue circles) and DFT-computed (red crosses) of the continuous-time Fourier transform of a Gaussian
signal.

To increase the value of N to an odd integer, we increase F to


Concept Question 2-13: The DFT is often used to com-
1.375 Hz, which results in N = 11 and M = 5. In Fig. 2-19
pute the CTFT numerically. Why does this often work as
computed values of the discretized spectrum of x(t) are com-
well as it does?
pared with exact values based on evaluating the analytical
2 2
expression for X( f ) = e−2π f . The comparison provides an
excellent demonstration of the power of the sampling theorem;
representing x(t) by only 11 equally spaced samples is sufficient
to capture its information content and generate its Fourier
transform with high fidelity.

2-10.2 Practical Computation of X( f ) Using the


DFT
Example 2-11 shows that simple discretization of the
continuous-time Fourier transform works very well, provided
that the discretization lengths in time and frequency are chosen
properly. The integer N was odd for clarity of derivation. In
practice, we would increase T and F so that N = T F is a power
of two, permitting the fast Fourier transform (FFT) algorithm to
be used to compute the DFT quickly.

Concept Question 2-12: Why do we need a discrete


Fourier transform (DFT)?
2-10 COMPUTATION OF CONTINUOUS-TIME FOURIER TRANSFORM (CTFT) USING THE DFT 85

Summary
Concepts

• Many 2-D concepts can be understood more easily by h(t) to input x(t) is output y(t) = h(t)∗ x(t), and similarly
reviewing their 1-D counterparts. These include: LTI sys- in discrete time.
tems, convolution, sampling, continuous-time; discrete- • The response of an LTI system with impulse response
time; and discrete Fourier transforms. h(t) to input A cos(2π f0t + θ ) is
• The DTFT is periodic with period 2π .
A|H( f )| cos(2π f0t + θ + H( f )),
• Continuous-time signals can be sampled to discrete-time
signals, on which discrete-time signal processing can be where H( f ) is the Fourier transform of h(t), and
performed. similarly in discrete time.
• The response of an LTI system with impulse response

Mathematical Formulae
Impulse Sinc interpolation formula
1  t  ∞
δ (x) = lim
ε →0 2ε
rect

x(t) = ∑ x(n∆) sinc(S(t − n∆))
n=−∞
Energy of x(t) Discrete-time Fourier transform (DTFT)
Z ∞
2 ∞
E= |x(t)| dt x[n] e− jΩn
−∞
X(Ω) = ∑
n=−∞
Convolution Z ∞ Inverse DTFTZ
y(t) = h(t) ∗ x(t) = h(τ )x(t − τ ) d τ 1 π
−∞ x[n] = X(Ω) e jΩn dΩ
2 π −π
Convolution
∞ Discrete-time sinc
 
y[n] = h[n] ∗ x[n] = ∑ h[i] x[n − i]
h[n] =
Ω0
sinc
Ω0 n
i=−∞ π π
Fourier transform
Z ∞ Discrete sinc
X( f ) = x(t) e− j2π f t dt sin((2N + 1)Ω/2)
−∞ X(Ω) =
sin(Ω/2)
Inverse Fourier transform
Z ∞ Discrete Fourier Transform (DFT)
j2π f t M−1
x(t) = X( f ) e df
−∞ X[k] = ∑ x[n] e− j2π nk/N
n=0
Sinc function
sin(π x) Inverse DFT
sinc(x) =
πx 1 N−1
x[n] = ∑ X[k] e j2π nk/N
N k=0
Ideal lowpass filter impulse response
h(t) = 2 fc sinc(2 fc t) Cyclic convolution
N−1
Sampling theorem yc [n] = x1 [n]
c x2 [n] = ∑ x1 [n1] x2 [(n − n1)N ]
1 n1 =0
Sampling rate S = > 2B if X( f ) = 0 for | f | > B

86 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

Important Terms Provide definitions or explain the meaning of the following terms:
aliasing FFT Parseval’s theorem spectrum
cyclic convolution Fourier transform Rayleigh’s theorem zero padding
deconvolution frequency response function sampled signal
DFT impulse response convolution sampling theorem
DTFT linear time-invariant (LTI) sinc function

PROBLEMS 2.8 Compute a Nyquist sampling rate for reconstructing signal

Section 2-2: Review of Continuous-Time Systems sin(40π t) sin(60π t)


x(t) =
π 2t 2
2.1 Compute the following convolutions:
from its samples.
(a) e−t u(t) ∗ e−2t u(t)
2.9 Signal
(b) e−2t u(t) ∗ e−3t u(t)
(c) e−3t u(t) ∗ e−3t u(t) sin(2π t)
x(t) = [1 + 2 cos(4π t)]
πt
Section 2-3: 1-D Fourier Transforms
is sampled every 1/6 second. What is the spectrum of the
sampled signal?
2.2 Show that the spectrum of
2.10 Signal x(t) = cos(14π t) − cos(18π t) is sampled at 16
sin(20π t) sin(10π t) sample/s. The result is passed through an ideal brick-wall
πt πt lowpass filter with a cutoff frequency of 8 Hz. What is the
spectrum of the output signal?
is zero for | f | > 15 Hz.
2.3 Using only Fourier transform properties, show that 2.11 Signal x(t) = sin(30π t) + sin(70π t) is sampled at 50
sample/s. The result is passed through an ideal brick-wall
sin(10π t) sin(30π t) lowpass filter with a cutoff frequency of 25 Hz. What is the
[1 + 2 cos(20π t)] = . spectrum of the output signal?
πt πt

2.4 If x(t) = sin(2t)/(π t), compute the energy of d 2 x/dt 2 . Section 2-5: Review of Discrete-Time Signals and
2.5 Compute the energy of e−t u(t) ∗ sin(t)/(π t). Systems
2.6 Show that Z ∞ 2.12 Compute the following convolutions:
sin2 (at) a
dt = (a) {1, 2} ∗ {3, 4, 5}
−∞ (π t)2 π
if a > 0. (b) {1, 2, 3} ∗ {4, 5, 6}
(c) {2, 1, 4} ∗ {3, 6, 5}
Section 2-4: The Sampling Theorem 2.13 If {1, 2, 3} ∗ x[n] = {5, 16, 34, 32, 21}, compute x[n].

2.7 The spectrum of the trumpet signal for note G (784 Hz) 2.14 Given the two systems connected in series as
is negligible above its ninth harmonic. What is the Nyquist
sampling rate required for reconstructing the trumpet signal x[n] h1 [n] w[n] = 3x[n] − 2x[n − 1],
from its samples?
PROBLEMS 87

and Section 2-6: Discrete-Time Fourier Transform


(DTFT)
w[n] h2 [n] y[n] = 5w[n] − 4w[n − 1],
2.20 Compute the DTFTs of the following signals (simplify
answers to sums of sines and cosines).
compute the overall impulse response.
(a) {1, 1, 1, 1, 1}
2.15 The two systems (b) {3, 2, 1}
y[n] = 3x[n] − 2x[n − 1] 2.21 Compute the inverse DTFT of

and X(Ω) = [3 + 2 cos(Ω) + 4 cos(2Ω)] + j[6 sin(Ω) + 8 sin(2Ω)].


y[n] = 5x[n] − 4x[n − 1]
are connected in parallel. Compute the overall impulse response. 2.22 Compute the inverse DTFT of

X(Ω) = [7 + 5 cos(Ω) + 3 cos(2Ω)] + j[sin(Ω) + sin(2Ω)].


Section 2-6: Discrete-Time Frequency Response
2.16 Given
Section 2-7: Discrete Fourier Transform (DFT)
π  2.23 Compute the DFTs of each of the following signals:
cos n y[n] = x[n] + 0.5x[n − 1] + x[n − 2] y[n], (a) {12, 8, 4, 8}
2
(b) {16, 8, 12, 4}
(a) Compute the frequency response H(Ω). 2.24 Determine the DFT of a single period of each of the
(b) Compute the output y[n]. following signals:
2.17 Given (a) cos( π4 n)
π  (b) 1
4 sin( 34π n)
cos n y[n] = 8x[n] + 3x[n − 1] + 4x[n − 2] y[n], 2.25 Compute the inverse DFTs of the following:
2
(a) {0, 0, 3, 0, 4, 0, 3, 0}
(a) Compute the frequency response H(Ω). (b) {0, 3 + j4, 0, 0, 0, 0, 0, 3 − j4}
(b) Compute the output y[n].
Section 2-9: Deconvolution Using the DFT
2.18 If input x[n] = cos( π2 n) + cos(π n), and
2.26 Use DFTs to compute the convolution
x[n] y[n] = x[n] + x[n − 1] + x[n − 2] + x[n − 3] y[n],
{1, 3, 5} ∗ {7, 9}

(a) Compute the frequency response H(Ω). by hand.


(b) Compute the output y[n]. 2.27 Solve each of the following deconvolution problems for
input x[n]. Use MATLAB.
2.19 If input x[n] = 1 + 2 cos( π2 n) + 3 cos(π n), and
(a) x[n] ∗ {1, 2, 3} = {7, 15, 27, 13, 24, 27, 34, 15}.
(b) x[n] ∗ {1, 3, 5} = {3, 10, 22, 18, 28, 29, 52, 45}.
x[n] y[n] = x[n] + 4x[n − 1] + 3x[n − 3] y[n], (c) x[n] ∗ {1, 4, 2, 6, 5, 3} =
{2, 9, 11, 31, 48, 67, 76, 78, 69, 38, 12}.
(a) Compute the frequency response H(Ω). 2.28 Solve each of the following deconvolution problems for
(b) Compute the output y[n]. input x[n]. Use MATLAB.
88 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS

(a) x[n] ∗ {3, 1, 4, 2} = {6, 23, 18, 57, 35, 37, 28, 6}.
(b) x[n] ∗ {1, 7, 3, 2} = {2, 20, 53, 60, 53, 54, 21, 10}.
(c) x[n] ∗ {2, 2, 3, 6} =
{12, 30, 42, 71, 73, 43, 32, 45, 42}.

Section 2-10: Computation of CTFT Using the


DFT

2.29 Use a 40-point DFT to compute the inverse Fourier


transform of  
sin(π f ) 2
X( f ) = .
πf
Assume that X( f ) ≈ 0 for | f | > 10 Hz and x(t) ≈ 0 for |t| > 1 s.
Plot the actual and computed inverse Fourier transforms on the
same plot to show the close agreement between them.
2.30 Use an 80-point DFT to compute the inverse Fourier
transform of
 
sin(π f ) 2
X( f ) H( f ) = (1 + e−2 jπ f ).
πf

Assume that X( f ) ≈ 0 for | f | > 10 Hz and x(t) ≈ 0 for |t| > 2 s.


Plot the actual and computed inverse Fourier transforms on the
same plot to show the close agreement between them.
Chapter 3
3 2-D Images and Systems
Contents Ω2

Overview, 90
3-1 Displaying Images, 90
3-2 2-D Continuous-Space Images, 91
3-3 Continuous-Space Systems, 93
3-4 2-D Continuous-Space Fourier Transform
(CSFT), 94 Ω1
3-5 2-D Sampling Theorem, 107
3-6 2-D Discrete Space, 113
3-7 2-D Discrete-Space Fourier Transform
(DSFT), 118
3-8 2-D Discrete Fourier Transform (2-D DFT), 119
3-9 Computation of the 2-D DFT Using
−4π
MATLAB, 126 −4π 4π
Problems, 86

Objectives This chapter presents the 2-D versions, suitable for


Learn to: image processing, of the 1-D concepts presented in
Chapter 2. These include: linear shift-invariant
■ Compute the output image from an LSI system to a (LSI) systems, 2-D convolution, spatial frequency
given input image using convolution. response, filtering, Fourier transforms for continu-
ous and discrete-space images, and the 2-D samp-
■ Compute the continuous-space Fourier transform of ling theorem. It also covers concepts that do not
an image or point-spread function. arise in 1-D, such as separability, image scaling and
rotation, and representation of discrete-space
■ Use the 2-D sampling theorem to convert a continu-
images using various coordinate systems.
ous-space image to a discrete-space image.

■ Perform the two tasks listed above for continuous-


space images on discrete-space images.
Overview
A 2-D image is a signal that varies as a function of the two
spatial dimensions, x and y, instead of time t. A 3-D image, such 3.5
as a CT scan (Fig. 1-24), is a signal that varies as a function of 3
(x, y, z). 2.5
In 1-D, it is common practice to assign the symbol x(t) to 2
represent the signal at the input to a system, and to assign y(t) 1.5
to the output signal [or x[n] and y[n] if the signals are discrete]. 1
Because x, y, and z are the standard symbols of the Cartesian 0.5
coordinate system, a different symbolic representation is used 0
with 2-D images: 19
x 10
10 y
19 0
◮ Image intensity is represented by f (x, y), where (x, y) are
two orthogonal spatial dimensions. ◭ (a) Mesh plot format

0
1
This chapter extends the 1-D definitions, properties, and trans-
3
formations covered in the previous chapter into their 2-D
equivalents. It also presents certain 2-D properties that have no 5
counterparts in 1-D. 7
9
x
11
3-1 Displaying Images
13
In 1-D, a continuous-time signal x(t) is displayed by plotting 15
x(t) versus t. A discrete-time signal x[n] is displayed using a 17
stem plot of x[n] versus n. Clearly such plots are not applicable
19
for 2-D images.
0 1 3 5 7 9 11 13 15 17 19
Image intensity f (x, y) of a 2-D image can be displayed either y
as a 3-D mesh plot (Fig. 3-1(a)), which hides some features of
the image and is difficult to create and interpret, or as a grayscale (b) Grayscale format
image (Fig. 3-1(b)). In a grayscale image, the image intensity is
scaled so that the minimum value of f (x, y) is depicted in black Figure 3-1 An image displayed in (a) mesh plot format and (b)
and the maximum value of f (x, y) is depicted in white. If the grayscale format.
image is non-negative ( f (x, y) ≥ 0), as is often the case, black in
the grayscale image denotes zero values of f (x, y). If the image
is not non-negative, zero values of f (x, y) appear as a shade of depicting the infrared intensity emitted by a hot air balloon.
gray. We should not confuse a false-color image with a true-color
MATLAB’s imagesc(X),colormap(gray) displays image. Whereas a false-color image is a single grayscale image
the 2-D array X as a grayscale image in which black depicts (1 channel) displayed in color, a true-color image actually is a
the minimum value of X and white depicts the maximum value set of three images (3 channels):
of X.
It is also possible to display an image f (x, y) as a false-color { fred (x, y), fgreen (x, y), fblue (x, y)},
image, in which case different colors denote different values
of f (x, y). The relation between color and values of f (x, y) is representing (here) the three primary colors: red, green and
denoted using a colorbar, to the side or bottom of the image. An blue. Other triplets of colors, such as yellow, cyan and magenta,
example of a false-color display was shown earlier in Fig. 1-11, can also be used. Hence, image processing of color images

90
3-2 2-D CONTINUOUS-SPACE IMAGES 91

encompasses 3-channel image processing (see Chapter 10). B. Box Image


A grayscale image can be regarded as a still of a black-and-
white TV image, while a color image can be regarded as a still The box image fBox (x, y) is the 2-D pulse-equivalent of the
of a color TV image. Before flat-panel displays were invented, rectangular function rect(t) and is defined as
color images on TVs and monitors were created from three
separate signals using three different electron guns in picture fBox (x, y) = rect(x) rect(y)
(
tubes, while black-and-white TV images were created using a 1 for |x| < 1/2 and |y| < 1/2,
single electron gun in a picture tube. Modern solid-state color =
0 otherwise.
display, such as liquid crystal displays (LCDs), are composed
of three interleaved displays (similar to the APS display in
Fig. 1-4), each driven by one of the three signal channels. By extension, a box image of widths (ℓx , ℓy ) and centered at
(x0 , y0 ) is defined as
     
Concept Question 3-1: What is the difference between a x − x0 y − y0 x − x0 y − y0
true-color image and a false-color image? fBox , = rect rect
ℓx ℓy ℓx ℓy
(
1 for |x − x0| < ℓx /2 & |y − y0| < ℓy /2,
Exercise 3-1: How can you tell whether an image displayed = (3.2)
0 otherwise,
in color is a color image or a false-color image?
Answer: False-color images should have colorbars to and shown in Fig. 3-2(a).
identify the numerical value associated with each color.

3-2 2-D Continuous-Space Images


A continuous-space image is a physical quantity, such as tem- y
perature or pressure, that varies with spatial position in 2-D. ℓx
Mathematically a continuous-space image is a function f (x, y)
of spatial position (x, y), where x and y have units of length y0 ℓy
(meters).

x0 x
3-2.1 Fundamental 2-D Images
(a) f Box x − x0 , y − y0 = rect x − x0 rect y − y0
A. Impulse ( ℓx ℓy ) ℓx ( ) ( )ℓy
A 2-D impulse δ (x, y) is simply y

δ (x, y) = δ (x) δ (y),


a/2
y0
and a 2-D impulse shifted by ξ along x and by η along y is

δ (x − ξ , y − η ) = δ (x − ξ ) δ (y − η ). x0 x

(b) f Disk x − x0 , y − y0
The sifting property generalizes directly from 1-D to 2-D. In
2-D ( a a )
Z ∞Z ∞
f (ξ , η ) δ (x − ξ , y − η ) d ξ d η = f (x, y). (3.1) Figure 3-2 (a) Box image of widths (ℓx , ℓy ) and centered at
−∞ −∞
(x0 , y0 ) and (b) disk image of radius a/2 and centered at (x0 , y0 ).
92 CHAPTER 3 2-D IMAGES AND SYSTEMS

C. Disk Image
(0,0) x
Being rectangular in shape, the box-image function is suitable
for applications involving Cartesian coordinates, such as shifting
the box sideways or up and down across the image. Some
applications, however, require the use of polar coordinates, in
which case the disk-image function is more suitable. The disk y
image fDisk (x, y) of radius 1/2 is defined as
( p (a) Origin at top left and y axis downward
1 for px2 + y2 < 1/2,
fDisk (x, y) = (3.3)
0 for x2 + y2 > 1/2. y

The expression given by Eq. (3.3) pertains to a circular disk


centered at the origin and of radius 1/2. For the more general
case of a circular disk of radius a/2 and centered at (x0 , y0 ),
  (0,0) x
x − x0 y − y0
fDisk , =
a a (b) Origin at bottom left and y axis upward
( p
1 for p(x − x0)2 + (y − y0)2 < a/2,
(3.4) y
0 for (x − x0)2 + (y − y0)2 > a/2.

An example is displayed in Fig. 3-2(b). (0,0)


x
3-2.2 Properties of Images
A. Generalizations of 1-D Properties (c) Origin at center of image
1. Spatial shift
Figure 3-3 Three commonly used image coordinate systems.
When shifted spatially by (x0 , y0 ), image f (x, y) becomes
f (x − x0 , y − y0). Assuming the x axis is pointing to the right,
image f (x − x0 , y − y0 ) is shifted to the right by x0 , relative to
f (x, y), if x0 is positive, and to the left by the same amount if x0 • Point spread functions (PSFs)—introduced in Section
is negative. 3-3.2—are usually displayed with the origin at the center,
Whereas it is customary to define the direction of the x axis as as in Fig. 3-3(c).
pointing to the right, there is less uniformity with regard to the
definition of the y axis; sometimes the y axis is defined along the • Image spectra, defined in Section 3-4, are usually dis-
upward direction, and in other cases it is defined to be along the played as in Fig. 3-3(c).
downward direction. Hence, if y0 is positive, f (x − x0 , y − y0 )
is shifted by y0 either upwards or downwards, depending on the 2. Spatial scaling
direction of the y axis.
As to the location of the origin (x, y) = (0, 0), it is usually When spatially scaled by (ax , ay ), image f (x, y) becomes
defined to be at any one of the following three locations: f (ax x, ay y). If ax > 1, the image is shrunk in the x direction by
upper left corner, lower left corner, or the center of the image a factor ax , and if 0 < ax < 1, the image is magnified in size
(Fig. 3-3). by 1/ax . So ax represents a shrinkage factor. If ax < 0, the image
In this book: is reversed in the x direction, in addition to being shrunk by a
factor of |ax |. The same comments apply to ay .
• Images f (x, y) are usually displayed with the origin (0, 0)
at the upper-left corner, as in Fig. 3-3(a).
3-3 CONTINUOUS-SPACE SYSTEMS 93

3. Image energy
y
Extending the expression for signal energy given by Eq. (3.7)
from 1-D to 2-D leads to x′
y′ y
x′ = x cos θ + y sin θ
Z ∞Z ∞ x′ y′ = −x sin θ + y cos θ
E= | f (x, y)|2 dx dy. (3.5) θ
−∞ −∞ y′
x x

4. Even-odd decomposition
A real-valued image f (x, y) can be decomposed into its even Figure 3-4 Rotation of coordinate system (x, y) by angle θ to
fe (x, y) and odd fo (x, y) components: coordinate system (x′ , y′ ).

f (x, y) = fe (x, y) + fo (x, y), (3.6)

where These can be combined into


fe (x, y) = [ f (x, y) + f (−x, −y)]/2, (3.7a)  ′   
x cos θ sin θ x
fo (x, y) = [ f (x, y) − f (−x, −y)]/2. (3.7b) = . (3.11)
y′ − sin θ cos θ y

Hence, after rotation by angle θ , image f (x, y) becomes trans-


B. Non-Generalizations of 1-D Properties formed into a new image g(x, y) given by
We now introduce two new properties of images, separability g(x, y) = f (x′ , y′ ) = f (x cos θ + y sin θ , y cos θ − x sin θ ) (3.12)
and rotation, neither one of which has a 1-D counterpart.
Note that rotating an image usually assumes that the image has
1. Separability been shifted so that the center of the image is at the origin (0, 0),
as in Fig. 3-3(c).
An image f (x, y) is separable if it can be written as a product of
separate functions f1 (x) and f2 (y):
Exercise 3-2: Which of the following images is separable:
f (x, y) = f1 (x) f2 (y). (3.8) (a) 2-D impulse; (b) box; (c) disk?
Answer: 2-D impulse and box are separable; disk is not
As we will see later, the 2-D Fourier transform of a separable im-
separable.
age can be computed as a product of the 1-D Fourier transforms
of f1 (x) and f2 (y). 2-D impulses and box images are separable,
whereas disk images are not. Exercise 3-3: Which of the following images is invariant
to rotation: (a) 2-D impulse; (b) box; (c) disk with center at
2. Rotation the origin (0, 0)?

To rotate an image by an angle θ , we define rectangular Answer: 2-D impulse and disk are invariant to rotation;
coordinates (x′ , y′ ) as the rectangular coordinates (x, y) rotated box is not invariant to rotation.
by angle θ . Using the sine and cosine addition formulae, the
rotated coordinates (x′ , y′ ) are related to coordinates (x, y) by
(Fig. 3-4)
x′ = x cos θ + y sin θ (3.9)
3-3 Continuous-Space Systems
and A continuous-space system is a device or mathematical model
y′ = −x sin θ + y cos θ . (3.10) that accepts as an input an image f (x, y) and produces as an
94 CHAPTER 3 2-D IMAGES AND SYSTEMS

output an image g(x, y). and if the system also is shift-invariant, the 2-D superposition
integral simplifies to the 2-D convolution given by
Z ∞Z ∞
f (x, y) SYSTEM g(x, y).
g(x, y) = f (ξ , η ) h(x − ξ , y − η ) d ξ d η
−∞ −∞
The image rotation transformation described by Eq. (3.12) is a = f (x, y) ∗ ∗h(x, y), (3.15a)
good example of such a 2-D system.
where the “double star” in f (x, y) ∗ ∗h(x, y) denotes the 2-D
convolution of the PSF h(x, y) with the input image f (x, y). In
symbolic form, the 2-D convolution is written as
3-3.1 Linear and Shift-Invariant (LSI) Systems
f (x, y) LSI g(x, y) = f (x, y) ∗ ∗h(x, y). (3.15b)
The definition of the linearity property of 1-D systems (Section
2-2.1) extends directly to 2-D spatial systems, as does the
definition for the invariance, except that time invariance in 1-D
systems becomes shift invariance in 2-D systems. Systems ◮ A 2-D convolution consists of a convolution in the x
that are both linear and shift-invariant are termed linear shift- direction, followed by a convolution in the y direction, or
invariant (LSI). vice versa. Consequently, the 1-D convolution properties
listed in Table 2-3 generalize to 2-D. ◭

3-3.2 Point Spread Function (PSF)


Concept Question 3-2: Why do so many 1-D convolu-
The point spread function (PSF) of a 2-D system is essentially tion properties generalize directly to 2-D?
its 2-D impulse response. The PSF h(x, y; x0 , y0 ) of an image
system is its response to a 2-D impulse δ (x, y) shifted by Exercise 3-4: The operation of a 2-D mirror is described by
(x0 , y0 ): g(x, y) = f (−x, −y). Find the PSF of the mirror.
Answer: h(x, y; ξ , η ) = δ (x + ξ ; y + η ).
δ (x − x0 , y − y0 ) SYSTEM h(x, y; x0 , y0 ). (3.13a)
Exercise 3-5: If g(x, y) = h(x, y) ∗ ∗ f (x, y), what is
If the system is also shift-invariant (SI), Eq. (3.13a) becomes 4h(x, y) ∗ ∗ f (x − 3, y − 2) in terms of g(x, y)?
Answer: 4g(x − 3, y − 2), using the shift and scaling
δ (x − x0 , y − y0) SI h(x − x0, y − y0). (3.13b) properties of 1-D convolutions.

3-3.3 2-D Convolution 3-4 2-D Continuous-Space Fourier


Transform (CSFT)
For a linear system, extending the 1-D superposition integral
expression given by Eq. (2.13) to 2-D gives
The 2-D continuous-space Fourier transform (CSFT) operates
between the spatial domain (x, y) and the spatial frequency
f (x, y) L g(x, y) = domain (µ , v), where µ and v are called spatial frequencies or
Z ∞Z ∞ wavenumbers, with units of cycles/meter, analogous to the units
f (ξ , η ) h(x, y; ξ , η ) d ξ d η , (3.14) of cycles/s (i.e., Hz) for the time frequency f .
−∞ −∞ The 2-D CSFT is denoted F(µ , v) and it is related to f (x, y)
3-4 2-D CONTINUOUS-SPACE FOURIER TRANSFORM (CSFT) 95

by
◮ The spectrum F(µ , ν ) of an image f (x, y) is its 2-D
Z ∞Z ∞
CSFT. The spatial frequency response H(µ , ν ) of an LSI
F(µ , v) = f (x, y) e− j2π (µ x+vy) dx dy. (3.16a) 2-D system is the 2-D CSFT of its PSF h(x, y). ◭
−∞ −∞

The inverse operation is given by


Z ∞Z ∞ 3-4.1 Notable 2-D CSFT Pairs and Properties
f (x, y) = F(µ , v) e j2π (µ x+vy) d µ dv, (3.16b)
−∞ −∞ A. Conjugate Symmetry
and the combination of the two operations is represented sym- The 2-D CTFT of a real-valued image f (x, y) obeys the conju-
bolically by gate symmetry property
f (x, y) F(µ , v).
F∗ (µ , v) = F(−µ , −v), (3.18)
In the 1-D continuous-time domain, we call the output of
an LTI system its impulse response h(t) when the input is an which states that the 2-D Fourier transform F(µ , v), must be
impulse: reflected across both the µ and v axes to produce its complex
δ (t) LTI h(t), conjugate.

and we call the Fourier transform of h(t) the frequency response B. Separable Images
of the system, H( f ):
◮ The CSFT of a separable image f (x, y) = f1 (x) f2 (y) is
h(t) H( f ). itself separable in the spatial frequency domain:
The analogous relationships for a 2-D LSI system are f1 (x) f2 (y) F1 (µ ) F2 (v). (3.19)

δ (x, y) LSI h(x, y), (3.17a) This assertion follows directly from the definition of the CSFT
given by Eq. (3.16a):
h(x, y) H(µ , v). (3.17b)
Z ∞Z ∞
The CSFT H(µ , v) is called the spatial frequency response of F(µ , v) = f (x, y) e− j2π (µ x+vy) dx dy
−∞ −∞
the LSI system. Z ∞ Z ∞
= f1 (x) e− j2π µ x dx f2 (y) e− j2π vy dy
−∞ −∞
◮ As in 1-D, the 2-D Fourier transform of a convolution = F1 (µ ) F2 (v). (3.20)
of two functions is equal to the product of their Fourier
transforms:
◮ The CSFT pairs listed in Table 3-1 are all separable
functions, and can be obtained by applying Eq. (3.20) to the
f (x, y) LTI g(x, y) = h(x, y) ∗ ∗ f (x, y) 1-D Fourier transform pairs listed in Table 2-5. CSFT pairs
for non-separable functions are listed later in Table 3-2. ◭
implies that

G(µ , v) = H(µ , v) F(µ , v). C. Sinusoidal Image


Consider the sinusoidal image described by
All of the 1-D Fourier transform properties listed in Table 2-4 f (x, y) = cos(2π µ0 x) cos(2π v0 y),
and all of the 1-D transform pairs listed in Table 2-5 generalize
readily to 2-D. The 2-D version of the two tables is available in where µ0 = 1.9 cycles/cm and v0 = 0.9 cycles/cm are the
Table 3-1. frequencies of the spatial variations along x and y in the spatial
96 CHAPTER 3 2-D IMAGES AND SYSTEMS

Table 3-1 2-D Continuous-space Fourier transform (CSFT).

Selected Properties

1. Linearity ∑ ci fi (x, y) ∑ ci Fi (µ , v)
 
1 µ v
2. Spatial scaling f (ax x, ay y) F ,
|ax ay | ax ay
3. Spatial shift f (x − x0 , y − y0 ) e− j2π µ x0 e− j2π vy0 F(µ , v)

4. Reversal f (−x, −y) F(−µ , −v)

5. Conjugation f ∗ (x, y) F∗ (−µ , −v)

6. Convolution f (x, y) ∗ ∗h(x, y) F(µ , v) H(µ , v)


in space
7. Convolution f (x, y) h(x, y) F(µ , v) ∗ ∗H(µ , v)
in frequency

CSFT Pairs
8. δ (x, y) 1

9. δ (x − x0 , y − y0 ) e− j2π µ x0 e− j2π vy0

10. e j2π µ0 x e j2π v0 y δ (µ − µ0 , v − v0 )


   
x y
11. rect rect ℓx ℓy sinc(ℓx µ ) sinc(ℓy v)
ℓx ℓy    
µ v
12. µ0 v0 sinc(µ0 x) sinc(v0 y) rect rect
µ0 v0
2 2 2 2
13. e−π x e−π y e−π µ e−π v
1
14. cos(2π µ0 x) cos(2π v0 y) 4 [δ ( µ ± µ0 ) δ (v ± v0 )]

domain, respectively. A grayscale image-display of f (x, y) is By Eq. (3.20), the CSFT of f (x, y) is
shown in Fig. 3-5(a), with pure black representing f (x, y) = −1
and pure white representing f (x, y) = +1. As expected, the F(µ , v) = F1 (µ ) F2 (v)
image exhibits a repetitive pattern along both x and y, with 19 = F { cos(2π µ0 x) } F { cos(2π v0y) }
cycles in 10 cm in the x direction and 9 cycles in 10 cm in 1
the y direction, corresponding to spatial frequencies of µ = 1.9 = [δ (µ − µ0 ) + δ (µ + µ0 )]
4
cycles/cm and v = 0.9 cycles/cm, respectively.
× [δ (v − v0) + δ (v + v0)], (3.21)

where we derived entry #14 in Table 3-1. The CSFT


consists of four impulses at spatial frequency locations
{ µ , v } = { ±µ0 , ±v0 }, as is also shown in Fig. 3-5(b).
We should note that the images displayed in Fig. 3-5 are
3-4 2-D CONTINUOUS-SPACE FOURIER TRANSFORM (CSFT) 97

0 f (t)
Signal
1

t
0
−T T
5 cm 2 2
(a) Rectangular pulse

|F( μ)|
Magnitude
spectrum T
10 cm x
0 5 cm 10 cm
y
(a) 2-D sinusoidal image f (x, y)
μ
−3 −2 −1 0 1 2 3
v
T T T T T T
13
(b) Magnitude spectrum

Phase ϕ( μ)
spectrum
180o
0 μ

μ
−3 −2 −1 0 1 2 3
T T T T T T
−13 (c) Phase spectrum
−13 0 13
(b) F( μ,v), with axes in cycles/cm Figure 3-6 (a) Rectangular pulse, and corresponding (b) magni-
tude spectrum and (c) phase spectrum.
Figure 3-5 (a) Sinusoidal image f (x, y)=cos(2π µ0 x) cos(2π v0 y)
with µ0 = 1.9 cycles/cm and v0 = 0.9 cycles/cm, and (b) the
corresponding Fourier transform F(µ , v), which consists of four
impulses (4 white dots) at { ±µ0 , ±v0 }.
D. Box Image

As a prelude to presenting the CSFT of a 2-D box image, let us


not truly continuous functions; function f (x, y) was discretized examine the 1-D case of the rectangular pulse f (t) = rect(t/T )
into 256 × 256 pixels and then a discrete form of the Fourier shown in Fig. 3-6(a). The pulse is centered at the origin
transform called the DFT (Section 3-8) was used to generate and extends between −T /2 and +T /2, and the corresponding
F(µ , v), also in the form of a 256 × 256 image. Fourier transform is, from entry #3 in Table 2-5,

F(µ ) = T sinc(µ T ), (3.22)


98 CHAPTER 3 2-D IMAGES AND SYSTEMS

where sinc(θ ) is the sinc function defined by Eq. (2.35) as


sinc θ = [sin(πθ )]/(πθ ). By defining F(µ ) as y

F(µ ) = |F(µ )|e j φ (µ ) , (3.23)

we determine that the phase spectrum φ (µ ) can be ascertained


from
F(µ ) sinc(µ T )
e j φ (µ ) = = . (3.24)
|F(µ )| | sinc(µ T )| l x
The quantity on the right-hand side of Eq. (3.24) is always equal
to +1 or −1. Hence, φ (µ ) = 0◦ when sinc(µ T ) is positive
and 180◦ when sinc(µ T ) is negative. The magnitude and phase
spectra of the rectangular pulse are displayed in Figs. 3-6(b)
and (c), respectively.
Next, let us consider the white square shown in Fig. 3-7(a).
If we assign an amplitude of 1 to the white part of the image (a) White square image
and 0 to the black part, the variation across the image along the v
x direction is analogous to that representing the time-domain
pulse of Fig. 3-6(a), and the same is true along y. Hence, the
white square represents the product of two pulses, one along x
and another along y, and is given by
x y
f (x, y) = rect rect , (3.25)
ℓ ℓ
μ
where ℓ is the length of the square sides. In analogy with
Eq. (3.22),
F(µ , v) = ℓ2 sinc(µ ℓ) sinc(vℓ). (3.26)
The magnitude and phase spectra associated with the expres-
sion given by Eq. (3.26) are displayed in grayscale format in
Fig. 3-7(b) and (c), respectively. For the magnitude spectrum, (b) Magnitude image |F( μ,v)|
white represents the peak value of |FBox (µ , v)| and black rep-
resents |FBox (µ , v)| = 0. The phase spectrum φ (µ , v) varies be- v
tween 0◦ and 180◦ , so the grayscale was defined such that white
corresponds to +180◦ and black to 0◦ . The tonal variations along
µ and v are equivalent to the patterns depicted in Figs. 3-6(b)
and (c) for the rectangular pulse.
In the general case of a box image of widths ℓx along x and ℓy
along y, and centered at (x0 , y0 ),
    μ
x − x0 y − y0
f (x, y) = rect rect . (3.27)
ℓx ℓy

In view of properties #3 and 11 in Table 3-1, the corresponding


CSFT is

F(µ , v) = ℓx e− j2π µ x0 sinc(µ ℓx )ℓy e− j2π vy0 sinc(vℓy ), (3.28) (c) Phase image ϕ( μ,v)

Figure 3-7 (a) Grayscale image of a white square in a black


background, (b) magnitude spectrum, and (c) phase spectrum.
3-4 2-D CONTINUOUS-SPACE FOURIER TRANSFORM (CSFT) 99

and the associated magnitude and phase spectra are given by


y
|F(µ , v)| (3.29a)

and  
−1 Im[F(µ , v)]
φ (µ , v) = tan . (3.29b)
Re[F(µ , v)]
A visual example is shown in Fig. 3-8(a) for a square box of
sides ℓx = ℓy = ℓ, and shifted to the right by L and also downward x
L
by L. Inserting x0 = L, y0 = −L, and ℓx = ℓy = ℓ in Eq. (3.28) l
leads to

F(µ , v) = ℓ2 e− j2π µ L sinc(µ ℓ) e j2π vL sinc(vℓ). (3.30)

The magnitude and phase spectra associated with the CSFT of L


the box image defined by Eq. (3.30) are displayed in Fig. 3-8(b) (a) Box image f (x,y)
and (c), respectively. The magnitude spectrum of the shifted box
is similar to that of the unshifted box (Fig. 3-7), but the phase v
spectra of the two boxes are considerably different.

E. 2-D Ideal Brickwall Lowpass Filter


An ideal lowpass filter is characterized by a spatial frequency
response HLP (µ , v) with a specified cutoff frequency µ0 along
both the µ and v axes: μ
(
1 for 0 ≤ |µ |, |v| ≤ µ0 ,
HLP (µ , v) = (3.31)
0 otherwise.

Mathematically, HLP (µ , v) can be expressed in terms of rectan-


gle functions centered at the origin and of width 2 µ0 : (b) |F( μ,v)|
    v
µ v
HLP (µ , v) = rect rect . (3.32)
2 µ0 2 µ0

The inverse 2-D Fourier transfer of HLP (µ , v) is the PSF


hLP (x, y). Application of property #12 in Table 3-1 yields

sin(2π xµ0 ) sin(2π yv0 ) μ


hLP (x, y) = ωx
πx πy
= 4 µ02 sinc(2xµ0 ) sinc(2yv0 ). (3.33)

F. Example: Lowpass-Filtering of Clown Image


Figure 3-9(a) displays an image of a clown’s face. Our goal is to (c) ϕ( μ,v)
lowpass-filter the clown image using an ideal lowpass filter with
a cutoff frequency µ0 = 0.44 cycles/mm. To do so, we perform Figure 3-8 (a) Image of a box image of dimension ℓ and centered
at (L, −L), (b) magnitude spectrum, and (c) phase spectrum.
100 CHAPTER 3 2-D IMAGES AND SYSTEMS

(x, y) Domain ( μ,v) Domain


v

FT
μ

(a) Clown face image f (x,y) (b) Magnitude spectrum of clown image F( μ,v)
×v

IFT
μ

(c) Magnified PSF h(x,y) of 2-D LPF (d) Spatial frequency response of 2-D LPF, HLP( μ,v)
with μ0 = 0.44 cycles/mm
=

IFT
μ

(e) Lowpass-filtered clown image g(x,y) (f ) Magnitude spectrum of filtered image G( μ,v)

Figure 3-9 Lowpass filtering the clown image in (a) to generate the image in (e). Image f (x, y) is 40 mm × 40 mm and the magnitude
spectra extend between −2.5 cycles/mm and +2.5 cycles/mm in both directions.
3-4 2-D CONTINUOUS-SPACE FOURIER TRANSFORM (CSFT) 101

the following steps:


(1) We denote the intensity distribution of the clown image y v
as f (x, y), and we apply the 2-D Fourier transform to obtain the
spectrum F(µ , v), whose magnitude is displayed in Fig. 3-9(b). y′ x′ μ′
v′
(2) The spatial frequency response of the lowpass filter, shown
in Fig. 3-9(d), consists of a white square representing the
passband of the filter. Its functional form is given by Eq. (3.32) θ θ
with µ0 = 0.44 cycles/mm. The corresponding PSF given by x μ
Eq. (3.33) is displayed in Fig. 3-9(c).
(3) Multiplication of F(µ , v) by HLP (µ , v) yields the spectrum
of the filtered image, G(µ , v): (a) Spatial domain (b) Frequency domain

G(µ , v) = F(µ , v) HLP (µ , v). (3.34) Figure 3-10 Rotation of axes by θ in (a) spatial domain causes
rotation by the same angle in the (b) spatial frequency domain.
The magnitude of the result is displayed in Fig. 3-9(f). Upon
performing an inverse Fourier transform on G(µ , v), we obtain
g(x, y), the lowpass-filtered image of the clown face shown in
Fig. 3-9(e). Image g(x, y) looks like a blurred version of the
original image f (x, y) because the lowpass filtering smooths out
rapid variations in the image. of the rotated image g(x, y) = f (x′ , y′ ), and Rθ is the rotation
Alternatively, we could have obtained g(x, y) directly by matrix relating (x, y) to (x′ , y′ ):
performing a convolution in the spatial domain:
 
g(x, y) = f (x, y) ∗ ∗hLP(x, y). (3.35) cos θ sin θ
Rθ = . (3.38)
− sin θ cos θ
Even though the convolution approach is direct and conceptually
straightforward, it is computationally much easier to perform the
filtering by transforming to the angular frequency domain, mul- The inverse relationship between (x′ , y′ ) and (x, y) is given in
tiplying the two spectra, and then inverse transforming back to terms of the inverse of matrix Rθ :
the spatial domain. The actual computation was performed using    ′   
x −1 x cos θ − sin θ x′
discretized (pixelated) images and the Fourier transformations = Rθ = . (3.39)
y y′ sin θ cos θ y′
were realized using the 2-D DFT introduced later in Section 3-8.
The 2-D Fourier transform of g(x, y) is given by
3-4.2 Image Rotation Z ∞Z ∞
G(µ , v) = g(x, y) e− j2π (µ x+vy) dx dy
−∞ −∞
◮ Rotating an image by angle θ (Fig. 3-10) in the 2-D Z ∞Z ∞
spatial domain (x, y) causes its Fourier transform to also = f (x′ , y′ ) e− j2π (µ x+vy) dx dy. (3.40)
rotate by the same angle in the frequency domain (µ , v). ◭ −∞ −∞

Using the relationships between (x, y) and (x′ , y′ ) defined by


To demonstrate the validity of the assertion, we start with the Eq. (3.39), while also recognizing that dx dy = dx′ dy′ because
relationships given by Eqs. (3.11) and (3.12): a differential element of area is the same in either coordinate
system, Eq. (3.40) becomes
g(x, y) = f (x′ , y′ ) (3.36)
G(µ , v)
and Z ∞Z ∞
 ′   ′ ′ ′ ′
x x = f (x′ , y′ ) e− j2π [µ (x cos θ −y sin θ )+v(x sin θ +y cos θ )] dx′ dy′
= R θ y , (3.37) −∞ −∞
y′ Z ∞Z ∞
′ ′ ′ ′
= f (x′ , y′ ) e− j2π [µ x +v y ] dx′ dy′ , (3.41)
where f (x, y) is the original image, (x′ , y′ ) are the coordinates −∞ −∞
102 CHAPTER 3 2-D IMAGES AND SYSTEMS

coordinates (ρ , φ ), with
y v
   p 
(x,y) (μ,v) µ = ρ cos φ ρ = µ 2 + v2
. (3.45)
y v v = ρ sin φ φ = tan−1 (v/µ )
r ρ
θ ϕ The Fourier transform of f (x, y) is given by Eq. (3.16a) as
x x μ Z ∞Z ∞
μ
F(µ , v) = f (x, y) e− j2π (µ x+vy) dx dy. (3.46)
−∞ −∞
(a) Spatial domain (b) Frequency domain
We wish to transform F(µ , v) into polar coordinates so we
Figure 3-11 Relationships between Cartesian and polar coordi- may apply it to circularly symmetric images or to use it in
nates in (a) spatial domain and (b) spatial frequency domain. filtering applications where the filter’s frequency response is
defined in terms of polar coordinates. To that end, we convert
the differential area dx dy in Eq. (3.46) to r dr d θ , and we use
the relations given by Eqs. (3.44) and (3.45) to transform the
exponent in Eq. (3.46):
where we define
 ′      µ x + vy = (ρ cos φ )(r cos θ ) + (ρ sin φ )(r sin θ )
µ cos θ sin θ µ µ
= = Rθ . (3.42) = ρ r[cos φ cos θ + sin φ sin θ ]
v′ − sin θ cos θ v v
= ρ r cos(φ − θ ). (3.47)
The newly defined spatial-frequency coordinates (µ ′ , v′ ) are
related to the original frequency coordinates (µ , v) by exactly The cosine addition formula was used in the last step. Conver-
the same rotation matrix Rθ that was used to rotate image f (x, y) sion to polar coordinates leads to
to g(x, y). The consequence of using Eq. (3.42) is that Eq. (3.38) Z ∞ Z 2π
now assumes the standard form for the definition of the Fourier F(ρ , φ ) = f (r, θ ) e− j2πρ r cos(φ −θ ) r dr d θ . (3.48a)
transform for f (x′ , y′ ): r=0 θ =0

G(µ , v) = F(µ ′ , v′ ). (3.43) The inverse transform is given by


Z ∞ Z 2π
In conclusion, we have demonstrated that rotation of image f (r, θ ) = F(ρ , φ ) e j2πρ r cos(φ −θ ) ρ d ρ d φ . (3.48b)
f (x, y) by angle θ in the (x, y) plane leads to rotation of F(µ , v) ρ =0 φ =0
by exactly the same angle in the spatial frequency domain.

3-4.3 2-D Fourier Transform in Polar 3-4.4 Rotationally Invariant Images


Coordinates A rotationally invariant image is a circularly symmetric image,
which means that f (r, θ ) is a function of r only. According to
In the spatial domain, the location of a point can be speci- Section 3-4.2, rotation of f (r, θ ) by a fixed angle θ0 causes
fied by its (x, y) coordinates in a Cartesian coordinate system the transform F(ρ , φ ) to rotate by exactly the same angle θ0 .
or by its (r, θ ) in the corresponding polar coordinate system Hence, if f (r, θ ) is independent of θ , it follows that F(ρ , φ ) is
(Fig. 3-11(a). The two pairs of variables are related by independent of φ , in which case Eq. (3.48a) can be rewritten as
   p  Z ∞ Z 2π
x = r cos θ r = x2 + y2
. (3.44) F(ρ ) = f (r) e− j2πρ r cos(φ −θ ) r dr d θ
y = r sin θ θ = tan−1 (y/x) r=0 θ =0
Z ∞ Z 2 π 
Similarly, in the spatial frequency domain (Fig. 3-11(b)), we − j2πρ r cos(φ −θ )
= r f (r) e d θ dr. (3.49)
can use Cartesian coordinates (µ , v) or their corresponding polar r=0 θ =0
3-4 2-D CONTINUOUS-SPACE FOURIER TRANSFORM (CSFT) 103

J0(z)
1
y
3 cm

0.5

0 z 0 x

−0.5
0 2 4 6 8 10 12 14 16 18 20

Figure 3-12 Plot of J0 (z) the Bessel function of order zero, as a


function of z. −3 cm
−3 cm 0 3 cm
(a) Ring impulse

v
21
Because the integration over θ extends over the range (0, 2π ),
the integrated value is the same for any fixed value of φ . Hence,
for simplicity we set φ = 0, in which case Eq. (3.49) simplifies
to
Z ∞ Z 2 π 
− j2πρ r cos θ
F(ρ ) = r f (r) e d θ dr
r=0 θ =0
Z ∞ 0 μ
= 2π r f (r) J0 (2πρ r) dr, (3.50)
r=0

where J0 (z) is the Bessel function of order zero:


Z 2π
1
J0 (z) = e− jz cos θ d θ . (3.51)
2π 0
−21
A plot of J0 (z) versus z is shown in Fig. 3-12. −21 0 21
The integral expression on the right-hand side of Eq. (3.50) (b) Fourier transform of ring impulse
is known as the Hankel transform of order zero. Hence, the
Fourier transform of a circularly symmetric image f (r) is given
Figure 3-13 (a) Image of ring impulse of radius a = 1 cm and
by its Hankel transform of order zero. An example is the ring
(b) the logarithm of its 2-D CTFT. [In display mode, the images
impulse are 256 × 256 pixels.]
f (r) = δ (r − a), (3.52a)
which defines a unit-intensity circle of radius a (Fig. 3-13(a)) in
the spatial coordinate system (r, θ ). The corresponding Fourier
104 CHAPTER 3 2-D IMAGES AND SYSTEMS

Table 3-2 2-D Fourier transforms of rotationally invariant


images.

f (r) F(ρ )
δ (r)
1
πr
J1 (πρ )
rect(r)

J1 (π r)
rect(ρ )
2r
1 1
r ρ
2 2
e− π r e−πρ (a) Letters image f (x,y)
δ (r − r0 ) 2π r0 J0 (2π r0 ρ )

transform is
Z ∞
F(ρ ) = 2π r δ (r − a) J0 (2πρ r) dr = 2π a J0 (2πρ a). (b) Letters image f (x′,y′) with x′ = ax and y′ = ay,,
r=0 spatially scaled by a = 4
(3.52b)
The image in Fig. 3-13(b) displays the variation of F(ρ ) as a
function of ρ in the spatial frequency domain for a ring with Figure 3-14 (a) Letters image and (b) a scaled version.
a = 1 cm (image size is 6 cm × 6 cm).
Table 3-2 provides a list of Fourier transform pairs of rota-
tionally symmetric images.

C. Gaussian Image
3-4.5 Image Examples
A. Scaling A 2-D Gaussian image is characterized by

Figure 3-14 compares image f (x, y), representing an image of 2 2


letters, to a scaled-down version f (x′ , y′ ) with x′ = ax, y′ = ay, f (x, y) = e−π (x +y ) . (3.53a)
and a = 4. The area of the scaled-down image is 1/16 of the area Gaussian image
of the original. To enlarge the image, the value of a should be
smaller than 1.
Since x2 + y2 = r2 , f (x, y) is rotationally invariant, so we can
rewrite it as
B. Image Rotation 2
f (r) = e−π r . (3.53b)
The image displayed in Fig. 3-15(a) is a sinusoidal image that To obtain the Fourier transform F(ρ ), we can apply Eq. (3.50),
oscillates along only the y direction. Its 2-D spectrum consists the Fourier transform for a rotationally invariant image:
of two impulse functions along the v direction, as shown in Z ∞
Fig. 3-15(b). Rotating the sinusoidal image by 45◦ to the image
F(ρ ) = 2π r f (r) J0 (2πρ r) dr
in Fig. 3-15(c) leads to a corresponding rotation of the spectrum,
Z0 ∞
as shown in Fig. 3-15(d). 2
= 2π re−π r J0 (2πρ r) dr. (3.54)
0
3-4 2-D CONTINUOUS-SPACE FOURIER TRANSFORM (CSFT) 105

x
y
(a) Sinusoidal image (b) Spectrum of image in (a)

x
y
(c) Sinusoidal image rotated by 45◦ (d) Spectrum of rotated sinusoidal image

Figure 3-15 (a) Sinusoidal image and (b) its 2-D spectrum; (c) rotated image and (d) its rotated spectrum.

From standard tables of integrals, we borrow the following The integrals in Eq. (3.54) and Eq. (3.55) become identical if we
identity for any real variable t: set t = r, a2 = π , and b = 2πρ , which leads to
Z ∞
2t 2 1 −b2 /4a2
te−a J0 (bt) dt = e , for a2 > 0. (3.55) 2
0 2a2 F(ρ ) = e−πρ . (3.56)
Gaussian spectrum
106 CHAPTER 3 2-D IMAGES AND SYSTEMS

extends between 0 and ρ0 , and is given by


◮ Hence, the Fourier transform of a 2-D Gaussian image is
itself 2-D Gaussian. ◭   (
ρ 1 for 0 < |ρ | < ρ0 ,
HLP (ρ ) = rect =
2ρ0 0 otherwise.
D. Disk Image
By Fourier duality, or equivalently by application of the inverse
A disk image has a value of 1 inside the disk area and zero transformation given by Eq. (3.48b), we obtain the PSF of the
outside it. A disk image centered at the origin and of radius 1/2 lowpass filter as
is characterized by Eq. (3.3) in (x, y) coordinates. Conversion to
polar coordinates gives J1 (2πρ0r)(2ρ0 )
( hLP (r) = = (2ρ0 )2 jinc(2ρ0 r). (3.59)
2r
1 for 0 ≤ r < 1/2,
fDisk (r) = rect(r) = (3.57)
0 otherwise.
◮ We should always remember the scaling property of the
Fourier transform, namely if
After much algebra, it can be shown that the corresponding
Fourier transform is given by f (r) F(ρ ),

J1 (πρ ) then for any real-valued scaling factor a,


FDisk (ρ ) = = jinc(ρ ), (3.58) ρ 
2ρ 1
f (ar) F .
|a|2 a
where J1 (x) is the Bessel function of order 1, and
The scaling property allows us to use available expressions,
jinc(x) = J1 (π x)/(2x) is called the jinc function, which comes
such as those in Eqs. (3.58) and (3.59), and to easily
from its resemblance in both waveform and purpose to the
convert them into the expressions appropriate (for example)
sinc function defined by Eq. (2.35), except that the numerator
to disks of different sizes or filters with different cutoff
changes from a sine to a Bessel function. Figure 3-16 displays
frequencies. ◭
a plot of jinc(ρ ), as well as a plot of the sinc function sinc(ρ ),
included here for comparison. In one dimension, the Fourier
transform of rect(x) is sinc(µ ); in two dimensions, the Fourier
transform of a disk image fDisk (r) = rect(r) is given by jinc(ρ ),
which resembles the variation exhibited by the sinc function.

1
Concept Question 3-3: Why do so many 1-D Fourier
sin(πρ) J1(πρ)
sinc( ρ) = jinc( ρ) = transform properties generalize directly to 2-D?
0.5 πρ 2ρ

0 ρ Concept Question 3-4: Where does the jinc function get


its name?
−0.5
0 1 2 3 4 5 6 7 8 9 10
Exercise 3-6: Why do f (x, y) and f (x − x0 , y − y0 ) have the
Figure 3-16 Sinc function (blue) and jinc function (red). same magnitude spectrum |F(µ , ν )|?
Answer: Let g(x, y) = f (x − x0 , y − y0 ). Then, from entry
#3 in Table 3-1,
E. PSF of Radial Brickwall Lowpass Filter |G(µ , ν )| = |e− j2π µ x0 e− j2πν y0 F(µ , ν )|
In the spatial frequency domain, the frequency response of a = |e− j2π µ x0 e− j2πν y0 ||F(µ , ν )| = |F(µ , ν )|.
radial brickwall lowpass filter with cutoff spatial frequency ρ0
3-5 2-D SAMPLING THEOREM 107

2
Exercise 3-7: Compute the 2-D CSFT of f (x, y) = e−π r ,
where r2 = x2 + y2 , without using Bessel functions. Hint:
f (x, y) is separable. y
Answer: f (x, y) = e −π r 2
=e e −π x2 −π y2
is separable, so
Eq. (3.19) and entry #5 of Table 2-5 (see also entry #13
of Table 3-1) give
2 2 2 +ν 2 ) 2
F(µ , ν ) = e−π µ e−πν = e−π (µ = e−πρ .

Exercise 3-8: The 1-D phase spectrum φ (µ ) in Fig. 3-6(c) x


is either 0 or 180◦ for all µ . Yet the phase of the 1-D
CTFT of a real-valued function must be an odd function
of frequency. How can these two statements be reconciled? Figure 3-17 The “bed of nails” function
∞ ∞
Answer: A phase of 180◦ is equivalent to a phase of −180◦ . ∑ ∑ δ (x − n∆) δ (y − m∆).
Replacing 180◦ with −180◦ for µ < 0 in Fig. 3-6(c) makes n=−∞ m=−∞
the phase φ (µ ) an odd function of µ .

Conceptually, image f (x, y) can be reconstructed from its


3-5 2-D Sampling Theorem discretized version f (n∆, m∆) by applying the 2-D version of
the sinc interpolation formula. Generalizing Eq. (2.51) to 2-D
The sampling theorem generalizes directly from 1-D to 2-D gives
using rectangular sampling:
∞ ∞
f [n, m] = f (n∆, m∆) = f (n/S, m/S) (3.60) f (x, y) = ∑ ∑ f (n∆, m∆)
n=−∞ m=−∞

where ∆ is the sampling length (instead of interval) and sin(π S(x − n∆)) sin(π S(y − m∆))
× . (3.62)
Sx = 1/∆ is the sampling rate in samples/meter. π S(x − n∆) π S(y − m∆)
If the spectrum of image f (x, y) is bandlimited to B—that is,
F(µ , v) = 0 outside the square region defined by As noted earlier in Section 2-4.4 in connection with Eq. (2.51),
accurate reconstruction using the sinc interpolation formula
{ (µ , v) : 0 ≤ |µ |, |v| ≤ B }, is not practical because it requires summations over infinite
number of samples.
then the image f (x, y) can be reconstructed from its samples
f [m, n], provided the sampling rate is such that S > 2B. As in
1-D, 2B is called the Nyquist sampling rate, although the units 3-5.1 Sampling/Reconstruction Examples
are now samples/meter instead of samples/second. The following image examples are designed to illustrate the im-
The sampled signal xs (t) defined by Eq. (2.43) generalizes portant role of the Nyquist rate when sampling an image f (x, y)
directly to the sampled image: (for storage or digital transmission) and then reconstructing it
∞ ∞ from its sampled version fs (x, y). We will use the term image
fs (x, y) = ∑ ∑ f (n∆, m∆) reconstruction fidelity as a qualitative measure of how well
n=−∞ m=−∞ the reconstructed image frec (x, y) resembles the original image
× [δ (x − n∆) δ (y − m∆)]. (3.61) f (x, y).
Reconstruction of frec (x, y) from the sampled image fs (x, y)
The term inside the square brackets (product of two impulse can be accomplished through either of two approaches:
trains) is called the bed of nails function, because it consists (a) Application of nearest-neighbor (NN) interpolation
of a 2-D array of impulses, as shown in Fig. 3-17. (which is a 2-D version of the 1-D nearest-neighbor interpola-
108 CHAPTER 3 2-D IMAGES AND SYSTEMS

tion), implemented directly on image fs (x, y). in part (d). The spectrum of the sampled image contains
(b) Transforming image fs (x, y) to the frequency domain, the spectrum of the original image (namely, the spectrum in
applying 2-D lowpass filtering (LPF) to simultaneously preserve Fig. 3-18(b)), plus periodic copies spaced at an interval S along
the central spectrum of f (x, y) and remove all copies thereof both directions in the spatial frequency domain. To preserve the
(generated by the sampling process), and then inverse transform- central spectrum and simultaneously remove all of the copies,
ing to the spatial domain. a lowpass filter is applied in step (f) of Fig. 3-18. Finally.
Both approaches will be demonstrated in the examples that application of the 2-D inverse Fourier transform to the spectrum
follow, and in each case we will compare an image reconstructed in part (f) leads to the reconstructed image frec (x, y) in part (e).
from an image sampled at the Nyquist rate with an aliased image We note that the process yields a reconstructed image with high-
reconstructed from an image sampled at a rate well below the fidelity resemblance to the original image f (x, y).
Nyquist rate. In all cases, the following parameters apply:
• Size of original (clown) image f (x, y) and reconstructed B. NN Reconstruction
image frec (x, y): 40 mm × 40 mm
Figure 3-19 displays image f (x, y), sampled image fs (x, y), and
• Sampling interval ∆ (and corresponding sampling rate the NN reconstructed image f (x, y). The last step was realized
S = 1/∆) and number of samples N: using a 2-D version of the nearest-neighbor interpolation tech-
nique described in Section 2-4.4D. NN reconstruction provides
• Nyquist-sampled version: ∆ = 0.2 mm, S = 5 sam- image
ples/mm, N = 200 × 200
• Sub–Nyquist-sampled version: ∆ = 0.4 mm, S = 2.5 fˆ(x, y) = fs (x, y) ∗ ∗ rect(x/∆) rect(y/∆), (3.63)
sample/mm, N = 100 × 100
which is a 2-D convolution of the 2-D sampled image fs (x, y)
• Spectrum of original image f (x, y) is bandlimited to with a box function. The spectrum of the NN interpolated signal
B = 2.5 cycles/mm is
sin(π ∆µ ) sin(π ∆v)
F̂(µ , v) = F(µ , v) . (3.64)
• Display πµ πv
• Images f (x, y), fs (x, y), frec (x, y): Linear scale As in 1-D, the zero crossings of the 2-D sinc functions coincide
with the centers of the copies of F(µ , v) induced by sampling.
• Image magnitude spectra: Logarithmic scale (for Consequently, the 2-D sinc functions act like lowpass filters
easier viewing; magnitude spectra extend over a wide along µ and v, serving to eliminate the copies almost completely.
range). Comparison of the NN-interpolated image in Fig. 3-19(c)
with the original image in part (a) of the figure leads to the
Reconstruction Example 1: conclusion that the NN technique works quite well for images
Image Sampled at Nyquist Rate sampled at or above the Nyquist rate.
Our first step is to create a bandlimited image f (x, y). This was
done by transforming an available clown image to the spatial Reconstruction Example 2:
frequency domain and then applying a lowpass filter with a Image Sampled below the Nyquist Rate
cutoff frequency of 2.5 cycles/mm. The resultant image and its
corresponding spectrum are displayed in Figs. 3-18(a) and (b), A. LPF Reconstruction
respectively.
The sequence in this example (Fig. 3-20) is identical with that
A. LPF Reconstruction described earlier in Example 1A, except for one very important
difference: in the present case the sampling rate is S = 2.5
Given that image f (x, y) is bandlimited to B = 2.5 cycles/mm, cycles/mm, which is one-half of the Nyquist rate. Consequently,
the Nyquist rate is 2B = 5 cycles/mm. Figure 3-18(c) displays the final reconstructed image in Fig. 3-20(e) bears a poor
fs (x, y), a sampled version of f (x, y), sampled at the Nyquist resemblance to the original image in part (a) of the figure.
rate, so it should be possible to reconstruct the original im-
age with good fidelity. The spectrum of fs (x, y) is displayed
3-5 2-D SAMPLING THEOREM 109

(a) Bandlimited image f (x,y) (b) Spectrum F( μ,v) of image f (x,y)


Sampling at S = 5 samples/mm
v

FT
μ

(c) Nyquist-sampled image fs(x,y) (d) Spectrum Fs( μ,v) of sampled image
Filtering

IFT
μ

(e) LPF reconstructed image frec(x,y) (f ) Lowpass-filtered spectrum

Figure 3-18 Reconstruction Example 1A: After sampling image f (x, y) in (a) to generate fs (x, y) in (c), the sampled image is Fourier
transformed [(c) to (d)], then lowpass-filtered [(d) to (f)] to remove copies of the central spectrum) and inverse Fourier transform [(f) to (e)]
to generate the reconstructed image frec (x, y). All spectra are displayed in log scale.
110 CHAPTER 3 2-D IMAGES AND SYSTEMS

B. NN Reconstruction
The sequence in Fig. 3-21 parallels the sequence in Fig. 3-19,
except that in the present case we are working with the sub-
Nyquist sampled image. As expected, the NN interpolation
technique generates a poor-fidelity reconstruction, just like the
LPF reconstructed version.

3-5.2 Hexagonal Sampling


The transformation defined by Eq. (3.48a) converts image
f (r, θ )—expressed in terms of spatial polar coordinates (r, θ )—
(a) Bandlimited image f (x,y) to its spectrum F(ρ , φ )—expressed in terms of radial frequency
ρ and associated azimuth angle φ . Spectrum F(ρ , φ ) is said to
Sampled at be radially bandlimited to radial frequency ρ0 if:
S = 5 samples/mm
F(ρ , φ ) = 0 for ρ > ρ0 .

If f (r, θ ) is sampled along a rectangular grid—the same as


when sampling f (x, y) in rectangular coordinates—at a sam-
pling spacing ∆rect (Fig. 3-22(a)) and corresponding sampling
rate S = 1/∆rec such that S ≥ 2ρ0 (to satisfy the Nyquist rate),
then the spectrum Fs (ρ , φ ) of the sampled image fs (r, θ ) would
consist of a central disk of radius ρ0 , as shown in Fig. 3-22(b),
plus additional copies at a spacing S along both the µ and v
directions. The term commonly used to describe the sampling in
(x, y) space is tiling; the image space in Fig. 3-22(a) is tiled with
(b) Nyquist-sampled image fs(x,y) square pixels.
Square tiling is not the only type of tiling used to sample
NN 2-D images. A more efficient arrangement in terms of data rate
(or total number of samples per image) is to tile the image
space using hexagons instead of squares. Such an arrangement
is shown in Fig. 3-23(a) and is called hexagonal sampling.
The image space is tiled with hexagons. The spacing along y is
unchanged (Fig. 3-23(a)), but the spacing along x has changed
to
2
∆hex = √ ∆rect = 1.15∆rect .
3
The modest wider spacing along x translates into fewer samples
needed to tile the image, and more efficient utilization of the
spatial frequency space (Fig. 3-23(b)).
Hexagonal sampling is integral to how the human vision sys-
(c) NN reconstructed image frec(x,y)
tem functions, in part because our photoreceptors are arranged
along a hexagonal lattice. The same is true for other mammals
Figure 3-19 Reconstruction Example 1B: Nearest-neighbor as well.
(NN) interpolation for Nyquist-sampled image. Sampling rate is Reconstruction of f (r, θ ) from its hexagonal samples entails
S = 5 samples/mm. the application of a radial lowpass filter with cutoff frequency
ρ0 to Fs (ρ , φ ), followed by an inverse Fourier transformation
(using Eq. (3.48b)) to the (r, θ ) domain. A clown image ex-
3-5 2-D SAMPLING THEOREM 111

(a) Bandlimited image f (x,y) (b) Spectrum F( μ,v) of image f (x,y)


Sampling at S = 2.5 sample/mm
v

FT
μ

(c) Sub-Nyquist sampled image fs(x,y) (d) Spectrum Fs( μ,v) of sampled image
Filtering

IFT
μ

(e) LPF reconstructed image frec(x,y) (f ) Lowpass-filtered spectrum

Figure 3-20 Reconstruction Example 2A: Image f (x, y) is sampled at half the Nyquist rate (S = 2.5 sample/mm compared with 2B = 5
samples/mm). Consequently, the reconstructed image in (e) bears a poor resemblance to the original image in (a). All spectra are displayed in
log scale.
112 CHAPTER 3 2-D IMAGES AND SYSTEMS

∆rec

∆rec

(a) Bandlimited image f (x,y)


Sampled at
S = 2.5 samples/mm
x

(a) Square tiling of f ( x,y)

(b) Sub-Nyquist sampled image fs(x,y)


NN

ρ0
μ
0

(c) NN reconstructed image frec(x,y)


(b) Spectrum Fs( ρ,ϕ)
Figure 3-21 Reconstruction Example 2B: Nearest-neighbor
interpolation for sub-Nyquist-sampled image.
Figure 3-22 (a) Square tiling at a spacing ∆rect and a sampling
rate S = 1/∆rect ≥ 2ρ0 , and (b) corresponding spectrum Fs (ρ , φ )
for an image radially bandlimited to spatial frequency ρ0 .
3-6 2-D DISCRETE SPACE 113

ample with hexagonal sampling at the Nyquist rate is shown


in Fig. 3-24. Note that the clown image has been antialiased
(lowpass-filtered prior to hexagonal sampling) so that the copies
of the spectrum created by hexagonal sampling do not overlap in
Fig. 3-24(d). This is why the clown image looks blurred, but the
reconstructed clown image matches the original blurred image.
∆rec
Concept Question 3-5: Why is the sampling theorem
important in image processing?

Exercise 3-9: An image is spatially bandlimited to 10


cycles/mm in both the x and y directions. What is the
minimum sampling length ∆s required in order to avoid
aliasing?
1 1
Answer: ∆s < 2B = 20 mm.
x
∆hex
y
Exercise 3-10: The 2-D CSFT of a 2-D impulse is 1, so it
is not spatially bandlimited. Why is it possible to sample an
impulse?
(a) Hexagonal tiling of f ( x,y)
Answer: In general, it isn’t. Using a sampling interval of
∆s , the impulse δ (x − x0 , y − y0 ) will be missed unless x0
and y0 are both integer multiples of ∆s .

ν
3-6 2-D Discrete Space
3-6.1 Discrete-Space Images
A discrete-space image represents a physical quantity that
varies with discrete space [n, m], where n and m are dimension-
ρ0 less integers. Such an image usually is generated by sampling a
continuous-space image f (x, y) at a spatial interval ∆s along the
μ x and y directions. The sampled image is defined by
0
f [n, m] = f (n∆s , m∆s ). (3.65)

The spatial sampling rate is Ss = 1/∆s , [n, m] denotes the


location of a pixel (picture element), and f [n, m] denotes the
value (such as image intensity) of that pixel.

A. Image Axes
(b) Spectrum Fs( ρ,ϕ) As noted earlier in connection with continuous-space images,
multiple different formats are used in both continuous- and
discrete-space to define image coordinates. We illustrate the
Figure 3-23 (a) Hexagonal tiling and (b) corresponding spec- most common of these formats in Fig. 3-25. In the top of the
trum Fs (ρ , φ ). figure, we show pixel values for a 10 × 10 array. In parts (a)
114 CHAPTER 3 2-D IMAGES AND SYSTEMS

(a) Radially bandlimited image f (r,θ) (b) Spectrum F( ρ,ϕ) of image f (r,θ)
Sampling at 2ρ0
v

FT
μ

(c) Hexagonally sampled image fs(r,θ) (d) Spectrum Fs( ρ,ϕ) of sampled image
Filtering

IFT
μ

(e) Radial LPF reconstructed image frec(r,θ) (f ) Radial lowpass-filtered spectrum

Figure 3-24 Hexagonal sampling and reconstruction example.


3-6 2-D DISCRETE SPACE 115

0 0 0 0 0 0 0 0 0 0
0 3 5 7 9 10 11 12 13 14
0 5 10 14 17 20 23 25 27 29
0 8 15 21 26 30 34 37 40 43
0 10 20 27 34 40 45 50 54 57
0 10 20 27 34 40 45 50 54 57
0 8 15 21 26 30 34 37 40 43
0 5 10 14 17 20 23 25 27 29
0 3 5 7 9 10 11 12 13 14
0 0 0 0 0 0 0 0 0 0
Pixel values
Origin
m
0 9
1 8
2 7
3 6
4 5
5 4
6 3
7 2
8 1
9 0
n n
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
m
(a) Top-left corner format (b) Bottom-left corner format
Origin

m
4 1
3 2
2 3
1 4
0 5
−1
n 6
−2 7
−3 8
−4 9
−5 10
−5 −4 −3 −2 −1 0 1 2 3 4 1 2 3 4 5 6 7 8 9 10
n′
m′
(c) Center-of-image format (d) MATLAB format

Figure 3-25 The four color images are identical in pixel values, but they use different formats for the location of the origin and for coordinate
directions for image f [n, m]. The MATLAB format in (d) represents X(m′ , n′ ).
116 CHAPTER 3 2-D IMAGES AND SYSTEMS

through (d), we display the same color maps corresponding to MATLAB format
the pixel array, except that the [n, m] coordinates are defined
differently, namely: In MATLAB, an (M × N) image is defined as

{ X(m′ , n′ ), 1 ≤ m′ ≤ M, 1 ≤ n′ ≤ N }. (3.68)
Top-left corner format
MATLAB uses the top-left corner format, except that its
Figure 3-25(a): [n, m] starts at [0, 0] and both integers extend indices n′ and m′ start at 1 instead of 0. Thus, the top-left corner
to 9, the origin is located at the upper left-hand corner, m in- is (1, 1) instead of [0, 0]. Also, m′ , the first index in X(m′ , n′ ),
creases downward, and n increases to the right. For a (M × N) represents the vertical axis and the second index, n′ , represents
image, f [n, m] is defined as the horizontal axis, which is the reverse of the index notation
represented in f [n, m]. The two notations are related as follows:
{ f [n, m], 0 ≤ n ≤ N − 1, 0 ≤ m ≤ M − 1 } (3.66a)
f [n, m] = X(m′ , n′ ), (3.69a)
or, equivalently,
with
f [n, m] =
  m′ = m + 1, (3.69b)
f [0, 0] f [1, 0] . . . f [N − 1, 0]
 f [0, 1] f [1, 1] . . . f [N − 1, 1]  ′
n = n + 1. (3.69c)
 .. .. . (3.66b)
 
. .
f [0, M − 1] f [1, M − 1] . . . f [N − 1, M − 1]

Bottom-left corner format


Common-Image vs. MATLAB Format
Figure 3-25(b): [n, m] starts at [0, 0] and both integers extend
to 9, the origin is located at the bottom left-hand corner, m in-
creases upward, and n increases to the right. This is a vertically The format used by MATLAB to store images is different
flipped version of the image in Fig. 3-25(a), and f [n, m] has the from the conventional matrix format given in Eq. (3.66b) in
same definition given by Eq. (3.66a). two important ways:

(1) Whereas the pixel at the upper left-hand corner of image


Center-of-image format f [n, m] is f [0, 0], at location (0, 0), that pixel is denoted
X(1, 1) in MATLAB.
Figure 3-25(c): The axes directions are the same as in the image
of Fig. 3-25(b), except that the origin of the coordinate system (2) In f [n, m], the value of n denotes the column that f [n, m]
is now located at the center of the image. The ranges of n and resides within (relative to the most-left column, which is
m depend on whether M and N are odd or even integers. If pixel denoted by n = 0), and m denotes the row that f [n, m]
[0, 0] is to be located in the center in the [n, m] coordinate system, resides within (relative to the top row, which is denoted by
then the range of m is m = 0). In MATLAB, the notation is reversed: the first index
of X(m′ , n′ ) denotes the row and the second index denotes
M−1 M−1
− ≤m≤ , if M is odd, (3.67a) the column. Hence,
2 2
M M f [n, m] = X(m′ , n′ )
− ≤ m ≤ −1 , if M is even. (3.67b)
2 2
with m′ and n′ related to n and m by Eq. (3.69). To
A similar definition applies to index n. distinguish between the two formats, f [n, m] uses square
brackets whereas X(m′ , n′ ) uses curved brackets.
3-6 2-D DISCRETE SPACE 117

Symbolically, the 2-D convolution is represented by


◮ From here on forward, the top-left-corner format will be
used for images, the center-of-image format will be used
for image spectra and PSFs, and the MATLAB format will f [n, m] h[n, m] g[n, m].
be used in MATLAB arrays. ◭
The properties of 1-D convolution are equally applicable in 2-D
discrete-space. Convolution of images of sizes (L1 × L2 ) and
B. Impulses and Shifts (M1 × M2 ) yields an image of size (N1 × N2 ), where

In 2-D discrete-space, impulse δ [n, m] is defined as N1 = L1 + M1 − 1, (3.72a)


( N2 = L2 + M2 − 1. (3.72b)
1 if n = m = 0,
δ [n, m] = δ [n] δ [m] = (3.70)
0 otherwise. The 2-D convolution process is illustrated next through a simple
example.
In upper-left-corner image format, shifting an image
f [n, m] by m0 downward and n0 rightward generates image
f [n − n0, m − m0]. An example is shown in Fig. 3-26.

Example 3-1: 2-D Convolution


Image Size
N columns Compute the 2-D convolution
   
1 2 5 6
∗∗ .
M rows 3 4 7 8

n Solution: By using entries in the first image as weights, we


m (M × N) image have
       
As noted earlier in Chapter 1, the size of an image is denoted 5 6 0 0 5 6 0 0 0 0 0 0
by (# of rows × # of columns) = M × N. 1 7 8 0 + 2 0 7 8 + 3 5 6 0 + 4 0 5 6
0 0 0 0 0 0 7 8 0 0 7 8
 
5 16 12
= 22 60 40 .
3-6.2 Discrete-Space Systems 21 52 32
For discrete-space systems, linearity and shift invariance follow
analogously from their continuous-space counterparts. When a Concept Question 3-6: Why do we bother studying
linear shift-invariant (LSI) system characterized by a point discrete-space images and systems, when almost all real-
spread function h[n, m] is subjected to an input f [n, m], it world systems and images are defined in continuous space?
generates an output g[n, m] given by the 2-D convolution of
f [n, m] and h[n, m]:
Exercise 3-11: If g[n, m] = h[n, m] ∗ ∗ f [n, m], what is
g[n, m] = h[n, m] ∗ ∗ f [n, m] 4h[n, m] ∗ ∗ f [n − 3, m − 2] in terms of g[n, m]?
∞ ∞
Answer: 4g[n − 3, m − 2], using the shift and scaling
= ∑ ∑ h[i, j] f [n − i, m − j]. (3.71)
properties of 1-D convolutions.
i=−∞ j=−∞
118 CHAPTER 3 2-D IMAGES AND SYSTEMS

Column 3 of f [n,m]

Image f [n,m]
Image f [n − 2, m − 1]
Row 5 of f [n,m]

Figure 3-26 Image f [n − 2, m − 1] is image f [n, m] shifted down by 1 and to the right by 2.

    Conjugate symmetry for real-valued images f [m, n] implies that


1 1 1 1
Exercise 3-12: Compute ∗∗ .
1 1 1 1 F∗ (Ω1 , Ω2 ) = F(−Ω1 , −Ω2 ). (3.74)
 
1 2 1 As in the 2-D continuous-space Fourier transform, F(Ω1 , Ω2 )
Answer: 2 4 2 . must be reflected across both spatial frequency axes to produce
1 2 1 its complex conjugate.
The DSFT is doubly periodic in (Ω1 , Ω2 ) with periods 2π
along each axis, as demonstrated by Example 3-2.

3-7 2-D Discrete-Space Fourier ◮ The spectrum of an image f [n, m] is its DSFT F(Ω1 , Ω2 ).
Transform (DSFT) The discrete-space frequency response H(Ω1 , Ω2 ) of an
LSI system is the DSFT of its point spread function (PSF)
The 2-D discrete-space Fourier transform (DSFT) is obtained h[n, m]. ◭
via direct generalization of the 1-D DTFT (Section 2-6) to
2-D. The DSFT consists of a DTFT applied first along m
and then along n, or vice versa. By extending the 1-D DTFT
definition given by Eq. (2.73a) (as well as the properties listed Example 3-2: DSFT of Clown Image
in Table 2-7) to 2-D, we obtain the following definition for the
DSFT F(Ω1 , Ω2 ) and its inverse f [n, m]:
∞ ∞ Use MATLAB to obtain the magnitude image of the DSFT of
F(Ω1 , Ω2 ) = ∑ ∑ f [n, m] e− j(Ω1 n+Ω2 m) , (3.73a) the clown image.
n=−∞ m=−∞
Z π Z π
1 Solution: The magnitude part of the DSFT is displayed in
f [n, m] = F(Ω1 , Ω2 )
4π 2 −π −π Fig. 3-27. As expected, the spectrum is periodic with period 2π
×e j(Ω1 n+Ω2 m)
dΩ1 dΩ2 . (3.73b) along both Ω1 and Ω2 .

The properties of the DSFT are direct 2-D generalizations of Concept Question 3-7: What is the DSFT used for? Give
the properties of the DTFT, and discrete-time generalizations of three applications.
the properties of the 2-D continuous-space Fourier transform.
3-8 2-D DISCRETE FOURIER TRANSFORM (2-D DFT) 119

format. That conversion is called the discrete Fourier transform


Ω2 (DFT). With the DFT, both f [n, m] and its 2-D Fourier transform
4π operate in discrete domains. For an (M × N) image f [n, m],
generalizing Eq. (2.89) to 2-D leads to the 2-D DFT of order
(K2 × K1 ):
N−1 M−1   
nk1 mk2
F[k1 , k2 ] = ∑ ∑ f [n, m] exp − j2π
K1
+
K2
,
n=0 m=0
Ω1 k1 = { 0, . . . , K1 − 1 }, k2 = { 0, . . . , K2 − 1 }, (3.75)

where we have converted (Ω1 , Ω2 ) into discrete indices (k1 , k2 )


by setting

Ω1 = k1
K1
−4π and
−4π 4π 2π
Ω2 = k2 .
K2
Figure 3-27 DSFT magnitude of clown image (log scale). Array F[k1 , k2 ] is given by the (K2 × K1 ) array
 
F[0, 0] F[1, 0] . . . F[K1 − 1, 0]
 F[0, 1] F[1, 1] . . . F[K1 − 1, 1] 
Exercise 3-13: An LSI system has a PSF given by F[k1 , k2 ] = 
 .. .. .

. .
  F[0, K2 − 1] F[1, K2 − 1] . . .F[K1 − 1, K2 − 1]
1 2 1
(3.76)
h[n, m] = 2 4 2 .
Note that the indexing is the same as that used for f [n, m]. The
1 2 1
inverse DFT is
Compute its spatial frequency response H(Ω1 , Ω2 ). Hint:   
1 K1 −1 K2 −1 nk1 mk2
K1 K2 k∑ ∑ F[k1 , k2 ] exp j2π K1 + K2 ,
h[n, m] is separable. f [n, m] =
=0 k =0
1 2
Answer: The DTFT was defined in Eq. (2.73). Recog-
nizing that h[n, m] = h1 [n] h1 [m] with h1 [n] = {1, 2, 1}, the n = { 0, . . . , N − 1 }, m = { 0, . . . , M − 1 }. (3.77)
DSFT is the product of two 1-D DTFTs, each of the form
◮ Note that if N < K1 and M < K2 , then the re-
H1 (Ω) = e jΩ + 2 + e− jΩ = 2 + 2 cos(Ω),
constructed image f [n, m] = 0 for N ≤ n ≤ K1 − 1 and
and therefore M ≤ m ≤ K2 − 1. ◭

H(Ω1 , Ω2 ) = [2 + 2 cos(Ω1 )][2 + 2 cos(Ω2 )].


3-8.1 Properties of the 2-D DFT
The 2-D DFT, like the 2-D CSFT and 2-D DSFT, consists of a
3-8 2-D Discrete Fourier Transform 1-D transform along either the horizontal or vertical direction,
(2-D DFT) followed by another 1-D transform along the other direction.
Accordingly, the 2-D DFT has the following properties listed
According to Eq. (3.73), while f [n, m] is a discrete function, in Table 3-3, which are direct generalizations of the 1-D DFT
its DSFT F(Ω1 , Ω2 ) is a continuous function of Ω1 and Ω2 . properties listed earlier in Table 2-9.
Numerical computation using the fast Fourier transform (FFT) The cyclic convolution h[n] c x[n] was defined in Eq. (2.104).
requires an initial step of converting F(Ω1 , Ω2 ) into discrete The 2-D DFT maps 2-D cyclic convolutions to products, and
120 CHAPTER 3 2-D IMAGES AND SYSTEMS

Table 3-3 Properties of the (K2 × K1 ) 2-D DFT. In the time-shift and modulation properties, (k1 − k1′ ) and (n − n0 ) must be reduced
mod(K 1 ), and (k2 − k2′ ) and (m − m0 ) must be reduced mod(K 2 ).

Selected Properties

1. Linearity ∑ ci fi [n, m] ∑ ci F[k1 , k2 ]


2. Shift f [(n − n0 ), (m − m0 )] e− j2π k1 n0 /K1 e− j2π k2 m0 /K2 F[k1 , k2 ]
3. Modulation e j2π k1′ n/K1 e j2π k2′ m/K2 f [n, m] F[(k1 − k1′ ), (k2 − k2′ )]
4. Reversal f [(N − n), (M − m)] F[(K 1 − k1 ), (K 2 − k2 )]
5. Convolution h[n, m]
c c f [n, m] H[k1 , k2 ] F[k1 , k2 ]

Special DFT Relationships


6. Conjugate Symmetry for f [n, m] real F∗ [k1 , k2 ] = F[(K 1 − k1 ), (K 2 − k2 )]
N−1 M−1
7. Zero spatial frequency F[0, 0] = ∑ ∑ f [n, m]
n=0 m=0
K1 −1 K2 −1
1
8. Spatial origin f [0, 0] =
K1 K2 ∑ ∑ F[k1 , k2 ]
k1 =0 k2 =0
N−1 M−1 K1 −1 K2 −1
1
9. Rayleigh’s theorem ∑ ∑ | f [n, m]|2 = ∑ ∑ |F[k1 , k2 ]|2
n=0 m=0 K1 K2 k1 =0 k2 =0

linear 2-D convolutions h[n, m] ∗ ∗ f [n, m] can be zero-padded to 3-8.2 Conjugate Symmetry for the 2-D DFT
cyclic convolutions, just as in 1-D.

Concept Question 3-8: Why is the 2-D DFT defined


only over a finite region, while the DSFT is defined over
all spatial frequency space?

Given an (M × N) image f [n, m], the expression given by


Exercise 3-14:  an expression for the (256 × 256)
 Compute Eq. (3.75) allows us to compute the 2-D DFT of f [n, m] for any
1 2 1 order (K2 × K1 ). If f [n, m] is real, then conjugate symmetry
2-D DFT of 2 4 2. Use the result of Exercise 3-11. holds:
1 2 1
Answer: The 2-D DFT is the DSFT sampled at F∗ [k1 , k2 ] = F[K1 − k1 , K2 − k2 ], (3.78)
Ωi = 2π ki /256 for i = 1, 2. Substituting in the answer to 1 ≤ k1 ≤ K1 − 1; 1 ≤ k2 ≤ K2 − 1.
Exercise 3-11 gives
[Compare this statement with the conjugate symmetry of the
     
k1 k2 1-D DFT X[k] of a real-valued signal x[n], as given by
X[k1 , k2 ] = 2 + 2 cos 2π 2 + 2 cos 2π , Eq. (2.98).]
256 256
The conjugate-symmetry relation given by Eq. (3.78) states
0 ≤ k1 , k2 ≤ 255. that an array element F[k1 , k2 ] is equal to the complex conjugate
of array element F[K1 − k1 , K2 − k2 ], and vice versa. To
demonstrate the validity of conjugate symmetry, we start by
rewriting Eq. (3.75) with k1 and k2 replaced with (K1 − k1 ) and
3-9 COMPUTATION OF THE 2-D DFT USING MATLAB 121


n − j(2π /K2 )mk2 ,
(K2 − k2), respectively: (4) F[K1 /2, k2 ] = ∑M−1 N−1
m=0 ∑n=0 f [n, m] (−1) e
which is the K2 -point 1-D DFT of ∑N−1 n=0 f [n, m] (−1)
n
F[K1 − k1 , K2 − k2 ] − j(2 π / K1 )n(K1 /2) j π n n
   because e = e = (−1) .
N−1 M−1
n(K1 − k1 ) m(K2 − k2)
= ∑ ∑ f [n, m] exp − j2π + 
m − j(2π /K1 )nk1
n=0 m=0 K1 K2 (5) F[k1 , K2 /2] = ∑N−1 M−1
n=0 ∑m=0 f [n, m] (−1) e ,
N−1 M−1    M−1
which is the K1 -point 1-D DFT of ∑m=0 f [n, m] (−1)m
nk1 mk2
= ∑ ∑ f [n, m] exp j2π + because e− j(2π /K2)m(K2 /2) = e jπ m = (−1)m .
n=0 m=0 K1 K2
  
nK1 mK2 (6) F[K1 /2, K2 /2] = ∑N−1 M−1 m+n .
× exp − j2π + n=0 ∑m=0 f [n, m] (−1)
K1 K2
N−1 M−1   
nk1 mk2
= ∑ ∑ f [n, m] exp j2π +
n=0 m=0 K1 K2
3-9 Computation of the 2-D DFT Using
× e− j2π (n+m)
N−1 M−1    MATLAB
nk1 mk2
= ∑ ∑ f [n, m] exp j2π
K1
+
K2
, (3.79)
We remind the reader that the notation used in this book
n=0 m=0
represents images f [n, m] defined in Cartesian coordinates, with
where we used e− j2π (n+m) = 1 because n and m are integers. The the origin at the upper left corner, the first element n of the
expression on the right-hand side of Eq. (3.79) is identical to the coordinates [n, m] increasing horizontally rightward from the
expression for F[k1 , k2 ] given by Eq. (3.75) except for the minus origin, and the second element m of the coordinates [n, m]
sign ahead of j. Hence, for a real-valued image f [n, m], increasing vertically downward from the origin. To illustrate
with an example, let us consider the (3 × 3) image given by
   
F∗ [k1 , k2 ] = F[K1 − k1 , K2 − k2 ], (3.80) f [0, 0] f [1, 0] f [2, 0] 3 1 4
1 ≤ k1 ≤ K1 − 1; 1 ≤ k2 ≤ K2 − 1, f [n, m] =  f [0, 1] f [1, 1] f [2, 1] = 1 5 9 . (3.81)
f [0, 2] f [1, 2] f [2, 2] 2 6 5
where (K2 × K1 ) is the order of the 2-D DFT. When stored in MATLAB as array X(m′ , n′ ), the content remains
the same, but the indices swap roles and their values start at
3-8.3 Special Cases (1, 1):
   
A. f [n, m] is real X(1, 1) X(1, 2) X(1, 3) 3 1 4
X(m′ , n′ ) = X(2, 1) X(2, 2) X(2, 3) = 1 5 9 .
If f [n, m] is a real-valued image, the following special cases X(3, 1) X(3, 2) X(3, 3) 2 6 5
hold: (3.82)
Arrays f [n, m] and X(m′ , n′ ) are displayed in Fig. 3-28.
(1) F[0, 0] = ∑N−1 M−1
n=0 ∑m=0 f [n, m] is real-valued. Application of Eq. (3.75) with N = M = 3 and K1 = K2 = 3
 − j(2π /K )mk to the 3 × 3 image defined by Eq. (3.81) leads to
(2) F[0, k2 ] = ∑M−1 N−1
m=0 ∑n=0 f [n, m] e
2 2,
N−1  
which is the K2 -point 1-D DFT of ∑n=0 f [n, m]. 36 −9 + j5.2 −9 − j5.2
 − j(2π /K )nk F[k1 , k2 ] = −6 − j1.7 9 + j3.5 1.5 + j0.9 . (3.83)
(3) F[k1 , 0] = ∑N−1 M−1
n=0 ∑m=0 f [n, m] e
1 1,
−6 + j1.7 1.5 − j0.9 9 − j3.5
M−1
which is the K1 -point 1-D DFT of ∑m=0 f [n, m].
◮ In MATLAB, the command FX=fft2(X,M,N) com-
B. f [n, m] is real and K1 and K2 are even putes the (M × N) 2-D DFT of array X and stores it in array
If also K1 and K2 are even, then the following relations apply: FX. ◭
122 CHAPTER 3 2-D IMAGES AND SYSTEMS

3 × 3 Image
 
3 1 4
1 5 9
2 6 5

Common Image Format MATLAB Format


n n′
   
f [0, 0] = 3 f [1, 0] = 1 f [2, 0] = 4 X(1, 1) = 3 X(1, 2) = 1 X(1, 3) = 4
m f [n, m] =  f [0, 1] = 1 f [1, 1] = 5 f [2, 1] = 9 m′ X(m′ , n′ ) = X(2, 1) = 1 X(2, 2) = 5 X(2, 3) = 9
f [0, 2] = 2 f [1, 2] = 6 f [2, 2] = 5 X(3, 1) = 2 X(3, 2) = 6 X(3, 3) = 5

DFT fft2(X)

F[k1 , k2 ] = FX(k2′ , k1′ ) =


   
F[0, 0] = 36 F[1, 0] = −9 + j5.2 F[2, 0] = −9 − j5.2 FX(1, 1) = 36 FX(1, 2) = −9 + j5.2 FX(1, 3) = −9 − j5.2
 F[0, 1] = −6 − j1.7 F[1, 1] = 9 + j3.5 F[2, 1] = 1.5 + j0.9   FX(2, 1) = −6 − j1.7 FX(2, 2) = 9 + j3.5 FX(2, 3) = 1.5 + j0.9 
F[0, 2] = −6 + j1.7 F[1, 2] = 1.5 − j0.9 F[2, 2] = 9 − j3.5 FX(3, 1) = −6 + j1.7 FX(3, 2) = 1.5 − j0.9 FX(3, 3) = 9 − j3.5

Shift to Center fftshift(fft2(X))

   
9 − j3.5 −6 + j1.7 1.5 − j0.9 9 − j3.5 −6 + j1.7 1.5 − j0.9
′ ′
Fc [k1c , k2c ] = −9 − j5.2 36 −9 + j5.2 FXC(k2c , k1c ) = −9 − j5.2 36 −9 + j5.2
1.5 + j0.9 −6 − j1.7 9 + j3.5 1.5 + j0.9 −6 − j1.7 9 + j3.5

Figure 3-28 In common-image format, application of the 2-D DFT to image f [n, m] generates F[k1 , k2 ]. Upon shifting F[k1 , k2 ] along k1 and k2 to
center the image, we obtain the center-of-image format represented by Fc [k1c , k2c ]. The corresponding sequence in MATLAB starts with X(m′ , n′ ) and
′ , k′ ).
concludes with FXC(k2c 1c

The corresponding array in MATLAB, designated FX(k2′ , k1′ ) 3-9.1 Center-of-Image Format
and displayed in Fig. 3-28, has the same content but with
MATLAB indices (k2′ , k1′ ). Also, k2′ increases downward and In some applications, it is more convenient to work with the
k1′ increases horizontally. The relationships between MATLAB 2-D DFT array when arranged in a center-of-image format
indices (k2′ , k1′ ) and common-image format indices (k1 , k2 ) are (Fig. 3-25(c) but with the vertical axis pointing downward) than
identical in form to those given by Eq. (3.69), namely in the top-left corner format. To convert array F[k1 , k2 ] to a
center-of-image format, we need to shift the array elements to
k2′ = k2 + 1, (3.84)
the right and downward by an appropriate number of steps so as
k1′ = k1 + 1. (3.85) to locate F[0, 0] in the center of the array. If we denote the 2-D
3-9 COMPUTATION OF THE 2-D DFT USING MATLAB 123

DFT in the center-of-image format as Fc [k1c , k2c ], then its index with k1 = K1 − k1c , k2 = K2 − k2c ,
k1c extends over the range
   
Ki − 1 Ki − 1
− ≤ k1c ≤ , for Ki = odd, (3.86) (d) Fourth Quadrant
2 2
and Fc [k1c , −k2c ] = F[k1 , k2 ], (3.90d)

Ki Ki with k1 = k1c , k2 = K2 − k2c .


− ≤ k1c ≤ − 1, for Ki = even. (3.87)
2 2
To obtain Fc [k1c , k2c ] from F[k1 , k2 ] for the array given by To demonstrate the recipe, we use a numerical example with
Eq. (3.83), we circularly shift the array by one unit to the right K1 = K2 = 3, and again for K1 = K2 = 4 because the recipe is
and one unit downward, which yields different for odd and even integers.
Fc [k1c ,k2c ] =
 
Fc [−1,1] = 9 − j3.5 Fc [0,1] = −6 + j1.7 Fc [1,1] = 1.5 − j0.9
 Fc [−1,0] = −9 − j5.2 Fc [0,0] = 36 Fc [1,0] = −9 + j5.2  .
Fc [−1,−1] = 1.5 + j0.9 Fc [0,−1] = −6 − j1.7 Fc [1,−1] = 9 + j3.5
3-9.2 Odd and Even Image Examples
(3.88)

A. N = M and K1 = K2 = odd
◮ In MATLAB, the command
FXC=fftshift(fft2(FX)) The (3 × 3) image shown in Fig. 3-28 provides an example of an
(M × M) image with M being an odd integer. As noted earlier,
shifts array FX to center-image format and stores it in array when the 2-D DFT is displayed in the center-of-image format,
FXC. ◭ the conjugate symmetry about the center of the array becomes
readily apparent.
In the general case for any integers K1 and K2 , transform-
ing the 2-D DFT F[k1 , k2 ] into the center-of-image format
Fc [k1c , k2c ] entails the following recipe:
For A. N = M and K1 = K2 = even
(
′ Ki /2 − 1 if Ki is even, Let us consider the (4 × 4) image
Ki = (3.89)
(Ki − 1)/2 if Ki is odd,  
1 2 3 4
and 0 ≤ k1c , k2c ≤ K′i : 2 4 5 3
f [n, m] =  . (3.91)
3 4 6 2
(a) First Quadrant 4 3 2 1

Fc [k1c , k2c ] = F[k1 , k2 ], (3.90a) The (4 × 4) 2-D DFT F[k1 , k2 ] of f [n, m], displayed in the upper-
left corner format, is
with k1 = k1c and k2 = k2c ,
 
F[0, 0] F[1, 0] F[2, 0] F[3, 0]
(b) Second Quadrant
F[0, 1] F[1, 1] F[2, 1] F[3, 1]
F=
Fc [−k1c , k2c ] = F[k1 , k2 ], (3.90b) F[0, 2] F[1, 2] F[2, 2] F[3, 2]
F[0, 3] F[1, 3] F[2, 3] F[3, 3]
with k1 = K1 − k1c , k2 = k2c ,  
49 −6 − j3 3 −6 + j3
−5 − j4 2 + j9 −5 + j2 j 
(c) Third Quadrant = . (3.92)
1 −4 + j3 −1 −4 − j3
Fc [−k1c , −k2c ] = F[k1 , k2 ], (3.90c) −5 + j4 −j −5 − j2 2 − j9
124 CHAPTER 3 2-D IMAGES AND SYSTEMS

Application of the recipe given by Eq. (3.90) leads to

Fc (k1c , k2c )
 ′ 
F [−2, −2] F′ [−1, −2] F′ [0, −2] F′ [1, −2]
′ ′ ′ ′
 F [−2, −1] F [−1, −1] F [0, −1] F [1, −1] 
= ′
F [−2, 0] F′ [−1, 0] F′ [0, 0] F′ [1, 0] 
′ ′
F [−2, −1] F [−1, 1] ′
F [0, 1] F′ [1, 1]
 
−1 −4 − j3 1 −4 + j3
−5 − j2 2 − j9 −5 + j4 −j 
= . (3.93)
3 −6 + j3 49 −6 − j3
−5 + j2 j −5 − j4 2 + j9

Now conjugate symmetry Fc [−k1c , −k2c ] = F∗c [k1c , k2c ] applies,


but only after omitting the first row and column of Fc [k1c , k2c ].
This is because the dc value (49) is not at the center of the array.
Indeed, there is no center pixel in an M × M array if M is even.
It is customary in center-of-image depictions to place the origin
at array coordinates [ M2 + 1, M2 + 1]. The first row and column
are not part of the mirror symmetry about the origin. This is not
noticeable in M × M arrays if M is even and large.
Conjugate symmetry applies within the first row and within
the first column, since these are 1-D DFTs with M2 = 2.

Exercise 3-15: Why did this book not spend more space on
computing the DSFT?
Answer: Because in practice, the DSFT is computed using
the 2-D DFT.
3-9 COMPUTATION OF THE 2-D DFT USING MATLAB 125

Summary
Concepts
• Many 2-D concepts are generalizations of 1-D coun- space images, on which discrete-space image processing
terparts. These include: LSI systems, convolution, sam- can be performed.
pling, 2-D continuous-space; 2-D discrete-space; and • Nearest-neighbor interpolation often works well for in-
2-D discrete Fourier transforms. terpolating sampled images to continuous space. Hexag-
• The 2-D DSFT is doubly periodic in Ω1 and Ω2 with onal sampling can also be used.
periods 2π . • Discrete-space images can be displayed in several differ-
• Rotating an image rotates its 2-D continuous-space ent formats (the location of the origin differs).
Fourier transform (CSFT). The CSFT of a radially sym- • The response of an LSI system with point spread
metric image is radially symmetric. function h(x, y) to image f (x, y) is output g(x, y) =
• Continuous-space images can be sampled to discrete- h(x, y) ∗ ∗ f (x, y), and similarly in discrete space.

Mathematical Formulae
Impulse Ideal radial lowpass filter PSF
δ (r) h(r) = 4ρ02 jinc(2ρ0 r)
δ (x, y) = δ (x) δ (y) =
πr
Energy of f (x, y) 2-D Sampling
Z ∞Z ∞ 1
E= | f (x, y)|2 dx dy Sampling interval > 2B if F(µ , ν ) = 0 for |µ |, |ν | > B

−∞ −∞

Convolution 2-D Sinc interpolation formula


h(x, y) ∗ f (x, y) = f (x, y) =    
∞ ∞
Z ∞Z ∞ x − n∆ y − m∆
f (ξ , η ) h(x − ξ , y − η ) d ξ d η
∑ ∑ f (n∆, m∆) sinc ∆ sinc

n=−∞ m=−∞
−∞ −∞
Discrete-space Fourier transform (DSFT)
Convolution ∞ ∞
∞ ∞ F(Ω1 , Ω2 ) = ∑ ∑ f [n, m] e− j(Ω1 n+Ω2 m)
h[n, m] ∗ ∗ f [n, m] = ∑ ∑ h[i, j] f [n − i, m − j] n=−∞ m=−∞
i=−∞ j=−∞
(K2 × K1 ) 2-D DFT of (M × N) image
Fourier transform
Z Z (CSFT) N−1 M−1
∞ ∞
F(µ , ν ) = f (x, y) e − j2π ( µ x+ν y)
dx dy F[k1 , k2 ] = ∑ ∑ f [n, m] e− j2π (nk1/K1 +mk2 /K2 )
−∞ −∞ n=0 m=0

Inverse CSFT Inverse 2-D DFT


Z ∞Z ∞
f (x, y) = F(µ , ν ) e j2π (µ x+ν y) d µ d ν 1 K1 −1 K2 −1
−∞ −∞
f [n, m] =
K1 K2 k∑ ∑ F[k1 , k2 ] e j2π (nk1/K1 +mk2 /K2)
=0 k =0
1 2
Ideal square lowpass filter PSF
h(x, y) = 4µ02 sinc(2 µ0 x) sinc(2ν0 y)

Important Terms Provide definitions or explain the meaning of the following terms:
aliasing CSFT DSFT linear shift-invariant (LSI) point spread function sampling theorem
convolution DFT FFT nearest-neighbor interpolation sampled image sinc function
126 CHAPTER 3 2-D IMAGES AND SYSTEMS

PROBLEMS its samples using NN interpolation, but displays spectra at each


stage. Explain why the image reconstructed from its samples
Section 3-2: 2-D Continuous-Space Images matches the clown image.

3.1 Let f (x, y) be an annulus (ring) with inner radius 3 and


outer radius 5, with center at the point (2,4). Express f (x, y) in Section 3-6: 2-D Discrete Space
terms of fDisk (x, y).
3.10 Compute the 2-D convolution
Section 3-3: Continuous-Space Systems
   
3 1 5 9
3.2 Compute the autocorrelation ∗∗
4 1 2 6
r(x, y) = fBox (x, y) ∗ ∗ fBox(−x, −y) by hand. Check your answer using MATLAB’s conv2.
of fBox (x, y). 3.11 An LTI system is described by the equation

Section 3-4: 2-D Continuous-Space Fourier 1 1 1


g[n, m] = f [n − 1, m − 1] + f [n − 1, m] + f [n − 1, m + 1]
Transform (CSFT) 9 9 9
1 1 1
+ f [n, m − 1] + f [n, m] + f [n, m + 1]
3.3 Compute the 2-D CSFT F(µ , ν ) of a 10 × 6 ellipse f (x, y). 9 9 9
1 1 1
3.4 A 2-D Gaussian function has the form + f [n + 1, m − 1] + f [n + 1, m] + f [n + 1, m + 1].
9 9 9
2 2)
e−r /(2σ What is the PSF h[n, m] of the system? Describe in words what
fg (r) = .
2πσ 2 it does to its input.
Compute the 2-D CSFT fg (ρ ) of fg (r) using the scaling property 3.12 An LTI system is described by the equation
of the 2-D CSFT.
3.5 Compute the CSFT of an annulus (ring) f (x, y) with inner g[n, m] = 9 f [n − 1, m − 1] + 8 f [n − 1, m] + 7 f [n − 1, m + 1]
radius 3 and outer radius 5, with center at: + 6 f [n, m − 1] + 5 f [n, m] + 4 f [n, m + 1]
(a) the origin (0,0); + 3 f [n + 1, m − 1] + 2 f [n + 1, m] + f [n + 1, m + 1].
(b) the point (2,4).
What is the PSF h[n, m] of the system? Use center-of-image
notation (Fig. 3-25(c)).
Section 3-5: 2-D Sampling Theorem
3.6 f (x, y) = cos(2π 3x) cos(2π 4y) is sampled every ∆ = 0.2. Section 3-7: Discrete-Space Fourier Transform
What is reconstructed by a brick-wall lowpass filter with cutoff (DSFT)
= 5 in µ and ν and passband gain ∆ = 0.2?
3.7 f (x, y) = cos(2π 12x) cos(2π 15y) is sampled every
3.13 Prove the shift property of the DSFT:
∆ = 0.1. What is reconstructed by a brick-wall lowpass filter
with cutoff = 10 in µ and ν and passband gain ∆ = 0.1?
f [n − a, m − b] e− j(aΩ1 +bΩ2 ) F(Ω1 , Ω2 ).
3.8 Antialias filtering: Run MATLAB program P38.m. This
lowpass-filters the clown image before sampling it below its
Nyquist rate. Explain why the image reconstructed from its 3.14 Compute the spatial frequency response of the system in
samples has no aliasing. Problem 3.11.

3.9 Nearest-Neighbor interpolation: Run MATLAB program 3.15 Compute the spatial frequency response of the system in
P39.m. This samples the clown image and reconstructs from Problem 3.12.
PROBLEMS 127

3.16 Show that the 2-D spatial frequency response of the PSF using (2 × 2) 2-D DFTs.
 
1 2 1
h[m, n] = 2 4 2
1 2 1

is close to circularly symmetric, making it a circularly symmet-


ric lowpass filter, by:
(a) displaying the spatial frequency response as an image with
dc at the center;
(b) using
Ω2 Ω4
cos(Ω) = 1 − + − ···
2! 4!
and neglecting all terms of degree four or higher.

Section 3-8: 2-D Discrete Fourier Transform


(DFT)

3.17 Compute by hand the (2 × 2) 2-D DFT of


 
1 2
F = f [n, m] = .
3 4

Check your answer using MATLAB using FX=fft(F,2,2).


3.18 Prove that the (M × M) 2-D DFT of a separable image
f [n, m] = f1 [n] f2 [m] is the product of the 1-D M-point DFTs of
f1 [n] and f2 [n]: F[k1 , k2 ] = f1 [k1 ] f2 [k2 ].
3.19 Show that we can extend the definition of the (M × M)
2-D DFT to negative values of k1 and k2 using

F[−k1 , −k2 ] = F[M − k1 , M − k2 ]

for indices 0 < k1 , k2 < M. Conjugate symmetry for real f [n, m]


is then

F[−k1 , −k2 ] = F[M − k1 , M − k2 ] = F∗ [k1 , k2 ].

3.20 Compute the 2-D cyclic convolution

y[n, m] = x1 [n, m]
c c x2 [n, m],

where  
1 2
x1 [n, m] =
3 4
and  
5 6
x2 [n, m] =
7 8
Chapter 4
4 Image Interpolation
0
Contents
50
Overview, 129
4-1 Interpolation Using Sinc Functions, 129 100
4-2 Upsampling and Downsampling Modalities, 130
4-3 Upsampling and Interpolation, 133 150
4-4 Implementation of Upsampling Using 2-D DFT
in MATLAB, 137 200
4-5 Downsampling, 140
4-6 Antialias Lowpass Filtering, 141 250
4-7 B-Splines Interpolation, 143
4-8 2-D Spline Interpolation, 149 300
4-9 Comparison of 2-D Interpolation Methods, 150
4-10 Examples of Image Interpolation 350
Applications, 152
Problems, 156 399
0 50 100 150 200 250 300 350 399

Objectives In many image processing applications, such as


Learn to: magnification, thumbnails, rotation, morphing, and
reconstruction from samples, it is necessary to
■ Use sinc and Lanczos functions to interpolate a interpolate (roughly, fill in gaps between given
bandlimited image. samples). This chapter presents three approaches to
interpolation: using sinc or Lanczos functions;
■ Perform upsampling and interpolation using the 2-D upsampling using the 2-D DFT; and use of
DFT and MATLAB. B-splines. These methods are compared and used on
each of the applications listed above.
■ Perform downsampling for thumbnails using the 2-D
DFT and MATLAB.

■ Use B-spline functions of various orders to interpo-


late non-bandlimited images.

■ Rotate, magnify, and morph images using interpola-


tion.
Overview applying the sinc interpolation formula may be an aliased
version of x(t). In such cases, different interpolation methods
Suppose some unknown signal x(t) had been sampled at a should be used, several of which are examined in this chapter.
sampling rate S to generate a sampled signal x[n] = { x(n∆), In 2-D, interpolation seeks to fill in the gaps between the
n = . . . , −1, 0, 1, . . . }, sampled at times t = n∆, where ∆ = 1/S is image values f [n, m] = { f (n∆, m∆) } defined at discrete lo-
the sampling interval. Interpolation entails the use of a recipe cations (x = n∆, y = m∆) governed by the spatial sampling
or formula to compute a continuous-time interpolated version rate S = Sx = Sy = 1/∆. In analogy with the 1-D case, if the
xint (t) that takes on the given values { x(n∆) } and interpolates spectrum F(µ , v) of the original image f (x, y) is bandlimited
between them. In 1-D, interpolation is akin to connecting to Bx = By = B and if the sampling rate satisfies the Nyquist
the dots, represented here by { x(n∆) }, to obtain xint (t). The criterion (i.e., S > 2B), then it should be possible to interpolate
degree to which the interpolated signal is identical to or a close f [n, m] = f (n∆, m∆) to generate f (x, y) exactly.
rendition of the original signal x(t) depends on two factors: This chapter explores several types of recipes for interpolating
(1) Whether or not the sampling rate S used to generate xint (t) a sampled image f [n, m] = { f (n∆, m∆) } into a continuous-
from x(t) satisfies the Nyquist criterion, namely S > 2B, space image fint (x, y). Some of these recipes are extensions of
where B is the maximum frequency in the spectrum of the sinc interpolation formula, while others rely on the use of
signal x(t), and B-spline functions. As discussed later in Section 4-7, B-splines
are polynomial functions that can be designed to perform
(2) the specific interpolation method used to obtain xint (t). nearest-neighbor, linear, quadratic, and cubic interpolation of
images and signals. We will also explore how interpolation is
If the spectrum X( f ) of x(t) is bandlimited to B and S > 2B,
used to realize image zooming (magnification), rotation, and
it should be possible to use the sinc interpolation formula given
warping.
by Eq. (2.51) to reconstruct x(t) exactly. A simple example is
illustrated by the finite-duration sinusoid shown in Fig. 4-1.
In practice, however, it is computationally more efficient to
perform the interpolation in the frequency domain by lowpass- 4-1 Interpolation Using Sinc Functions
filtering the spectrum of the sampled signal.
Oftentimes, the sampling rate does not satisfy the Nyquist 4-1.1 Sinc Interpolation Formula
criterion, as a consequence of which the signal obtained by
An image f (x, y) is said to be bandlimited to a maximum spatial
frequency B if its spectrum F(µ , v) is such that

F(µ , ν ) = 0 for |µ |, |v| ≥ B.


1
If such an image is sampled uniformly along x and y at sampling
rates Sx = Sy = S = 1/∆, and the sampling interval ∆ satisfies the
0.5 Nyquist rate, namely
1
∆< ,
2B
0 t (ms) then we know from the 2-D sampling theorem (Section
3-5) that f (x, y) can be reconstructed from its samples
f [n, m] = { f (n∆, m∆) } using the sinc interpolation formula
−0.5 given by Eq. (3.62), which we repeat here as
∞ ∞ x y 
−1 f (x, y) =∑ ∑ f (n∆, m∆) sinc

− n sinc

− m ,
n=−∞ m=−∞
0 0.5 1 1.5 2 2.5 3 3.5
(4.1)
where for any argument z, the sinc function is defined as
Figure 4-1 1-D interpolation of samples of a sinusoid using the
sinc interpolation formula. sin(π z)
sinc(z) = . (4.2)
πz

129
130 CHAPTER 4 IMAGE INTERPOLATION

In reality, an image is finite in size, and so is the number of where the rectangle function, defined earlier in Eq. (2.2), is given
samples { f (n∆, m∆) }. Consequently, for a square image, the by (
ranges of indices n and m are limited to finite lengths M, in x 1 for − a < x < a,
which case the infinite sums in Eq. (4.1) become finite sums: rect =
2a 0 otherwise.
M−1 M−1 x  y 
Parameter a usually is assigned a value of 2 or 3. In MATLAB,
fsinc (x, y) = ∑ ∑ f (n∆, m∆) sinc

− n sinc

−m .
the Lanczos interpolation formula can be exercised using the
n=0 m=0
(4.3) command imresize and selecting Lanczos from the menu.
The summations start at n = 0 and m = 0, consistent with the In addition to the plot for sinc(x), Fig. 4-2 also contains plots
image display format shown in Fig. 3-3(a), wherein location of the sinc function multiplied by the windowed sinc function,
(0, 0) is at the top-left corner of the image. with a = 2 and also with a = 3. The rectangle function is zero
The sampled image consists of M × M values—denoted here beyond |x| = a, so the windowed function stops at |x| = 2 for
by f (n∆, m∆)—each of which is multiplied by the product of a = 2 and at |x| = 3 for a = 3. The Lanczos interpolation method
two sinc functions, one along x and another along y. The value provides significant computational improvement over the simple
fsinc (x, y) at a specified location (x, y) on the image consists of sinc interpolation formula.
the sum of M × M terms. In the limit for an image of infinite
size, and correspondingly an infinite number of samples along Concept Question 4-1: What is the advantage of Lanc-
x and y, the infinite summations lead to fsinc (x, y) = f (x, y), zos interpolation over sinc interpolation?
where f (x, y) is the original image. That is, the interpolated
image is identical to the original image, assuming all along that
the sampled image is in compliance with the Nyquist criterion Exercise 4-1: If sinc interpolation is applied to the samples
of the sampling theorem. When applying the sinc interpolation {x(0.1n) = cos(2π (6)0.1n)}, what will be the result?
formula to a finite-size image, the interpolated image should be Answer: {x(0.1n)} are samples of a 6 Hz cosine sam-
a good match to the original image, but it is computationally pled at a rate of 10 samples/second, which is below the
inefficient when compared with the spatial-frequency domain Nyquist frequency of 12 Hz. The sinc interpolation will
interpolation technique described later in Section 4-3.2. be a cosine with frequency aliased to (10 − 6) = 4 Hz (see
Section 2-4.3).

4-1.2 Lanczos Interpolation 4-2 Upsampling and Downsampling


To reduce the number of terms involved in the computation of Modalities
the interpolated image, the sinc function in Eq. (4.3) can be
modified by truncating the sinc pattern along x and y so that As a prelude to the material presented in forthcoming sections,
it is zero beyond a certain multiple of ∆, such as |x| = 2∆ we present here four examples of image upsampling and down-
and |y| = 2∆. Such a truncation is offered by the Lanczos sampling applications. Figure 4-3 depicts five image configu-
interpolation formula which replaces each of the sinc functions rations, each consisting of a square array of square pixels. The
in Eq. (4.3) by windowed sinc functions, thereby assuming the central image is the initial image from which the other four were
form generated. We define this initial image as the discrete version of
a continuous image f (x, y), sampled at (M × M) locations at a
M−1 M−1 sampling interval ∆o along both dimensions:
fLanczos (x, y) = ∑ ∑ f (n∆, m∆)
n=0 m=0 f [n, m] = { f (n∆o , m∆o ), 0 ≤ n, m ≤ M − 1 }. (4.5)
x   x  x
× sinc − n sinc − n rect
∆
y  a∆y  2ay  We assume that the sampling rate S is such that ∆o = 1/S
× sinc − m sinc − m rect , satisfies the Nyquist rate S > 2B or, equivalently, ∆o < 1/2B,
∆ a∆ 2a where B is the maximum spatial frequency of f (x, y).
(4.4) When displayed on a computer screen, the initial (central
4-2 UPSAMPLING AND DOWNSAMPLING MODALITIES 131

1
sinc(x)
0.8
sinc(x) sinc(x/2) rect(x/4)
0.6 sinc(x) sinc(x/3) rect(x/6)

0.4

0.2

−0.2 x
−3 −2 −1 0 1 2 3

Figure 4-2 Sinc function (in black) and Lanczos windowed sinc functions (a = 2 in blue and a = 3 in red).

image is characterized by four parameters: (Mu × Mu ) image g[n, m], with Mu > Mo and

Mo × Mo = M 2 = image array size, Mu Mo


T′ = T, w′ = w, and ∆u = ∆o . (4.6)
2
T × T = T = image physical size on computer screen, Mo Mu

w × w = w2 = image pixel area, Since the sampling interval ∆u in the enlarged upsampled image
2 is shorter than the sampling interval in the initial image, ∆o , the
∆o × ∆o = ∆o = true resolution area. Nyquist requirement continues to be satisfied.
If the displayed image bears a one-to-one correspondence to the
array f [n, m], then the brightness of a given pixel corresponds B. Increasing Image Array Size While Keeping
to the magnitude f [n, m] of the corresponding element in the Physical Size Unchanged
array, and the pixel dimensions w × w are directly proportional
to ∆o × ∆o . For simplicity, we set w = ∆o , which means that the The image displayed in Fig. 4-3(b) is an upsampled version of
image displayed on the computer screen has the same physical f [n, m] in which the array size was increased from (M × M) to
dimensions as the original image f (x, y). The area ∆2o is the true (Mu × Mu ) and the physical size of the image remained the same,
resolution area of the image. but the pixel size is smaller. In the image g[n, m],
Mo Mo
T = T ′, w′ = w , and ∆u = ∆o . (4.7)
Mu Mu
4-2.1 Upsampling Modalities
A. Enlarging Physical Size While Keeping Pixel ◮ The two images in Fig. 4-3(a) and (b) have identical ar-
Size Unchanged rays g[n, m], but they are artificially displayed on computer
screens with different pixel sizes. ◭
The image configuration in part (a) of Fig. 4-3 depicts what
happens when the central image f [n, m] is enlarged in size from
(T × T ) to (T ′ × T ′ ) while keeping the pixel size the same. The 4-2.2 Downsampling Modalities
process, accomplished by an upsampling operation, leads to an
132 CHAPTER 4 IMAGE INTERPOLATION

T′ g[n,m]
T′ = T

Upsampling
Upsampling
Δu = Δo /2
f [n,m] Δu = Δo /2 w′
T
(b) Image with Mu = 2Mo,
T ′ = T, and w′ = w /2

w′
(a) Image with w′ = w, Mu = 2Mo,
and T ′ = 2T Downsampling
Downsampling T′ = T
w
Δd = 2Δo
T′ Δd = 2Δo Original image
Mo × Mo

w′ w′
(c) Thumbnail image with w′ = w, (d) Image with w′ = 2w, T ′ = T,
Md = Mo /2 and T ′ = T/2 and Md = Mo /2

Figure 4-3 Examples of image upsampling and downsampling.

A. Thumbnail Image B. Reduce Array Size While Keeping Physical


Size Unchanged
If we apply downsampling to reduce the array size from
(Mo × Mo ) to (Md × Md ), we end up with the downsampled The final of the transformed images, shown in Fig. 4-3(d), has
images depicted in Figs. 4-3(c) and (d). In the thumbnail image the same identical content of the thumbnail image, but the pixel
shown in Fig. 4-3(c), the pixel size of the computer display is size on the computer screen has been enlarged so that
the same as that of the original image. Hence,
Mo Mo
Md Mo T = T ′, ∆d = ∆o , and w′ = w . (4.9)

T = T, ′
w = w, and ∆d = ∆o . (4.8) Md Md
Mo Md
4-3 UPSAMPLING AND INTERPOLATION 133

4-3 Upsampling and Interpolation illustrated in Fig. 4-3(a), rather than to decrease the sampling
interval.
Upsampling image f [n, m] to image g[n′ , m′ ] can be accom-
Let f (x, y) be a continuous-space image of size T (meters) by T
plished either directly in the discrete spatial domain or indirectly
(meters) whose 2-D Fourier transform F(µ , v) is bandlimited to
in the spatial frequency domain. We examine both approaches in
B (cycles/m). That is,
the subsections that follow.
F(µ , v) = 0 for |µ |, |ν | > B. (4.10)

Image f (x, y) is not available to us, but an (Mo × Mo ) sampled 4-3.1 Upsampling in the Spatial Domain
version of f (x, y) is available. We define it as the original
In practice, image upsampling is performed using the 2-D
sampled image
DFT in the spatial frequency domain because it is much faster
f [n, m] = { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1 }, (4.11) and easier computationally than performing the upsampling
directly in the spatial domain. Nevertheless, for the sake of
where ∆o is the associated sampling interval. Moreover, the completeness, we now provide a succinct presentation of how
sampling had been performed at a rate exceeding the Nyquist upsampling is performed in the spatial domain using the sinc
rate, which requires the choice of ∆o to satisfy the condition interpolation formula.
We start by repeating Eq. (4.3) after replacing ∆ with ∆o and
1 M with Mo :
∆o < . (4.12)
2B    
Mo −1 Mo −1
x y
Given that the sampled image is T × T in size, the number of f (x, y) = ∑ ∑ f [n, m] sinc − n sinc −m .
samples Mo along each direction is n=0 m=0 ∆o ∆o
(4.16)
T Here, f [n, m] is the original (Mo × Mo ) sampled image available
Mo = . (4.13) to us, and the goal is to upsample it to an (Mu × Mu ) image
∆o
g[n′ , m′ ], with Mu = LMo , where L is an upsampling factor. In
Next, we introduce a new (yet to be created) higher-density the upsampled image, the sampling interval ∆u is related to the
sampled image g[n, m], also T × T in physical dimensions but sampling interval ∆o of the original sampled image by
containing Mu × Mu samples—instead of Mo × Mo samples—
with Mu > Mo (which corresponds to the scenario depicted in Mo ∆o
∆u = ∆o = . (4.17)
Fig. 4-3(b)). We call g[n′ , m′ ] the upsampled version of f [n, m]. Mu L
The goal of upsampling and interpolation, which usually is
abbreviated to just “upsampling,” is to compute g[n′ , m′ ] from To obtain g[n′ , m′ ], we sample f (x, y) at x = n′ ∆u and y = m′ ∆u :
f [n, m]. Since Mu > Mo , g[n′ , m′ ] is more finely discretized than
f [n, m], and the narrower sampling interval ∆u of the upsampled g[n′ , m′ ] = f (n′ ∆u , m′ ∆u ) (0 ≤ n′ , m′ ≤ Mu − 1)
image is Mu −1 Mu −1

∆u =
T
=
Mo
∆o . (4.14)
= ∑ ∑ f [n, m]
n=0 m=0
Mu Mu    
′ ∆u ′ ∆u
Since ∆u < ∆o , it follows that the sampling rate associated with × sinc n − n sinc m −m
g[n′ , m′ ] also satisfies the Nyquist rate. ∆o ∆o
The upsampled image is given by Mu −1 Mu −1

′ ′ ′ ′ ′ ′
= ∑ ∑ f [n, m]
g[n , m ] = { f (n ∆u , m ∆u ), 0 ≤ n , m ≤ Mu − 1 }. (4.15) n=0 m=0
   ′ 
n′ m
In a later section of this chapter (Section 4-6) we demonstrate × sinc − n sinc −m . (4.18)
L L
how the finer discretization provided by upsampling is used to
compute a rotated or warped version of image f [n, m]. Another If greater truncation is desired, we can replace the product of
application is image magnification, but in that case the primary sinc functions with the product of the Lanczos functions defined
goal is to increase the image size (from T1 × T1 to T2 × T2 ), as in Eq. (4.4). In either case, application of Eq. (4.18) generates
134 CHAPTER 4 IMAGE INTERPOLATION

the upsampled version g[n′ , m′ ] directly from the original sam- tion for F[k1 , k2 ] extends to (Mo − 1) whereas the summation for
pled version f [n, m]. If ∆u = ∆o /L and L is an integer, then the G[k1 , k2 ] extends to (Mu − 1).
process preserves the values of f [n, m] while adding new ones in The goal is to compute g[n, m] by (1) transforming f [n, m] to
between them. To demonstrate that the upsampling process does obtain F[k1 , k2 ], (2) transforming F[k1 , k2 ] to G[k1 , k2 ], and (3)
indeed preserve f [n, m], let us consider the expression given by then transforming G[k1 , k2 ] back to the spatial domain to form
Eq. (4.15) for the specific case where n′ = Ln and m′ = Lm: g[n, m]. Despite the seeming complexity of having to execute a
three-step process, the process is computationally more efficient
g[Ln, Lm] = { f (n′ ∆u , m′ ∆u ), 0 ≤ n′ , m′ ≤ Mu − 1 } than performing upsampling entirely in the spatial domain.
= { f (Ln∆u , Lm∆u ), 0 ≤ n, m ≤ Mo − 1 } Upsampling in the discrete frequency domain [k1 , k2 ] en-
= { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1 } = f [n, m], tails increasing the number of discrete frequency components
from (Mo × Mo ) for F[k1 , k2 ] to (Mu × Mu ) for G[k1 , k2 ], with
where we used the relationships given by Eqs. (4.11) and (4.14). Mu > Mo . As we will demonstrate shortly, G[k1 , k2 ] includes all
Hence, upsampling by an integer L using the sinc interpola- of the elements of F[k1 , k2 ], but it also includes some additional
tion formula does indeed preserve the existing values of f [n, m], rows and columns filled with zeros.
in addition to adding interpolated values between them.

A. Original Image
4-3.2 Upsampling in the Spatial Frequency
Domain Let us start with the sampled image fo (x, y) of continuous
image f (x, y) sampled at a sampling interval ∆o and resulting
Instead of using the sinc interpolation formula given by in (Mo × Mo ) samples. Per Eq. (3.61), adapted to a finite sum
Eq. (4.18), upsampling can be performed much more easily, that starts at (0, 0) and ends at (Mo − 1, Mo − 1),
and with less computation, using the 2-D DFT in the spatial
frequency domain. From Eqs. (4.11) and (4.18), the (Mo × Mo ) Mo −1 Mo −1
original f [n, m] image and the (Mu × Mu ) upsampled image fo (x, y) = ∑ ∑ f (n∆o , m∆o ) δ (x − n∆o ) δ (y − m∆o)
g[n′ , m′ ] are defined as n=0 m=0
Mo −1 Mo −1
f [n, m] = { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1 }, (4.19a) = ∑ ∑ f [n, m] δ (x − n∆o) δ (y − m∆o ), (4.21)
n=0 m=0
g[n′ , m′ ] = { g(n′ ∆u , m′ ∆u ), 0 ≤ n′ , m′ ≤ Mu − 1 }. (4.19b)
where we used the definition for f [n, m] given by Eq. (4.19a).
Note that whereas in the earlier section it proved convenient to Using entry #9 in Table 3-1, the 2-D CSFT Fo (µ , ν ) of
distinguish the indices of the upsampled image from those of fo (x, y) can be written as
the original image—so we used [n, m] for the original image and
[n′ , m′ ] for the upsampled image—the distinction is no longer Fo (µ , ν ) = F { fo (x, y)}
needed in the present section, so we will now use indices [n, m] Mo −1 Mo −1
for both images. = ∑ ∑ f [n, m] F {δ (x − n∆o) δ (y − m∆o )}
From Eq. (3.75), the 2-D DFT of f [n, m] of order (Mo × Mo ) n=0 m=0
and the 2-D DFT of g[n, m] of order (Mu × Mu ) are given by Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π µ n∆o e− j2πν m∆o . (4.22)
Mo −1 Mo −1 n=0 m=0
− j(2π /Mo )(nk1 +mk2 )
F[k1 , k2 ] = ∑ ∑ f [n, m] e ,
n=0 m=0 The spectrum Fo (µ , ν ) of sampled image fo (x, y) is doubly
0 ≤ k1 , k2 ≤ Mo − 1, (4.20a) periodic in µ and ν with period 1/∆o , as expected.
Mu −1 Mu −1 Extending the relations expressed by Eqs. (2.47) and (2.54)
G[k1 , k2 ] = ∑ ∑ f [n, m] e− j(2π /Mu )(nk1 +mk2 ) , from 1-D to 2-D, the spectrum Fo (µ , ν ) of the sampled image is
n=0 m=0 related to the spectrum F(µ , ν ) of continuous image f (x, y) by
0 ≤ k1 , k2 ≤ Mu − 1. (4.20b) ∞ ∞
1
The summations are identical in form, except that the summa-
Fo ( µ , ν ) =
∆2o k ∑ ∑ F(µ − k1 /∆o , ν − k2 /∆o ). (4.23)
1 =−∞ k2 =−∞
4-3 UPSAMPLING AND INTERPOLATION 135

The spectrum Fo (µ , ν ) of the sampled image consists of copies k1 , k2 ≥ 0 and redefining µ as µ = −k1 /(Mo ∆o ) leads to
of the spectrum F(µ , ν ) of f (x, y) repeated every 1/∆o in both  
µ and ν , and also scaled by 1/∆2o. −k1 k2 Mo − 1
Fo , 0 ≤ k1 , k2 ≤
In (µ , ν ) space, µ and ν can be both positive or negative. The Mo ∆o Mo ∆o 2
relation between the spectrum Fo (µ , ν ) of the sampled image Mo −1 Mo −1
fo (x, y) and the spectrum F(µ , ν ) of the original continuous = ∑ ∑ f [n, m] e− j2π n(−k1)/Mo e− j2π mk2 /Mo
image f (x, y) assumes different forms for the four quadrants of n=0 m=0
(µ , ν ) space. Mo −1 Mo −1
For ease of presentation, let Mo be odd. If Mo is even, then = ∑ ∑ f [n, m] e− j2π n(Mo −k1 )/Mo e− j2π mk2 /Mo
simply replace (Mo − 1)/2 with Mo /2 (see Section 4-4.2). n=0 m=0
Next, we sample µ and ν by setting them to Mo − 1
= F[Mo − k1 , k2 ], 0 ≤ k1 , k2 ≤ , (4.26)
2
k1 k2
µ= and ν = , (4.24)
Mo ∆o Mo ∆o 3. Quadrant 3: µ ≤ 0 and ν ≤ 0
Mo − 1
0 ≤ |k1 |, |k2 | ≤ . Redefining µ as µ = −k1 /(Mo ∆o ) and ν as ν = −k2 /(Mo ∆o )
2
leads to
 
−k1 −k2 Mo − 1
Fo , , 0 ≤ k1 , k2 ≤
Mo ∆o Mo ∆o 2
Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π n(−k1)/Mo e− j2π m(−k2 )/Mo
1. Quadrant 1: µ ≥ 0 and ν ≥ 0 n=0 m=0
Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π n(Mo −k1 )/Mo e− j2π m(Mo −k2 )/Mo
At these values of µ and ν , Fo (µ , ν ) becomes n=0 m=0
  Mo − 1
k1 k2 = F[Mo − k1, Mo − k2 ], 0 ≤ k1 , k2 ≤ . (4.27)
Fo , 2
Mo ∆o Mo ∆o
Mo −1 Mo −1
4. Quadrant 4: µ ≥ 0 and ν ≤ 0
= ∑ ∑ f [n, m] e− j2π nk1 /Mo e− j2π mk2 /Mo
n=0 m=0 Upon defining µ as in Eq. (4.24) and redefining ν as
Mo − 1 ν = −k2 /(Mo ∆o ),
= F[k1 , k2 ], 0 ≤ k1 , k2 ≤ . (4.25)
2  
k1 −k2 Mo − 1
Fo , , 0 ≤ k1 , k2 ≤
Mo ∆o Mo ∆o 2
Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π nk1 /Mo e− j2π m(−k2 )/Mo
n=0 m=0
Mo −1 Mo −1
2. Quadrant 2: µ ≤ 0 and ν ≥ 0 = ∑ ∑ f [n, m] e− j2π nk1 /Mo e− j2π m(Mo −k2 )/Mo
n=0 m=0
In quadrants 2–4, we make use of the relation Mo − 1
= F[k1 , Mo − k2], 0 ≤ k1 , k2 ≤ , (4.28)
2
− j2π n(−k1 )/Mo − j2π nMo /Mo − j2π n(−k1 )/Mo
e =e e
The result given by Eqs. (4.25)–(4.28) states that the 2-D
= e− j2π n(Mo −k1 )/Mo , CSFT Fo (µ , ν ) of the sampled image fo (x, y)—when sampled
at the discrete spatial frequency values defined by Eq. (4.24)—
where we used e− j2π nMo /Mo = 1. A similar relation applies to k2 . is the 2-D DFT of the sampled image f [n, m]. Also, the spec-
In quadrant 2, µ is negative and ν is positive, so keeping trum Fo (µ , ν ) of the sampled image consists of copies of the
136 CHAPTER 4 IMAGE INTERPOLATION

spectrum F(µ , ν ) repeated every 1/∆o in both µ and ν , and also upon sampling Fu (µ , ν ) at the rates defined by Eq. (4.24), we
scaled by 1/∆2o. Hence, generalizing Eq. (2.54) from 1-D to 2-D obtain
gives  
k1 k2
    Fu ,
k1 k2 1 k1 k2 Mo ∆o Mo ∆o
Fo , = 2F , , (4.29)
Mo ∆o Mo ∆o ∆o Mo ∆o Mo ∆o Mu −1 Mu −1
= ∑ ∑ g[n, m] e− j2π nk1 /Mu e− j2π mk2 /Mu
where F(µ , ν ) is the 2-D CSFT of the continuous-space image n=0 m=0
f (x, y) and 0 ≤ |k1 |, |k2 | ≤ (Mo − 1)/2. Mo − 1
= G[k1 , k2 ], 0 ≤ k1 , k2 ≤ , (4.32)
2

A. Upsampled Image 2. Quadrant 2: µ ≤ 0 and ν ≥ 0

Now we repeat this entire derivation using a sampling interval  


∆u instead of ∆o , and replacing Mo with Mu , but keeping the form −k1 k2 Mo − 1
Fu , 0 ≤ k1 , k2 ≤
of the relations given by Eq. (4.24) the same. Since ∆u < ∆o , the Mo ∆o Mo ∆o 2
sampled image is now (Mu × Mu ) instead of (Mo × Mo ). Hence, Mu −1 Mu −1
the (Mu × Mu ) sampled image is = ∑ ∑ g[n, m] e− j2π n(−k1)/Mu e− j2π mk2 /Mu
n=0 m=0
Mu −1 Mu −1 Mu −1 Mu −1
fu (x, y) = ∑ ∑ f (n∆u , m∆u ) δ (x − n∆u ) δ (y − m∆u ) = ∑ ∑ g[n, m] e− j2π n(Mu −k1 )/Mu e− j2π mk2 /Mu
n=0 m=0 n=0 m=0
Mu −1 Mu −1
Mo − 1
= ∑ ∑ g[n, m] δ (x − n∆u ) δ (y − m∆u ), (4.30) = G[Mu − k1, k2 ], 0 ≤ k1 , k2 ≤
2
, (4.33)
n=0 m=0

with g[n, m] as defined in Eq. (4.19b). The associated 2-D CSFT


of the sampled image fu (x, y) is

Fu (µ , ν ) = F { fu (x, y)} 3. Quadrant 3: µ ≤ 0 and ν ≤ 0


Mu −1 Mu −1
= ∑ ∑ g[n, m] F {δ (x − n∆u) δ (y − m∆u )}
 
n=0 m=0 −k1 −k2 Mo − 1
Mu −1 Mu −1 Fu , , 0 ≤ k1 , k2 ≤
Mo ∆o Mo ∆o 2
= ∑ ∑ g[n, m] e− j2π µ n∆u e− j2πν m∆u . (4.31)
Mu −1 Mu −1
n=0 m=0
= ∑ ∑ g[n, m] e− j2π n(−k1)/Mu e− j2π m(−k2 )/Mu
n=0 m=0
Mu −1 Mu −1
= ∑ ∑ g[n, m] e− j2π n(Mu −k1 )/Mu e− j2π m(Mu −k2 )/Mu
n=0 m=0

1. Quadrant 1: µ ≥ 0 and ν ≥ 0 Mo − 1
= G[Mu − k1, Mu − k2 ], 0 ≤ k1 , k2 ≤ . (4.34)
2
In view of the relationship (from Eq. (4.14))
∆u ∆u 1
= = ,
Mo ∆o Mu ∆u Mu
4-4 IMPLEMENTATION OF UPSAMPLING USING 2-D DFT IN MATLAB 137

4. Quadrant 4: µ ≥ 0 and ν ≤ 0 sampled signal is zero, since sampling the original signal f (x, y)
at above its Nyquist rate separates the copies of its spectrum,
  leaving bands of zero between copies. Thus
k1 −k2 Mo − 1
Fu , , 0 ≤ k1 , k2 ≤
Mo ∆o Mo ∆o 2 G[k1 , k2 ] = 0, Mo ≤ k1 , k2 ≤ Mu − 1. (4.39)
Mu −1 Mu −1
= ∑ ∑ g[n, m] e− j2π nk1 /Mu e− j2π m(−k2 )/Mu
n=0 m=0
Mu −1 Mu −1
= ∑ ∑ g[n, m] e− j2π nk1 /Mu e− j2π m(Mu −k2 )/Mu
n=0 m=0 4-4 Implementation of Upsampling
Mo − 1
= G[k1 , Mu − k2 ], 0 ≤ k1 , k2 ≤ , (4.35) Using 2-D DFT in MATLAB
2
The result given by Eqs. (4.32)–(4.35) states that the 2-D In MATLAB, both the image f [n, m] and its 2-D DFT are stored
CSFT Fu (µ , ν ) of the upsampled image fu (x, y)—when sampled and displayed using the format shown in Fig. 3-25(d), wherein
at the discrete spatial frequency values defined by Eq. (4.24)—is the origin is at the upper left-hand corner of the image, and the
the 2-D DFT of the sampled image g[n, m]. Also, the spectrum indices of the corner pixel are (1, 1).
Fu (µ , ν ) of the sampled image consists of copies of the spec-
trum F(µ , ν ) of f (x, y) repeated every 1/∆u in both µ and ν ,
and also scaled by 1/∆2u. Thus,
   
k1 k2 1 k1 k2 Image and 2-D DFT Notation
Fu , = 2F , , (4.36)
Mo ∆o Mo ∆o ∆u Mo ∆o Mo ∆o
To avoid confusion between the common-image format
From Eq. (4.14), Mo ∆o = Mu ∆u . Combining Eq. (4.29) and (CIF) and the MATLAB format, we provide the following
Eq. (4.36) shows that list of symbols and definitions:
   
k1 k2 M2 k1 k2 CIF MATLAB
Fu , = u2 Fo , . (4.37)
Mo ∆o Mo ∆o Mo Mo ∆o Mo ∆o Original image f [n, m] X(m′ , n′ )
Hence, the (Mo × Mo ) 2-D DFT F[k1 , k2 ] of f [n, m] and the Upsampled image g[n, m] Y(m′ , n′ )
(Mu × Mu ) 2-D DFT G[k1 , k2 ] of g[n, m] are related by 2-D DFT of f [n, m] F[k1 , k2 ] FX(k2′ , k1′ )
2-D DFT of g[n, m] G[k1 , k2 ] FY(k2′ , k1′ )
Mu2
G[k1 , k2 ] = F[k1 , k2 ],
Mo2
Mu2
G[Mu − k1 , k2 ] = F[Mo − k1 , k2 ],
Mo2 As noted earlier in Section 3-9, when an image f [n, m] is stored
M2 in MATLAB as array X(m′ , n′ ), the two sets of indices are related
u
G[k1 , Mu − k1 ] = F[k1 , Mo − k2 ], by
Mo2
Mu2 m′ = m + 1, (4.40a)
G[Mu − k1 , Mu − k2 ] = F[Mo − k1 , Mo − k2 ], ′
Mo2 n = n + 1. (4.40b)
Mo − 1
0 ≤ k1 , k2 ≤ . (4.38) The indices get interchanged in orientation (n represents row
2
number, whereas n′ represents column number) and are shifted
This leaves G[k1 , k2 ] for Mo ≤ k1 , k2 ≤ Mu − 1 to be determined. by 1. For example, f [0, 0] = X(1, 1), and f [0, 1] = X(2, 1). While
But Eq. (4.38) shows that these values of G[k1 , k2 ] are samples the indices of the two formats are different, the contents of
of Fu (µ , ν ) at values of µ , ν for which this spectrum of the array X(m′ , n′ ) is identical with that of f [n, m]. That is, for an
138 CHAPTER 4 IMAGE INTERPOLATION

(Mo × Mo ) image zeros in the “middle” of the array FX. The result is
Mu2
X(m′ , n′ ) = f [n, m] = FY = ×
  Mo2
f [0, 0] f [1, 0] ··· f [Mo − 1, 0]  (Mu −Mo ) columns

h i z }| { h i
 f [0, 1] f [1, 1] ··· f [Mo − 1, 1]   Mo −1 Mo +1 
 .  F[0,0]... F 2 ,0 0 ... 0 ... 0 F 2 ,0 ... F[Mo −1,0] 
 .
.. .. .. ..  
.. .. .. .. ..

. . .  
 .h . . . . 
 h i i h i h i
f [0, Mo − 1] f [1, Mo − 1] · · · f [Mo − 1, Mo − 1]  Mo −1 
F 0, Mo2−1 ... F Mo2−1 , Mo2−1 F Mo +1 Mo −1
2 , 2 ... F Mo −1, 2 
(4.41)  
 0. 0. 
 
 .. .. 
 .. . 
The MATLAB command FX = fft(X, Mo , Mo ) computes the {
 0. . 0 .. 0. }

2-D DFT F[k1 , k2 ] and stores it in array FX(k2′ , k1′ ): 
 .
.
.
.


 
 h i 0 h i h i 0 h i
FX(k2′ , k1′ ) = F[k1 , k2 ] = 
F 0, Mo +1 ... F Mo −1 , Mo +1 Mo +1 Mo +1

Mo +1 
 2 2 2 F 2 , 2 ... F Mo −1, 2 
   
F[0, 0] F[1, 0] ··· F[Mo − 1, 0]  .. .. .. .. .. 
 .h . . . . 
 F[0, 1] F[1, 1] ··· F[Mo − 1, 1]   i h i 
 .. .. .. .. . F [0,Mo −1]... F Mo2−1 ,Mo −1 0| 0 0} F Mo −1, 2Mo +1
... F [Mo −1,Mo −1]
  {z
. . . . (4.43)
F[0, Mo − 1] F[1, Mo − 1] · · · F[Mo − 1, Mo − 1]
(4.42)
◮ Note that the (Mu − Mo ) columns of zeros start after entry
The goal of upsampling using the spatial-frequency domain F[(Mo − 1)/2, 0], and similarly the (Mu − Mo ) rows of zeros
is to compute g[n, m] from f [n, m] by computing G[k1 , k2 ] from start after F[0, (Mo − 1)/2]. ◭
F[k1 , k2 ] and then applying the inverse DFT to obtain g[n, m].
The details of the procedure are somewhat different depending
on whether the array size parameter M is an odd integer or an Once array FY has been established, the corresponding up-
even integer. Hence, we consider the two cases separately. sampled image g[n, m] is obtained by applying the MATLAB
command Y=real(ifft2(FY,N,N)), where N = Mu and
the “real” is needed to eliminate the imaginary part of Y, which
4-4.1 Mo = Odd Integer may exist because of round-off error in the ifft2.
As a simple example, consider the (3 × 3) array
The recipe for upsampling using the 2-D DFT is as follows:  
F[0, 0] F[1, 0] F[2, 0]
1. Given: image { f [n, m], 0 ≤ n, m ≤ Mo −1 }, as represented FX = F[0, 1] F[1, 1] F[2, 1] . (4.44a)
by Eq. (4.41). F[0, 2] F[1, 2] F[2, 2]

To generate a 5 × 5 array FY we insert Mu − Mo = 5 − 3 = 2


2. Compute: the 2-D DFT F[k1 , k2 ] of f [n, m] using columns of zeros after element F[(Mo − 1)/2, 0] = F[1, 0], and
Eq. (4.20a) to obtain the array represented by Eq. (4.42). also 2 rows of zeros after F[0, (Mo − 1)/2] = F[0, 1]. The result
is
3. Create: an upsampled (Mu × Mu ) array G[k1 , k2 ], and then  
F[0, 0] F[1, 0] 0 0 F[2, 0]
set its entries per the rules of Eq. (4.38).
52 
F[0, 1] F[1, 1] 0 0 F[2, 1]

FY = 2  0 0 0 0 0 . (4.44b)
4. Compute: the (Mu × Mu ) 2-D inverse DFT of G[k1 , k2 ] to 3  0 0 0 0 0 
obtain g[n, m]. F[0, 2] F[1, 2] 0 0 F[2, 2]

In MATLAB, the array FY containing the 2-D DFT G[k1 , k2 ] Application of the inverse 2-D DFT to FY generates array Y in
is obtained from the array FX given by Eq. (4.42) by inserting MATLAB, which is equivalent in content to image g[n, m] in
(Mu − Mo ) rows of zeros and an equal number of columns of common-image format.
4-4 IMPLEMENTATION OF UPSAMPLING USING 2-D DFT IN MATLAB 139

4-4.2 Mo = Even Integer one column of zeros and multiplying by (52 /32 ) would generate
 
49 −6 − j3 0 3 −6 + j3
52  −5 − j4 2 + j9 0 −5 + j2 j 
 
G[k1 , k2 ] = 2  0 0 0 0 0 .
3  1 −4 + j3 0 −1 −4 − j3
For a real-valued (Mo × Mo ) image f [n, m] with Mo = an odd
−5 + j4 −j 0 −5 − j2 2 − j9
integer, conjugate symmetry is automatically satisfied for both
(4.48)
F[k1 , k2 ], the 2-D DFT of the original image, as well as for This G[k1 , k2 ] array does not satisfy conjugate symmetry. Ap-
G[k1 , k2 ], the 2-D DFT of the upsampled image. However, if plying the inverse 2-D DFT to G[k1 , k2 ] generates the upsampled
Mo is an even integer, application of the recipe outlined in the image g[n, m]:
preceding subsection will violate conjugate symmetry, so we
need to modify it. Recall from Eq. (3.80) that for a real-valued g[n,m] = (52 /32 )×
 
image f [n, m], conjugate symmetry requires that 0.64 1.05 + j0.19 1.64 − j0.30 2.39 + j0.30 2.27 − j0.19
1.05 + j0.19 2.01 + j0.22 2.85 − j0.20 2.8 − j0.13 1.82 − j0.19
 
F∗ [k1 , k2 ] = F[Mo − k1 , Mo − k2 ], 1 ≤ k1 , k2 ≤ Mo − 1.  1.64 − j0.3 2.38 − j0.44 3.9 + j0.46 3.16 + j0.08 1.31 + j0.39 ,
2.39 + j0.30 2.37 − j0.066 2.88 + j0.36 2.04 − j0.9 0.84 + j0.12
(4.45) 2.27 − j0.19 1.89 − j0.25 1.33 + j0.26 0.83 + j0.08 1.18 + j0.22
Additionally, in view of the definition for F[k1 , k2 ] given by
Eq. (4.20a), the following two conditions should be satisfied: which is clearly incorrect; all of its elements should be real-
valued because the original image f [n, m] is real-valued. Obvi-
Mo −1 Mo −1 ously, the upsampling recipe needs to be modified.
F[0, 0] = ∑ ∑ f [n, m] = real-valued, (4.46a) A simple solution is to split row F[k1 , Mo /2] into 2 rows and
n=0 m=0 to split column F[Mo /2, k2 ] into 2 columns, which also means
 
Mo Mo Mo −1 Mo −1 that F[Mo /2, Mo /2] gets split into 4 entries. The recipe preserves
F , = ∑ ∑ (−1)n+m f [n, m] = real-valued. conjugate symmetry in G[k1 , k2 ].
2 2 n=0 m=0 When applied to F[k1 , k2 ], the recipe yields the (6 × 6) array
(4.46b)
G[k1 , k2 ] = (62 /42 )×
In the preceding subsection, we inserted the appropriate number  
of rows and columns of zeros to obtain G[k1 , k2 ] from F[k1 , k2 ]. F[0, 0] F[1, 0] F[2, 0]/2 0 F[2, 0]/2 F[3, 0]
For Mo equal to an odd integer, the conditions represented  F[0, 1] F[1, 1] F[2, 1]/2 0 F[2, 1]/2 F[3, 1] 
 
by Eqs. (4.45) and (4.46) are satisfied for both F[k1 , k2 ] and F[0, 2]/2 F[1, 2]/2 F[2, 2]/4 0 F[2, 2]/4 F[3, 2]/2
 0 0 0 0 0 0 =
G[k1 , k2 ], but they are not satisfied for G[k1 , k2 ] when Mo is  
F[0, 2]/2 F[1, 2]/2 F[2, 2]/4 0 F[2, 2]/4 F[3, 2]/2
an even integer. To demonstrate why the simple zero-insertion
procedure is problematic, let us consider the (4 × 4) image F[0, 3] F[1, 3] F[2, 3]/2 0 F[2, 3]/2 F[3, 3]
 
  49 −6 − j3 1.5 0 1.5 −6 + j3
1 2 3 4 −5 − j4 2 + j9 −2.5 + j1 0 −2.5 + j1 j 
2 4 5 3  
f [n, m] =  . 
(4.47a)  0.5 −2 + j1.5 −0.25 0 −0.25 −2 − j1.5
3 4 6 2 .
 0 0 0 0 0 0 
4 3 2 1  0.5 −2 + j1.5 −0.25 0 −0.25 −2 − j1.5
−5 − j4 −j − j0.5 0 −2.5 − j1 2 − j9
The (4 × 4) 2-D DFT F[k1 , k2 ] of f [n, m] is
(4.49)
 
49 −6 − j3 3 −6 + j3
−5 − j4 2 + j9 −5 + j2 j  Application of the inverse 2-D DFT to G[k1 , k2 ] yields the
F[k1 , k2 ] =  .
1 −4 + j3 −1 −4 − j3
−5 + j4 −j −5 − j2 2 − j9
(4.47b)
This F[k1 , k2 ] has conjugate symmetry.
Let us suppose that we wish to upsample the (4 × 4) image
f [n, m] to a (5 × 5) image g[n, m]. Inserting one row of zeros and
140 CHAPTER 4 IMAGE INTERPOLATION

Answer: The 2-point DFT X[k] = {8 + 4, 8 − 4} = {12, 4}.


Since Mo = 2 is even, we split 4 and insert a zero in the
middle, and multiply by 4/2 to get Y[k] = {12, 2, 0, 2}. The
inverse 4-point DFT is
2
y[0] = (12 + 2 + 0 + 2) = 8,
4
2
y[1] = (12 + 2 j − 0 − 2 j) = 6,
4
(a) Original 4 × 4 image (b) Upsampled 6 × 6 image 2
y[2] = (12 − 2 + 0 − 2) = 4,
4
Figure 4-4 Comparison of original and upsampled images. 2
y[3] = (12 − 2 j + 0 + 2 j) = 6.
4
y[n] = {8, 6, 4, 6}.

4-5 Downsampling
upsampled image Downsampling is the inverse operation to upsampling. The
  objective of downsampling is to reduce the array size f [n, m]
0.44 0.61 1.06 1.33 1.83 1.38 of an image from (Mo × Mo ) samples down to (Md × Md ), with
0.61 1.09 1.73 1.91 1.85 1.21 Md < Mo . The original image f [n, m] and the downsampled
62  
1.06 1.55 2.31 2.58 1.66 0.90 image g[n, m] are defined as:
g[n, m] = 2  
4 1.33 1.55 2.22 2.67 1.45 0.78
1.83 1.72 1.52 1.42 0.54 0.73
Original image f [n, m] = { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1},
1.38 1.25 0.94 0.76 0.72 1.04 Downsampled image g[n, m] = { f (n∆d , m∆d ), 0 ≤ n, m ≤ Md − 1}.
 
1.0 1.37 2.39 3.0 4.12 3.11
1.37 2.46 3.90 4.30 4.16 2.72 Both images are sampled versions of some continuous-space
  image f (x, y), with image f [n, m] sampled at a sampling interval
2.39 3.49 5.20 5.81 3.74 2.03
=  , (4.50) ∆o that satisfies the Nyquist rate, and the downsampled image
 3.0 3.49 5.00 6.01 3.26 1.76
4.12 3.87 3.42 3.20 1.22 1.64 g[n, m] is sampled at ∆d , with ∆d > ∆o , so it is unlikely that
3.11 2.81 2.12 1.71 1.62 2.34 g[n, m] satisfies the Nyquist rate.
The goal of downsampling is to compute g[n, m] from f [n, m].
which is entirely real-valued, as it should be. The original (4×4) That is, we compute a coarser-discretized Md × Md image g[n, m]
image given by Eq. (4.47a) and the upsampled (6 × 6) image from the finer-discretized Mo × Mo image f [n, m]. Applications
given by Eq. (4.50) are displayed in Fig. 4-4. The two images, of downsampling include computation of “thumbnail” versions
which bear a close resemblance, have the same physical size but of images, as demonstrated in Example 4-1, and shrinking
different-sized pixels. images to fit into a prescribed space, such as columns in a
textbook.

4-5.1 Aliasing
It might seem that downsampling by, say, two, meaning that
Exercise 4-2: Upsample the length-2 signal x[n] = {8, 4} to
Md = Mo /2 (assuming Mo is even), could be easily accomplished
a length-4 signal y[n]. by simply deleting every even-indexed (or odd-indexed) row
and column of f [n, m]. Deleting every other row and column of
4-6 ANTIALIAS LOWPASS FILTERING 141

f [n, m] is called decimation by two. Decimation by two would 4-6 Antialias Lowpass Filtering
give the result of sampling f (x, y) every 2∆o instead of every
∆o . But if sampling at S = 1/2∆o is below the Nyquist rate, the Clearly decimation alone is not sufficient to obtain a downsam-
decimated image g[n, m] is aliased. The effect of aliasing on the pled image that looks like a demagnified original image. To
spectrum of g[n, m] can be understood in 1-D from Fig. 2-6(b). avoid aliasing, it is necessary to lowpass filter the image before
The copies of the spectrum of f (x, y) produced by sampling decimation, eliminating the high spatial frequency components
overlap one another, so the high-frequency parts of the signal of the image, so that when the filtered image is decimated, the
become distorted. Example 4-1 gives an illustration of aliasing copies of the spectra do not overlap. In 1-D, in Fig. 2-6(b),
in 2-D. had the spectrum been previously lowpass filtered with cutoff
frequency S/2 Hz, the high-frequency parts of the spectrum
would no longer overlap after sampling. This is called antialias
Example 4-1: Aliasing filtering.
The same concept applies in discrete space (and time). The
The 200 × 200 clown image shown in Fig. 4-5(a) was decimated periodicity of spectra induced by sampling becomes the period-
to the 25 × 25 image shown in Fig. 4-5(b). The decimated image icity of the DSFT and DTFT with periods of 2π in (Ω1 , Ω2 )
is a poor replica of the original image and could not function as and Ω, respectively. Lowpass filtering can be accomplished by
a thumbnail image. setting certain high-spatial-frequency portions of the 2-D DFT
to zero. The purpose of this lowpass filtering is now to eliminate
the high-spatial-frequency parts of the discrete-space spectrum,
prior to decimation. This eliminates aliasing, as demonstrated in
Example 4-2.

Example 4-2: Antialiasing

The 200 × 200 clown image shown in Fig. 4-5(a) was first
lowpass-filtered to the image shown in Fig. 4-6(a), with the
spectrum shown in Fig. 4-6(b), then decimated to the 25 × 25
image shown in Fig. 4-6(c). The decimated image is now a good
replica of the original image and could function as a thumbnail
image. The decimated image pixel values in Fig. 4-6(c) are all
equal to certain pixel values in the lowpass-filtered image in
Fig. 4-6(a).

(a) Original ◮ A MATLAB code for this example is available on the


book website. ◭

4-6.1 Downsampling in the 2-D DFT Domain


The antialiasing approach works well when downsampling by
(b) Decimated 25 × 25 version an integer factor L, since decimation can be easily performed
by keeping only every Lth row and column of the antialiased-
Figure 4-5 Clown image: (a) original 200 × 200 and (b) filtered image. But downsampling by a non-integer factor
decimated 25 × 25 version. Md /Mo must be performed entirely in the 2-D DFT domain, as
follows.
Antialias lowpass filtering is performed by setting to zero
p the Mo − Md center rows and columns of the 2-D DFT, in
142 CHAPTER 4 IMAGE INTERPOLATION

MATLAB depiction. Decimation is then performed by deleting


those (Mo − Md ) center rows and columns of the 2-D DFT. Of
course, there is no reason to set to zero rows and columns that
will be deleted anyways, so in practice the first step need not be
performed. By “center rows and columns” we mean those rows
and columns with DFT indices k (horizontal or vertical; add one
to all DFT indices to get MATLAB indices) in the (Mo × Mo )
2-D DFT of the (M × M) original image f [n, m]:
• For Md odd: (Md + 1)/2 ≤ k ≤ Mo − (Md + 1)/2.
• For Md even: Md /2 ≤ k ≤ Mo − Md /2, then insert a row or
column of zeros at index k = Md /2.
Note that there is no need to subdivide some values of F[k1 , k2 ]
to preserve conjugate symmetry.
The procedure is best illustrated by an example.

(a) Lowpass-filtered 200 × 200 image


Example 4-3: Downsampling

The goal is to downsample the MATLAB (6 × 6) image array:


 
3 1 4 1 5 9
2 6 5 3 5 8
 
9 7 9 3 2 3
X(m, n) =   (4.51)
8 4 6 2 6 4
3 3 8 3 2 7
9 5 0 2 8 8

The corresponding magnitudes of the (6 × 6) 2-D DFT of this


array in MATLAB is

|FX(k1 , k2 )| =
 
173.00 23.81 20.66 15.00 20.66 23.81
 6.93 25.00 7.55 14.00 7.94 21.93
(b) Spectrum of lowpass-filtered image  
 11.14 19.98 16.09 15.10 14.93 4.58 
 9.00 16.09 . (4.52)
 19.98 1.00 19.98 16.09
 11.14 4.58 14.93 15.10 16.09 19.98 
6.93 21.93 7.94 14.00 7.55 25.00

Setting to zero the middle three columns of the 2-D DFT


magnitudes in MATLAB depiction gives the magnitudes
(c) The decimated 25 × 25 image is not aliased  
173.00 23.81 0 0 0 23.81
Figure 4-6 Clown image: (a) lowpass-filtered 200 × 200 im-  6.93 25.00 0 0 0 21.93
 
age, (b) spectrum of image in (a), and (c) unaliased, decimated  0 0 0 0 0 0 
|FG(k1 , k2 )| =  . (4.53)
version of (a).  0 0 0 0 0 0  
 0 0 0 0 0 0 
6.93 21.93 0 0 0 25.00
4-7 B-SPLINES INTERPOLATION 143

Deleting the zero-valued rows and columns in the 2-D DFT Photoshop
R
and through the command imresize.
magnitudes in MATLAB depiction gives the magnitudes
 
173.00 23.82 23.82 4-7.1 B-Splines
|FG(k1 , k2 )| =  6.93 25.00 21.93 . (4.54)
6.93 21.93 25.00 Splines are piecewise-polynomial functions whose polynomial
coefficients change at half-integer or integer values of the
Multiplying by Md2 /Mo2 = 32 /62 = 1/4 and taking the inverse independent variable, called knots, so that the function and some
2-D DFT gives of its derivatives are continuous at each knot. In 1-D, a B-spline
βN (t) of order N is a piecewise polynomial of degree N, centered
 
5.83 1.58 6.00 at t = 0. The support of βN (t), which is the interval outside of
g[m, n] = 6.08 6.33 3.00 . (4.55) which βN (t) = 0, extends between −(N + 1)/2 and +(N + 1)/2.
6.25 3.5 4.67

This is the result we would have gotten by decimating the ◮ Hence, the duration of βN (t) is (N + 1). ◭
antialiased lowpass-filtered original image, but it was performed
entirely in the 2-D DFT domain.
Formally, the B-spline function βN (t) is defined as
◮ A MATLAB code for this example is available on the Z ∞ 
sin(π f ) N+1 j2π f t
book website. ◭ βN (t) = e d f, (4.56)
−∞ πf
which is equivalent to the inverse Fourier transform of
Concept Question 4-2: Why must we delete rows and
sincN+1 ( f ). Recognizing that (a) the inverse Fourier transform
columns of the 2-D DFT array to perform downsampling?
of sinc(t) is a rectangle function and (b) multiplication in
the frequency domain is equivalent to convolution in the time
domain, it follows that
4-7 B-Splines Interpolation
In the preceding sections we examined several different image βN (t) = rect(t) ∗ · · · ∗ rect(t), (4.57)
| {z }
interpolation methods, some of which perform the interpolation N+1 times
directly in the spatial domain, and others that perform the inter-
polation in the spatial frequency domain. Now, we introduce yet with (
another method, known as the B-splines interpolation method,
1 for |t| < 1/2,
with the distinguishing feature that it is the method most com- rect(t) = (4.58)
monly used for image interpolation. Unlike with downsampling, 0 for |t| > 1/2.
B-spline interpolation has no aliasing issues when the sampling Application of Eq. (4.57) for N = 0, 1, 2, and 3 leads to:
interval ∆ is too large. Moreover, unlike with upsampling,
(
B-spline interpolation need not result in blurred images. 1 for |t| < 1/2,
B-splines are a family of piecewise polynomial functions, β0 (t) = rect(t) = (4.59)
with each polynomial piece having a degree N, where N is 0 for |t| > 1/2,
(
a non-negative integer. As we will observe later on in this 1 − |t| for |t| < 1,
section, a B-spline of order zero is equivalent to the nearest- β1 (t) = β0 (t) ∗ β0(t) = (4.60)
neighbor interpolation method of Section 3-5.1, but it is simpler 0 for |t| > 1,
to implement than the sinc interpolation formula. Interpolation β2 (t) = β1 (t) ∗ β0(t)
with B-splines of order N = 1 generates linear interpolation, 
 2
which is used in computer graphics. Another popular member 3/4 − t for 0 ≤ |t| ≤ 1/2,
1
of the B-spline interpolation family is cubic interpolation, corre- = 2 (3/2 − |t|)2 for 1/2 ≤ |t| ≤ 3/2, (4.61)

0
sponding to N = 3. Cubic spline interpolation is used in Adobe R
for |t| > 3/2,
144 CHAPTER 4 IMAGE INTERPOLATION

β3 (t) = β2 (t) ∗ β0(t)



 2 3
2/3 − t + |t| /2 for |t| ≤ 1,
= (2 − |t|) /63 for 1 ≤ |t| ≤ 2, (4.62) 1

0 for |t| > 2.

Note that in all cases, βN (t) is continuous over its full duration.
For N ≥ 1, the B-spline function βN (t) is continuous and
differentiable (N − 1) times at all times t. For β2 (t), the function
is continuous across its full interval (−3/2, 3/2), including at
t = 1/2. Similarly, β3 (t) is continuous over its interval (−2, 2), t
−0.5 0 0.5
including at t = 1.
(a) β0(t)
Plots of the B-splines of order N = 0, 1, 2, and 3 are displayed
in Fig. 4-7. From the central limit theorem in the field of
probability, we know that convolving a function with itself
repeatedly makes the function resemble a Gaussian. This is 1
evident in the present case as well.
From the standpoint of 1-D and 2-D interpolation of signals
and images, the significance of B-splines is in how we can use
them to express a signal or image. To guide us through the
process, let us assume we have 6 samples x(n∆), as shown in
Fig. 4-8, extending between t = 0 and t = 5∆. Our objective
t
is to interpolate between these 6 points so as to obtain a −1 0 1
continuous function x(t). An important constraint is to ensure (b) β1(t)
that x(t) = x(n∆) at the 6 discrete times n∆.
For a B-spline of a specified order N, the interpolation is
realized by expressing the desired interpolated signal x(t) as a 0.8
linear combination of time-shifted B-splines, all of order N:
∞ t 
x(t) = ∑ c[m] βN

−m . (4.63) 0.4
m=−∞

Here, βN ∆t − m is the B-spline function βN (t), with t scaled by t
−2 −1 0 1 2
the sampling interval ∆ and delayed by a scaled time integer m.
(c) β2(t)

◮ The support of βN ∆t − m is
0.7
    0.6
N +1 t N+1
m− < < m+ . (4.64) 0.4
2 ∆ 2
 0.2
That is, βN ∆t − m = 0 outside that interval. ◭
t
−3 −2 −1 0 1 2 3

(d) β3(t)
Associated with each value of m is a constant coefficient c[m]
whose value is related to the sampled values x(n∆) and the order
N of the B-spline. More specifically, the values of c[m] have to Figure 4-7 Plots of βN (t) for N = 0, 1, 2, and 3.
be chosen such that the aforementioned constraint requiring that
x(t) = x(n∆) at discrete times t = n∆ is satisfied. The process is
4-7 B-SPLINES INTERPOLATION 145

4 4

3 3 x(nΔ)
x(t)
2 2

1 1

0 t/Δ 0 t/Δ
0 1 2 3 4 5 0 1 2 3 4 5

Figure 4-8 Samples x(n∆) to be interpolated into x(t). Figure 4-9 B-spline interpolation for N = 0.

described in forthcoming subsections. 4-7.2 N = 0: Nearest-Neighbor Interpolation


Since each B-spline is a piecewise polynomial of order N,
For N = 0, Eqs. (4.63) and (4.64) lead to
continuous and differentiable (N − 1) times, any linear com-
bination of time-shifted B-splines also constitutes a piecewise ∞ t 
polynomial of order N, and will also be continuous and differ- x(t) = ∑ c[m] β0

−m
entiable. Thus, the B-splines form a basis—hence, the “B” in m=−∞
∞ t 
their name—and where the basis is used to express x(t), as in
Eq. (4.63), x(t) is continuous and differentiable (N − 1) times at
= ∑ c[m] rect

−m
m=−∞
the knots t = m∆ if N is odd and at the knots t = (m + 1/2)∆ (

if N is even. This feature of B-splines makes them suitable for 1 for m − 21 < t
< m + 12 ,
interpolation, as well as for general representation of signals and
= ∑ c[m]
0 otherwise.
∆ (4.65)
m=−∞
images.
The expression given in Eq. (4.65) consists of a series of adjoin-
ing, but not overlapping, rectangle functions. For m = 0, rect ∆t
Exercise 4-3: Show that the area under βN (t) is 1. 1 1
R∞ is centered at ∆t = 0 and extends over t
 the range − 2 t < ∆ < 2 .
Answer: −∞ βN (t) dt = B(0) by entry #11 in Table 2-4. t
Similarly, for m = 1, rect ∆ − 1 is centered at ∆ = 1 and
Set f = 0 in sincN ( f ) and use sinc(0) = 1. extends over 21 < ∆t < 32 . Hence, to satisfy the constraint that
x(t) = x(n∆) at the sampled locations t = n∆, we need to set
Exercise 4-4: Why does βN (t) look like a Gaussian func-
m = n and c[m] = x(n∆):
tion for N ≥ 3? t  1 t 1
x(t) = x(n∆) rect −n , n− < < n + . (4.66)
Answer: Because βN (t) is rect(t) convolved with itself ∆ 2 ∆ 2
(N + 1) times. By the central limit theorem of probabil-
ity, convolving any square-integrable function with itself The interpolated function x(t) is shown in Fig. 4-9, along with
the 6 samples { x(n∆) }. The B-spline representation given by
results in a function resembling a Gaussian.
Eq. (4.66) is a nearest-neighbor (NN) interpolation: x(t) in
the interval { n∆ ≤ t ≤ (n + 1)∆ } is set to the closer (in time)
Exercise 4-5: Show that the support of βN (t) is of x(n∆) and x((n + 1)∆). The resulting x(t) is a piecewise
−(N + 1)/2 < t < (N + 1)/2. constant, as shown in Fig. 4-9.
Answer: The support of βN (t) is the interval outside of
which βN (t) = 0. The support of rect(t) is −1/2 < t < 1/2. 4-7.3 Linear Interpolation
βN (t) is rect(t) convolved with itself N + 1 times, which has 
duration N + 1 centered at t = 0. For N = 1, the B-spline β1 ∆t − m assumes the shape of a
triangle (Fig. 4-7(b)) centered at t/∆ = m. Figure 4-10(b)
displays the triangles centered at t/∆ = m for m = 0 through 5.
Also displayed in the same figure are the values of x(n∆). To
146 CHAPTER 4 IMAGE INTERPOLATION

0 t/Δ
0 1 2 3 4 5
(a) x(nΔ)

β1(t/Δ − 1) β1(t/Δ − 2) β1(t/Δ − 3) β1(t/Δ − 4)


β1(t/Δ) 1 β1(t/Δ − 5)

0 t/Δ
−1 0 1 2 3 4 5 6
(b) β1(t/Δ − m)

0
0 1 2 3 4 5
(c) x(t)

Figure 4-10 B-spline linear interpolation (N = 1).


4-7 B-SPLINES INTERPOLATION 147

satisfy Eq. (4.63) for N = 1, namely


3
∞ t  2.5 m=0
x(t) = ∑ c[m] β1

−m , (4.67) 2
m = −2
m=2
m=−∞ 1.5

as well as meet the condition that x(t) = x(n∆) at the 6 given


1 m=1
0.5
m = −1
points, we should set m = n and select t
0
−4 −3 −2 −1 0 1 2 3 4 Δ
c[m] = c[n] = x(∆n) (N = 1).

Consequently, for any integer n, x(t) at time t between n∆ and Figure 4-11 B-splines β2 (t/∆ − m) overlap in time.
(n + 1)∆ is a weighted average given by
 t t 
x(t) = x(n∆) (n + 1) − + x((n + 1)∆) − n . (4.68a)
∆ ∆ samples { x(n∆) } can be derived by starting with Eq. (4.63),
The associated duration is ∞ t 
n∆ ≤ t ≤ (n + 1)∆. (4.68b)
x(t) = ∑ c[m] βN

−m , (4.69)
m=−∞

Application of the B-spline linear interpolation to the given and then setting t = n∆, which gives
samples x(n∆) leads to the continuous function x(t) shown in

Fig. 4-10(c). The linear interpolation amounts to setting { x(t),
n∆ ≤ t ≤ (n + 1)∆ } to lie on the straight line connecting x(n∆)
x(n∆) = ∑ c[m] βN (n − m) = c[n] ∗ βN (n), (4.70)
m=−∞
to x((n + 1)∆).
where use was made of the discrete-time convolution relation
Exercise 4-6: Given the samples given by Eq. (2.71a).
For N = 2, Eq. (4.61) indicates that β2 (n) 6= 0 only for integers
{ x(0), x(∆), x(2∆), x(3∆) } = { 7, 4, 3, 2 }, n = { −1, 0, 1 }. Hence, the discrete-time convolution given by
Eq. (4.70) simplifies to
compute x(∆/3) by interpolation using: (a) nearest neigh-
bor; (b) linear. x(n∆) = c[n − 1] β2 (1) + c[n] β2 (0) + c[n + 1] β2 (−1)
1
Answer: (a) ∆/3 is closer to 0 than to ∆, so = 8 c[n − 1] + 43 c[n] + 81 c[n + 1] (N = 2). (4.71)

x(∆/3) = x(0) = 7. In the second step, the constant coefficients were computed
using Eq. (4.61) for β2 (t). The sum truncates because β2 (t) = 0
for |t| ≥ 3/2, so only three basis functions overlap at any specific
(b)
time t, as is evident in Fig. 4-11.
x(∆/3) = (2/3)x(0)+(1/3)x(∆) = (2/3)(7)+(1/3)(4) = 6. Similarly, for N = 3, β3 (t) = 0 for |t| ≥ 2, which also leads
to the sum of three terms:

x(n∆) = c[n − 1] β3 (1) + c[n] β2 (0) + c[n + 1] β3 (−1)


4-7.4 Quadratic Interpolation 1
= 6 c[n − 1] + 64 c[n] + 61 c[n + 1] (N = 3). (4.72)
For N ≥ 2, interpolation using B-splines becomes more compli-
cated than for N = 0 and 1 because the supports of the basis If N is increased beyond 3, the number of terms increases, but
functions { βN ∆t − m } overlap in time for different values the proposed method of solution to determine the values of c[n]
of m, as shown in Fig. 4-11 for N = 2. Unlike the cases N = 0 remains the same. Specifically, we offer the following recipe:
and N = 1, wherein we set c[m] = x(m∆), now c[m] is related to (1) Delay { x(n∆) } to { xe(n∆) } to make it causal. Compute
values of more than one of the discrete samples { x(n∆) }. the No th-order DFT X[k] of { xe(n∆) }, where No is the number of
For N ≥ 2, the relationships between coefficients { c[m] } and samples { x(n∆) }.
148 CHAPTER 4 IMAGE INTERPOLATION

(2) Delay { β2 (−1), β2 (0), β2 (1) } by 1 to { βe2 (t) } to make it


30
causal. Compute the No th-order DFT B[k] of
  25
1 3 1
{ β2 (−1), β2 (0), β2 (1) } = , , .
8 4 8
20
(3) Compute the No th-order inverse DFT to determine c[n]:
  15
−1 X[k]
c[n] = DFT . (4.73)
B[k] 10

Example 4-4: Quadratic Spline Interpolation


0 t
−5 −4 −3 −2 −1 0 1 2 3 4 5

Figure 4-12(a) displays samples { x(n∆) } with ∆ = 1. Obtain (a) x(n)


an interpolated version x(t) using quadratic splines.
30
Solution: From Fig. 4-12(a), we deduce that x(n) has 6
nonzero samples and is given by
25

x(n) = { x(−3), x(−2), x(−1), x(0), x(1), x(2) }


20
= { 3, 19, 11, 17, 26, 4 }.

As noted earlier, 15

 
1 3 1 10
β2 (n) = { β2 (−1), β2 (0), β2 (1) } = , , .
8 4 8
5
Inserting x(n) and β (n) in Eq. (4.70) establishes the convolution
problem 0 t
  −5 −4 −3 −2 −1 0 1 2 3 4 5
1 3 1 (b) x(t) interpolated using quadratic splines
{ 3, 19, 11, 17, 26, 4 } = , , ∗ c[n].
8 4 8
Figure 4-12 (a) Original samples x(n) and (b) interpolated
Following the solution recipe outlined earlier—and demon-
function x(t).
strated in Section 2-9—the deconvolution solution is

c[n] = { 24, 8, 16, 32 },

and, therefore, the interpolated continuous function is

x(t) = 24β2(t + 2) + 8β2(t + 1) + 16β2(t) + 32β2(t − 1).


Concept Question 4-3: What is the difference between
A plot of x(t) is shown in Fig. 4-12(b). It is evident that x(t) has zero-order-spline interpolation and nearest-neighbor inter-
the same values as x(n) at t = { −3, −2, −1, 0, 1, 2 }. polation?

◮ Note: The MATLAB code for solving Example 4-4 is Concept Question 4-4: Why use cubic interpolation,
available on the book website. ◭ when quadratic interpolation produces smooth curves?
4-8 2-D SPLINE INTERPOLATION 149

four neighbors:
Concept Question 4-5: Why do quadratic and cubic in-
terpolation require computation of coefficients, while linear { f (n∆, m∆), f ((n + 1)∆, m∆), f (n∆, (m + 1)∆),
interpolation does not?
f ((n + 1)∆, (m + 1)∆) }. (4.77)

Exercise 4-7: In Eq. (4.63), why isn’t c[n] = x(n∆) for


N ≥ 2?
 4-8.2 Bilinear Image Interpolation
Answer: Because three of the basis functions βN ∆t − m
overlap for any t. Linear interpolation is performed as follows:
(1) Each image location (x0 , y0 ) has four nearest sampled
values given by Eq. (4.77), with a unique set of values for n
and m.
4-8 2-D Spline Interpolation (2) Compute:
 x0 
1-D interpolation using splines generalizes directly to 2-D. The f (x0 , m∆) = f (n∆, m∆) n + 1 −
task now is to obtain a continuous function f (x, y) from samples  x∆ 
0
{ f (n∆, m∆) }. + f ((n + 1)∆, m∆) −n , (4.78a)
The 2-D spline functions are separable products of the 1-D  ∆ x0 
spline functions: f (x0 , (m + 1)∆) = f (n∆, (m + 1)∆) n + 1 −
 x∆ 
0
βN (x, y) = βN (x) βN (y). (4.74) + f ((n + 1)∆, (m + 1)∆) − n , (4.78b)

For example, for N = 1 we have and then combine them to find
( 
(1 − |x|)(1 − |y|) for 0 ≤ |x|, |y| ≤ 1 y0 
β1 (x, y) = f (x0 , y0 ) = f (x0 , m∆) m + 1 −
0 otherwise,  y∆ 
0
(4.75) + f (x0 , (m + 1)∆) −m . (4.79)

which is pyramidal in shape. Interpolation using β1 (x, y) is
called bilinear interpolation, which is a misnomer because The preceding computation linearly interpolates in x for y = m∆,
β1 (x, y) includes a product term, |x||y|. and again for y = (m + 1)∆, and then linearly interpolates in y
In 2-D, Eq. (4.63) becomes for x = x0 .
∞ ∞ x  y 
f (x, y) = ∑ ∑ c[n, m] βN

− n βN

− m . (4.76)
m=−∞ n=−∞ 4-8.3 Cubic Spline Interpolation
For N = 0 and N = 1, c[n, m] = f (n∆, m∆), but for N ≥ 2, c[n, m] We describe the cubic spline interpolation procedure through
is computed from { f (n∆, m∆) } using a 2-D version of the DFT an example. Figure 4-13(a) shows a synthetic-aperture radar
recipe outlined in Section 4-7.4. (SAR) image of a metropolitan area, and part (b) of the figure
shows a magnified version of the central part of the original
image using cubic-spline interpolation. The magnification factor
4-8.1 Nearest-Neighbor (NN) image is 3 along each direction. The interpolation uses
Interpolation
x[n, m] = c[n, m] ∗ ∗(β3[n] β3 [m]) (4.80)
NN interpolation of images works in the same way as NN
interpolation of 1-D signals. After locating the four samples to compute x[n, m] at x = n/3 and y = m/3 for integers n
surrounding a location (x0 , y0 )—thereby specifying the appli- and m. The values of c[n, m] are determined using the DFT
cable values of n and m, the value assigned to f (x, y) at location recipe outlined in Section 4-7.4 in combination with the product
(x0 , y0 ) is the value of the nearest-location neighbor among those of cubic spline functions given by Eq. (4.62) evaluated at
150 CHAPTER 4 IMAGE INTERPOLATION

(−1, 0, 1):
 
0 β3 (−1)  
 β3 (0)  β3 (−1) β3 (0) β3 (1)
100
β3 (1)
200 1  
1 1 4 1
6
300 41 4 1
= 6 6 6 6 = 4 16 4 . (4.81)
36 1 4 1
400 1
6
500

600  
1 2
Exercise 4-8: The “image” is interpolated using
700 3 4
bilinear interpolation. What is the interpolated value at the .
800 center of the image?
900 Answer: 14 (1 + 2 + 3 + 4) = 2.5
1000

0 200 400 600 800 1000


(a) Original SAR image to be magnified 4-9 Comparison of 2-D Interpolation
Methods
0
In this chapter, we have discussed three image interpolation
methods: The sinc interpolation formula in Section 4-1.1, the
100 Lanczos interpolation formula in Section 4-1.2, and the B-
spline interpolation method in Section 4-8. To compare the
effectiveness of the different methods, we chose the original
200 clown image shown in Fig. 4-14(a) and then downsampled it
by a factor of 9 by retaining only 1/9 of the original pixels (1/3
along each direction). The locations of the downsampled pixels
300
and the resulting (67 × 67) downsampled image are shown in
parts (b) and (c) of Fig. 4-14, respectively.
400 Next, we applied the various interpolation methods listed
earlier to the downsampled clown image in Fig. 4-14(c) so as
to generate an interpolated version of the original clown image.
500 This is equivalent to magnifying the (67 × 67) downsampled
clown image in Fig. 4-14(c) by a factor of 3 in each dimen-
sion. The results, displayed in Fig. 4-15, deserve the following
599 commentary:
0 100 200 300 400 500 599 (a) Relative to the original clown image, of the first three in-
(b) SAR image magnified by 3 using cubic splines terpolated images, the Lanczos with a = 3 is slightly better than
that with a = 2, and both are better than the sinc-interpolated
Figure 4-13 The image in (b) is the (200 × 200) central part image.
of the synthetic-aperture radar (SAR) image in (a) magnified by a (b) Among the B-spline images, significant improvement is
factor of 3 along each direction using cubic-spline interpolation. realized in image quality as N is increased from N = 0 (nearest
neighbor) to N = 1 (bilinear) and then to N = 3 (cubic). MAT-
LAB code for this example is available on the book website.
4-9 COMPARISON OF 2-D INTERPOLATION METHODS 151

0 0
20 20
40 40
60 60
80 80
100 100
120 120
140 140
160 160
180 180
199 199
0 40 80 120 160 199 0 40 80 120 160 199
(a) Sinc interpolation (d) B-spline NN interpolation
0 0
20 20
40 40
60 60
80 80
100 100
120 120
140 140
160 160
180 180
199 199
0 40 80 120 160 199 0 40 80 120 160 199
(b) Lanczos interpolation with a = 2 (e) B-spline linear interpolation
0 0
20 20
40 40
60 60
80 80
100 100
120 120
140 140
160 160
180 180
199 199
0 40 80 120 160 199 0 40 80 120 160 199
(c) Lanczos interpolation with a = 3 (f ) B-spline cubic-spline interpolation

Figure 4-15 Comparison of three interpolation methods: (a) sinc interpolation; (b) and (c) Lanczos interpolation with a = 2 and a = 3,
respectively; and (d) to (f) B-spline with N = 0, N = 1, and N = 3, respectively.
152 CHAPTER 4 IMAGE INTERPOLATION

4-10 Examples of Image Interpolation


0
Applications

Example 4-5: Image Rotation

Recall from Eq. (3.12), that rotating an image f (x, y) by an


angle θ leads to a rotated image g(x, y) given by

g(x, y) = f (x cos θ + y sin θ , y cos θ − x sin θ ). (4.82)

Sampling g(x, y) at x = n∆ and y = m∆ gives

g(n∆, m∆) =
199
0 199 f (n∆ cos θ + m∆ sin θ , m∆ cos θ − n∆ sin θ ), (4.83)
(a) Original clown image
which clearly requires interpolation of f (x, y) at the required
0 points from its given samples f (n∆, m∆). In practice, nearest
neighbor (NN) interpolation is usually sufficient to realize the
necessary interpolation.
Figure 4-16(a) displays a zero-padded clown image, and
part (b) displays the image after rotation by 45◦ using NN
interpolation. The rotated image bears a very good resemblance
to the rotated original. The MATLAB code for this figure is on
the book website.

Example 4-6: Exponential Image Warping

Image warping or morphing entails creating a new image g(x, y)


from the original image f (x, y) where
199
0 199 g(x, y) = f (Tx (x), Ty (y)) (4.84)
(b) Samples to be interpolated
0 for some 1-D transformations Tx (x) and Ty (y). Image shifting
by (xo , yo ) can be implemented using
20

40 Tx (x) = x − xo and Ty (y) = y − yo.


60 Magnification by a factor of a can be implemented using
0 20 40 60
(c) Downsampled image x y
Tx (x) = and Ty (y) = .
a a
Figure 4-14 The original clown image in (a) was downsampled More interesting warping of images can be performed using
to the image in (c) by sampling only 1/9 of the pixels of the nonlinear transformations, as demonstrated by the following
original image. illustrations.
4-10 EXAMPLES OF IMAGE INTERPOLATION APPLICATIONS 153

(a) Warped Clown Image


After shifting the clown image so that the origin [n, m] = [0, 0]
0 is at the center of the image, the image was warped using
Tn (n) = ne−|n|/300 and Tm (m) = me−|m|/300 and NN interpola-
50 tion. The warped image is shown in Fig. 4-17(a). Repeating the
process with the space constant 300 replaced with 200 leads to
100 greater warping, as shown in Fig. 4-17(b).

150
Example 4-7: Square-Root and Inverse Image
200 Warping

250
Another form of image warping is realized by applying a
300 square-root function of the form
p p
350 n |n| m |m|
Tn (n) = and Tm (m) = .
25 25
399
0 50 100 150 200 250 300 350 399 Repetition of the steps described in the previous example, but
(a) Zero-padded clown image using the square-root transformation instead, leads to the images
in Fig. 4-18(a). The MATLAB code for this figure is available
0 on the book website.
Another transformation is the inverse function given by
50
n|n| m|m|
Tn (n) = and Tm (m) = .
100 a a
The result of warping the clown image with a = 300 is shown in
150 Fig. 4-18(b). The MATLAB code for this figure is available on
the book website.
200

Concept Question 4-6: Provide four applications of in-


250 terpolation.

300 
1 2
Exercise 4-9: The “image” is rotated counter-
3 4
350 ◦
clockwise 90 . What is the result?
 
399 2 4
0 50 100 150 200 250 300 350 399 Answer:
1 3
(b) Clown image rotated by 45˚
 
4 8
Figure 4-16 Clown image before and after rotation by 45◦ . Exercise 4-10: The “image” is magnified by a
12 16
factor of three. What is the result, using: (a) NN; (b) bilinear
interpolation?
154 CHAPTER 4 IMAGE INTERPOLATION

0 0

50 50

100 100

150 150

200 200

250 250

300 300

350 350

399 399
0 50 100 150 200 250 300 350 399 0 50 100 150 200 250 300 350 399
(a) Tn(n) = ne−|n|/ 300 and Tm(m) = me−|m|/ 300 (a) Tn(n) = n√|n|/ 25 and Tm(m) = m√|n|/ 25
0 0

50 50

100 100

150 150

200 200

250 250

300 300

350 350

399 399
0 50 100 150 200 250 300 350 399 0 50 100 150 200 250 300 350 399
(b) Tn(n) = ne−|n|/ 200 and Tm(m) = me−|m|/ 200 (b) Tn(n) = n|n|/ 300 and Tm(m) = m|m|/ 300

Figure 4-17 Nonlinear image warping with space constants of Figure 4-18 Clown image warped with (a) square-root transfor-
(a) 300 and (b) 200. mation and (b) inverse transformation.
4-10 EXAMPLES OF IMAGE INTERPOLATION APPLICATIONS 155

Answer:
(a)  
4 4 4 8 8 8
4 4 4 8 8 8
 
4 4 4 8 8 8
12 12 12 16 16 16
 
12 12 12 16 16 16
12 12 12 16 16 16
(b)  
1 2 1 2 4 2
2 4 2 4 8 4
 
1 2 1 2 4 2
 
3 6 3 4 8 4
6 12 6 8 16 8
3 6 3 4 8 4
156 CHAPTER 4 IMAGE INTERPOLATION

Summary
Concepts
• Interpolation is “connecting the dots” of 1-D samples, must be taken to preserve conjugate symmetry in the
and “filling in the gaps” of 2-D samples. 2-D DFT. Deleting rows and columns performs lowpass
• Interpolation can be used to rotate and to warp or filtering so that the downsampled image is not aliased.
“morph” images. • B-splines are piecewise polynomial functions that can be
• Upsampling an image can be performed by inserting used to interpolate samples in 1-D and 2-D.
rows and columns of zeros in the 2-D DFT of the image. • For N ≥ 2, computation of the coefficients {c[m]} from
Care must be taken to preserve conjugate symmetry in samples {x(m∆)} can be formulated as a deconvolution
the 2-D DFT. problem.
• Downsampling an image can be performed by deleting • 2-D interpolation using B-splines is a generalization of
rows and columns in the 2-D DFT of the image. Care 1-D interpolation using B-splines.

Mathematical Formulae
B-splines B-spline 
 2 3
βN (t) = rect(t) ∗ · · · ∗ rect(t) 2/3 − t + |t| /2 for |t| ≤ 1,
| {z } 3
N+1 times β3 (t) = (2 − |t|) /6 for 1 ≤ |t| ≤ 2,


B-spline 0 for |t| ≥ 2
(
1 for |t| < 1/2,
β0 (t) = rect(t) = B-spline 1-D interpolation
0 for |t| > 1/2 ∞ t 
x(t) = ∑ c[m] βN −m
B-spline ( m=−∞ ∆
1 − |t| for |t| ≤ 1, Nearest-neighbor 1-D interpolation
β1 (t) =
0 for |t| ≥ 1 x(t) = x(n∆) for |t − n∆| < ∆/2
Linear 1-D interpolation

B-spline  t t 
 2 x(t) = x(n∆) (n + 1) − + x((n + 1)∆) −n
3/4 − t for 0 ≤ |t| ≤ 1/2, ∆ ∆
β2 (t) = (3/2 − |t|)2/2 for 1/2 ≤ |t| ≤ 3/2, for n∆ ≤ t ≤ (n + 1)∆


0 for |t| ≥ 3/2

Important Terms Provide definitions or explain the meaning of the following terms:
B-spline downsampling interpolation Lanczos function nearest-neighbor thumbnail image upsampling

PROBLEMS 4.2 Write a MATLAB program that loads the 50 × 50 image


in tinyclown.mat, deletes the last row and column to make
Section 4-3: Upsampling and Interpolation it 49 × 49, and magnifies it by four using upsampling. This is
easier than Problem 4.1 since 49 is an odd number.
4.1 Write a MATLAB program that loads the 50 × 50 image in
tinyclown.mat and magnifies it by four using upsampling. 4.3 Write a MATLAB program that loads the 64 × 64 image
Note that 50 is an even number. in tinyletters.mat and magnifies it by four using upsam-
PROBLEMS 157

pling. Note that 64 is an even number. 4.14 Another way to derive the formula for linear in-
terpolation is as follows: The goal is to interpolate the
4.4 Write a MATLAB program that loads the 64 × 64 image in
four points { f (0, 0), f (1, 0), f (0, 1), f (1, 1)} using a formula
tinyletters.mat, deletes the last row and column to make
f (x, y) = f0 + f1 x + f2 y + f3 xy, where { f0 , f2 , f2 , f3 } are found
it 63 × 63, and magnifies it by four using upsampling. This is
from the given points. This extends to
easier than Problem 4.3 since 63 is an odd number.
{ f (n, m), f (n + 1, ), f (n, m + 1), f (n + 1, m + 1)}
Section 4-5: Downsampling
for any integers n, m.
4.5 Write a MATLAB program that loads the 200 × 200 image (a) Set up a linear system of equations with unknowns
in clown.mat, antialias lowpass filters it, and demagnifies { f0 , f1 , f2 , f3 } and knowns
it by four using downsampling. (This is how the image in
tinyclown.mat was created.) { f (0, 0), f (1, 0), f (0, 1), f (1, 1)}.
4.6 Repeat Problem 4.5, but skip the antialias lowpass filter.
(b) Solve the system to obtain a closed-form expression for
4.7 Write a MATLAB program that loads the 256 × 256 image f (x, y) as a function of { f (0, 0), f (0, 1), f (1, 0), f (1, 1)}.
in letters.mat, antialias lowpass filters it, and demagnifies
it by four using downsampling. (This is how the image in 4.15 The image  
tinyletters.mat was created.) a b c
d e f
4.8 Repeat Problem 4.7, but skip the antialias lowpass filter. g h i
4.9 Show that if the sinc interpolation formula is used to
is rotated 90◦ clockwise. What is the result?
upsample an M × M image f [n, m] to an N × N image g[n, m] by
an integer factor L (so that N = ML), then g[nL, mL] = f [n, m], 4.16 The image  
so that the values of f [n, m] are preserved after upsampling. a b c
d e f
Section 4-8: 2-D Spline Interpolation g h i

4.10 Write a MATLAB program that loads the 50 × 50 image is rotated 45◦ clockwise and magnified by 2 using linear
in tinyclown.mat and magnifies it by four using nearest- interpolation.