Image Processing For Engineers
Image Processing For Engineers
For Engineers
: ip.eecs.umich.edu
“book” — 2016/3/15 — 6:35 — page iii — #3
IMAGE
PROCESSING
FOR ENGINEERS
Andrew E. Yagle
The University of Michigan
Fawwaz T. Ulaby
The University of Michigan
Copyright 2018 Andrew E. Yagle and Fawwaz T. Ulaby
This book is published by Michigan Publishing under an agreement with the authors.
It is made available free of charge in electronic form to any student or instructor
interested in the subject matter.
Raw Improved
Image image Image image
Image display
formation processing
Image storage/
Sensor transmission
Image analysis
Figure 1-1 After an image is formed by a sensor, image processing tools are applied for many purposes, including changing its scale and
orientation, improving its information content, or reducing its digital size.
1-1 OPTICAL IMAGERS 3
Frequency (Hz)
Temperature of object
Temperature when
of object radiation
when radiationisismost
mostintense
intense
Converging lens
of focal length f
do di
Blue Green
1.0
V [n1 , m1 ] = {Vred [n1 , m1 ], Vgreen [n1 , m1 ], Vblue [n1 , m1 ]} distribu-
0.8 tion: discrete 2-D array of the voltage outputs of the CCD or
Red photo detector array.
0.6
B[n2 , m2 ] = {Bred [n2 , m2 ], Bgreen [n2 , m2 ], Bblue [n2 , m2 ]} distribu-
0.4
tion: discrete 2-D array of the brightness across the LCD array.
0.2
0
0.40 0.45 0.50 0.55 0.60 0.65 0.70
◮ Our notation uses parentheses ( ) with continuous-space
Wavelength λ ( μm)
signals and images, as in Io (x′ , y′ ), and square brackets [ ]
with discrete-space images, as in V [n, m]. ◭
Figure 1-5 Spectral sensitivity plots for photodetectors.
(Courtesy Nikon Corporation.)
1 µ m, it is necessary to use a filter to block the IR spectrum and The three associated transformations are:
to place red (R), green (G), or blue (B) filters over each pixel
so as to separate the visible spectrum of the incident light into
(1) Optical Transformation: from Io (x′ , y′ ; λ ) to Ii (x, y; λ ).
the three primary colors. Thus, the array elements depicted in
Fig. 1-4 in red respond to red light, and a similar correspondence
applies to those depicted in green and blue. Typical examples (2) Detection Transformation: from Ii (x, y; λ ) to V [n1 , m1 ].
of color sensitivity spectra are shown in Fig. 1-5 for a Nikon
camera.
Regardless of the specific detection mechanism (CCD or (3) Display Transformation: from V [n1 , m1 ] to B[n2 , m2 ].
APS), the array output is transferred to a digital storage device
with specific markers denoting the location of each element of
the array and its color code (R, G, or B). Each array consists of
three subarrays, one for red, another for green, and a third for Indices [n1 , m1 ] and [n2 , m2 ] vary over certain ranges of discrete
blue. This information is then used to synchronize the output values, depending on the chosen notation. For a discrete image,
of the 2-D detector array with the 2-D pixel arrangement on an the two common formats are:
LCD (liquid crystal display) or other electronic displays.
Io (x′ , y′ ; λ ): continuous intensity brightness in the object plane, (2) Corner Coordinate System: In Fig. 1-7(b), indices n and m
with (x′ , y′ ) denoting the coordinates of the object plane. of V [n, m] start at 1 (rather than zero). Image size is M × N.
6 CHAPTER 1 IMAGING SENSORS
Discretized version
of apple
Optical transformation Detection transformation Display transformation
]
n 1,m1
[
V red
Red
Io(x′, y′) Ii(x, y) filter B[n2, m2]
]
m1
[ n 1,
en
V gre
Green
filter
]
,m1
Blue [e n 1
Object plane Image plane filter V blu Display array
Detector array
Figure 1-6 Io (x′ , y′ ; λ ) and Ii (x, y; λ ) are continuous scene brightness and image intensities, whereas V [n1 , m1 ] and B[n2 , m2 ] are discrete
images of the detected voltage and displayed brightness, respectively.
1-1 OPTICAL IMAGERS 7
Image Notation
N columns
(0, M)
M rows
(0,1)
n
(−N,0) (−1,0) (0,0) (1,0) (N,0) m
s
where J1 (γ ) is the first-order Bessel function of the first kind, x2 + y2
and sin θ = , (1.5)
πD x2 + y2 + di2
γ= sin θ . (1.3)
λ and Eq. (1.4) can be rewritten as
Here, λ is the wavelength of the light (assumed to be monochro-
matic for simplicity) and D is the diameter of the converging Ii (x, y) 2J1 (γ ) 2
lens. The normalized form of Eq. (1.2) represents the impulse h(x, y) = = , (1.6)
I0 γ
response h(θ ) of the imaging system,
with s
Ii (θ ) 2J1 (γ ) 2 πD x2 + y2
h(θ ) = = . (1.4) γ= . (1.7)
Io γ λ x2 + y2 + di2
For a 2-D image, the impulse response is called the point spread The expressions given by Eqs. (1.2) through (1.7) pertain to
function (PSF). coherent monochromatic light. Unless the light source is a laser,
Detector arrays are arranged in rectangular grids. For a pixel the light source usually is panchromatic, in which case the
at (x, y) in the image plane (Fig. 1-9), diffraction pattern that would be detected by each of the three-
color detector arrays becomes averaged over the wavelength
range of that array. The resultant diffraction pattern maintains
the general shape of the pattern in Fig. 1-8(b), but it exhibits a
Lens of diameter D gentler variation with θ (with no distinct minima). Here, h(x, y)
denotes the PSF in rectangular coordinates relative to the center
of the image plane.
y
Source s
θ x
πD
γ= sin θ y
λ Along y axis: sin θ = q
y2 + di2
γ s
−5 5 x2 + y2
For an image pixel at (x, y): sin θ =
(b) 1-D profile of imaged response x + y2 + di2
2
λ
∆y′min = 1.22do ∆θmin = 1.22do . (1.9b)
C. Spatial Resolution D
(scene spatial resolution)
Each of the two coherent, monochromatic sources shown in
Fig. 1-10 produces a diffraction pattern. If the two sources are
sufficiently far apart so that their patterns are essentially distinct, This is known as the Rayleigh resolution criterion. Because
then we should be able to distinguish them from one another. But the lens diameter D is in the denominator, using a larger lens
as we bring them closer together, their diffraction patterns in the improves spatial resolution. Thus telescopes are made with very
image plane start to overlap, making it more difficult to discern large lenses and/or mirrors.
their images as those of two distinct sources. These expressions apply to the y and y′ directions at wave-
One definition of the spatial resolution capability of the length λ . Since the three-color detector arrays operate over
imaging system along the y′ direction is the separation ∆y′min different wavelength ranges, the associated angular and spatial
between the two point sources (Fig. 1-10) such that the peak of resolutions are the smallest at λblue ≈ 0.48 µ m and the largest at
the diffraction pattern of one of them occurs at the location of λred ≈ 0.6 µ m (Fig. 1-5). Expressions with identical form apply
the first null of the diffraction pattern of the other one, and vice along the x and x′ direction (i.e., upon replacing y and y′ with x
versa. Along the y direction in the image plane, the first null and x′ , respectively).
occurs when [2J1 (γ )/γ ]2 = 0 or, equivalently, γ = 3.832. Use of
D. Detector Resolution
The inherent spatial resolution in the image plane is
y ∆ymin = di λ /D, but the detector array used to record the
image has its own detector resolution ∆p, which is the pixel
Image pattern of s2
size of the active pixel sensor. For a black and white imaging
camera, to fully capture the image details made possible by the
s1 imaging system, the pixel size ∆p should be, at most, equal to
∆θmin
∆ymin . In a color camera, however, the detector pixels of an
∆ymin
′ ∆θmin ∆ymin individual color are not adjacent to one another (see Fig. 1-4),
s2 so ∆p should be several times smaller than ∆ymin .
do
Image pattern of s1 1-1.2 Thermal IR Imagers
di Density slicing is a technique used to convert a parameter
of interest from amplitude to pseudocolor so as to enhance
Figure 1-10 The separation between s1 and s2 is such that the the visual display of that parameter. An example is shown in
peak of the diffraction pattern due to s1 is coincident with the Fig. 1-11, wherein color represents the infrared (IR) tempera-
first null of the diffraction pattern of s2 , and vice versa. ture of a hot-air balloon measured by a thermal infrared imager.
The vertical scale on the right-hand side provides the color-
10 CHAPTER 1 IMAGING SENSORS
Figure 1-11 IR image of a hot-air balloon (courtesy of Ing.-Buro für Thermografie). The spatial pattern is consistent with the fact that
warm air rises.
1-1 OPTICAL IMAGERS 11
Medium-wave IR
λ = 0.76 μm 2 μm 4 μm 103 μm
Near IR Long-wave IR
Gamma Ultraviolet
X-rays Infrared Microwaves Radio waves
rays rays
1.0 IR lens
Dectector
0.95 array
Spectral emissivity
0.90 Data
processing
0.85 Ocean unit
Vegetation IR
0.80 Desert signal
Snow/ice
0.75 Cooler
Optics
3.3 μm 5 μm 10 μm 20 μm
Wavelength 50 μm
Figure 1-15 Thermal IR imaging systems often use cryogenic
cooling to improve detection sensitivity.
Figure 1-14 Emissivity spectra for four types of terrain.
(Courtesy the National Academy Press.)
B. Imaging System availability and use of a cryogenic agent, such as liquid nitrogen,
as well as placing the detectors in a vacuum-sealed container.
The basic configuration of a thermal IR imaging system Consequently, cooled IR imagers are significantly more expen-
(Fig. 1-15) is similar to that of a visible-light camera, but the sive to construct and operate than uncooled imagers.
lenses and detectors are designed to operate over the intended We close this section with two image examples. Figure 1-16
IR wavelength range of the system. Two types of detectors compares the image of a scene recorded by a visible-light black-
are used, namely uncooled detectors and cooled detectors. By and-white camera with a thermal IR image of the same scene.
cooling a semiconductor detector to very low temperatures, The IR image is in pseudocolor, with red representing high IR
typically in the 50–100 K range, its self-generated thermal noise emission and blue representing (comparatively) low IR emis-
is reduced considerably, thereby improving the signal-to-noise sion. The two images convey different types of information, but
ratio of the detected IR signal emitted by the observed scene. they also have significantly different spatial resolutions. Today,
Cooled detectors exhibit superior sensitivity in comparison with digital cameras with 16 megapixel detector arrays are readily
uncooled detectors, but the cooling arrangement requires the available and fairly inexpensive. In contrast, most standard
1-2 RADAR IMAGERS 13
Concept Question 1-4: Why is an IR imager called a It is important to note that λ of visible light is much shorter
thermal imager? than λ in the microwave region. In the middle of the visible
spectrum, λvis ≈ 0.5 µ m, whereas at a typical microwave radar
14 CHAPTER 1 IMAGING SENSORS
∆y′min
∆θmin
Figure 1-18 Radar imaging of a scene by raster scanning the antenna beam.
frequency of 6 GHz, λmic ≈ 5 cm. The ratio is sion and transmits very short pulses to achieve fine resolution in
the orthogonal dimension. The predecessor to SAR is the real-
λmic 5 × 10−2 aperture side-looking airborne radar (SLAR). A SLAR uses
= = 105 !
λvis 0.5 × 10−6 a rectangular- or cylindrical-shaped antenna that gets mounted
along the longitudinal direction of an airplane, and pointed
This means that the angular resolution capability of an optical partially to the side (Fig. 1-19).
system is on the order of 100,000 times better than the angular Even though the antenna beam in the elevation direction
resolution of a radar, if the lens diameter is the same size as the is very wide, fine discrimination can be realized along the x
antenna diameter. direction in Fig. 1-19 by transmitting a sequence of very short
To fully compensate for the large wavelength ratio, a radar pulses. At any instant in time, the extent of the pulse along x is
antenna would need a diameter on the order of 1 km to produce
an image with the same resolution as a camera with a lens 1 cm cτ
∆x′min = (scene range resolution), (1.11)
in diameter. Clearly, that is totally impractical. In practice, most 2 sin θ
radar antennas are on the order of centimeters to meters in size,
where c is the velocity of light, τ is the pulse width, and θ is
but certainly not kilometers. Yet, radar can image the Earth
the incidence angle relative to nadir-looking. This represents the
surface from satellite altitudes with spatial resolutions on the
scene spatial resolution capability along the x′ direction. At a
order of 1 m—equivalent to antenna sizes several kilometers in
typical angle of θ = 45◦ , the spatial resolution attainable when
extent! How is that possible?
transmitting pulses each 5 µ s in width is
A. Synthetic-Aperture Radar 3 × 108 × 5 × 10−9
∆x′min = ≈ 1.05 m.
As we will see shortly, a synthetic-aperture radar (SAR) uses 2 sin 45◦
a synthesized aperture to achieve good resolution in one dimen-
1-2 RADAR IMAGERS 15
va Transmitte
Tr ter-
ter-
Transmitter-receiver
An
Antenna
Idealized
lized elevation
antenna pattern ly
Short pu
pulse
Trees
Bank edge
Water Truck
x′′
Shadow ∆y′′
∆
∆y
Start sweep
Truck
Sloping edge
Trees
Shadow
End of sweep
Video Water
amplitude
Time
(range)
A-scope display
Scrub growth
(brush, grass, bare earth, etc.)
Figure 1-19 Real-aperture SLAR imaging technique. The antenna is mounted along the belly of the aircraft.
Not only is this an excellent spatial resolution along the x′ Fig. 1-19). The shape of the beam of the cylindrical antenna
direction, but it is also independent of range R (distance between is illustrated in Fig. 1-20. From range R, the extent of the beam
the radar and the surface), which means it is equally applicable along the y direction is
to a satellite-borne radar.
As the aircraft flies along the y direction, the radar beam λ λh
∆y′min ≈ R= , (1.12)
sweeps across the terrain, while constantly transmitting pulses, ly ly cos θ
receiving their echoes, and recording them on an appropriate (real-aperture azimuth resolution)
medium. The sequential echoes are then stitched together to
form an image. where h is the aircraft altitude. This is the spatial resolution
By designing the antenna to be as long as practicable along capability of the radar along the flight direction. For a 3 m long
the airplane velocity direction, the antenna pattern exhibits a antenna operating at λ = 3 cm from an altitude of 1 km, the
relatively narrow beam along that direction (y′ direction in
16 CHAPTER 1 IMAGING SENSORS
NRL
N
Capitol
Pentagon
White House
Washington
Monument
Figure 1-22 SAR image collected over Washington, D.C. Right of center is the Washington Monument, though only the shadow of the obelisk
is readily apparent in the image. [Courtesy of Sandia National Laboratories.]
18 CHAPTER 1 IMAGING SENSORS
given by
Exercise 1-3: With reference to the diagram in Fig. 1-21,
h(x, y) = hx (x) hy (y), (1.14)
suppose the length of the real aperture were to be increased
with hx (x) describing the shape of the transmitted pulse and from 2 m to 8 m. What would happen to (a) the antenna
hy (y) describing the shape of the synthetic antenna-array pat- beamwidth, (b) length of the synthetic aperture, and (c) the
tern. Typically, the pulse shape is like a Gaussian: SAR azimuth resolution?
2 Answer: (a) Beamwidth is reduced by a factor of 4, (b)
hx (x) = e−2.77(x/τ ) , (1.15a)
synthetic aperture length is reduced from 8 km to 2 km, and
where τ is the effective width of the pulse (width between half- (c) SAR resolution changes from 1 m to 4 m.
peak points). The synthetic array pattern is sinc-like in shape,
but the sidelobes may be suppressed further by assigning differ-
ent weights to the processed pulses. For the equally weighted
case, 1-3 X-Ray Computed Tomography
2 1.8y (CT)
hy (y) = sinc , (1.15b)
l
where l is the length of the real antenna, and the sinc function is Computed tomography, also known as CT scan, is a technique
defined such that sinc(z) = sin(π z)/(π z) for any variable z. capable of generating 3-D images of the X-ray attenuation (ab-
sorption) properties of an object, such as the human body. The
X-ray absorption coefficient of a material is strongly dependent
Concept Question 1-5: Why is a SAR called a
on the density of that material. CT has the sensitivity necessary
“synthetic”-aperture radar?
to image body parts across a wide range of densities, from soft
tissue to blood vessels and bones.
Concept Question 1-6: What system parameters deter- As depicted in Fig. 1-24(a), a CT scanner uses an X-ray
mine the PSF of a SAR? source, with a narrow slit to generate a fan-beam, wide enough
to encompass the extent of the body, but only about 1 mm
thick. The attenuated X-ray beam is captured by an array of
∼ 900 detectors. The X-ray source and the detector array are
mounted on a circular frame that rotates in steps of a fraction
of a degree over a full 360◦ circle around the object or patient,
each time recording an X-ray attenuation profile from a different
angular direction. Typically, on the order of 1000 such profiles
are recorded, each composed of measurements by 900 detectors.
For each horizontal slice of the body, the process is completed
in less than 1 second. CT uses image reconstruction algorithms
to generate a 2-D image of the absorption coefficient of that
horizontal slice. To image an entire part of the body, such as
the chest or head, the process is repeated over multiple slices
(layers).
For each anatomical slice, the CT scanner generates on
the order of 9 × 105 measurements (1000 angular orientations
×900 detectors). In terms of the coordinate system shown in
Fig. 1-24(b), we define α (ξ , η ) as the absorption coefficient
of the object under test at location (ξ , η ). The X-ray beam is
directed along the ξ direction at η = η0 . The X-ray intensity
Figure 1-23 High-resolution image of an airport runway received by the detector located at ξ = ξ0 and η = η0 is given
with a plane and helicopter. [Courtesy of Sandia National by
Z ξ0
Laboratories.]
I(ξ0 , η0 ) = I0 exp − α (ξ , η0 ) d ξ , (1.16)
0
1-4 MAGNETIC RESONANCE IMAGING 19
~1 × 10−4 T
Computer
Figure 1-26 B0 is static and approximately uniform within
the cavity. Inside the cavity, B0 ≈ 1.5 T (teslas), compared with
Figure 1-25 Basic diagram of an MRI system. only 0.1 to 0.5 milliteslas outside.
B = B0 + BG + BRF , mI = −1/2
A. Static Field B0
Figure 1-27 Nuclei with spin magnetic number of ±1/2
Field B0 is a strong, static (non–time varying) magnetic field precessing about B0 at the Larmor angular frequency ω0 .
created by a magnet designed to generate a uniform (constant)
distribution throughout the magnetic core (Fig. 1-26). Usually, a
superconducting magnet is used for this purpose because it can
generate magnetic fields with much higher magnitudes than can come magnetized when exposed to a magnetic field. Among the
be realized with resistive and permanent magnets. The direction substances found in a biological material, the hydrogen nucleus
of B0 is longitudinal (ẑ direction in Fig. 1-26) and its magnitude has a strong susceptibility to magnetization, and hydrogen is
is typically on the order of 1.5 teslas (T). The conversion factor highly abundant in biological tissue. For these reasons, a typical
between teslas and gauss is 1 T = 104 gauss. Earth’s magnetic MR image is related to the concentration of hydrogen nuclei.
field is on the order of 0.5 gauss, so B0 inside the MRI core is The strong magnetic field B0 causes the nuclei of the material
on the order of 30,000 times that of Earth’s magnetic field. inside the core space to temporarily magnetize and to spin
Biological tissue is composed of chemical compounds, and (precess) like a top about the direction of B0 . The precession ori-
each compound is organized around the nuclei (protons) of the entation angle θ , shown in Fig. 1-27, is determined by the spin
atoms comprising that compound. Some, but not all, nuclei be- quantum number I of the spinning nucleus and the magnetic
1-4 MAGNETIC RESONANCE IMAGING 21
sin(π N ∆k x)
the vertical direction, the total core volume can be discretized hx (x) = ∆k , (1.20)
sin(π ∆k x)
into horizontal layers called slices, each corresponding to a
different value of f0 (Fig. 1-29). This way, the RF signal can where x is one of the two MR image coordinates, k is a spatial
communicate with each slice separately by selecting the RF frequency, ∆k is the sampling interval in k space, and N is the
frequency to match f0 of that slice. In practice, instead of total number of Fourier samples. A similar expression applies
sending a sequence of RF signals at different frequencies, the RF to hy (y). The spatial resolution of the MR image is equal to the
transmitter sends out a short pulse whose frequency spectrum equivalent width of hx (x), which can be computed as follows:
covers the frequency range of interest for all the slices in the
Z 1/(1 ∆k)
volume, and then a Fourier transformation is applied to the 1 1
response from the biological tissue to separate the responses ∆xmin = hx (x) dx = . (1.21)
h(0) −1/(1 ∆k) N ∆k
from the individual slices.
The gradient magnetic field along the ŷ direction allows dis- The integration was performed over one period (1/∆k) of hx (x).
cretization of the volume into x–y slices. A similar process can According to Eq. (1.21), the image resolution is inversely
be applied to generate x–z and y–z slices, and the combination is proportional to the product N ∆k. The choices of values for N
used to divide the total volume into a three-dimensional matrix and ∆k are associated with signal-to-noise ratio and scan time
of voxels (volume pixels). The voxel size defines the spatial considerations.
resolution capability of the MRI system.
1-4.3 MRI-Derived Information
C. RF System
Generally speaking, MRI can provide three types of information
The combination of the strong static field B0 and the gradient about the imaged tissue:
field BG (whose amplitude is on the order of less than 1%
of B0 ) defines a specific Larmor frequency for the nuclei of (a) The magnetic characteristics of tissues, which are related
every isotope within each voxel. As we noted earlier through to biological attributes and blood vessel conditions.
Table 1-1, at B0 intensities in the 1 T range, the Larmor frequen-
cies of common isotopes are in the MHz range. The RF system (b) Blood flow, made possible through special time-dependent
consists of a transmitter and a receiver connected to separate gradient excitations.
coils, or the same coil can be used for both functions. The
transmitter generates a burst of narrow RF pulses. In practice, (c) Chemical properties discerned from measurements of
many different pulse configurations are used, depending on the small shifts in the Larmor frequency.
1-5 ULTRASOUND IMAGER 23
Transmit pulse
Transmit
Transmitter τ beamforming
unit (time delay
generator) Transducers
System T/R
Display
processor switch
Receive echo
Receive
Data
beamforming
acquisition
(time delay
unit
generator)
Array
axis Array axis
Rf1
Rf2
Focus 1
Focus 2
Figure 1-33 Changing the inter-element time delay across a symmetrical time-delay distribution shifts the location of the focal point in
the range direction.
1-5.3 Spatial Resolution cycles in the pulse. The wavelength is related to the signal
frequency by
v
For a 2-D transducer array of size (Lx × Ly ) and focused at λ= , (1.23)
f
range Rf , as shown in Fig. 1-36 (side Ly is not shown in
the figure), the size of the resolution voxel is given by an where v is the wave velocity and f is the frequency. In biological
axial resolution ∆Rmin along the range direction and by lateral tissue, v ≈ 1540 m/s. For an ultrasound system operating at
resolutions ∆xmin and ∆ymin along the two lateral directions. The f = 5 MHz and generating pulses with N = 2 cycles per pulse,
axial resolution is given by
vN 1540 × 2
∆Rmin = = ≈ 0.3 mm.
λN 2f 2 × 5 × 106
∆Rmin = (axial resolution), (1.22)
2
where λ is the wavelength of the pulse in the material in which
the acoustic waves are propagating and N is the number of
26 CHAPTER 1 IMAGING SENSORS
90 90
70 70
50 50
Lateral distance (mm)
30 30 Lx ∆ xmin
10 10
∆ Rmin
10 10
30 30
50 50 Rf
70 70
90 90 Figure 1-36 Axial resolution ∆Rmin and lateral resolution
0 20 40 60 80 100 0 20 40 60 80 ∆xmin for a transducer array of length Lx focused at range Rf .
Axial distance (mm)
(a) Focused beam (b) Focused beam
with no steering with steering
by 45o
The lateral resolution ∆xmin is given by
Figure 1-34 Simulations of acoustic energy distribution for
(a) a beam focused at Rf = 40 mm by a 96-element array and λ Rf v
∆xmin = Rf = (lateral resolution), (1.24)
(b) a beam focused and steered by 45◦ . Lx Lx f
where Rf is the focal length (range at which the beam is
focused). If the beam is focused at Rf = 5 cm and the array
length Lx = 4 cm and f = 5 MHz, then
5 × 10−2 × 1540
∆xmin = ≈ 0.4 mm,
4 × 10−2 × 5 × 106
(a) Uniform distribution (b) Linear shift (c) Non-uniform (d) Nonlinear shift
symmetrical and non-uniform
distribution distribution
Figure 1-35 Beam focusing and steering are realized by shaping the time-delay distribution.
1-6 COMING ATTRACTIONS 27
which is comparable with the magnitude of the axial resolution 1-6 Coming Attractions
∆Rmin . The resolution along the orthogonal lateral direction,
∆ymin , is given by Eq. (1.24) with Lx replaced with Ly . The size Through examples of image processing products, this section
of the resolvable voxel is presents images extracted from various sections in the book.
In each case, we present a transformed image, along with a
∆V = ∆Rmin × ∆xmin × ∆ymin. (1.25) reference to its location within the text.
199
0 199
50
100
Concept Question 1-10: How does an ultrasound imager
focus its beam in the range direction and in the lateral 150
300
Answer: ∆V = ∆Rmin × ∆xmin × ∆ymin Figure 1-38 Original clown image and nonlinearly warped
= 0.26 mm × 0.41 mm × 0.41 mm. product. [Extracted from Figs. 4-14 and 4-17.]
28 CHAPTER 1 IMAGING SENSORS
200
0 200
(a) Dark clown image
200
0 200
(b) Brightened clown image
0 0
20
40
60
80
100
120
140 659
0 799
160 (a) Original Mariner image
180
0
200
0 20 40 60 80 100 120 140 160 180 200
20
40
60
80 659
0 799
100
(b) Notch-filtered Mariner image
120
140
Figure 1-42 The horizontal lines in (a) are due to sinusoidal
interference in the recorded image. The lines were removed by
160 applying notch filtering. [Extracted from Fig. 6-7.]
180
200
0 20 40 60 80 100 120 140 160 180 200
0
20
40
60
80
100
120
140
224 160
0 274
180
(a) Original motion-blurred image caused by
taking a photograph in a moving car 200
20 40 60 80 100 120 140 160 180 200
(a) A noisy clown image
0
20
40
60
80
100
120
140
160
274 180
0 274
200
(b) Motion deblurred image 20 40 60 80 100 120 140 160 180 200
(b) Wavelet-denoised clown image
Figure 1-43 Motion blurring is removed. [Extracted from
Fig. 6-11.] Figure 1-44 Image denoising. [Extracted from Fig. 7-21.]
1-6 COMING ATTRACTIONS 31
60
80
100
120
140
160
180
200
20 40 60 80 100 120 140 160 180 200 (a) Unfocused MRI image
(a) Image with missing pixels
20
40
60
80
100
160
180
200
20 40 60 80 100 120 140 160 180 200
(b) Inpainted image
1-6.10 Markov Random Fields for Image 1-6.11 Motion-Deblurring of a Color Image
Segmentation Chapters 1–9 consider grayscale (black-and-white) images,
In a Markov random field (MRF) image model, the value of each since color images consist of three (red, green, blue) images.
pixel is stochastically related to its surrounding values. This Motion deblurring of a color image is presented in Section 10-3,
is useful in segmenting images, as presented in Section 9-11. an example of which is shown in Fig. 1-48.
Figure 1-47 illustrates how an MRF image model can improve
the segmentation of an X-ray image of a foot into tissue and
bone.
0
u2 1
1.5 2
784 terminals
3
1 4
Class 1 5
0.5 6
7
0
8
9
−0.5 Class 2 Class 3
−1
Summary
Concepts
• Images may be formed using any of these imaging • Color images are actually triplets of red, green, and blue
modalities: optical, infrared, radar, x-rays, ultrasound, images, displayed together.
and magnetic resonance imaging. • The effect of an image acquisition system on an image
• Image processing is needed to process a raw image, can usually be modelled as 2-D convolution with the
formed directly from data, into a final image, which has point spread function of the system (see below).
been deblurred, denoised, interpolated, or enhanced, all • The resolution of an image acquisition system can be
of which are subjects of this book. computed using various formulae (see below).
Mathematical Formulae
1 1 1
Lens law + = X-ray tomography path attenuation
d0 di f Z ∞Z ∞
p(r, θ ) = a(ξ , η ) δ (r − ξ cos θ − η sin θ ) d ξ d η
Optical point spread function −∞ −∞
λ
2J1 (γ ) 2 πD Optical resolution ∆θmin ≈ 1.22
h(θ ) = , γ= sin θ D
γ λ λ
Radar resolution ∆y′min ≈ R
SAR point spread function D
λN
2 1.8y Ultrasound resolution ∆Rmin =
h(x, y) = e−2.77(x/τ ) sinc2 2
l
sin(π N ∆k x)
MRI point spread function hx (x) = ∆k
sin(π ∆k x)
2-D convolution Z ∞Z ∞
Ii (x, y) = Io (x, y) ∗ ∗ h(x, y) = Io (x − x′ , y − y′) h(x′ , y′ ) dx′ dy′
−∞ −∞
Important Terms Provide definitions or explain the meaning of the following terms:
active pixel sensor liquid crystal display radar X-ray computed tomography
beamforming magnetic resonance imaging (MRI) resolution
charge-coupled device optical imaging synthetic-aperture radar
infrared imaging point spread function ultrasound imaging
36 CHAPTER 1 IMAGING SENSORS
Objectives 0.20
0.15
Learn to:
0.10
39
40 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
1-D Signals
Continuous Time
FT
x(t) X( f )
Signal in Spectrum in
time domain frequency domain
Discrete Time
DTFT DFT
x[n] X(Ω) X[k]
Signal at Spectrum at Spectrum at
2πk
discrete times continuous discrete frequencies Ω =
N
t = n∆ frequency Ω 0≤k≤N−1
2-D Images
Continuous Space
CSFT
f (x,y) F(μ,ν)
Image in Spectrum in
spatial domain frequency domain
Discrete Space
DSFT 2-D DFT
f [n,m] F(Ω1,Ω2) F[k1,k2]
Image in Spectrum in order N Spectrum in
discrete space continuous discrete frequency domain:
frequency domain 2πk1 2πk2
Ω1 = , Ω2 =
N N
2-1 REVIEW OF 1-D CONTINUOUS-TIME SIGNALS 41
Types of Signals
x(t)
A cos (2πt /T )
A
Eternal sinusoid x(t) = A cos(2π f0t + θ ), −∞ < t < ∞
t
−T −T/2 0 T/2 T
t − t0
rect ( )T
T
t − t0 1
Pulse (rectangle) x(t) = rect
( T
t (s)
1 for (t0 − T /2) < t < (t0 + T /2), 0 t0
=
0 otherwise
1 δ(t − t0)
Z ∞
Impulse δ (t) x(t) δ (t − t0 ) dt = x(t0 )
−∞
t
0 t0
Properties
x(t)
y1(t) = x(2t)
10 x(t) y2(t) = x(t / 2)
The scaling property can be interpreted using Eq. (2.5) as A nonzero signal x(t) that is zero-valued outside the interval
follows. For a > 1 the width of the pulse in Eq. (2.5) is
compressed by |a|, reducing its area by a factor of |a|, but its [a, b] = {t : a ≤ t ≤ b},
height is unaltered. Hence the area under the pulse is reduced to
1/|a|. (i.e., x(t) = 0 for t ∈
/ [a, b]), has support [a, b] and duration b − a.
Impulses are important tools used in defining the impulse
responses of 1-D systems and the point-spread functions of 2-D Concept Question 2-1: Why does scaling time in an im-
spatial systems (such as a camera or an ultrasound), as well as pulse also scale its area?
for deriving the sampling theorem.
R∞
Exercise 2-1: Compute the value of −∞ δ (3t − 6) t 2 dt.
2-1.2 Properties of 1-D Signals
Answer: 43 . (See IP )
A. Time Delay
Delaying signal x(t) by t0 generates signal x(t −t0 ). If t0 > 0, the Exercise 2-2: Compute
the energy of the pulse defined by
waveform of x(t) is shifted to the right by t0 , and if t0 < 0, the t−2
x(t) = 5 rect 6 .
waveform of x(t) is shifted to the left by |t0 |. This is illustrated
by the time-delay figure in Table 2-1. Answer: 150. (See IP )
B. Time Scaling
2-2 Review of 1-D Continuous-Time
A signal x(t) time-scaled by a becomes x(at). If a > 1, the
waveform of x(t) is compressed in time by a factor of a. If Systems
0 < a < 1, the waveform of x(t) is expanded in time by a factor
of 1/a, as illustrated by the scaling figure in Table 2-1. If a < 0, A continuous-time system is a device or mathematical model
the waveform of x(t) is compressed by |a| or expanded by 1/|a|, that accepts a signal x(t) as its input and produces another signal
and then time-reversed. y(t) at its output. Symbolically, the input-output relationship is
expressed as
C. Signal Energy
x(t) SYSTEM y(t)
The energy E of a signal x(t) is
Z ∞
Table 2-2 provides a list of important system types and proper-
E= |x(t)|2 dt. (2.7)
−∞ ties.
Property Definition
N N
Linear System (L) If xi (t) L yi (t), then ∑ ci xi (t) L ∑ ci yi (t)
i=1 i=1
N N
Linear Time-Invariant (LTI) If xi (t) LTI yi (t), then ∑ ci xi (t − τi) LTI ∑ ci yi (t − τi)
i=1 i=1
of the above four classes. If a system is moderately nonlinear, exactly the same direction. That is, if
it can be approximated by a linear model, and if it is highly
nonlinear, it may be possible to divide its input-output response
x(t) TI y(t),
into a series of quasi-linear regions. In this book, we limit our
treatment to linear (L) and linear time-invariant (LTI) systems.
then it follows that
in the following two equations: which is known as the convolution integral. Often, the convolu-
tion integral is represented symbolically by
Z ∞
δ (t − τ ) SYSTEM h(t; τ ), (2.10a)
x(t) ∗ h(t) = x(τ ) h(t − τ ) d τ . (2.15)
−∞
and for a time-invariant system,
Combining the previous results leads to the symbolic form
δ (t − τ ) TI h(t − τ ). (2.10b)
x(t) LTI y(t) = x(t) ∗ h(t). (2.16)
Z ∞
x(t) L y(t) = x(τ ) h(t; τ ) d τ . (2.13)
−∞
Concept Question 2-2: What is the significance of a sys-
This integral is called the superposition integral. tem being linear time-invariant?
Z ∞ Z ∞
4. x(τ ) δ (t − τ ) d τ LTI y(t) = x(τ ) h(t − τ ) d τ
−∞ −∞
Z ∞
5. x(t) LTI y(t) = x(τ ) h(t − τ ) d τ = x(t) ∗ h(t)
−∞
Figure 2-2 Derivation of the convolution integral for a linear time-invariant system.
Property Description
3. Distributive x(t) ∗ [h1 (t) + · · · + hN (t)] = x(t) ∗ h1 (t) + · · · + x(t) ∗ hN (t) (2.19c)
Z t
4. Causal ∗ Causal = Causal y(t) = u(t) h(τ ) x(t − τ ) d τ (2.19d)
0
(a) In series
2-3 1-D Fourier Transforms
The continuous-time Fourier transform (CTFT) is a powerful
h1(t) tool for
• computing the spectra of signals, and
x(t) h2(t) y(t)
• analyzing the frequency responses of LTI systems.
and Modulation:
Z ∞
1 − jω t Z ∞
x(t) = X−ω (ω ) e dω . (2.22b) e j2π f0t x(t) = X( f ) e j2π f t e j2π f0t d f
2π −∞
−∞
Geophysicists use different sign conventions for time and space! Z ∞
In addition, some computer programs,√such as Mathematica, = X( f ) e j2π ( f + f0 )t d f
−∞
split the 1/(2π ) factor into factors of 1/ 2π in both the forward −1
and inverse transforms. =F {X( f − f0 )}. (2.25)
Derivative:
◮ In this book, we use the definition of the Fourier trans- Z ∞
dx(t) d j2π f t
form given by Eq. (2.20) exclusively. ◭ = X( f ) e df
dt dt −∞
Z ∞
= X( f ) ( j2π f ) e j2π f t d f
B. Fourier Transform Notation −∞
−1
Throughout this book, we use Eq. (2.20) as the definition of the =F {( j2π f ) X( f )}. (2.26)
Fourier transform, we denote the individual transformations by
Zero frequency:
F {x(t)} = X( f ) Setting f = 0 in Eq. (2.20a) leads to
Z ∞
and X(0) = x(t) dt.
F −1 {X( f )} = x(t), −∞
1. Linearity ∑ ci xi (t) ∑ ci Xi ( f )
1 f
2. Time scaling x(at) X
|a| a
3. Time shift x(t − τ ) e− j2π f τ X( f )
dx
5. Time derivative x′ = j2π f X( f )
dt
6. Reversal x(−t) X(− f )
7. Conjugation x∗ (t) X∗ (− f )
Special FT Relationships
Z ∞
11. Zero frequency X(0) = x(t) dt
−∞
Z ∞
12. Zero time x(0) = X( f ) d f
−∞
Z ∞ Z ∞
13. Parseval’s theorem x(t) y∗ (t) dt = X( f ) Y∗ ( f ) d f
−∞ −∞
x(t) and X( f ) are equal: where the even component xe (t) and the odd component xo (t)
Z ∞ Z ∞ are formed from their parent signal x(t) as follows:
E= |x(t)|2 dt = |X( f )|2 d f . (2.28)
−∞ −∞ xe (t) = [x(t) + x∗ (−t)]/2 (2.30a)
and
xo (t) = [x(t) − x∗(−t)]/2. (2.30b)
A signal is said to have even symmetry if x(t) = x∗ (−t), in
A. Even and Odd Parts of Signals which case x(t) = xe (t) and xo (t) = 0. Similarly, a signal has
odd symmetry if x(t) = −x∗ (−t), in which case x(t) = xo (t) and
xe (t) = 0.
A signal x(t) can be decomposed into even xe (t) and odd x0 (t)
components:
x(t) = xe (t) + x0(t), (2.29)
50 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
d
Sound magnitude Exercise 2-6: Compute the Fourier transform of dt [sinc(t)].
Answer:
(
“oo” as in “cool” d j2π f for | f | < 0.5,
F sinc(t) =
dt 0 for | f | > 0.5.
(See IP )
f (kHz)
1 2 3
(a) “oo” spectrum
2-4 The Sampling Theorem
The sampling theorem is an operational cornerstone of both
Sound magnitude discrete-time 1-D signal processing and discrete-space 2-D
image processing.
1 1 1 and if
x(t) = sin(t) + sin(3t) + sin(5t) + sin(7t) + · · ·
3 5 7 x(t) is sampled at a sampling rate of S samples/s,
Compute output y(t) if then
x(t) can be reconstructed exactly from {x(n∆),
x(t) h(t) = 0.4 sinc(0.4t) y(t). n = . . . , −2, −1, 0, 1, 2, . . .}, provided S > 2B.
mum) sampling rate 2B samples/second is called the Nyquist Dividing by ∆ and recalling that S = 1/∆ gives
rate, and the frequency 2B is called the Nyquist frequency.
∞
xs (t) S ∑ X( f − kS). (2.47)
k=−∞
2-4.2 Sampling Theorem Derivation The spectrum Xs ( f ) of xs (t) consists of a superposition of copies
of the spectrum X( f ) of x(t), repeated every S = 1/∆ and
A. The Sampled Signal xs (t) multiplied by S. If these copies do not overlap in frequency,
we may then recover X( f ) from Xs ( f ) using a lowpass filter,
Given a signal x(t), we construct the sampled signal xs (t) by provided S > 2B [see Fig. 2-6(a)].
multiplying x(t) by the impulse train
∞ X s( f )
δs (t) = ∑ δ (t − n∆). (2.42) S X(0)
n=−∞ f
−B − S B−S −B 0 B S−B S+B
That is, Copy Copy
(a) S > 2B
∞
xs (t) = x(t) δs (t) = ∑ x(t) δ (t − n∆) X s( f )
n=−∞
∞
f
= ∑ x(n∆) δ (t − n∆). (2.43) −B − S B − S0S − B S+B
n=−∞
−B B
(b) S < 2B
B. Spectrum of the sampled signal xs (t) Figure 2-6 Sampling a signal x(t) with maximum frequency B at
a rate of S makes X( f ) change amplitude to S X( f ) and to repeat
Using Fourier series, it can be shown that the Fourier transform in f with period S. These copies (a) do not overlap if S > 2B, but
of the impulse train δs (t) is itself an impulse train in frequency: (b) they do if S < 2B.
∞ ∞
∆ ∑ δ (t − n∆) ∑ δ ( f − k/∆). (2.44)
n=−∞ k=−∞
2-4.3 Aliasing
This result can be interpreted as follows. A periodic signal has
If the sampling rate S does not exceed 2B, the copies of X( f )
a discrete spectrum (zero except at specific frequencies) given
will overlap one another, as shown in Fig. 2-6(b). This is called
by the signal’s Fourier series expansion. By Fourier duality, a
an aliased condition, the consequence of which is that the
discrete signal (zero except at specific times) such as xs (t) has
reconstructed signal will no longer match the original signal
a periodic spectrum. So a discrete and periodic signal such as
x(t).
δs (t) has a spectrum that is both discrete and periodic.
Multiplying Eq. (2.44) by x(t), using the definition for xs (t)
given by Eq. (2.43), and applying property #9 in Table 2-4 leads Example 2-1: Two Sinusoids and Aliasing
to
∞
xs (t) ∆ X( f ) ∗ ∑ δ ( f − k/∆), (2.45)
Two signals, a 2 Hz sinusoid and a 12 Hz sinusoid:
k=−∞
0.8 x1(t)
0.6
x1s(t)
0.4
0.2
0 t (s)
−0.2
−0.4
−0.6
−0.8
−1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
x2(t)
0.8
0.6
0.4
x2s(t)
0.2
0 t (s)
−0.2
−0.4
−0.6
−0.8
−1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Figure 2-7 Plots of (a) x1 (t) and x1s (t) and (b) x2 (t) and x2s (t). Sampling rate S = 20 samples/s.
2-4 THE SAMPLING THEOREM 57
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0 f (Hz)
−30 −20 −10 0 10 20 30
X1s( f )
X1s( f ) for 2 Hz sinusoid sampled at 20 Hz. Note no aliasing.
10
9
S X1( f )
Copy of S X1( f ) Copy of S X1( f )
8 at −20 Hz at +20 Hz
7
0 f (Hz)
−30 −20 −10 0 10 20 30
S = 20 Hz S = 20 Hz
(b) Spectrum X1s( f ) of sampled signal x1s(t)
Figure 2-8 Spectra (a) X1 ( f ) and (b) X1s ( f ) of the 2 Hz sinusoid and its sampled version, respectively. The spectrum X1s ( f ) consists of
X( f ) scaled by S = 20, plus copies thereof at integer multiples of ±20 Hz. The vertical axes denote areas under the impulses.
58 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0 f (Hz)
−30 −20 −10 0 10 20 30
4 Overlap Overlap
0 f (Hz)
−30 −20 −10 0 10 20 30
S = 20 Hz S = 20 Hz
(b) Spectrum X2s( f ) of sampled signal x2s(t)
Figure 2-9 Spectra (a) X2 ( f ) and (b) X2s ( f ) of the 12 Hz sinusoid and its sampled version. Note the overlap in (b) between the spectrum
of X2s ( f ) and its neighboring copies. The vertical axes denote areas under the impulses.
2-5 REVIEW OF 1-D DISCRETE-TIME SIGNALS AND SYSTEMS 59
Note that Xs ( f ) is periodic in f with period 1/∆, as it should be. reconstruction x̂1 (t), we apply Eq. (2.57) with ∆ = 1/S = 0.05 s:
In the absence of aliasing,
X̂1 ( f ) = X1s ( f ) ∆ sinc(∆ f ).
1
X( f ) = Xs ( f ) for | f | < S/2. (2.54)
S The sinc function is displayed in Fig. 2-10(a) in red and uses the
vertical scale on the right-hand side, and the spectrum X̂1 ( f ) is
The relationship given by Eq. (2.53)) still requires an infinite displayed in blue using the vertical scale on the left-hand side.
number of samples {x(n∆)} to reconstruct X( f ) at each fre- The sinc function preserves the spectral components at ±2 Hz,
quency f . but attenuates the components centered at ±20 Hz by a factor
of 10 (approximately).
D. Nearest-Neighbor (NN) Interpolation (b) Application of Eq. (2.56) to x1 (n∆) = cos(4π n∆) with
∆ = 1/20 s yields plot x̂1 (t) shown in Fig. 2-10(b).
A common procedure for computing an approximation to x(t)
from its samples {x(n∆)} is the nearest neighbor interpolation.
The signal x(t) is approximated by x̂(t): Concept Question 2-6: Why must the sampling rate of a
signal exceed double its maximum frequency, if it is to be
reconstructed from its samples?
x(n∆)
for (n − 0.5)∆ < t < (n + 0.5)∆,
x̂(t) = x((n + 1)∆) for (n + 0.5)∆ < t < (n + 1.5)∆,
.. .. Concept Question 2-7: Why does nearest-neighbor in-
. . terpolation work as well as it does?
(2.55)
So x̂(t) is a piecewise-constant approximation to x(t), and it is
related to the sampled signal xs (t) by Exercise 2-7: What is the Nyquist sampling rate for a signal
bandlimited to 4 kHz?
x̂(t) = xs (t) ∗ rect(t/∆). (2.56)
Answer: 8000 samples/s. (See IP )
Using the Fourier transform of a rectangle function (entry #4 in
Table 2-5), the spectrum X̂( f ) of x̂(t) is Exercise 2-8: A 500 Hz sinusoid is sampled at 900
samples/s. No anti-alias filter is being used. What is the
X̂( f ) = Xs ( f ) ∆ sinc(∆ f ), (2.57) frequency of the reconstructed continuous-time sinusoid?
where Xs ( f ) is the spectrum of the sampled signal. The zero- Answer: 400 Hz. (See IP )
crossings of the sinc function occur at frequencies f = k/∆ = kS
for integers k. These are also the centers of the copies of the
original spectrum X( f ) induced by sampling. So these copies
are attenuated if the maximum frequency B of X( f ) is such that 2-5 Review of 1-D Discrete-Time
B ≪ S. The factor ∆ in Eq. (2.57) cancels the factor S = 1/∆ in
Eq. (2.47). Signals and Systems
Through direct generalizations of the 1-D continuous-time def-
initions and properties of signals and systems presented earlier,
Example 2-2: Reconstruction of 2 Hz Sinusoid
we now extend our review to their discrete counterparts.
ˆ 1( f )
X sinc(0.05 f )
0.5 1.0
0.45 0.9
0.4 0.8
0.2 0.4
0.15 0.3
0.1 0.2
0.05 0.1
0
0 f (Hz)
−30 −20 −10 0 10 20 30
0.8
0.6
0.4
0.2
−0.2
−0.4
−0.6
−0.8
−1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
n 2-5.2 Discrete-Time Eternal Sinusoids
−2 −1 0 1 2 3
A discrete-time eternal sinusoid is defined as
Figure 2-11 Stem plot representation of x[n]. x[n] = A cos(Ω0 n + θ ), −∞ < n < ∞, (2.62)
can be depicted using either the bracket notation with N/D being a rational number. In such a case, the fun-
damental period of the sinusoid is N, provided N/D has been
x[n] = {3, 2, 0, 4}, (2.59) reduced to lowest terms.
∞
Another important property of discrete-time eternal sinusoids
is that the discrete-time frequency Ω0 is periodic, which is not
y[n] = h[n] ∗ x[n] = ∑ h[i] x[n − i]. (2.71a)
i=−∞
true for continuous-time sinusoids. For any integer k, Eq. (2.62) Discrete-time convolution
can be rewritten as
x[n] = A cos(Ω0 n + θ ), −∞ < n < ∞ Most of the continuous-time properties of convolution also
= A cos((Ω0 + 2π k)n + θ ), apply in discrete time.
Real-world signals and filters are defined over specified
= A cos(Ω′0 n + θ ), −∞ < n < ∞, (2.66) ranges of n (and set to zero outside those ranges). If h[n] has
support in the interval [n1 , n2 ], Eq. (2.71a) becomes
with
Ω′0 = Ω0 + 2π k. (2.67) n2
x[n] = cos(2π n) = 1 at (Ω0 = 2π ). (2.70) For causal signals (x[n] and h[n] equal to zero for n < 0), y[n]
assumes the form
Beyond Ω0 = 2π , the oscillatory behavior starts to increase
again, and so on. This behavior has no equivalence in the world n
of continuous-time sinusoids. y[n] = ∑ x[i] h[n − i], n ≥ 0. (2.71d)
i=0
Causal
2-5.3 1-D Discrete-Time Systems
A 1-D discrete-time system accepts an input x[n] and produces For example,
an output y[n]:
{1, 2} ∗ {3, 4} = {3, 10, 8}. (2.72)
x[n] SYSTEM y[n]. The duration of the output is 2 + 2 − 1 = 3.
The definition of LTI for discrete-time systems is identical to 2-5.4 Discrete-Time Convolution Properties
the definition of LTI for continuous-time systems. If a discrete-
time system has impulse response h[n], then the output y[n] With one notable difference, the properties of the discrete-time
can be computed from the input x[n] using the discrete-time convolution are the same as those for continuous time. If (t)
2-5 REVIEW OF 1-D DISCRETE-TIME SIGNALS AND SYSTEMS 63
Table 2-6 Comparison of convolution properties for continuous-time and discrete-time signals.
2. Associative [g(t) ∗ h(t)] ∗ x(t) = g(t) ∗ [h(t) ∗ x(t)] [g[n] ∗ h[n]]∗ x[n] = g[n] ∗ [h[n] ∗ x[n]]
is replaced with [n] and integrals are replaced with sums, the 2-5.5 Delayed-Impulses Computation Method
convolution properties listed in Table 2-3 lead to those listed in
Table 2-6. For finite-duration signals, computation of the convolution sum
The notable difference is associated with property #7. In can be facilitated by expressing one of the signals as a linear
discrete time, the width (duration) of a signal that is zero-valued combination of delayed impulses. The process is enabled by the
outside interval [a, b] is b − a + 1, not b − a. Consider two sampling property (#6 in Table 2-6).
signals, h[n] and x[n], defined as follows: Consider, for example, the convolution sum of the two signals
x[n] = {2, 3, 4} and h[n] = {5, 6, 7}, namely
Signal From To Duration
y[n] = x[n] ∗ h[n] = {2, 3, 4} ∗ {5, 6, 7}.
h[n] a b b−a+1
x[n] c d d−c+1 The sampling property allows us to express x[n] in terms of
impulses,
y[n] a+c b+d (b + d) − (a + c) + 1
x[n] = 2δ [n] + 3δ [n − 1] + 4δ [n − 2],
where y[n] = h[n] ∗ x[n]. Note that the duration of y[n] is
which leads to
(b + d) − (a + c) + 1 = (b − a + 1) + (d − c + 1) − 1
= duration h[n] + duration x[n] − 1. y[n] = (2δ [n] + 3δ [n − 1] + 4δ [n − 2])∗ h[n]
= 2h[n] + 3h[n − 1] + 4h[n − 2].
64 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
2
Given that both x[n] and h[n] are of duration = 3, the duration
of their sum is 3 + 3 − 1 = 5, and it extends from n = 0 y[3] = ∑ x[i] h[3 − i]
i=1
to n = 4. Computing y[0] using the delayed-impulses method
(while keeping in mind that h[i] has a non-zero value for only = x[1] h[2] + x[2] h[1] = 3 × 7 + 4 × 6 = 45,
i = 0, 1, and 2) leads to 2
y[4] = ∑ x[i] h[4 − i] = x[2] h[2] = 4 × 7 = 28,
y[0] = 2h[0] + 3h[−1] + 4h[−2] i=2
The process can then be repeated to obtain the values of y[n] for Hence,
n = 1, 2, 3, and 4. y[n] = {10, 27, 52, 45, 28}.
(b) The convolution sum can be computed graphically
through a four-step process.
Example 2-4: Discrete-Time Convolution Step 1: Replace index n with index i and plot x[i] and h[−i],
as shown in Fig. 2-12(a). Signal h[−i] is obtained from h[i] by
reflecting it about the vertical axis.
Given x[n] = {2, 3, 4} and h[n] = {5, 6, 7}, compute
Step 2: Superimpose x[i] and h[−i], as in Fig. 2-12(b), and
y[n] = x[n] ∗ h[n] multiply and sum them. Their product is 10.
by (a) applying the sum definition and (b) graphically. Step 3: Shift h[−i] to the right by 1 to obtain h[1 − i], as
shown in Fig. 2-12(c). Multiplication and summation of x[i] by
Solution: (a) Both signals have a length of 3 and start at time h[1 − i] generates y[1] = 27. Shift h[1 − i] by one more unit to
zero. That is, x[0] = 2, x[1] = 3, x[2] = 4, and x[i] = 0 for all the right to obtain h[2 − i], and then repeat the multiplication
other values of i. Similarly, h[0] = 5, h[1] = 6, h[2] = 7, and and summation process to obtain y[2]. Continue the shifting and
h[i] = 0 for all other values of i. multiplication and summation processes until the two signals no
By Eq. (2.71d), the convolution sum of x[n] and h[n] is longer overlap.
n
y[n] = x[n] ∗ h[n] = ∑ x[i] h[n − i]. Step 4: Use the values of y[n] obtained in step 3 to generate a
i=0 plot of y[n], as shown in Fig. 2-12(g);
Since h[i] = 0 for all values of i except i = 0, 1, and 2, it follows y[n] = {10, 27, 52, 45, 28}.
that h[n − i] = 0 for all values of i except for i = n, n − 1, and
n − 2. With this constraint in mind, we can apply Eq. (2.71d) at
discrete values of n, starting at n = 0:
Concept Question 2-8: Why are most discrete-time si-
0 nusoids not periodic?
y[0] = ∑ x[i] h[0 − i] = x[0] h[0] = 2 × 5 = 10,
i=0
1 Concept Question 2-9: Why is the length of the convo-
y[1] = ∑ x[i] h[1 − i] lution of two discrete-time signals not equal to the sum of
i=0 the lengths of the two signals?
= x[0] h[1] + x[1] h[0] = 2 × 6 + 3 × 5 = 27,
2 Exercise 2-9: A 28 Hz sinusoid is sampled at 100 sam-
y[2] = ∑ x[i] h[2 − i] ples/s. What is Ω0 for the resulting discrete-time sinusoid?
i=0
What is the period of the resulting discrete-time sinusoid?
= x[0] h[2] + x[1] h[1] + x[2] h[0]
Answer: Ω0 = 0.56π ; N = 25. (See IP )
= 2 × 7 + 3 × 6 + 4 × 5 = 52,
2-5 REVIEW OF 1-D DISCRETE-TIME SIGNALS AND SYSTEMS 65
8 8
n=0 7 n=0
6 x[i] 6 6
5 h[−i]
4 4 4
3
2 2 2
i i
−3 −2 −1 0 1 2 3 4 5 6 −3 −2 −1 0 1 2 3 4 5 6
(a) x[i] and h[−i]
y[0] = 2 × 5 = 10 y[1] = 2 × 6 + 3 × 5 = 27
8 8
h[−i] n=0 h[1 − i] n=1
6 6
4 4
2 x[i] 2 x[i]
i i
−3 −2 −1 0 1 2 3 4 5 6 −3 −2 −1 0 1 2 3 4 5 6
(b) (c)
y[2] = 2 × 7 + 3 × 6 + 4 × 5 = 52 y[3] = 3 × 7 + 4 × 6 = 45
8 8
h[2 − i] n=2 h[3 − i] n=3
6 6
4 4
2 x[i] 2 x[i]
i i
−3 −2 −1 0 1 2 3 4 5 6 −3 −2 −1 0 1 2 3 4 5 6
(d) (e)
y[4] = 4 × 7 = 28 y[n]
8
n=4
6 60
h[4 − i] 52
4 40 45
x[i] 27 28
2 20
10
i n
−3 −2 −1 0 1 2 3 4 5 6 −1 0 1 2 3 4
(f ) (g) y[n]
(a) m 6= n
Exercise 2-10: Compute the output y[n] of a discrete-time
LTI system with impulse response h[n] and input x[n], where
Evaluation of the integral in Eq. (2.74) leads to
h[n] = { 3, 1 } and x[n] = { 1, 2, 3, 4 }.
π
Z π
Answer: { 3, 7, 11, 15, 4 }. (See IP ) 1 jΩ(m−n) e jΩ(m−n)
e dΩ =
2 π −π 2π j(m − n)
−π
e j π (m−n) − e jπ (m−n)
−
=
2-6 Discrete-Time Fourier Transform j2π (m − n)
(DTFT) (−1)m−n − (−1)m−n
=
j2π (m − n)
The discrete-time Fourier transform (DTFT) is the discrete- = 0, (m 6= n). (2.75)
time counterpart to the Fourier transform. It has the same two
functions: (1) to compute spectra of signals and (2) to analyze
the frequency responses of LTI systems.
(b) m = n
2-6.1 Definition of the DTFT
If m = n, the integral reduces to
The DTFT of x[n], denoted X(Ω), and its inverse are defined as Z π Z π
1 1
e jΩ(n−n) dΩ = 1 dΩ = 1. (2.76)
∞
2π −π 2π −π
X(Ω) = ∑ x[n] e− jΩn (2.73a) The results given by Eqs. (2.75) and (2.76) can be combined into
n=−∞
the definition of the orthogonality property given by Eq. (2.74).
and Having verified the validity of the orthogonality property, we
Z π
1 now use it to derive Eq. (2.73b). Upon multiplying the definition
x[n] = X(Ω) e jΩn dΩ. (2.73b)
2π −π for the DTFT given by Eq. (2.73a) by 21π e jΩm and integrating
over Ω, we have
Readers familiar with the Fourier series will recognize that Z π Z π ∞
1 1
the DTFT X(Ω) is a Fourier series expansion with x[n] as the X(Ω) e jΩm dΩ = ∑ x[n] e jΩ(m−n) dΩ
2π −π 2π −π n=−∞
coefficients of the Fourier series. The inverse DTFT is simply Z π
∞
the formula used for computing the coefficients x[n] of the 1
= ∑ x[n] e jΩ(m−n) dΩ
Fourier series expansion of the periodic function X(Ω). 2π n=−∞ −π
We note that the DTFT definition given by Eq. (2.73a) is ∞
the same as the formula given by Eq. (2.53) for computing the = ∑ x[n] δ [m − n] = x[m]. (2.77)
spectrum Xs ( f ) of a continuous-time signal x(t) directly from n=−∞
its samples {x(n∆)}, with Ω = 2π f ∆.
The inverse DTFT given by Eq. (2.73b) can be derived as Equation (2.74) was used in the final step leading to Eq. (2.77).
follows. First, we introduce the orthogonality property Exchanging the order of integration and summation in Eq. (2.77)
is acceptable if the summand is absolutely summable; i.e., if the
Z π
1 DTFT is defined. Finally, replacing the index m with n in the top
e jΩ(m−n) dΩ = δ [m − n]. (2.74) left-hand side and bottom right-hand side of Eq. (2.77) yields
2π −π
the inverse DTFT expression given by Eq. (2.73b).
To establish the validity of this property, we consider two cases,
namely: (1) when m 6= n and (2) when m = n.
2-6 DISCRETE-TIME FOURIER TRANSFORM (DTFT) 67
Table 2-7 Properties of the DTFT. If x[n] is real-valued, then conjugate symmetry holds:
1. Linearity ∑ ci xi [n] ∑ ci Xi (Ω) Parseval’s theorem for the DTFT states that the energy of x[n]
is identical, whether computed in the discrete-time domain n or
2. Time shift x[n − n0] X(Ω) e− jn0 Ω in the frequency domain Ω:
∞ Z π
3. Modulation x[n] e jΩ0 n X(Ω − Ω0) 1
∑ |x[n]|2 = |X(Ω)|2 dΩ (2.80)
n=−∞ 2π −π
4. Time reversal x[−n] X(−Ω)
5. Conjugation x∗ [n] X∗ (−Ω) The energy spectral density is now 21π |X(Ω)|2 .
Finally, by analogy to continuous time, a discrete-time ideal
6. Time h[n] ∗ x[n] H(Ω) X(Ω) lowpass filter with cutoff frequency Ω0 has the frequency
convolution response for |Ω| < π (recall that H(Ω) is periodic with period π )
(
Special DTFT Relationships 1 for |Ω| < Ω0 ,
H(Ω) = (2.81)
0 for Ω0 < |Ω| ≤ π ,
7. Conjugate X∗ (Ω) = X(−Ω)
symmetry
which eliminates frequency components of x[n] that lie in the
8. Zero X(0) = ∑∞
n=−∞ x[n] range Ω0 < |Ω| ≤ π .
frequency Z π
1
9. Zero time x[0] = X(Ω) dΩ
2π −π 2-6.3 Important DTFT Pairs
∞
n
10. Ω = ±π X(±π ) = ∑ (−1) x[n] For easy access, several DTFT pairs are provided in Table 2-8.
n=−∞
∞ Z In all cases, the expressions for X(Ω) are periodic with period
1 π
11. Rayleigh’s ∑ |x[n]|2 = |X(Ω)|2 dΩ 2π , as they should be.
n=−∞ 2π −π
Entries #7 and #8 of Table 2-8 deserve more discussion,
(often called
Parseval’s) which we now present.
theorem
68 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
π ∞
5. sin(Ω0 n)
j ∑ [δ (Ω − Ω0 − 2π k) − δ (Ω + Ω0 − 2π k)]
k=−∞
The impulse response of an ideal lowpass filter is As can be seen in Fig. 2-13(c) and (d), hFIR [n] with N = 10
Z Ω0 provides a good approximation to an ideal lowpass filter, with
Ω0 Ω0 a finite duration. The Hamming-windowed filter belongs to a
h[n] = 1e jΩn dΩ = sinc n . (2.82)
−Ω0 π π group of filters called finite-impulse response (FIR) filters. FIR
filters can also be designed using a minimax criterion, resulting
This is called a discrete-time sinc function. A discrete-time sinc in an equiripple filter. This and other FIR filter design proce-
function h[n] with Ω0 = π /4 is displayed in Fig. 2-13(a), along dures are discussed in discrete-time signal processing textbooks.
with its frequency response H(Ω) in Fig. 2-13(b). Such a filter
is impractical for real-world applications, because it is unstable
and it has infinite duration. To override these limitations, we
can multiply h[n] by a window function, such as a Hamming
window. The modified impulse response hFIR [n] is then given
2-6 DISCRETE-TIME FOURIER TRANSFORM (DTFT) 69
h[n] H(Ω)
Impulse response Frequency response
0.25 1.5
0.20
1
0.10
0 n 0.5
−0.10 0 Ω
−20 −15 −10 −5 0 5 10 15 20 −2π −π −Ω00 Ω0 π 2π
(a) Impulse response h[n] (b) Ideal lowpass filter spectrum H(Ω) with Ω0 = π/4
hFIR[n] HFIR(Ω)
Frequency response
Impulse response 1.5
0.25
0.20 1
0.10
0.5
0 n
−0.05 0
−10 −5 0 5 10 −π 0 π
(c) Impulse response of Hamming-windowed filter (d) Spectrum HFIR(Ω) of Hamming-windowed filter
Figure 2-13 Parts (a) and (b) are for an ideal lowpass filter with Ω0 = π /4, and parts (c) and (d) are for the same filter after multiplying its
impulse response with a Hamming window of length N = 10.
( 1 − e jΩ(2N+1)
hni 1 for |n| ≤ N, = e− jΩN
rect = (2.84) 1 − e jΩ
N 0 for |n| > N. sin((2N + 1)Ω/2)
= . (2.87)
sin(Ω/2)
We note that rect Nn has duration 2N + 1. This differs from the
continuous-time rect
function
rect(t/T ), which has duration T . This is called a discrete (or periodic) sinc function. A rectangu-
The DTFT of rect Nn is obtained from Eq. (2.73a) by setting lar pulse with N = 10 is shown in Fig. 2-14 along with its DTFT.
x[n] = 1 and limiting the summation to the range (−N, N):
N
DTFT{rect[n/N]} = ∑ e− jΩn . (2.85)
n=−N Concept Question 2-10: Why does the DTFT share so
many properties with the CTFT?
Using the formula
N
1 − rN+1 Concept Question 2-11: Why is the DTFT periodic in
∑ rk = 1−r
(2.86) frequency?
k=0
70 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
by 1
N e j2π mk/N and summing over k gives
◮ To avoid confusion between the DTFT and the DFT, the
DFT, being a discrete function of integer k, uses square N−1 N−1 M−1
1 1
brackets, as in X[k], while the DTFT, being a continuous ∑ X[k] e j2π mk/N = ∑ ∑ x[n] e j2π (m−n)k/N
and periodic function of real numbers Ω, uses round paren- N k=0 N k=0 n=0
theses, as in X(Ω). ◭ 1 M−1 N−1
=
N ∑ x[n] ∑ e j2π (m−n)k/N
n=0 k=0
M−1
= ∑ x[n] δ [m − n]
n=0
(
x[m] for 0 ≤ m ≤ M − 1,
=
0 for M ≤ m ≤ N − 1.
(2.91)
Similarly,
X∗ [2] = X[4 − 2] = X[2] = −2,
2-7 DISCRETE FOURIER TRANSFORM (DFT) 73
which is real-valued. This conjugate-symmetry property follows where (n − n1 )N means (n − n1 ) reduced mod N (i.e., reduced
from the definition of the DFT given by Eq. (2.89a): by the largest integer multiple of N without (n − n1 ) becoming
negative).
N−1
X∗ [k] = ∑ x[n] e j2π nk/N (2.99)
n=0
and
N−1 N−1
C. DFT and Cyclic Convolution
X[N − k] = ∑ x[n] e− j2π n(N−k)/N = ∑ x[n] e− j2π ne j2π k/N .
n=0 n=0
(2.100) Because of the mod N reduction cycle, the expression on the
Since n is an integer, e− j2π n = 1 and Eq. (2.100) reduces to right-hand side of Eq. (2.103) is called the cyclic or circular
convolution of signals x1 [n] and x2 [n]. The terminology helps
N−1
distinguish it from the traditional linear convolution of two
X[N − k] = ∑ e j2π nk/N = X∗[k]. nonperiodic signals.
n=0
The symbol commonly used to denote cyclic convolution is
.
c
Combining Eqs. (2.101) and (2.103) leads to
B. Use of DFT for Convolution
N−1
The convolution property of the DTFT extends to the DFT after
some modifications. Consider two signals, x1 [n] and x2 [n], with
yc [n] = x1 [n]
c x2 [n] = ∑ x1[n1 ] x2 [(n − n1)N ]
n1 =0
N-point DFTs X1 [k] and X2 [k]. From Eq. (2.89b), the inverse
DFT of their product is = DFT−1 (X1 [k] X2 [k])
N−1
1 2π
1 N−1
jk 2Nπ n
= ∑ X1 [k] X2 [k] e jk N n . (2.104)
DFT−1 (X1 [k] X2 [k]) = ∑ (X1 [k] X2 [k])e N k=0
N k=0
" # The cyclic convolution yc [n] can certainly by computed by
1 N−1 jk 2π n N−1
− jk 2Nπ n1
= ∑e N
N k=0 ∑ x1 [n1 ] e applying Eq. (2.104), but it can also be computed from the
n1 =0 linear convolution x1 [n]∗ x2[n] by aliasing the latter. To illustrate,
" # suppose x1 [n] and x2 [n] are both of duration N. The linear
N−1
− jk 2Nπ n2
· ∑ x2[n2 ] e . (2.101) convolution of the two signals
n2 =0
y[n] = x1 [n] ∗ x2[n] (2.105)
Rearranging the order of the summations gives
is of duration 2N − 1, extending from n = 0 to n = 2N − 2.
DFT−1 (X1 [k] X2 [k]) = Aliasing y[n] means defining z[n], the aliased version of y[n],
as
1 N−1 N−1 N−1 2π
X1 [k] X2 [k] Next, we zero-pad x1 [n] and x2 [n] so that their durations are
= { 10 × 11, (−2 + j2)(3 − j2), 2 × 3, (−2 − j2)(3 + j2) } equal to or greater than Nc . As we will see in Section 2-8 on how
the fast Fourier transform (FFT) is used to compute the DFT, it
= { 110, −2 + j10, 6, −2 − j10 }. is advantageous to choose the total length of the zero-padded
signals to be M such that M ≥ Nc , and simultaneously M is a
Application of Eq. (2.104) leads to
power of 2.
c x2 [n] = { 28, 21, 30, 31 }.
x1 [n]
The zero-padded signals are defined as
(b) Per Eq. (2.71d), the linear convolution of x′1 [n] = {x1 [n], 0, . . . , 0}, (2.109a)
|{z} | {z }
N1 M−N1
x1 [n] = { 2, 1, 4, 3 }
x′2 [n] = {x2 [n], 0, . . . , 0}, (2.109b)
|{z} | {z }
and N2 M−N2
x2 [n] = { 5, 3, 2, 1 }
and their M-point DFTs are X′1 [k] and X′2 [k], respectively. The
is linear convolution y[n] can now be computed by a modified
version of Eq. (2.104), namely
y[n] = x1 [n] ∗ x2[n]
3 y[n] = x′1 [n] ∗ x′2[n]
= ∑ x1 [i] x2 [n − i]
i=0 = DFT−1 { X′1 [k] X′2 [k] }
= { 10, 11, 27, 31, 18, 10, 3 }. M−1
1
=
M ∑ X′1[k] X′2 [k] e j2π nk/M . (2.110)
Per Eq. (2.106), k=0
z[n] = { y[0] + y[4], y[1] + y[5], y[2] + y[6], y[3] } Note that the DFTs can be computed using M-point DFTs of
x′1 [n] and x′2 [n], since using an M-point DFT performs the zero-
= { 10 + 18, 11 + 10, 27 + 3, 31 } = { 28, 21, 30, 31 }. padding automatically.
2-7 DISCRETE FOURIER TRANSFORM (DFT) 75
and
Example 2-7: DFT Convolution
X′2 [3] = −2 + j2.
From Eq. (2.89a) with N = Nc = 4, the 4-point DFT of which is identical to the answer obtained earlier in part (a).
x′1 [n] = {4, 5, 0, 0} is For simple signals like those in this example, the DFT method
involves many more steps than does the straightforward con-
3 volution method of part (a), but for the type of signals used in
X1 [k] = ∑ x′1[n] e− jkπ n/2, k = 0, 1, 2, 3, practice, the DFT method is computationally superior.
n=0
which gives
and
X′2 [0] = 6,
X′2 [1] = −2 − j2,
X′2 [2] = 2,
76 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
Exercise 2-13: Compute the 4-point DFT of {4, 3, 2, 1}. WN = e− j2π /N , (2.111a)
− j2π nk/N
Answer: {10, (2 − j2), 2, (2 + j2)}. (See IP ) WNnk =e , (2.111b)
and
WN−nk = e j2π nk/N . (2.111c)
2-8 Fast Fourier Transform (FFT)
Using this shorthand notation, the summations for the DFT, and
its inverse given by Eq. (2.89), assume the form
◮ The fast Fourier transform (FFT) is a computational
N−1
algorithm used to compute the discrete Fourier transforms
(DFT) of discrete signals. Strictly speaking, the FFT is X[k] = ∑ x[n] WNnk , k = 0, 1, . . . , N − 1, (2.112a)
n=0
not a transform, but rather an algorithm for computing the
transform. ◭ and
1 N−1
As was mentioned earlier, the fast Fourier transform (FFT) is x[n] = ∑ X[k]WN−nk ,
N k=0
n = 0, 1, . . . , N − 1. (2.112b)
a highly efficient algorithm for computing the DFT of discrete
time signals. An N-point DFT performs a linear transformation
from an N-long discrete-time vector, namely x[n], into an N- In this form, the N-long vector X[k] is given in terms of the
long frequency domain vector X[k] for k = 0, 1, . . . , N − 1. N-long vector x[n], and vice versa, with WNnk and WN−nk acting
Computation of each X[k] involves N complex multiplications, as weighting coefficients.
so the total number of multiplications required to perform the For a 2-point DFT,
DFT for all X[k] is N 2 . This is in addition to N(N − 1) complex
additions. For N = 512, for example, direct implementation of N = 2,
the DFT operation requires 262,144 multiplications and 261,632 W20k = e− j0 = 1,
complex additions. For small N, these numbers are smaller, and
since multiplication by any of { 1, −1, j, − j } does not count as
a true multiplication. W21k = e− jkπ = (−1)k .
2-8 FAST FOURIER TRANSFORM (FFT) 77
Table 2-10 Comparison of number of complex computations required by a standard DFT and an FFT using the formulas in the
bottom row.
N Multiplication Additions
Standard DFT FFT Standard DFT FFT
2 4 1 2 2
4 16 4 12 8
8 64 12 56 24
16 256 32 240 64
.. .. .. .. ..
. . . . .
512 262,144 2,304 261,632 4,608
1,024 1,048,576 5,120 1,047,552 10,240
2,048 4,194,304 11,264 4,192,256 22,528
N
N N2 log2 N N(N − 1) N log2 N
2
divided into four 4-point DFTs, which are just additions and
1 Xe[0] 1 X[0] subtractions. This conquers the 16-point DFT by dividing it into
xe[0] = x[0]
1 1 4-point DFTs and additional MADs.
1
Xe[1] 1 X[1]
xe[1] = x[2]
−1
W14
2-point DFT
A. Dividing a 16-Point DFT
1
xo[0] = x[1] 1 Xo[0] −1 X[2] We now show that the 16-point DFT can be computed for
1 even values of k using an 8-point DFT of (x[n] + x[n + 8]) and
1 1 for odd values of k using an 8-point DFT of the modulated
Xo[1] −W14 X[3] signal (x[n] − x[n + 8])e− j2π n/16. Thus, the 16-point DFT can
xo[1] = x[3]
−1 be computed as an 8-point DFT (for even values of k) and as a
2-point DFT Recomposition modulated 8-point DFT (for odd values of k).
k
X[2k′ ] = ∑ x[n] e− j2π (2k /16)n
X[k] = [xe [0] + (−1) xe [1]] n=0
| {z }
7 15
2-point DFT of xe [n] ′ ′
= ∑ x[n] e− j2π (2k /16)n + ∑ x[n] e− j2π (2k /16)n.
+ W41k [xo [0] + (−1)k xo [1]], (2.119) n=0 n=8
| {z } (2.120)
2-point DFT of xo [n]
= DFT({e− j2π (1/16)n (x[n] − x[n + 8]), n = 0, . . . , 7}). Note the conjugate symmetry in the second and third lines:
(2.123) X[7] = X∗ [1], X[6] = X∗ [2], and X[5] = X∗ [3].
So for odd values of k, the 16-point DFT of x[n] is the 8-point
• This result agrees with direct MATLAB computation using
DFT of { e− j(2π /16)n(x[n] − x[n + 8]), n = 0, . . . , 7 }. The signal
{ x[n] − x[n + 8], n = 0, . . . , 7 } has been modulated through
multiplication by e− j(2π /16)n. The multiplications by e− j(2π /16)n fft([7 1 4 2 8 5 3 6]).
are known as twiddle multiplications (mults) by the twiddle
factors e− j(2π /16)n.
2-8.4 Dividing Up a 2N-Point DFT
We now generalize the procedure to a 2N-point DFT by dividing
Example 2-8: Dividing an 8-Point DFT it into two N-point DFTs and N twiddle mults.
into Two 4-Point DFTs (1) For even indices k = 2k′ we have:
N−1
′
e
avoid one or more zeros in H[k], but it may also introduce new and
ones. It may be necessary to try multiple values of N to satisfy
e 6= 0 for all k.
the condition that H[k] Y[3] = 6(1) + 19( j) + 32(−1) + 21(− j) = −26 − j2.
Y[0] = 6(1) + 19(1) + 32(1) + 21(1) = 78, Solution: The two-trumpets signal time-waveform is shown
Y[1] = 6(1) + 19(− j) + 32(−1) + 21( j) = −26 + j2, in Fig. 2-18(a), and the corresponding spectrum is shown in
Fig. 2-18(b). We note that the spectral lines occur in pairs of
Y[2] = 6(1) + 19(−1) + 32(1) + 21(−1) = −2, harmonics with the lower harmonic of each pair associated with
82 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
{ X(k∆ f ) } if its sampling rate S f = 1/∆ f > 2 T2 = T . In the exponent of Eq. (2.136), the expression becomes
In the sequel, we use the minimum sampling intervals
M
∆t = 1/F and ∆ f = 1/T . Finer discretization can be achieved
Xs (k∆ f ) = ∑ x(n∆t ) e− j2π (k∆ f )(n∆t ) , |k| ≤ M
by simply increasing F and/or T . In practice, F and/or T is (are)
n=−M
increased slightly so that N = FT is an odd integer, which makes
M
the factor M = (N − 1)/2 also an integer (but not necessarily an x(n∆t ) e− j2π nk/(2M+1) ,
odd integer). The factor M is related to the order of the DFT,
= ∑ |k| ≤ M. (2.140)
n=−M
which has to be an integer.
The Fourier transform of the synthetic sampled signal xs (t), This expression looks like a DFT of order 2M + 1. Recall from
defined in Eq. (2.43) and repeated here as the statement in connection with Eq. (2.47) that the spectrum
Xs ( f ) of the sampled signal includes the spectrum X( f ) of the
∞
continuous-time signal (multiplied by the sampling rate St ) plus
xs (t) = ∑ x(n∆t ) δ (t − n∆t ), (2.134)
additional copies repeated every ±St along the frequency axis.
n=−∞
With St = 1/∆t = F in the present case,
was computed in Eq. (2.53), and also repeated here as
Xs ( f ) = FX( f ), for | f | < F, (2.141)
∞
Xs ( f ) = ∑ x(n∆t ) e− j2π f n∆t . (2.135) from which we deduce that
n=−∞
M
Setting f = k∆ f gives X(k∆ f ) = Xs ( f ) ∆t = ∑ x(n∆t ) e− j2π nk/(2M+1) ∆t , |k| ≤ M.
n=−M
∞
(2.142)
Xs (k∆ f ) = ∑ x(n∆t ) e− j2π (k∆ f )(n∆t ) . (2.136) Ironically, this is the same result that would be obtained by
n=−∞
simply discretizing the definition of the continuous-time Fourier
Noting that x(t) = 0 for |t| > T2 and X( f ) = 0 for | f | > F transform! But this derivation shows that discretization gives the
2, we
restrict the ranges of n and k to exact result if x(t) is time- and bandlimited.
T /2 FT N
|n| ≤ = = (2.137a)
∆t 2 2 Example 2-11: Computing CTFT by DFT
and
F/2 FT N
|k| ≤ = = . (2.137b) Use the DFT to compute the Fourier transform of the continuous
∆f 2 2
Gaussian signal
Next, we introduce factor M defined as 1 2
x(t) = √ e−t /2 .
N −1 2π
M= , (2.138)
2
Solution: Our first task is to assign realistic values for the
and we note that if N is an odd integer, M is guaranteed to be signal duration T and the width of its spectrum F. It is an
an integer. In view of Eq. (2.137), the ranges of n and k become “educated” trial-and-error process. At t = 4, x(t) = 0.00013,
n, k = −M, . . . , M. Upon substituting so we will assume that x(t) ≈ 0 for |t| > 4. Since x(t) is
symmetrical with respect to the vertical axis, we assign
1 1 1 1 1
∆t ∆ f = = = = . (2.139)
FT FT N 2M + 1 T = 2 × 4 = 8 s.
2 2
The Fourier transform of x(t) is X( f ) = e−2π f . By trial and
error, we determine that F = 1.2 Hz is sufficient to characterize
X( f ). The combination gives
N = T F = 8 × 1.2 = 9.6.
84 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
X[k]
0.8
0.6
0.4
0.2
k
−4 −3 −2 −1 0 1 2 3 4
Figure 2-19 Comparison of exact (blue circles) and DFT-computed (red crosses) of the continuous-time Fourier transform of a Gaussian
signal.
Summary
Concepts
• Many 2-D concepts can be understood more easily by h(t) to input x(t) is output y(t) = h(t)∗ x(t), and similarly
reviewing their 1-D counterparts. These include: LTI sys- in discrete time.
tems, convolution, sampling, continuous-time; discrete- • The response of an LTI system with impulse response
time; and discrete Fourier transforms. h(t) to input A cos(2π f0t + θ ) is
• The DTFT is periodic with period 2π .
A|H( f )| cos(2π f0t + θ + H( f )),
• Continuous-time signals can be sampled to discrete-time
signals, on which discrete-time signal processing can be where H( f ) is the Fourier transform of h(t), and
performed. similarly in discrete time.
• The response of an LTI system with impulse response
Mathematical Formulae
Impulse Sinc interpolation formula
1 t ∞
δ (x) = lim
ε →0 2ε
rect
2ε
x(t) = ∑ x(n∆) sinc(S(t − n∆))
n=−∞
Energy of x(t) Discrete-time Fourier transform (DTFT)
Z ∞
2 ∞
E= |x(t)| dt x[n] e− jΩn
−∞
X(Ω) = ∑
n=−∞
Convolution Z ∞ Inverse DTFTZ
y(t) = h(t) ∗ x(t) = h(τ )x(t − τ ) d τ 1 π
−∞ x[n] = X(Ω) e jΩn dΩ
2 π −π
Convolution
∞ Discrete-time sinc
y[n] = h[n] ∗ x[n] = ∑ h[i] x[n − i]
h[n] =
Ω0
sinc
Ω0 n
i=−∞ π π
Fourier transform
Z ∞ Discrete sinc
X( f ) = x(t) e− j2π f t dt sin((2N + 1)Ω/2)
−∞ X(Ω) =
sin(Ω/2)
Inverse Fourier transform
Z ∞ Discrete Fourier Transform (DFT)
j2π f t M−1
x(t) = X( f ) e df
−∞ X[k] = ∑ x[n] e− j2π nk/N
n=0
Sinc function
sin(π x) Inverse DFT
sinc(x) =
πx 1 N−1
x[n] = ∑ X[k] e j2π nk/N
N k=0
Ideal lowpass filter impulse response
h(t) = 2 fc sinc(2 fc t) Cyclic convolution
N−1
Sampling theorem yc [n] = x1 [n]
c x2 [n] = ∑ x1 [n1] x2 [(n − n1)N ]
1 n1 =0
Sampling rate S = > 2B if X( f ) = 0 for | f | > B
∆
86 CHAPTER 2 REVIEW OF 1-D SIGNALS AND SYSTEMS
Important Terms Provide definitions or explain the meaning of the following terms:
aliasing FFT Parseval’s theorem spectrum
cyclic convolution Fourier transform Rayleigh’s theorem zero padding
deconvolution frequency response function sampled signal
DFT impulse response convolution sampling theorem
DTFT linear time-invariant (LTI) sinc function
2.4 If x(t) = sin(2t)/(π t), compute the energy of d 2 x/dt 2 . Section 2-5: Review of Discrete-Time Signals and
2.5 Compute the energy of e−t u(t) ∗ sin(t)/(π t). Systems
2.6 Show that Z ∞ 2.12 Compute the following convolutions:
sin2 (at) a
dt = (a) {1, 2} ∗ {3, 4, 5}
−∞ (π t)2 π
if a > 0. (b) {1, 2, 3} ∗ {4, 5, 6}
(c) {2, 1, 4} ∗ {3, 6, 5}
Section 2-4: The Sampling Theorem 2.13 If {1, 2, 3} ∗ x[n] = {5, 16, 34, 32, 21}, compute x[n].
2.7 The spectrum of the trumpet signal for note G (784 Hz) 2.14 Given the two systems connected in series as
is negligible above its ninth harmonic. What is the Nyquist
sampling rate required for reconstructing the trumpet signal x[n] h1 [n] w[n] = 3x[n] − 2x[n − 1],
from its samples?
PROBLEMS 87
(a) x[n] ∗ {3, 1, 4, 2} = {6, 23, 18, 57, 35, 37, 28, 6}.
(b) x[n] ∗ {1, 7, 3, 2} = {2, 20, 53, 60, 53, 54, 21, 10}.
(c) x[n] ∗ {2, 2, 3, 6} =
{12, 30, 42, 71, 73, 43, 32, 45, 42}.
0
1
This chapter extends the 1-D definitions, properties, and trans-
3
formations covered in the previous chapter into their 2-D
equivalents. It also presents certain 2-D properties that have no 5
counterparts in 1-D. 7
9
x
11
3-1 Displaying Images
13
In 1-D, a continuous-time signal x(t) is displayed by plotting 15
x(t) versus t. A discrete-time signal x[n] is displayed using a 17
stem plot of x[n] versus n. Clearly such plots are not applicable
19
for 2-D images.
0 1 3 5 7 9 11 13 15 17 19
Image intensity f (x, y) of a 2-D image can be displayed either y
as a 3-D mesh plot (Fig. 3-1(a)), which hides some features of
the image and is difficult to create and interpret, or as a grayscale (b) Grayscale format
image (Fig. 3-1(b)). In a grayscale image, the image intensity is
scaled so that the minimum value of f (x, y) is depicted in black Figure 3-1 An image displayed in (a) mesh plot format and (b)
and the maximum value of f (x, y) is depicted in white. If the grayscale format.
image is non-negative ( f (x, y) ≥ 0), as is often the case, black in
the grayscale image denotes zero values of f (x, y). If the image
is not non-negative, zero values of f (x, y) appear as a shade of depicting the infrared intensity emitted by a hot air balloon.
gray. We should not confuse a false-color image with a true-color
MATLAB’s imagesc(X),colormap(gray) displays image. Whereas a false-color image is a single grayscale image
the 2-D array X as a grayscale image in which black depicts (1 channel) displayed in color, a true-color image actually is a
the minimum value of X and white depicts the maximum value set of three images (3 channels):
of X.
It is also possible to display an image f (x, y) as a false-color { fred (x, y), fgreen (x, y), fblue (x, y)},
image, in which case different colors denote different values
of f (x, y). The relation between color and values of f (x, y) is representing (here) the three primary colors: red, green and
denoted using a colorbar, to the side or bottom of the image. An blue. Other triplets of colors, such as yellow, cyan and magenta,
example of a false-color display was shown earlier in Fig. 1-11, can also be used. Hence, image processing of color images
90
3-2 2-D CONTINUOUS-SPACE IMAGES 91
x0 x
3-2.1 Fundamental 2-D Images
(a) f Box x − x0 , y − y0 = rect x − x0 rect y − y0
A. Impulse ( ℓx ℓy ) ℓx ( ) ( )ℓy
A 2-D impulse δ (x, y) is simply y
δ (x − ξ , y − η ) = δ (x − ξ ) δ (y − η ). x0 x
(b) f Disk x − x0 , y − y0
The sifting property generalizes directly from 1-D to 2-D. In
2-D ( a a )
Z ∞Z ∞
f (ξ , η ) δ (x − ξ , y − η ) d ξ d η = f (x, y). (3.1) Figure 3-2 (a) Box image of widths (ℓx , ℓy ) and centered at
−∞ −∞
(x0 , y0 ) and (b) disk image of radius a/2 and centered at (x0 , y0 ).
92 CHAPTER 3 2-D IMAGES AND SYSTEMS
C. Disk Image
(0,0) x
Being rectangular in shape, the box-image function is suitable
for applications involving Cartesian coordinates, such as shifting
the box sideways or up and down across the image. Some
applications, however, require the use of polar coordinates, in
which case the disk-image function is more suitable. The disk y
image fDisk (x, y) of radius 1/2 is defined as
( p (a) Origin at top left and y axis downward
1 for px2 + y2 < 1/2,
fDisk (x, y) = (3.3)
0 for x2 + y2 > 1/2. y
3. Image energy
y
Extending the expression for signal energy given by Eq. (3.7)
from 1-D to 2-D leads to x′
y′ y
x′ = x cos θ + y sin θ
Z ∞Z ∞ x′ y′ = −x sin θ + y cos θ
E= | f (x, y)|2 dx dy. (3.5) θ
−∞ −∞ y′
x x
4. Even-odd decomposition
A real-valued image f (x, y) can be decomposed into its even Figure 3-4 Rotation of coordinate system (x, y) by angle θ to
fe (x, y) and odd fo (x, y) components: coordinate system (x′ , y′ ).
To rotate an image by an angle θ , we define rectangular Answer: 2-D impulse and disk are invariant to rotation;
coordinates (x′ , y′ ) as the rectangular coordinates (x, y) rotated box is not invariant to rotation.
by angle θ . Using the sine and cosine addition formulae, the
rotated coordinates (x′ , y′ ) are related to coordinates (x, y) by
(Fig. 3-4)
x′ = x cos θ + y sin θ (3.9)
3-3 Continuous-Space Systems
and A continuous-space system is a device or mathematical model
y′ = −x sin θ + y cos θ . (3.10) that accepts as an input an image f (x, y) and produces as an
94 CHAPTER 3 2-D IMAGES AND SYSTEMS
output an image g(x, y). and if the system also is shift-invariant, the 2-D superposition
integral simplifies to the 2-D convolution given by
Z ∞Z ∞
f (x, y) SYSTEM g(x, y).
g(x, y) = f (ξ , η ) h(x − ξ , y − η ) d ξ d η
−∞ −∞
The image rotation transformation described by Eq. (3.12) is a = f (x, y) ∗ ∗h(x, y), (3.15a)
good example of such a 2-D system.
where the “double star” in f (x, y) ∗ ∗h(x, y) denotes the 2-D
convolution of the PSF h(x, y) with the input image f (x, y). In
symbolic form, the 2-D convolution is written as
3-3.1 Linear and Shift-Invariant (LSI) Systems
f (x, y) LSI g(x, y) = f (x, y) ∗ ∗h(x, y). (3.15b)
The definition of the linearity property of 1-D systems (Section
2-2.1) extends directly to 2-D spatial systems, as does the
definition for the invariance, except that time invariance in 1-D
systems becomes shift invariance in 2-D systems. Systems ◮ A 2-D convolution consists of a convolution in the x
that are both linear and shift-invariant are termed linear shift- direction, followed by a convolution in the y direction, or
invariant (LSI). vice versa. Consequently, the 1-D convolution properties
listed in Table 2-3 generalize to 2-D. ◭
by
◮ The spectrum F(µ , ν ) of an image f (x, y) is its 2-D
Z ∞Z ∞
CSFT. The spatial frequency response H(µ , ν ) of an LSI
F(µ , v) = f (x, y) e− j2π (µ x+vy) dx dy. (3.16a) 2-D system is the 2-D CSFT of its PSF h(x, y). ◭
−∞ −∞
and we call the Fourier transform of h(t) the frequency response B. Separable Images
of the system, H( f ):
◮ The CSFT of a separable image f (x, y) = f1 (x) f2 (y) is
h(t) H( f ). itself separable in the spatial frequency domain:
The analogous relationships for a 2-D LSI system are f1 (x) f2 (y) F1 (µ ) F2 (v). (3.19)
δ (x, y) LSI h(x, y), (3.17a) This assertion follows directly from the definition of the CSFT
given by Eq. (3.16a):
h(x, y) H(µ , v). (3.17b)
Z ∞Z ∞
The CSFT H(µ , v) is called the spatial frequency response of F(µ , v) = f (x, y) e− j2π (µ x+vy) dx dy
−∞ −∞
the LSI system. Z ∞ Z ∞
= f1 (x) e− j2π µ x dx f2 (y) e− j2π vy dy
−∞ −∞
◮ As in 1-D, the 2-D Fourier transform of a convolution = F1 (µ ) F2 (v). (3.20)
of two functions is equal to the product of their Fourier
transforms:
◮ The CSFT pairs listed in Table 3-1 are all separable
functions, and can be obtained by applying Eq. (3.20) to the
f (x, y) LTI g(x, y) = h(x, y) ∗ ∗ f (x, y) 1-D Fourier transform pairs listed in Table 2-5. CSFT pairs
for non-separable functions are listed later in Table 3-2. ◭
implies that
Selected Properties
1. Linearity ∑ ci fi (x, y) ∑ ci Fi (µ , v)
1 µ v
2. Spatial scaling f (ax x, ay y) F ,
|ax ay | ax ay
3. Spatial shift f (x − x0 , y − y0 ) e− j2π µ x0 e− j2π vy0 F(µ , v)
CSFT Pairs
8. δ (x, y) 1
domain, respectively. A grayscale image-display of f (x, y) is By Eq. (3.20), the CSFT of f (x, y) is
shown in Fig. 3-5(a), with pure black representing f (x, y) = −1
and pure white representing f (x, y) = +1. As expected, the F(µ , v) = F1 (µ ) F2 (v)
image exhibits a repetitive pattern along both x and y, with 19 = F { cos(2π µ0 x) } F { cos(2π v0y) }
cycles in 10 cm in the x direction and 9 cycles in 10 cm in 1
the y direction, corresponding to spatial frequencies of µ = 1.9 = [δ (µ − µ0 ) + δ (µ + µ0 )]
4
cycles/cm and v = 0.9 cycles/cm, respectively.
× [δ (v − v0) + δ (v + v0)], (3.21)
0 f (t)
Signal
1
t
0
−T T
5 cm 2 2
(a) Rectangular pulse
|F( μ)|
Magnitude
spectrum T
10 cm x
0 5 cm 10 cm
y
(a) 2-D sinusoidal image f (x, y)
μ
−3 −2 −1 0 1 2 3
v
T T T T T T
13
(b) Magnitude spectrum
Phase ϕ( μ)
spectrum
180o
0 μ
μ
−3 −2 −1 0 1 2 3
T T T T T T
−13 (c) Phase spectrum
−13 0 13
(b) F( μ,v), with axes in cycles/cm Figure 3-6 (a) Rectangular pulse, and corresponding (b) magni-
tude spectrum and (c) phase spectrum.
Figure 3-5 (a) Sinusoidal image f (x, y)=cos(2π µ0 x) cos(2π v0 y)
with µ0 = 1.9 cycles/cm and v0 = 0.9 cycles/cm, and (b) the
corresponding Fourier transform F(µ , v), which consists of four
impulses (4 white dots) at { ±µ0 , ±v0 }.
D. Box Image
F(µ , v) = ℓx e− j2π µ x0 sinc(µ ℓx )ℓy e− j2π vy0 sinc(vℓy ), (3.28) (c) Phase image ϕ( μ,v)
and
−1 Im[F(µ , v)]
φ (µ , v) = tan . (3.29b)
Re[F(µ , v)]
A visual example is shown in Fig. 3-8(a) for a square box of
sides ℓx = ℓy = ℓ, and shifted to the right by L and also downward x
L
by L. Inserting x0 = L, y0 = −L, and ℓx = ℓy = ℓ in Eq. (3.28) l
leads to
FT
μ
(a) Clown face image f (x,y) (b) Magnitude spectrum of clown image F( μ,v)
×v
IFT
μ
(c) Magnified PSF h(x,y) of 2-D LPF (d) Spatial frequency response of 2-D LPF, HLP( μ,v)
with μ0 = 0.44 cycles/mm
=
IFT
μ
(e) Lowpass-filtered clown image g(x,y) (f ) Magnitude spectrum of filtered image G( μ,v)
Figure 3-9 Lowpass filtering the clown image in (a) to generate the image in (e). Image f (x, y) is 40 mm × 40 mm and the magnitude
spectra extend between −2.5 cycles/mm and +2.5 cycles/mm in both directions.
3-4 2-D CONTINUOUS-SPACE FOURIER TRANSFORM (CSFT) 101
G(µ , v) = F(µ , v) HLP (µ , v). (3.34) Figure 3-10 Rotation of axes by θ in (a) spatial domain causes
rotation by the same angle in the (b) spatial frequency domain.
The magnitude of the result is displayed in Fig. 3-9(f). Upon
performing an inverse Fourier transform on G(µ , v), we obtain
g(x, y), the lowpass-filtered image of the clown face shown in
Fig. 3-9(e). Image g(x, y) looks like a blurred version of the
original image f (x, y) because the lowpass filtering smooths out
rapid variations in the image. of the rotated image g(x, y) = f (x′ , y′ ), and Rθ is the rotation
Alternatively, we could have obtained g(x, y) directly by matrix relating (x, y) to (x′ , y′ ):
performing a convolution in the spatial domain:
g(x, y) = f (x, y) ∗ ∗hLP(x, y). (3.35) cos θ sin θ
Rθ = . (3.38)
− sin θ cos θ
Even though the convolution approach is direct and conceptually
straightforward, it is computationally much easier to perform the
filtering by transforming to the angular frequency domain, mul- The inverse relationship between (x′ , y′ ) and (x, y) is given in
tiplying the two spectra, and then inverse transforming back to terms of the inverse of matrix Rθ :
the spatial domain. The actual computation was performed using ′
x −1 x cos θ − sin θ x′
discretized (pixelated) images and the Fourier transformations = Rθ = . (3.39)
y y′ sin θ cos θ y′
were realized using the 2-D DFT introduced later in Section 3-8.
The 2-D Fourier transform of g(x, y) is given by
3-4.2 Image Rotation Z ∞Z ∞
G(µ , v) = g(x, y) e− j2π (µ x+vy) dx dy
−∞ −∞
◮ Rotating an image by angle θ (Fig. 3-10) in the 2-D Z ∞Z ∞
spatial domain (x, y) causes its Fourier transform to also = f (x′ , y′ ) e− j2π (µ x+vy) dx dy. (3.40)
rotate by the same angle in the frequency domain (µ , v). ◭ −∞ −∞
coordinates (ρ , φ ), with
y v
p
(x,y) (μ,v) µ = ρ cos φ ρ = µ 2 + v2
. (3.45)
y v v = ρ sin φ φ = tan−1 (v/µ )
r ρ
θ ϕ The Fourier transform of f (x, y) is given by Eq. (3.16a) as
x x μ Z ∞Z ∞
μ
F(µ , v) = f (x, y) e− j2π (µ x+vy) dx dy. (3.46)
−∞ −∞
(a) Spatial domain (b) Frequency domain
We wish to transform F(µ , v) into polar coordinates so we
Figure 3-11 Relationships between Cartesian and polar coordi- may apply it to circularly symmetric images or to use it in
nates in (a) spatial domain and (b) spatial frequency domain. filtering applications where the filter’s frequency response is
defined in terms of polar coordinates. To that end, we convert
the differential area dx dy in Eq. (3.46) to r dr d θ , and we use
the relations given by Eqs. (3.44) and (3.45) to transform the
exponent in Eq. (3.46):
where we define
′ µ x + vy = (ρ cos φ )(r cos θ ) + (ρ sin φ )(r sin θ )
µ cos θ sin θ µ µ
= = Rθ . (3.42) = ρ r[cos φ cos θ + sin φ sin θ ]
v′ − sin θ cos θ v v
= ρ r cos(φ − θ ). (3.47)
The newly defined spatial-frequency coordinates (µ ′ , v′ ) are
related to the original frequency coordinates (µ , v) by exactly The cosine addition formula was used in the last step. Conver-
the same rotation matrix Rθ that was used to rotate image f (x, y) sion to polar coordinates leads to
to g(x, y). The consequence of using Eq. (3.42) is that Eq. (3.38) Z ∞ Z 2π
now assumes the standard form for the definition of the Fourier F(ρ , φ ) = f (r, θ ) e− j2πρ r cos(φ −θ ) r dr d θ . (3.48a)
transform for f (x′ , y′ ): r=0 θ =0
J0(z)
1
y
3 cm
0.5
0 z 0 x
−0.5
0 2 4 6 8 10 12 14 16 18 20
v
21
Because the integration over θ extends over the range (0, 2π ),
the integrated value is the same for any fixed value of φ . Hence,
for simplicity we set φ = 0, in which case Eq. (3.49) simplifies
to
Z ∞ Z 2 π
− j2πρ r cos θ
F(ρ ) = r f (r) e d θ dr
r=0 θ =0
Z ∞ 0 μ
= 2π r f (r) J0 (2πρ r) dr, (3.50)
r=0
f (r) F(ρ )
δ (r)
1
πr
J1 (πρ )
rect(r)
2ρ
J1 (π r)
rect(ρ )
2r
1 1
r ρ
2 2
e− π r e−πρ (a) Letters image f (x,y)
δ (r − r0 ) 2π r0 J0 (2π r0 ρ )
transform is
Z ∞
F(ρ ) = 2π r δ (r − a) J0 (2πρ r) dr = 2π a J0 (2πρ a). (b) Letters image f (x′,y′) with x′ = ax and y′ = ay,,
r=0 spatially scaled by a = 4
(3.52b)
The image in Fig. 3-13(b) displays the variation of F(ρ ) as a
function of ρ in the spatial frequency domain for a ring with Figure 3-14 (a) Letters image and (b) a scaled version.
a = 1 cm (image size is 6 cm × 6 cm).
Table 3-2 provides a list of Fourier transform pairs of rota-
tionally symmetric images.
C. Gaussian Image
3-4.5 Image Examples
A. Scaling A 2-D Gaussian image is characterized by
x
y
(a) Sinusoidal image (b) Spectrum of image in (a)
x
y
(c) Sinusoidal image rotated by 45◦ (d) Spectrum of rotated sinusoidal image
Figure 3-15 (a) Sinusoidal image and (b) its 2-D spectrum; (c) rotated image and (d) its rotated spectrum.
From standard tables of integrals, we borrow the following The integrals in Eq. (3.54) and Eq. (3.55) become identical if we
identity for any real variable t: set t = r, a2 = π , and b = 2πρ , which leads to
Z ∞
2t 2 1 −b2 /4a2
te−a J0 (bt) dt = e , for a2 > 0. (3.55) 2
0 2a2 F(ρ ) = e−πρ . (3.56)
Gaussian spectrum
106 CHAPTER 3 2-D IMAGES AND SYSTEMS
1
Concept Question 3-3: Why do so many 1-D Fourier
sin(πρ) J1(πρ)
sinc( ρ) = jinc( ρ) = transform properties generalize directly to 2-D?
0.5 πρ 2ρ
2
Exercise 3-7: Compute the 2-D CSFT of f (x, y) = e−π r ,
where r2 = x2 + y2 , without using Bessel functions. Hint:
f (x, y) is separable. y
Answer: f (x, y) = e −π r 2
=e e −π x2 −π y2
is separable, so
Eq. (3.19) and entry #5 of Table 2-5 (see also entry #13
of Table 3-1) give
2 2 2 +ν 2 ) 2
F(µ , ν ) = e−π µ e−πν = e−π (µ = e−πρ .
where ∆ is the sampling length (instead of interval) and sin(π S(x − n∆)) sin(π S(y − m∆))
× . (3.62)
Sx = 1/∆ is the sampling rate in samples/meter. π S(x − n∆) π S(y − m∆)
If the spectrum of image f (x, y) is bandlimited to B—that is,
F(µ , v) = 0 outside the square region defined by As noted earlier in Section 2-4.4 in connection with Eq. (2.51),
accurate reconstruction using the sinc interpolation formula
{ (µ , v) : 0 ≤ |µ |, |v| ≤ B }, is not practical because it requires summations over infinite
number of samples.
then the image f (x, y) can be reconstructed from its samples
f [m, n], provided the sampling rate is such that S > 2B. As in
1-D, 2B is called the Nyquist sampling rate, although the units 3-5.1 Sampling/Reconstruction Examples
are now samples/meter instead of samples/second. The following image examples are designed to illustrate the im-
The sampled signal xs (t) defined by Eq. (2.43) generalizes portant role of the Nyquist rate when sampling an image f (x, y)
directly to the sampled image: (for storage or digital transmission) and then reconstructing it
∞ ∞ from its sampled version fs (x, y). We will use the term image
fs (x, y) = ∑ ∑ f (n∆, m∆) reconstruction fidelity as a qualitative measure of how well
n=−∞ m=−∞ the reconstructed image frec (x, y) resembles the original image
× [δ (x − n∆) δ (y − m∆)]. (3.61) f (x, y).
Reconstruction of frec (x, y) from the sampled image fs (x, y)
The term inside the square brackets (product of two impulse can be accomplished through either of two approaches:
trains) is called the bed of nails function, because it consists (a) Application of nearest-neighbor (NN) interpolation
of a 2-D array of impulses, as shown in Fig. 3-17. (which is a 2-D version of the 1-D nearest-neighbor interpola-
108 CHAPTER 3 2-D IMAGES AND SYSTEMS
tion), implemented directly on image fs (x, y). in part (d). The spectrum of the sampled image contains
(b) Transforming image fs (x, y) to the frequency domain, the spectrum of the original image (namely, the spectrum in
applying 2-D lowpass filtering (LPF) to simultaneously preserve Fig. 3-18(b)), plus periodic copies spaced at an interval S along
the central spectrum of f (x, y) and remove all copies thereof both directions in the spatial frequency domain. To preserve the
(generated by the sampling process), and then inverse transform- central spectrum and simultaneously remove all of the copies,
ing to the spatial domain. a lowpass filter is applied in step (f) of Fig. 3-18. Finally.
Both approaches will be demonstrated in the examples that application of the 2-D inverse Fourier transform to the spectrum
follow, and in each case we will compare an image reconstructed in part (f) leads to the reconstructed image frec (x, y) in part (e).
from an image sampled at the Nyquist rate with an aliased image We note that the process yields a reconstructed image with high-
reconstructed from an image sampled at a rate well below the fidelity resemblance to the original image f (x, y).
Nyquist rate. In all cases, the following parameters apply:
• Size of original (clown) image f (x, y) and reconstructed B. NN Reconstruction
image frec (x, y): 40 mm × 40 mm
Figure 3-19 displays image f (x, y), sampled image fs (x, y), and
• Sampling interval ∆ (and corresponding sampling rate the NN reconstructed image f (x, y). The last step was realized
S = 1/∆) and number of samples N: using a 2-D version of the nearest-neighbor interpolation tech-
nique described in Section 2-4.4D. NN reconstruction provides
• Nyquist-sampled version: ∆ = 0.2 mm, S = 5 sam- image
ples/mm, N = 200 × 200
• Sub–Nyquist-sampled version: ∆ = 0.4 mm, S = 2.5 fˆ(x, y) = fs (x, y) ∗ ∗ rect(x/∆) rect(y/∆), (3.63)
sample/mm, N = 100 × 100
which is a 2-D convolution of the 2-D sampled image fs (x, y)
• Spectrum of original image f (x, y) is bandlimited to with a box function. The spectrum of the NN interpolated signal
B = 2.5 cycles/mm is
sin(π ∆µ ) sin(π ∆v)
F̂(µ , v) = F(µ , v) . (3.64)
• Display πµ πv
• Images f (x, y), fs (x, y), frec (x, y): Linear scale As in 1-D, the zero crossings of the 2-D sinc functions coincide
with the centers of the copies of F(µ , v) induced by sampling.
• Image magnitude spectra: Logarithmic scale (for Consequently, the 2-D sinc functions act like lowpass filters
easier viewing; magnitude spectra extend over a wide along µ and v, serving to eliminate the copies almost completely.
range). Comparison of the NN-interpolated image in Fig. 3-19(c)
with the original image in part (a) of the figure leads to the
Reconstruction Example 1: conclusion that the NN technique works quite well for images
Image Sampled at Nyquist Rate sampled at or above the Nyquist rate.
Our first step is to create a bandlimited image f (x, y). This was
done by transforming an available clown image to the spatial Reconstruction Example 2:
frequency domain and then applying a lowpass filter with a Image Sampled below the Nyquist Rate
cutoff frequency of 2.5 cycles/mm. The resultant image and its
corresponding spectrum are displayed in Figs. 3-18(a) and (b), A. LPF Reconstruction
respectively.
The sequence in this example (Fig. 3-20) is identical with that
A. LPF Reconstruction described earlier in Example 1A, except for one very important
difference: in the present case the sampling rate is S = 2.5
Given that image f (x, y) is bandlimited to B = 2.5 cycles/mm, cycles/mm, which is one-half of the Nyquist rate. Consequently,
the Nyquist rate is 2B = 5 cycles/mm. Figure 3-18(c) displays the final reconstructed image in Fig. 3-20(e) bears a poor
fs (x, y), a sampled version of f (x, y), sampled at the Nyquist resemblance to the original image in part (a) of the figure.
rate, so it should be possible to reconstruct the original im-
age with good fidelity. The spectrum of fs (x, y) is displayed
3-5 2-D SAMPLING THEOREM 109
FT
μ
(c) Nyquist-sampled image fs(x,y) (d) Spectrum Fs( μ,v) of sampled image
Filtering
IFT
μ
Figure 3-18 Reconstruction Example 1A: After sampling image f (x, y) in (a) to generate fs (x, y) in (c), the sampled image is Fourier
transformed [(c) to (d)], then lowpass-filtered [(d) to (f)] to remove copies of the central spectrum) and inverse Fourier transform [(f) to (e)]
to generate the reconstructed image frec (x, y). All spectra are displayed in log scale.
110 CHAPTER 3 2-D IMAGES AND SYSTEMS
B. NN Reconstruction
The sequence in Fig. 3-21 parallels the sequence in Fig. 3-19,
except that in the present case we are working with the sub-
Nyquist sampled image. As expected, the NN interpolation
technique generates a poor-fidelity reconstruction, just like the
LPF reconstructed version.
FT
μ
(c) Sub-Nyquist sampled image fs(x,y) (d) Spectrum Fs( μ,v) of sampled image
Filtering
IFT
μ
Figure 3-20 Reconstruction Example 2A: Image f (x, y) is sampled at half the Nyquist rate (S = 2.5 sample/mm compared with 2B = 5
samples/mm). Consequently, the reconstructed image in (e) bears a poor resemblance to the original image in (a). All spectra are displayed in
log scale.
112 CHAPTER 3 2-D IMAGES AND SYSTEMS
∆rec
∆rec
ρ0
μ
0
ν
3-6 2-D Discrete Space
3-6.1 Discrete-Space Images
A discrete-space image represents a physical quantity that
varies with discrete space [n, m], where n and m are dimension-
ρ0 less integers. Such an image usually is generated by sampling a
continuous-space image f (x, y) at a spatial interval ∆s along the
μ x and y directions. The sampled image is defined by
0
f [n, m] = f (n∆s , m∆s ). (3.65)
A. Image Axes
(b) Spectrum Fs( ρ,ϕ) As noted earlier in connection with continuous-space images,
multiple different formats are used in both continuous- and
discrete-space to define image coordinates. We illustrate the
Figure 3-23 (a) Hexagonal tiling and (b) corresponding spec- most common of these formats in Fig. 3-25. In the top of the
trum Fs (ρ , φ ). figure, we show pixel values for a 10 × 10 array. In parts (a)
114 CHAPTER 3 2-D IMAGES AND SYSTEMS
(a) Radially bandlimited image f (r,θ) (b) Spectrum F( ρ,ϕ) of image f (r,θ)
Sampling at 2ρ0
v
FT
μ
(c) Hexagonally sampled image fs(r,θ) (d) Spectrum Fs( ρ,ϕ) of sampled image
Filtering
IFT
μ
0 0 0 0 0 0 0 0 0 0
0 3 5 7 9 10 11 12 13 14
0 5 10 14 17 20 23 25 27 29
0 8 15 21 26 30 34 37 40 43
0 10 20 27 34 40 45 50 54 57
0 10 20 27 34 40 45 50 54 57
0 8 15 21 26 30 34 37 40 43
0 5 10 14 17 20 23 25 27 29
0 3 5 7 9 10 11 12 13 14
0 0 0 0 0 0 0 0 0 0
Pixel values
Origin
m
0 9
1 8
2 7
3 6
4 5
5 4
6 3
7 2
8 1
9 0
n n
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
m
(a) Top-left corner format (b) Bottom-left corner format
Origin
m
4 1
3 2
2 3
1 4
0 5
−1
n 6
−2 7
−3 8
−4 9
−5 10
−5 −4 −3 −2 −1 0 1 2 3 4 1 2 3 4 5 6 7 8 9 10
n′
m′
(c) Center-of-image format (d) MATLAB format
Figure 3-25 The four color images are identical in pixel values, but they use different formats for the location of the origin and for coordinate
directions for image f [n, m]. The MATLAB format in (d) represents X(m′ , n′ ).
116 CHAPTER 3 2-D IMAGES AND SYSTEMS
through (d), we display the same color maps corresponding to MATLAB format
the pixel array, except that the [n, m] coordinates are defined
differently, namely: In MATLAB, an (M × N) image is defined as
{ X(m′ , n′ ), 1 ≤ m′ ≤ M, 1 ≤ n′ ≤ N }. (3.68)
Top-left corner format
MATLAB uses the top-left corner format, except that its
Figure 3-25(a): [n, m] starts at [0, 0] and both integers extend indices n′ and m′ start at 1 instead of 0. Thus, the top-left corner
to 9, the origin is located at the upper left-hand corner, m in- is (1, 1) instead of [0, 0]. Also, m′ , the first index in X(m′ , n′ ),
creases downward, and n increases to the right. For a (M × N) represents the vertical axis and the second index, n′ , represents
image, f [n, m] is defined as the horizontal axis, which is the reverse of the index notation
represented in f [n, m]. The two notations are related as follows:
{ f [n, m], 0 ≤ n ≤ N − 1, 0 ≤ m ≤ M − 1 } (3.66a)
f [n, m] = X(m′ , n′ ), (3.69a)
or, equivalently,
with
f [n, m] =
m′ = m + 1, (3.69b)
f [0, 0] f [1, 0] . . . f [N − 1, 0]
f [0, 1] f [1, 1] . . . f [N − 1, 1] ′
n = n + 1. (3.69c)
.. .. . (3.66b)
. .
f [0, M − 1] f [1, M − 1] . . . f [N − 1, M − 1]
Column 3 of f [n,m]
Image f [n,m]
Image f [n − 2, m − 1]
Row 5 of f [n,m]
Figure 3-26 Image f [n − 2, m − 1] is image f [n, m] shifted down by 1 and to the right by 2.
3-7 2-D Discrete-Space Fourier ◮ The spectrum of an image f [n, m] is its DSFT F(Ω1 , Ω2 ).
Transform (DSFT) The discrete-space frequency response H(Ω1 , Ω2 ) of an
LSI system is the DSFT of its point spread function (PSF)
The 2-D discrete-space Fourier transform (DSFT) is obtained h[n, m]. ◭
via direct generalization of the 1-D DTFT (Section 2-6) to
2-D. The DSFT consists of a DTFT applied first along m
and then along n, or vice versa. By extending the 1-D DTFT
definition given by Eq. (2.73a) (as well as the properties listed Example 3-2: DSFT of Clown Image
in Table 2-7) to 2-D, we obtain the following definition for the
DSFT F(Ω1 , Ω2 ) and its inverse f [n, m]:
∞ ∞ Use MATLAB to obtain the magnitude image of the DSFT of
F(Ω1 , Ω2 ) = ∑ ∑ f [n, m] e− j(Ω1 n+Ω2 m) , (3.73a) the clown image.
n=−∞ m=−∞
Z π Z π
1 Solution: The magnitude part of the DSFT is displayed in
f [n, m] = F(Ω1 , Ω2 )
4π 2 −π −π Fig. 3-27. As expected, the spectrum is periodic with period 2π
×e j(Ω1 n+Ω2 m)
dΩ1 dΩ2 . (3.73b) along both Ω1 and Ω2 .
The properties of the DSFT are direct 2-D generalizations of Concept Question 3-7: What is the DSFT used for? Give
the properties of the DTFT, and discrete-time generalizations of three applications.
the properties of the 2-D continuous-space Fourier transform.
3-8 2-D DISCRETE FOURIER TRANSFORM (2-D DFT) 119
Table 3-3 Properties of the (K2 × K1 ) 2-D DFT. In the time-shift and modulation properties, (k1 − k1′ ) and (n − n0 ) must be reduced
mod(K 1 ), and (k2 − k2′ ) and (m − m0 ) must be reduced mod(K 2 ).
Selected Properties
linear 2-D convolutions h[n, m] ∗ ∗ f [n, m] can be zero-padded to 3-8.2 Conjugate Symmetry for the 2-D DFT
cyclic convolutions, just as in 1-D.
n − j(2π /K2 )mk2 ,
(K2 − k2), respectively: (4) F[K1 /2, k2 ] = ∑M−1 N−1
m=0 ∑n=0 f [n, m] (−1) e
which is the K2 -point 1-D DFT of ∑N−1 n=0 f [n, m] (−1)
n
F[K1 − k1 , K2 − k2 ] − j(2 π / K1 )n(K1 /2) j π n n
because e = e = (−1) .
N−1 M−1
n(K1 − k1 ) m(K2 − k2)
= ∑ ∑ f [n, m] exp − j2π +
m − j(2π /K1 )nk1
n=0 m=0 K1 K2 (5) F[k1 , K2 /2] = ∑N−1 M−1
n=0 ∑m=0 f [n, m] (−1) e ,
N−1 M−1 M−1
which is the K1 -point 1-D DFT of ∑m=0 f [n, m] (−1)m
nk1 mk2
= ∑ ∑ f [n, m] exp j2π + because e− j(2π /K2)m(K2 /2) = e jπ m = (−1)m .
n=0 m=0 K1 K2
nK1 mK2 (6) F[K1 /2, K2 /2] = ∑N−1 M−1 m+n .
× exp − j2π + n=0 ∑m=0 f [n, m] (−1)
K1 K2
N−1 M−1
nk1 mk2
= ∑ ∑ f [n, m] exp j2π +
n=0 m=0 K1 K2
3-9 Computation of the 2-D DFT Using
× e− j2π (n+m)
N−1 M−1 MATLAB
nk1 mk2
= ∑ ∑ f [n, m] exp j2π
K1
+
K2
, (3.79)
We remind the reader that the notation used in this book
n=0 m=0
represents images f [n, m] defined in Cartesian coordinates, with
where we used e− j2π (n+m) = 1 because n and m are integers. The the origin at the upper left corner, the first element n of the
expression on the right-hand side of Eq. (3.79) is identical to the coordinates [n, m] increasing horizontally rightward from the
expression for F[k1 , k2 ] given by Eq. (3.75) except for the minus origin, and the second element m of the coordinates [n, m]
sign ahead of j. Hence, for a real-valued image f [n, m], increasing vertically downward from the origin. To illustrate
with an example, let us consider the (3 × 3) image given by
F∗ [k1 , k2 ] = F[K1 − k1 , K2 − k2 ], (3.80) f [0, 0] f [1, 0] f [2, 0] 3 1 4
1 ≤ k1 ≤ K1 − 1; 1 ≤ k2 ≤ K2 − 1, f [n, m] = f [0, 1] f [1, 1] f [2, 1] = 1 5 9 . (3.81)
f [0, 2] f [1, 2] f [2, 2] 2 6 5
where (K2 × K1 ) is the order of the 2-D DFT. When stored in MATLAB as array X(m′ , n′ ), the content remains
the same, but the indices swap roles and their values start at
3-8.3 Special Cases (1, 1):
A. f [n, m] is real X(1, 1) X(1, 2) X(1, 3) 3 1 4
X(m′ , n′ ) = X(2, 1) X(2, 2) X(2, 3) = 1 5 9 .
If f [n, m] is a real-valued image, the following special cases X(3, 1) X(3, 2) X(3, 3) 2 6 5
hold: (3.82)
Arrays f [n, m] and X(m′ , n′ ) are displayed in Fig. 3-28.
(1) F[0, 0] = ∑N−1 M−1
n=0 ∑m=0 f [n, m] is real-valued. Application of Eq. (3.75) with N = M = 3 and K1 = K2 = 3
− j(2π /K )mk to the 3 × 3 image defined by Eq. (3.81) leads to
(2) F[0, k2 ] = ∑M−1 N−1
m=0 ∑n=0 f [n, m] e
2 2,
N−1
which is the K2 -point 1-D DFT of ∑n=0 f [n, m]. 36 −9 + j5.2 −9 − j5.2
− j(2π /K )nk F[k1 , k2 ] = −6 − j1.7 9 + j3.5 1.5 + j0.9 . (3.83)
(3) F[k1 , 0] = ∑N−1 M−1
n=0 ∑m=0 f [n, m] e
1 1,
−6 + j1.7 1.5 − j0.9 9 − j3.5
M−1
which is the K1 -point 1-D DFT of ∑m=0 f [n, m].
◮ In MATLAB, the command FX=fft2(X,M,N) com-
B. f [n, m] is real and K1 and K2 are even putes the (M × N) 2-D DFT of array X and stores it in array
If also K1 and K2 are even, then the following relations apply: FX. ◭
122 CHAPTER 3 2-D IMAGES AND SYSTEMS
3 × 3 Image
3 1 4
1 5 9
2 6 5
DFT fft2(X)
9 − j3.5 −6 + j1.7 1.5 − j0.9 9 − j3.5 −6 + j1.7 1.5 − j0.9
′ ′
Fc [k1c , k2c ] = −9 − j5.2 36 −9 + j5.2 FXC(k2c , k1c ) = −9 − j5.2 36 −9 + j5.2
1.5 + j0.9 −6 − j1.7 9 + j3.5 1.5 + j0.9 −6 − j1.7 9 + j3.5
Figure 3-28 In common-image format, application of the 2-D DFT to image f [n, m] generates F[k1 , k2 ]. Upon shifting F[k1 , k2 ] along k1 and k2 to
center the image, we obtain the center-of-image format represented by Fc [k1c , k2c ]. The corresponding sequence in MATLAB starts with X(m′ , n′ ) and
′ , k′ ).
concludes with FXC(k2c 1c
The corresponding array in MATLAB, designated FX(k2′ , k1′ ) 3-9.1 Center-of-Image Format
and displayed in Fig. 3-28, has the same content but with
MATLAB indices (k2′ , k1′ ). Also, k2′ increases downward and In some applications, it is more convenient to work with the
k1′ increases horizontally. The relationships between MATLAB 2-D DFT array when arranged in a center-of-image format
indices (k2′ , k1′ ) and common-image format indices (k1 , k2 ) are (Fig. 3-25(c) but with the vertical axis pointing downward) than
identical in form to those given by Eq. (3.69), namely in the top-left corner format. To convert array F[k1 , k2 ] to a
center-of-image format, we need to shift the array elements to
k2′ = k2 + 1, (3.84)
the right and downward by an appropriate number of steps so as
k1′ = k1 + 1. (3.85) to locate F[0, 0] in the center of the array. If we denote the 2-D
3-9 COMPUTATION OF THE 2-D DFT USING MATLAB 123
DFT in the center-of-image format as Fc [k1c , k2c ], then its index with k1 = K1 − k1c , k2 = K2 − k2c ,
k1c extends over the range
Ki − 1 Ki − 1
− ≤ k1c ≤ , for Ki = odd, (3.86) (d) Fourth Quadrant
2 2
and Fc [k1c , −k2c ] = F[k1 , k2 ], (3.90d)
A. N = M and K1 = K2 = odd
◮ In MATLAB, the command
FXC=fftshift(fft2(FX)) The (3 × 3) image shown in Fig. 3-28 provides an example of an
(M × M) image with M being an odd integer. As noted earlier,
shifts array FX to center-image format and stores it in array when the 2-D DFT is displayed in the center-of-image format,
FXC. ◭ the conjugate symmetry about the center of the array becomes
readily apparent.
In the general case for any integers K1 and K2 , transform-
ing the 2-D DFT F[k1 , k2 ] into the center-of-image format
Fc [k1c , k2c ] entails the following recipe:
For A. N = M and K1 = K2 = even
(
′ Ki /2 − 1 if Ki is even, Let us consider the (4 × 4) image
Ki = (3.89)
(Ki − 1)/2 if Ki is odd,
1 2 3 4
and 0 ≤ k1c , k2c ≤ K′i : 2 4 5 3
f [n, m] = . (3.91)
3 4 6 2
(a) First Quadrant 4 3 2 1
Fc [k1c , k2c ] = F[k1 , k2 ], (3.90a) The (4 × 4) 2-D DFT F[k1 , k2 ] of f [n, m], displayed in the upper-
left corner format, is
with k1 = k1c and k2 = k2c ,
F[0, 0] F[1, 0] F[2, 0] F[3, 0]
(b) Second Quadrant
F[0, 1] F[1, 1] F[2, 1] F[3, 1]
F=
Fc [−k1c , k2c ] = F[k1 , k2 ], (3.90b) F[0, 2] F[1, 2] F[2, 2] F[3, 2]
F[0, 3] F[1, 3] F[2, 3] F[3, 3]
with k1 = K1 − k1c , k2 = k2c ,
49 −6 − j3 3 −6 + j3
−5 − j4 2 + j9 −5 + j2 j
(c) Third Quadrant = . (3.92)
1 −4 + j3 −1 −4 − j3
Fc [−k1c , −k2c ] = F[k1 , k2 ], (3.90c) −5 + j4 −j −5 − j2 2 − j9
124 CHAPTER 3 2-D IMAGES AND SYSTEMS
Fc (k1c , k2c )
′
F [−2, −2] F′ [−1, −2] F′ [0, −2] F′ [1, −2]
′ ′ ′ ′
F [−2, −1] F [−1, −1] F [0, −1] F [1, −1]
= ′
F [−2, 0] F′ [−1, 0] F′ [0, 0] F′ [1, 0]
′ ′
F [−2, −1] F [−1, 1] ′
F [0, 1] F′ [1, 1]
−1 −4 − j3 1 −4 + j3
−5 − j2 2 − j9 −5 + j4 −j
= . (3.93)
3 −6 + j3 49 −6 − j3
−5 + j2 j −5 − j4 2 + j9
Exercise 3-15: Why did this book not spend more space on
computing the DSFT?
Answer: Because in practice, the DSFT is computed using
the 2-D DFT.
3-9 COMPUTATION OF THE 2-D DFT USING MATLAB 125
Summary
Concepts
• Many 2-D concepts are generalizations of 1-D coun- space images, on which discrete-space image processing
terparts. These include: LSI systems, convolution, sam- can be performed.
pling, 2-D continuous-space; 2-D discrete-space; and • Nearest-neighbor interpolation often works well for in-
2-D discrete Fourier transforms. terpolating sampled images to continuous space. Hexag-
• The 2-D DSFT is doubly periodic in Ω1 and Ω2 with onal sampling can also be used.
periods 2π . • Discrete-space images can be displayed in several differ-
• Rotating an image rotates its 2-D continuous-space ent formats (the location of the origin differs).
Fourier transform (CSFT). The CSFT of a radially sym- • The response of an LSI system with point spread
metric image is radially symmetric. function h(x, y) to image f (x, y) is output g(x, y) =
• Continuous-space images can be sampled to discrete- h(x, y) ∗ ∗ f (x, y), and similarly in discrete space.
Mathematical Formulae
Impulse Ideal radial lowpass filter PSF
δ (r) h(r) = 4ρ02 jinc(2ρ0 r)
δ (x, y) = δ (x) δ (y) =
πr
Energy of f (x, y) 2-D Sampling
Z ∞Z ∞ 1
E= | f (x, y)|2 dx dy Sampling interval > 2B if F(µ , ν ) = 0 for |µ |, |ν | > B
∆
−∞ −∞
Important Terms Provide definitions or explain the meaning of the following terms:
aliasing CSFT DSFT linear shift-invariant (LSI) point spread function sampling theorem
convolution DFT FFT nearest-neighbor interpolation sampled image sinc function
126 CHAPTER 3 2-D IMAGES AND SYSTEMS
3.9 Nearest-Neighbor interpolation: Run MATLAB program 3.15 Compute the spatial frequency response of the system in
P39.m. This samples the clown image and reconstructs from Problem 3.12.
PROBLEMS 127
3.16 Show that the 2-D spatial frequency response of the PSF using (2 × 2) 2-D DFTs.
1 2 1
h[m, n] = 2 4 2
1 2 1
y[n, m] = x1 [n, m]
c
c x2 [n, m],
where
1 2
x1 [n, m] =
3 4
and
5 6
x2 [n, m] =
7 8
Chapter 4
4 Image Interpolation
0
Contents
50
Overview, 129
4-1 Interpolation Using Sinc Functions, 129 100
4-2 Upsampling and Downsampling Modalities, 130
4-3 Upsampling and Interpolation, 133 150
4-4 Implementation of Upsampling Using 2-D DFT
in MATLAB, 137 200
4-5 Downsampling, 140
4-6 Antialias Lowpass Filtering, 141 250
4-7 B-Splines Interpolation, 143
4-8 2-D Spline Interpolation, 149 300
4-9 Comparison of 2-D Interpolation Methods, 150
4-10 Examples of Image Interpolation 350
Applications, 152
Problems, 156 399
0 50 100 150 200 250 300 350 399
129
130 CHAPTER 4 IMAGE INTERPOLATION
In reality, an image is finite in size, and so is the number of where the rectangle function, defined earlier in Eq. (2.2), is given
samples { f (n∆, m∆) }. Consequently, for a square image, the by (
ranges of indices n and m are limited to finite lengths M, in x 1 for − a < x < a,
which case the infinite sums in Eq. (4.1) become finite sums: rect =
2a 0 otherwise.
M−1 M−1 x y
Parameter a usually is assigned a value of 2 or 3. In MATLAB,
fsinc (x, y) = ∑ ∑ f (n∆, m∆) sinc
∆
− n sinc
∆
−m .
the Lanczos interpolation formula can be exercised using the
n=0 m=0
(4.3) command imresize and selecting Lanczos from the menu.
The summations start at n = 0 and m = 0, consistent with the In addition to the plot for sinc(x), Fig. 4-2 also contains plots
image display format shown in Fig. 3-3(a), wherein location of the sinc function multiplied by the windowed sinc function,
(0, 0) is at the top-left corner of the image. with a = 2 and also with a = 3. The rectangle function is zero
The sampled image consists of M × M values—denoted here beyond |x| = a, so the windowed function stops at |x| = 2 for
by f (n∆, m∆)—each of which is multiplied by the product of a = 2 and at |x| = 3 for a = 3. The Lanczos interpolation method
two sinc functions, one along x and another along y. The value provides significant computational improvement over the simple
fsinc (x, y) at a specified location (x, y) on the image consists of sinc interpolation formula.
the sum of M × M terms. In the limit for an image of infinite
size, and correspondingly an infinite number of samples along Concept Question 4-1: What is the advantage of Lanc-
x and y, the infinite summations lead to fsinc (x, y) = f (x, y), zos interpolation over sinc interpolation?
where f (x, y) is the original image. That is, the interpolated
image is identical to the original image, assuming all along that
the sampled image is in compliance with the Nyquist criterion Exercise 4-1: If sinc interpolation is applied to the samples
of the sampling theorem. When applying the sinc interpolation {x(0.1n) = cos(2π (6)0.1n)}, what will be the result?
formula to a finite-size image, the interpolated image should be Answer: {x(0.1n)} are samples of a 6 Hz cosine sam-
a good match to the original image, but it is computationally pled at a rate of 10 samples/second, which is below the
inefficient when compared with the spatial-frequency domain Nyquist frequency of 12 Hz. The sinc interpolation will
interpolation technique described later in Section 4-3.2. be a cosine with frequency aliased to (10 − 6) = 4 Hz (see
Section 2-4.3).
1
sinc(x)
0.8
sinc(x) sinc(x/2) rect(x/4)
0.6 sinc(x) sinc(x/3) rect(x/6)
0.4
0.2
−0.2 x
−3 −2 −1 0 1 2 3
Figure 4-2 Sinc function (in black) and Lanczos windowed sinc functions (a = 2 in blue and a = 3 in red).
image is characterized by four parameters: (Mu × Mu ) image g[n, m], with Mu > Mo and
w × w = w2 = image pixel area, Since the sampling interval ∆u in the enlarged upsampled image
2 is shorter than the sampling interval in the initial image, ∆o , the
∆o × ∆o = ∆o = true resolution area. Nyquist requirement continues to be satisfied.
If the displayed image bears a one-to-one correspondence to the
array f [n, m], then the brightness of a given pixel corresponds B. Increasing Image Array Size While Keeping
to the magnitude f [n, m] of the corresponding element in the Physical Size Unchanged
array, and the pixel dimensions w × w are directly proportional
to ∆o × ∆o . For simplicity, we set w = ∆o , which means that the The image displayed in Fig. 4-3(b) is an upsampled version of
image displayed on the computer screen has the same physical f [n, m] in which the array size was increased from (M × M) to
dimensions as the original image f (x, y). The area ∆2o is the true (Mu × Mu ) and the physical size of the image remained the same,
resolution area of the image. but the pixel size is smaller. In the image g[n, m],
Mo Mo
T = T ′, w′ = w , and ∆u = ∆o . (4.7)
Mu Mu
4-2.1 Upsampling Modalities
A. Enlarging Physical Size While Keeping Pixel ◮ The two images in Fig. 4-3(a) and (b) have identical ar-
Size Unchanged rays g[n, m], but they are artificially displayed on computer
screens with different pixel sizes. ◭
The image configuration in part (a) of Fig. 4-3 depicts what
happens when the central image f [n, m] is enlarged in size from
(T × T ) to (T ′ × T ′ ) while keeping the pixel size the same. The 4-2.2 Downsampling Modalities
process, accomplished by an upsampling operation, leads to an
132 CHAPTER 4 IMAGE INTERPOLATION
T′ g[n,m]
T′ = T
Upsampling
Upsampling
Δu = Δo /2
f [n,m] Δu = Δo /2 w′
T
(b) Image with Mu = 2Mo,
T ′ = T, and w′ = w /2
w′
(a) Image with w′ = w, Mu = 2Mo,
and T ′ = 2T Downsampling
Downsampling T′ = T
w
Δd = 2Δo
T′ Δd = 2Δo Original image
Mo × Mo
w′ w′
(c) Thumbnail image with w′ = w, (d) Image with w′ = 2w, T ′ = T,
Md = Mo /2 and T ′ = T/2 and Md = Mo /2
4-3 Upsampling and Interpolation illustrated in Fig. 4-3(a), rather than to decrease the sampling
interval.
Upsampling image f [n, m] to image g[n′ , m′ ] can be accom-
Let f (x, y) be a continuous-space image of size T (meters) by T
plished either directly in the discrete spatial domain or indirectly
(meters) whose 2-D Fourier transform F(µ , v) is bandlimited to
in the spatial frequency domain. We examine both approaches in
B (cycles/m). That is,
the subsections that follow.
F(µ , v) = 0 for |µ |, |ν | > B. (4.10)
Image f (x, y) is not available to us, but an (Mo × Mo ) sampled 4-3.1 Upsampling in the Spatial Domain
version of f (x, y) is available. We define it as the original
In practice, image upsampling is performed using the 2-D
sampled image
DFT in the spatial frequency domain because it is much faster
f [n, m] = { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1 }, (4.11) and easier computationally than performing the upsampling
directly in the spatial domain. Nevertheless, for the sake of
where ∆o is the associated sampling interval. Moreover, the completeness, we now provide a succinct presentation of how
sampling had been performed at a rate exceeding the Nyquist upsampling is performed in the spatial domain using the sinc
rate, which requires the choice of ∆o to satisfy the condition interpolation formula.
We start by repeating Eq. (4.3) after replacing ∆ with ∆o and
1 M with Mo :
∆o < . (4.12)
2B
Mo −1 Mo −1
x y
Given that the sampled image is T × T in size, the number of f (x, y) = ∑ ∑ f [n, m] sinc − n sinc −m .
samples Mo along each direction is n=0 m=0 ∆o ∆o
(4.16)
T Here, f [n, m] is the original (Mo × Mo ) sampled image available
Mo = . (4.13) to us, and the goal is to upsample it to an (Mu × Mu ) image
∆o
g[n′ , m′ ], with Mu = LMo , where L is an upsampling factor. In
Next, we introduce a new (yet to be created) higher-density the upsampled image, the sampling interval ∆u is related to the
sampled image g[n, m], also T × T in physical dimensions but sampling interval ∆o of the original sampled image by
containing Mu × Mu samples—instead of Mo × Mo samples—
with Mu > Mo (which corresponds to the scenario depicted in Mo ∆o
∆u = ∆o = . (4.17)
Fig. 4-3(b)). We call g[n′ , m′ ] the upsampled version of f [n, m]. Mu L
The goal of upsampling and interpolation, which usually is
abbreviated to just “upsampling,” is to compute g[n′ , m′ ] from To obtain g[n′ , m′ ], we sample f (x, y) at x = n′ ∆u and y = m′ ∆u :
f [n, m]. Since Mu > Mo , g[n′ , m′ ] is more finely discretized than
f [n, m], and the narrower sampling interval ∆u of the upsampled g[n′ , m′ ] = f (n′ ∆u , m′ ∆u ) (0 ≤ n′ , m′ ≤ Mu − 1)
image is Mu −1 Mu −1
∆u =
T
=
Mo
∆o . (4.14)
= ∑ ∑ f [n, m]
n=0 m=0
Mu Mu
′ ∆u ′ ∆u
Since ∆u < ∆o , it follows that the sampling rate associated with × sinc n − n sinc m −m
g[n′ , m′ ] also satisfies the Nyquist rate. ∆o ∆o
The upsampled image is given by Mu −1 Mu −1
′ ′ ′ ′ ′ ′
= ∑ ∑ f [n, m]
g[n , m ] = { f (n ∆u , m ∆u ), 0 ≤ n , m ≤ Mu − 1 }. (4.15) n=0 m=0
′
n′ m
In a later section of this chapter (Section 4-6) we demonstrate × sinc − n sinc −m . (4.18)
L L
how the finer discretization provided by upsampling is used to
compute a rotated or warped version of image f [n, m]. Another If greater truncation is desired, we can replace the product of
application is image magnification, but in that case the primary sinc functions with the product of the Lanczos functions defined
goal is to increase the image size (from T1 × T1 to T2 × T2 ), as in Eq. (4.4). In either case, application of Eq. (4.18) generates
134 CHAPTER 4 IMAGE INTERPOLATION
the upsampled version g[n′ , m′ ] directly from the original sam- tion for F[k1 , k2 ] extends to (Mo − 1) whereas the summation for
pled version f [n, m]. If ∆u = ∆o /L and L is an integer, then the G[k1 , k2 ] extends to (Mu − 1).
process preserves the values of f [n, m] while adding new ones in The goal is to compute g[n, m] by (1) transforming f [n, m] to
between them. To demonstrate that the upsampling process does obtain F[k1 , k2 ], (2) transforming F[k1 , k2 ] to G[k1 , k2 ], and (3)
indeed preserve f [n, m], let us consider the expression given by then transforming G[k1 , k2 ] back to the spatial domain to form
Eq. (4.15) for the specific case where n′ = Ln and m′ = Lm: g[n, m]. Despite the seeming complexity of having to execute a
three-step process, the process is computationally more efficient
g[Ln, Lm] = { f (n′ ∆u , m′ ∆u ), 0 ≤ n′ , m′ ≤ Mu − 1 } than performing upsampling entirely in the spatial domain.
= { f (Ln∆u , Lm∆u ), 0 ≤ n, m ≤ Mo − 1 } Upsampling in the discrete frequency domain [k1 , k2 ] en-
= { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1 } = f [n, m], tails increasing the number of discrete frequency components
from (Mo × Mo ) for F[k1 , k2 ] to (Mu × Mu ) for G[k1 , k2 ], with
where we used the relationships given by Eqs. (4.11) and (4.14). Mu > Mo . As we will demonstrate shortly, G[k1 , k2 ] includes all
Hence, upsampling by an integer L using the sinc interpola- of the elements of F[k1 , k2 ], but it also includes some additional
tion formula does indeed preserve the existing values of f [n, m], rows and columns filled with zeros.
in addition to adding interpolated values between them.
A. Original Image
4-3.2 Upsampling in the Spatial Frequency
Domain Let us start with the sampled image fo (x, y) of continuous
image f (x, y) sampled at a sampling interval ∆o and resulting
Instead of using the sinc interpolation formula given by in (Mo × Mo ) samples. Per Eq. (3.61), adapted to a finite sum
Eq. (4.18), upsampling can be performed much more easily, that starts at (0, 0) and ends at (Mo − 1, Mo − 1),
and with less computation, using the 2-D DFT in the spatial
frequency domain. From Eqs. (4.11) and (4.18), the (Mo × Mo ) Mo −1 Mo −1
original f [n, m] image and the (Mu × Mu ) upsampled image fo (x, y) = ∑ ∑ f (n∆o , m∆o ) δ (x − n∆o ) δ (y − m∆o)
g[n′ , m′ ] are defined as n=0 m=0
Mo −1 Mo −1
f [n, m] = { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1 }, (4.19a) = ∑ ∑ f [n, m] δ (x − n∆o) δ (y − m∆o ), (4.21)
n=0 m=0
g[n′ , m′ ] = { g(n′ ∆u , m′ ∆u ), 0 ≤ n′ , m′ ≤ Mu − 1 }. (4.19b)
where we used the definition for f [n, m] given by Eq. (4.19a).
Note that whereas in the earlier section it proved convenient to Using entry #9 in Table 3-1, the 2-D CSFT Fo (µ , ν ) of
distinguish the indices of the upsampled image from those of fo (x, y) can be written as
the original image—so we used [n, m] for the original image and
[n′ , m′ ] for the upsampled image—the distinction is no longer Fo (µ , ν ) = F { fo (x, y)}
needed in the present section, so we will now use indices [n, m] Mo −1 Mo −1
for both images. = ∑ ∑ f [n, m] F {δ (x − n∆o) δ (y − m∆o )}
From Eq. (3.75), the 2-D DFT of f [n, m] of order (Mo × Mo ) n=0 m=0
and the 2-D DFT of g[n, m] of order (Mu × Mu ) are given by Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π µ n∆o e− j2πν m∆o . (4.22)
Mo −1 Mo −1 n=0 m=0
− j(2π /Mo )(nk1 +mk2 )
F[k1 , k2 ] = ∑ ∑ f [n, m] e ,
n=0 m=0 The spectrum Fo (µ , ν ) of sampled image fo (x, y) is doubly
0 ≤ k1 , k2 ≤ Mo − 1, (4.20a) periodic in µ and ν with period 1/∆o , as expected.
Mu −1 Mu −1 Extending the relations expressed by Eqs. (2.47) and (2.54)
G[k1 , k2 ] = ∑ ∑ f [n, m] e− j(2π /Mu )(nk1 +mk2 ) , from 1-D to 2-D, the spectrum Fo (µ , ν ) of the sampled image is
n=0 m=0 related to the spectrum F(µ , ν ) of continuous image f (x, y) by
0 ≤ k1 , k2 ≤ Mu − 1. (4.20b) ∞ ∞
1
The summations are identical in form, except that the summa-
Fo ( µ , ν ) =
∆2o k ∑ ∑ F(µ − k1 /∆o , ν − k2 /∆o ). (4.23)
1 =−∞ k2 =−∞
4-3 UPSAMPLING AND INTERPOLATION 135
The spectrum Fo (µ , ν ) of the sampled image consists of copies k1 , k2 ≥ 0 and redefining µ as µ = −k1 /(Mo ∆o ) leads to
of the spectrum F(µ , ν ) of f (x, y) repeated every 1/∆o in both
µ and ν , and also scaled by 1/∆2o. −k1 k2 Mo − 1
Fo , 0 ≤ k1 , k2 ≤
In (µ , ν ) space, µ and ν can be both positive or negative. The Mo ∆o Mo ∆o 2
relation between the spectrum Fo (µ , ν ) of the sampled image Mo −1 Mo −1
fo (x, y) and the spectrum F(µ , ν ) of the original continuous = ∑ ∑ f [n, m] e− j2π n(−k1)/Mo e− j2π mk2 /Mo
image f (x, y) assumes different forms for the four quadrants of n=0 m=0
(µ , ν ) space. Mo −1 Mo −1
For ease of presentation, let Mo be odd. If Mo is even, then = ∑ ∑ f [n, m] e− j2π n(Mo −k1 )/Mo e− j2π mk2 /Mo
simply replace (Mo − 1)/2 with Mo /2 (see Section 4-4.2). n=0 m=0
Next, we sample µ and ν by setting them to Mo − 1
= F[Mo − k1 , k2 ], 0 ≤ k1 , k2 ≤ , (4.26)
2
k1 k2
µ= and ν = , (4.24)
Mo ∆o Mo ∆o 3. Quadrant 3: µ ≤ 0 and ν ≤ 0
Mo − 1
0 ≤ |k1 |, |k2 | ≤ . Redefining µ as µ = −k1 /(Mo ∆o ) and ν as ν = −k2 /(Mo ∆o )
2
leads to
−k1 −k2 Mo − 1
Fo , , 0 ≤ k1 , k2 ≤
Mo ∆o Mo ∆o 2
Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π n(−k1)/Mo e− j2π m(−k2 )/Mo
1. Quadrant 1: µ ≥ 0 and ν ≥ 0 n=0 m=0
Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π n(Mo −k1 )/Mo e− j2π m(Mo −k2 )/Mo
At these values of µ and ν , Fo (µ , ν ) becomes n=0 m=0
Mo − 1
k1 k2 = F[Mo − k1, Mo − k2 ], 0 ≤ k1 , k2 ≤ . (4.27)
Fo , 2
Mo ∆o Mo ∆o
Mo −1 Mo −1
4. Quadrant 4: µ ≥ 0 and ν ≤ 0
= ∑ ∑ f [n, m] e− j2π nk1 /Mo e− j2π mk2 /Mo
n=0 m=0 Upon defining µ as in Eq. (4.24) and redefining ν as
Mo − 1 ν = −k2 /(Mo ∆o ),
= F[k1 , k2 ], 0 ≤ k1 , k2 ≤ . (4.25)
2
k1 −k2 Mo − 1
Fo , , 0 ≤ k1 , k2 ≤
Mo ∆o Mo ∆o 2
Mo −1 Mo −1
= ∑ ∑ f [n, m] e− j2π nk1 /Mo e− j2π m(−k2 )/Mo
n=0 m=0
Mo −1 Mo −1
2. Quadrant 2: µ ≤ 0 and ν ≥ 0 = ∑ ∑ f [n, m] e− j2π nk1 /Mo e− j2π m(Mo −k2 )/Mo
n=0 m=0
In quadrants 2–4, we make use of the relation Mo − 1
= F[k1 , Mo − k2], 0 ≤ k1 , k2 ≤ , (4.28)
2
− j2π n(−k1 )/Mo − j2π nMo /Mo − j2π n(−k1 )/Mo
e =e e
The result given by Eqs. (4.25)–(4.28) states that the 2-D
= e− j2π n(Mo −k1 )/Mo , CSFT Fo (µ , ν ) of the sampled image fo (x, y)—when sampled
at the discrete spatial frequency values defined by Eq. (4.24)—
where we used e− j2π nMo /Mo = 1. A similar relation applies to k2 . is the 2-D DFT of the sampled image f [n, m]. Also, the spec-
In quadrant 2, µ is negative and ν is positive, so keeping trum Fo (µ , ν ) of the sampled image consists of copies of the
136 CHAPTER 4 IMAGE INTERPOLATION
spectrum F(µ , ν ) repeated every 1/∆o in both µ and ν , and also upon sampling Fu (µ , ν ) at the rates defined by Eq. (4.24), we
scaled by 1/∆2o. Hence, generalizing Eq. (2.54) from 1-D to 2-D obtain
gives
k1 k2
Fu ,
k1 k2 1 k1 k2 Mo ∆o Mo ∆o
Fo , = 2F , , (4.29)
Mo ∆o Mo ∆o ∆o Mo ∆o Mo ∆o Mu −1 Mu −1
= ∑ ∑ g[n, m] e− j2π nk1 /Mu e− j2π mk2 /Mu
where F(µ , ν ) is the 2-D CSFT of the continuous-space image n=0 m=0
f (x, y) and 0 ≤ |k1 |, |k2 | ≤ (Mo − 1)/2. Mo − 1
= G[k1 , k2 ], 0 ≤ k1 , k2 ≤ , (4.32)
2
1. Quadrant 1: µ ≥ 0 and ν ≥ 0 Mo − 1
= G[Mu − k1, Mu − k2 ], 0 ≤ k1 , k2 ≤ . (4.34)
2
In view of the relationship (from Eq. (4.14))
∆u ∆u 1
= = ,
Mo ∆o Mu ∆u Mu
4-4 IMPLEMENTATION OF UPSAMPLING USING 2-D DFT IN MATLAB 137
4. Quadrant 4: µ ≥ 0 and ν ≤ 0 sampled signal is zero, since sampling the original signal f (x, y)
at above its Nyquist rate separates the copies of its spectrum,
leaving bands of zero between copies. Thus
k1 −k2 Mo − 1
Fu , , 0 ≤ k1 , k2 ≤
Mo ∆o Mo ∆o 2 G[k1 , k2 ] = 0, Mo ≤ k1 , k2 ≤ Mu − 1. (4.39)
Mu −1 Mu −1
= ∑ ∑ g[n, m] e− j2π nk1 /Mu e− j2π m(−k2 )/Mu
n=0 m=0
Mu −1 Mu −1
= ∑ ∑ g[n, m] e− j2π nk1 /Mu e− j2π m(Mu −k2 )/Mu
n=0 m=0 4-4 Implementation of Upsampling
Mo − 1
= G[k1 , Mu − k2 ], 0 ≤ k1 , k2 ≤ , (4.35) Using 2-D DFT in MATLAB
2
The result given by Eqs. (4.32)–(4.35) states that the 2-D In MATLAB, both the image f [n, m] and its 2-D DFT are stored
CSFT Fu (µ , ν ) of the upsampled image fu (x, y)—when sampled and displayed using the format shown in Fig. 3-25(d), wherein
at the discrete spatial frequency values defined by Eq. (4.24)—is the origin is at the upper left-hand corner of the image, and the
the 2-D DFT of the sampled image g[n, m]. Also, the spectrum indices of the corner pixel are (1, 1).
Fu (µ , ν ) of the sampled image consists of copies of the spec-
trum F(µ , ν ) of f (x, y) repeated every 1/∆u in both µ and ν ,
and also scaled by 1/∆2u. Thus,
k1 k2 1 k1 k2 Image and 2-D DFT Notation
Fu , = 2F , , (4.36)
Mo ∆o Mo ∆o ∆u Mo ∆o Mo ∆o
To avoid confusion between the common-image format
From Eq. (4.14), Mo ∆o = Mu ∆u . Combining Eq. (4.29) and (CIF) and the MATLAB format, we provide the following
Eq. (4.36) shows that list of symbols and definitions:
k1 k2 M2 k1 k2 CIF MATLAB
Fu , = u2 Fo , . (4.37)
Mo ∆o Mo ∆o Mo Mo ∆o Mo ∆o Original image f [n, m] X(m′ , n′ )
Hence, the (Mo × Mo ) 2-D DFT F[k1 , k2 ] of f [n, m] and the Upsampled image g[n, m] Y(m′ , n′ )
(Mu × Mu ) 2-D DFT G[k1 , k2 ] of g[n, m] are related by 2-D DFT of f [n, m] F[k1 , k2 ] FX(k2′ , k1′ )
2-D DFT of g[n, m] G[k1 , k2 ] FY(k2′ , k1′ )
Mu2
G[k1 , k2 ] = F[k1 , k2 ],
Mo2
Mu2
G[Mu − k1 , k2 ] = F[Mo − k1 , k2 ],
Mo2 As noted earlier in Section 3-9, when an image f [n, m] is stored
M2 in MATLAB as array X(m′ , n′ ), the two sets of indices are related
u
G[k1 , Mu − k1 ] = F[k1 , Mo − k2 ], by
Mo2
Mu2 m′ = m + 1, (4.40a)
G[Mu − k1 , Mu − k2 ] = F[Mo − k1 , Mo − k2 ], ′
Mo2 n = n + 1. (4.40b)
Mo − 1
0 ≤ k1 , k2 ≤ . (4.38) The indices get interchanged in orientation (n represents row
2
number, whereas n′ represents column number) and are shifted
This leaves G[k1 , k2 ] for Mo ≤ k1 , k2 ≤ Mu − 1 to be determined. by 1. For example, f [0, 0] = X(1, 1), and f [0, 1] = X(2, 1). While
But Eq. (4.38) shows that these values of G[k1 , k2 ] are samples the indices of the two formats are different, the contents of
of Fu (µ , ν ) at values of µ , ν for which this spectrum of the array X(m′ , n′ ) is identical with that of f [n, m]. That is, for an
138 CHAPTER 4 IMAGE INTERPOLATION
(Mo × Mo ) image zeros in the “middle” of the array FX. The result is
Mu2
X(m′ , n′ ) = f [n, m] = FY = ×
Mo2
f [0, 0] f [1, 0] ··· f [Mo − 1, 0] (Mu −Mo ) columns
h i z }| { h i
f [0, 1] f [1, 1] ··· f [Mo − 1, 1] Mo −1 Mo +1
. F[0,0]... F 2 ,0 0 ... 0 ... 0 F 2 ,0 ... F[Mo −1,0]
.
.. .. .. ..
.. .. .. .. ..
. . .
.h . . . .
h i i h i h i
f [0, Mo − 1] f [1, Mo − 1] · · · f [Mo − 1, Mo − 1] Mo −1
F 0, Mo2−1 ... F Mo2−1 , Mo2−1 F Mo +1 Mo −1
2 , 2 ... F Mo −1, 2
(4.41)
0. 0.
.. ..
.. .
The MATLAB command FX = fft(X, Mo , Mo ) computes the {
0. . 0 .. 0. }
2-D DFT F[k1 , k2 ] and stores it in array FX(k2′ , k1′ ):
.
.
.
.
h i 0 h i h i 0 h i
FX(k2′ , k1′ ) = F[k1 , k2 ] =
F 0, Mo +1 ... F Mo −1 , Mo +1 Mo +1 Mo +1
Mo +1
2 2 2 F 2 , 2 ... F Mo −1, 2
F[0, 0] F[1, 0] ··· F[Mo − 1, 0] .. .. .. .. ..
.h . . . .
F[0, 1] F[1, 1] ··· F[Mo − 1, 1] i h i
.. .. .. .. . F [0,Mo −1]... F Mo2−1 ,Mo −1 0| 0 0} F Mo −1, 2Mo +1
... F [Mo −1,Mo −1]
{z
. . . . (4.43)
F[0, Mo − 1] F[1, Mo − 1] · · · F[Mo − 1, Mo − 1]
(4.42)
◮ Note that the (Mu − Mo ) columns of zeros start after entry
The goal of upsampling using the spatial-frequency domain F[(Mo − 1)/2, 0], and similarly the (Mu − Mo ) rows of zeros
is to compute g[n, m] from f [n, m] by computing G[k1 , k2 ] from start after F[0, (Mo − 1)/2]. ◭
F[k1 , k2 ] and then applying the inverse DFT to obtain g[n, m].
The details of the procedure are somewhat different depending
on whether the array size parameter M is an odd integer or an Once array FY has been established, the corresponding up-
even integer. Hence, we consider the two cases separately. sampled image g[n, m] is obtained by applying the MATLAB
command Y=real(ifft2(FY,N,N)), where N = Mu and
the “real” is needed to eliminate the imaginary part of Y, which
4-4.1 Mo = Odd Integer may exist because of round-off error in the ifft2.
As a simple example, consider the (3 × 3) array
The recipe for upsampling using the 2-D DFT is as follows:
F[0, 0] F[1, 0] F[2, 0]
1. Given: image { f [n, m], 0 ≤ n, m ≤ Mo −1 }, as represented FX = F[0, 1] F[1, 1] F[2, 1] . (4.44a)
by Eq. (4.41). F[0, 2] F[1, 2] F[2, 2]
In MATLAB, the array FY containing the 2-D DFT G[k1 , k2 ] Application of the inverse 2-D DFT to FY generates array Y in
is obtained from the array FX given by Eq. (4.42) by inserting MATLAB, which is equivalent in content to image g[n, m] in
(Mu − Mo ) rows of zeros and an equal number of columns of common-image format.
4-4 IMPLEMENTATION OF UPSAMPLING USING 2-D DFT IN MATLAB 139
4-4.2 Mo = Even Integer one column of zeros and multiplying by (52 /32 ) would generate
49 −6 − j3 0 3 −6 + j3
52 −5 − j4 2 + j9 0 −5 + j2 j
G[k1 , k2 ] = 2 0 0 0 0 0 .
3 1 −4 + j3 0 −1 −4 − j3
For a real-valued (Mo × Mo ) image f [n, m] with Mo = an odd
−5 + j4 −j 0 −5 − j2 2 − j9
integer, conjugate symmetry is automatically satisfied for both
(4.48)
F[k1 , k2 ], the 2-D DFT of the original image, as well as for This G[k1 , k2 ] array does not satisfy conjugate symmetry. Ap-
G[k1 , k2 ], the 2-D DFT of the upsampled image. However, if plying the inverse 2-D DFT to G[k1 , k2 ] generates the upsampled
Mo is an even integer, application of the recipe outlined in the image g[n, m]:
preceding subsection will violate conjugate symmetry, so we
need to modify it. Recall from Eq. (3.80) that for a real-valued g[n,m] = (52 /32 )×
image f [n, m], conjugate symmetry requires that 0.64 1.05 + j0.19 1.64 − j0.30 2.39 + j0.30 2.27 − j0.19
1.05 + j0.19 2.01 + j0.22 2.85 − j0.20 2.8 − j0.13 1.82 − j0.19
F∗ [k1 , k2 ] = F[Mo − k1 , Mo − k2 ], 1 ≤ k1 , k2 ≤ Mo − 1. 1.64 − j0.3 2.38 − j0.44 3.9 + j0.46 3.16 + j0.08 1.31 + j0.39 ,
2.39 + j0.30 2.37 − j0.066 2.88 + j0.36 2.04 − j0.9 0.84 + j0.12
(4.45) 2.27 − j0.19 1.89 − j0.25 1.33 + j0.26 0.83 + j0.08 1.18 + j0.22
Additionally, in view of the definition for F[k1 , k2 ] given by
Eq. (4.20a), the following two conditions should be satisfied: which is clearly incorrect; all of its elements should be real-
valued because the original image f [n, m] is real-valued. Obvi-
Mo −1 Mo −1 ously, the upsampling recipe needs to be modified.
F[0, 0] = ∑ ∑ f [n, m] = real-valued, (4.46a) A simple solution is to split row F[k1 , Mo /2] into 2 rows and
n=0 m=0 to split column F[Mo /2, k2 ] into 2 columns, which also means
Mo Mo Mo −1 Mo −1 that F[Mo /2, Mo /2] gets split into 4 entries. The recipe preserves
F , = ∑ ∑ (−1)n+m f [n, m] = real-valued. conjugate symmetry in G[k1 , k2 ].
2 2 n=0 m=0 When applied to F[k1 , k2 ], the recipe yields the (6 × 6) array
(4.46b)
G[k1 , k2 ] = (62 /42 )×
In the preceding subsection, we inserted the appropriate number
of rows and columns of zeros to obtain G[k1 , k2 ] from F[k1 , k2 ]. F[0, 0] F[1, 0] F[2, 0]/2 0 F[2, 0]/2 F[3, 0]
For Mo equal to an odd integer, the conditions represented F[0, 1] F[1, 1] F[2, 1]/2 0 F[2, 1]/2 F[3, 1]
by Eqs. (4.45) and (4.46) are satisfied for both F[k1 , k2 ] and F[0, 2]/2 F[1, 2]/2 F[2, 2]/4 0 F[2, 2]/4 F[3, 2]/2
0 0 0 0 0 0 =
G[k1 , k2 ], but they are not satisfied for G[k1 , k2 ] when Mo is
F[0, 2]/2 F[1, 2]/2 F[2, 2]/4 0 F[2, 2]/4 F[3, 2]/2
an even integer. To demonstrate why the simple zero-insertion
procedure is problematic, let us consider the (4 × 4) image F[0, 3] F[1, 3] F[2, 3]/2 0 F[2, 3]/2 F[3, 3]
49 −6 − j3 1.5 0 1.5 −6 + j3
1 2 3 4 −5 − j4 2 + j9 −2.5 + j1 0 −2.5 + j1 j
2 4 5 3
f [n, m] = .
(4.47a) 0.5 −2 + j1.5 −0.25 0 −0.25 −2 − j1.5
3 4 6 2 .
0 0 0 0 0 0
4 3 2 1 0.5 −2 + j1.5 −0.25 0 −0.25 −2 − j1.5
−5 − j4 −j − j0.5 0 −2.5 − j1 2 − j9
The (4 × 4) 2-D DFT F[k1 , k2 ] of f [n, m] is
(4.49)
49 −6 − j3 3 −6 + j3
−5 − j4 2 + j9 −5 + j2 j Application of the inverse 2-D DFT to G[k1 , k2 ] yields the
F[k1 , k2 ] = .
1 −4 + j3 −1 −4 − j3
−5 + j4 −j −5 − j2 2 − j9
(4.47b)
This F[k1 , k2 ] has conjugate symmetry.
Let us suppose that we wish to upsample the (4 × 4) image
f [n, m] to a (5 × 5) image g[n, m]. Inserting one row of zeros and
140 CHAPTER 4 IMAGE INTERPOLATION
4-5 Downsampling
upsampled image Downsampling is the inverse operation to upsampling. The
objective of downsampling is to reduce the array size f [n, m]
0.44 0.61 1.06 1.33 1.83 1.38 of an image from (Mo × Mo ) samples down to (Md × Md ), with
0.61 1.09 1.73 1.91 1.85 1.21 Md < Mo . The original image f [n, m] and the downsampled
62
1.06 1.55 2.31 2.58 1.66 0.90 image g[n, m] are defined as:
g[n, m] = 2
4 1.33 1.55 2.22 2.67 1.45 0.78
1.83 1.72 1.52 1.42 0.54 0.73
Original image f [n, m] = { f (n∆o , m∆o ), 0 ≤ n, m ≤ Mo − 1},
1.38 1.25 0.94 0.76 0.72 1.04 Downsampled image g[n, m] = { f (n∆d , m∆d ), 0 ≤ n, m ≤ Md − 1}.
1.0 1.37 2.39 3.0 4.12 3.11
1.37 2.46 3.90 4.30 4.16 2.72 Both images are sampled versions of some continuous-space
image f (x, y), with image f [n, m] sampled at a sampling interval
2.39 3.49 5.20 5.81 3.74 2.03
= , (4.50) ∆o that satisfies the Nyquist rate, and the downsampled image
3.0 3.49 5.00 6.01 3.26 1.76
4.12 3.87 3.42 3.20 1.22 1.64 g[n, m] is sampled at ∆d , with ∆d > ∆o , so it is unlikely that
3.11 2.81 2.12 1.71 1.62 2.34 g[n, m] satisfies the Nyquist rate.
The goal of downsampling is to compute g[n, m] from f [n, m].
which is entirely real-valued, as it should be. The original (4×4) That is, we compute a coarser-discretized Md × Md image g[n, m]
image given by Eq. (4.47a) and the upsampled (6 × 6) image from the finer-discretized Mo × Mo image f [n, m]. Applications
given by Eq. (4.50) are displayed in Fig. 4-4. The two images, of downsampling include computation of “thumbnail” versions
which bear a close resemblance, have the same physical size but of images, as demonstrated in Example 4-1, and shrinking
different-sized pixels. images to fit into a prescribed space, such as columns in a
textbook.
4-5.1 Aliasing
It might seem that downsampling by, say, two, meaning that
Exercise 4-2: Upsample the length-2 signal x[n] = {8, 4} to
Md = Mo /2 (assuming Mo is even), could be easily accomplished
a length-4 signal y[n]. by simply deleting every even-indexed (or odd-indexed) row
and column of f [n, m]. Deleting every other row and column of
4-6 ANTIALIAS LOWPASS FILTERING 141
f [n, m] is called decimation by two. Decimation by two would 4-6 Antialias Lowpass Filtering
give the result of sampling f (x, y) every 2∆o instead of every
∆o . But if sampling at S = 1/2∆o is below the Nyquist rate, the Clearly decimation alone is not sufficient to obtain a downsam-
decimated image g[n, m] is aliased. The effect of aliasing on the pled image that looks like a demagnified original image. To
spectrum of g[n, m] can be understood in 1-D from Fig. 2-6(b). avoid aliasing, it is necessary to lowpass filter the image before
The copies of the spectrum of f (x, y) produced by sampling decimation, eliminating the high spatial frequency components
overlap one another, so the high-frequency parts of the signal of the image, so that when the filtered image is decimated, the
become distorted. Example 4-1 gives an illustration of aliasing copies of the spectra do not overlap. In 1-D, in Fig. 2-6(b),
in 2-D. had the spectrum been previously lowpass filtered with cutoff
frequency S/2 Hz, the high-frequency parts of the spectrum
would no longer overlap after sampling. This is called antialias
Example 4-1: Aliasing filtering.
The same concept applies in discrete space (and time). The
The 200 × 200 clown image shown in Fig. 4-5(a) was decimated periodicity of spectra induced by sampling becomes the period-
to the 25 × 25 image shown in Fig. 4-5(b). The decimated image icity of the DSFT and DTFT with periods of 2π in (Ω1 , Ω2 )
is a poor replica of the original image and could not function as and Ω, respectively. Lowpass filtering can be accomplished by
a thumbnail image. setting certain high-spatial-frequency portions of the 2-D DFT
to zero. The purpose of this lowpass filtering is now to eliminate
the high-spatial-frequency parts of the discrete-space spectrum,
prior to decimation. This eliminates aliasing, as demonstrated in
Example 4-2.
The 200 × 200 clown image shown in Fig. 4-5(a) was first
lowpass-filtered to the image shown in Fig. 4-6(a), with the
spectrum shown in Fig. 4-6(b), then decimated to the 25 × 25
image shown in Fig. 4-6(c). The decimated image is now a good
replica of the original image and could function as a thumbnail
image. The decimated image pixel values in Fig. 4-6(c) are all
equal to certain pixel values in the lowpass-filtered image in
Fig. 4-6(a).
|FX(k1 , k2 )| =
173.00 23.81 20.66 15.00 20.66 23.81
6.93 25.00 7.55 14.00 7.94 21.93
(b) Spectrum of lowpass-filtered image
11.14 19.98 16.09 15.10 14.93 4.58
9.00 16.09 . (4.52)
19.98 1.00 19.98 16.09
11.14 4.58 14.93 15.10 16.09 19.98
6.93 21.93 7.94 14.00 7.55 25.00
Deleting the zero-valued rows and columns in the 2-D DFT Photoshop
R
and through the command imresize.
magnitudes in MATLAB depiction gives the magnitudes
173.00 23.82 23.82 4-7.1 B-Splines
|FG(k1 , k2 )| = 6.93 25.00 21.93 . (4.54)
6.93 21.93 25.00 Splines are piecewise-polynomial functions whose polynomial
coefficients change at half-integer or integer values of the
Multiplying by Md2 /Mo2 = 32 /62 = 1/4 and taking the inverse independent variable, called knots, so that the function and some
2-D DFT gives of its derivatives are continuous at each knot. In 1-D, a B-spline
βN (t) of order N is a piecewise polynomial of degree N, centered
5.83 1.58 6.00 at t = 0. The support of βN (t), which is the interval outside of
g[m, n] = 6.08 6.33 3.00 . (4.55) which βN (t) = 0, extends between −(N + 1)/2 and +(N + 1)/2.
6.25 3.5 4.67
This is the result we would have gotten by decimating the ◮ Hence, the duration of βN (t) is (N + 1). ◭
antialiased lowpass-filtered original image, but it was performed
entirely in the 2-D DFT domain.
Formally, the B-spline function βN (t) is defined as
◮ A MATLAB code for this example is available on the Z ∞
sin(π f ) N+1 j2π f t
book website. ◭ βN (t) = e d f, (4.56)
−∞ πf
which is equivalent to the inverse Fourier transform of
Concept Question 4-2: Why must we delete rows and
sincN+1 ( f ). Recognizing that (a) the inverse Fourier transform
columns of the 2-D DFT array to perform downsampling?
of sinc(t) is a rectangle function and (b) multiplication in
the frequency domain is equivalent to convolution in the time
domain, it follows that
4-7 B-Splines Interpolation
In the preceding sections we examined several different image βN (t) = rect(t) ∗ · · · ∗ rect(t), (4.57)
| {z }
interpolation methods, some of which perform the interpolation N+1 times
directly in the spatial domain, and others that perform the inter-
polation in the spatial frequency domain. Now, we introduce yet with (
another method, known as the B-splines interpolation method,
1 for |t| < 1/2,
with the distinguishing feature that it is the method most com- rect(t) = (4.58)
monly used for image interpolation. Unlike with downsampling, 0 for |t| > 1/2.
B-spline interpolation has no aliasing issues when the sampling Application of Eq. (4.57) for N = 0, 1, 2, and 3 leads to:
interval ∆ is too large. Moreover, unlike with upsampling,
(
B-spline interpolation need not result in blurred images. 1 for |t| < 1/2,
B-splines are a family of piecewise polynomial functions, β0 (t) = rect(t) = (4.59)
with each polynomial piece having a degree N, where N is 0 for |t| > 1/2,
(
a non-negative integer. As we will observe later on in this 1 − |t| for |t| < 1,
section, a B-spline of order zero is equivalent to the nearest- β1 (t) = β0 (t) ∗ β0(t) = (4.60)
neighbor interpolation method of Section 3-5.1, but it is simpler 0 for |t| > 1,
to implement than the sinc interpolation formula. Interpolation β2 (t) = β1 (t) ∗ β0(t)
with B-splines of order N = 1 generates linear interpolation,
2
which is used in computer graphics. Another popular member 3/4 − t for 0 ≤ |t| ≤ 1/2,
1
of the B-spline interpolation family is cubic interpolation, corre- = 2 (3/2 − |t|)2 for 1/2 ≤ |t| ≤ 3/2, (4.61)
0
sponding to N = 3. Cubic spline interpolation is used in Adobe
R
for |t| > 3/2,
144 CHAPTER 4 IMAGE INTERPOLATION
Note that in all cases, βN (t) is continuous over its full duration.
For N ≥ 1, the B-spline function βN (t) is continuous and
differentiable (N − 1) times at all times t. For β2 (t), the function
is continuous across its full interval (−3/2, 3/2), including at
t = 1/2. Similarly, β3 (t) is continuous over its interval (−2, 2), t
−0.5 0 0.5
including at t = 1.
(a) β0(t)
Plots of the B-splines of order N = 0, 1, 2, and 3 are displayed
in Fig. 4-7. From the central limit theorem in the field of
probability, we know that convolving a function with itself
repeatedly makes the function resemble a Gaussian. This is 1
evident in the present case as well.
From the standpoint of 1-D and 2-D interpolation of signals
and images, the significance of B-splines is in how we can use
them to express a signal or image. To guide us through the
process, let us assume we have 6 samples x(n∆), as shown in
Fig. 4-8, extending between t = 0 and t = 5∆. Our objective
t
is to interpolate between these 6 points so as to obtain a −1 0 1
continuous function x(t). An important constraint is to ensure (b) β1(t)
that x(t) = x(n∆) at the 6 discrete times n∆.
For a B-spline of a specified order N, the interpolation is
realized by expressing the desired interpolated signal x(t) as a 0.8
linear combination of time-shifted B-splines, all of order N:
∞ t
x(t) = ∑ c[m] βN
∆
−m . (4.63) 0.4
m=−∞
Here, βN ∆t − m is the B-spline function βN (t), with t scaled by t
−2 −1 0 1 2
the sampling interval ∆ and delayed by a scaled time integer m.
(c) β2(t)
◮ The support of βN ∆t − m is
0.7
0.6
N +1 t N+1
m− < < m+ . (4.64) 0.4
2 ∆ 2
0.2
That is, βN ∆t − m = 0 outside that interval. ◭
t
−3 −2 −1 0 1 2 3
(d) β3(t)
Associated with each value of m is a constant coefficient c[m]
whose value is related to the sampled values x(n∆) and the order
N of the B-spline. More specifically, the values of c[m] have to Figure 4-7 Plots of βN (t) for N = 0, 1, 2, and 3.
be chosen such that the aforementioned constraint requiring that
x(t) = x(n∆) at discrete times t = n∆ is satisfied. The process is
4-7 B-SPLINES INTERPOLATION 145
4 4
3 3 x(nΔ)
x(t)
2 2
1 1
0 t/Δ 0 t/Δ
0 1 2 3 4 5 0 1 2 3 4 5
Figure 4-8 Samples x(n∆) to be interpolated into x(t). Figure 4-9 B-spline interpolation for N = 0.
0 t/Δ
0 1 2 3 4 5
(a) x(nΔ)
0 t/Δ
−1 0 1 2 3 4 5 6
(b) β1(t/Δ − m)
0
0 1 2 3 4 5
(c) x(t)
Consequently, for any integer n, x(t) at time t between n∆ and Figure 4-11 B-splines β2 (t/∆ − m) overlap in time.
(n + 1)∆ is a weighted average given by
t t
x(t) = x(n∆) (n + 1) − + x((n + 1)∆) − n . (4.68a)
∆ ∆ samples { x(n∆) } can be derived by starting with Eq. (4.63),
The associated duration is ∞ t
n∆ ≤ t ≤ (n + 1)∆. (4.68b)
x(t) = ∑ c[m] βN
∆
−m , (4.69)
m=−∞
Application of the B-spline linear interpolation to the given and then setting t = n∆, which gives
samples x(n∆) leads to the continuous function x(t) shown in
∞
Fig. 4-10(c). The linear interpolation amounts to setting { x(t),
n∆ ≤ t ≤ (n + 1)∆ } to lie on the straight line connecting x(n∆)
x(n∆) = ∑ c[m] βN (n − m) = c[n] ∗ βN (n), (4.70)
m=−∞
to x((n + 1)∆).
where use was made of the discrete-time convolution relation
Exercise 4-6: Given the samples given by Eq. (2.71a).
For N = 2, Eq. (4.61) indicates that β2 (n) 6= 0 only for integers
{ x(0), x(∆), x(2∆), x(3∆) } = { 7, 4, 3, 2 }, n = { −1, 0, 1 }. Hence, the discrete-time convolution given by
Eq. (4.70) simplifies to
compute x(∆/3) by interpolation using: (a) nearest neigh-
bor; (b) linear. x(n∆) = c[n − 1] β2 (1) + c[n] β2 (0) + c[n + 1] β2 (−1)
1
Answer: (a) ∆/3 is closer to 0 than to ∆, so = 8 c[n − 1] + 43 c[n] + 81 c[n + 1] (N = 2). (4.71)
x(∆/3) = x(0) = 7. In the second step, the constant coefficients were computed
using Eq. (4.61) for β2 (t). The sum truncates because β2 (t) = 0
for |t| ≥ 3/2, so only three basis functions overlap at any specific
(b)
time t, as is evident in Fig. 4-11.
x(∆/3) = (2/3)x(0)+(1/3)x(∆) = (2/3)(7)+(1/3)(4) = 6. Similarly, for N = 3, β3 (t) = 0 for |t| ≥ 2, which also leads
to the sum of three terms:
As noted earlier, 15
1 3 1 10
β2 (n) = { β2 (−1), β2 (0), β2 (1) } = , , .
8 4 8
5
Inserting x(n) and β (n) in Eq. (4.70) establishes the convolution
problem 0 t
−5 −4 −3 −2 −1 0 1 2 3 4 5
1 3 1 (b) x(t) interpolated using quadratic splines
{ 3, 19, 11, 17, 26, 4 } = , , ∗ c[n].
8 4 8
Figure 4-12 (a) Original samples x(n) and (b) interpolated
Following the solution recipe outlined earlier—and demon-
function x(t).
strated in Section 2-9—the deconvolution solution is
◮ Note: The MATLAB code for solving Example 4-4 is Concept Question 4-4: Why use cubic interpolation,
available on the book website. ◭ when quadratic interpolation produces smooth curves?
4-8 2-D SPLINE INTERPOLATION 149
four neighbors:
Concept Question 4-5: Why do quadratic and cubic in-
terpolation require computation of coefficients, while linear { f (n∆, m∆), f ((n + 1)∆, m∆), f (n∆, (m + 1)∆),
interpolation does not?
f ((n + 1)∆, (m + 1)∆) }. (4.77)
(−1, 0, 1):
0 β3 (−1)
β3 (0) β3 (−1) β3 (0) β3 (1)
100
β3 (1)
200 1
1 1 4 1
6
300 41 4 1
= 6 6 6 6 = 4 16 4 . (4.81)
36 1 4 1
400 1
6
500
600
1 2
Exercise 4-8: The “image” is interpolated using
700 3 4
bilinear interpolation. What is the interpolated value at the .
800 center of the image?
900 Answer: 14 (1 + 2 + 3 + 4) = 2.5
1000
0 0
20 20
40 40
60 60
80 80
100 100
120 120
140 140
160 160
180 180
199 199
0 40 80 120 160 199 0 40 80 120 160 199
(a) Sinc interpolation (d) B-spline NN interpolation
0 0
20 20
40 40
60 60
80 80
100 100
120 120
140 140
160 160
180 180
199 199
0 40 80 120 160 199 0 40 80 120 160 199
(b) Lanczos interpolation with a = 2 (e) B-spline linear interpolation
0 0
20 20
40 40
60 60
80 80
100 100
120 120
140 140
160 160
180 180
199 199
0 40 80 120 160 199 0 40 80 120 160 199
(c) Lanczos interpolation with a = 3 (f ) B-spline cubic-spline interpolation
Figure 4-15 Comparison of three interpolation methods: (a) sinc interpolation; (b) and (c) Lanczos interpolation with a = 2 and a = 3,
respectively; and (d) to (f) B-spline with N = 0, N = 1, and N = 3, respectively.
152 CHAPTER 4 IMAGE INTERPOLATION
g(n∆, m∆) =
199
0 199 f (n∆ cos θ + m∆ sin θ , m∆ cos θ − n∆ sin θ ), (4.83)
(a) Original clown image
which clearly requires interpolation of f (x, y) at the required
0 points from its given samples f (n∆, m∆). In practice, nearest
neighbor (NN) interpolation is usually sufficient to realize the
necessary interpolation.
Figure 4-16(a) displays a zero-padded clown image, and
part (b) displays the image after rotation by 45◦ using NN
interpolation. The rotated image bears a very good resemblance
to the rotated original. The MATLAB code for this figure is on
the book website.
150
Example 4-7: Square-Root and Inverse Image
200 Warping
250
Another form of image warping is realized by applying a
300 square-root function of the form
p p
350 n |n| m |m|
Tn (n) = and Tm (m) = .
25 25
399
0 50 100 150 200 250 300 350 399 Repetition of the steps described in the previous example, but
(a) Zero-padded clown image using the square-root transformation instead, leads to the images
in Fig. 4-18(a). The MATLAB code for this figure is available
0 on the book website.
Another transformation is the inverse function given by
50
n|n| m|m|
Tn (n) = and Tm (m) = .
100 a a
The result of warping the clown image with a = 300 is shown in
150 Fig. 4-18(b). The MATLAB code for this figure is available on
the book website.
200
300
1 2
Exercise 4-9: The “image” is rotated counter-
3 4
350 ◦
clockwise 90 . What is the result?
399 2 4
0 50 100 150 200 250 300 350 399 Answer:
1 3
(b) Clown image rotated by 45˚
4 8
Figure 4-16 Clown image before and after rotation by 45◦ . Exercise 4-10: The “image” is magnified by a
12 16
factor of three. What is the result, using: (a) NN; (b) bilinear
interpolation?
154 CHAPTER 4 IMAGE INTERPOLATION
0 0
50 50
100 100
150 150
200 200
250 250
300 300
350 350
399 399
0 50 100 150 200 250 300 350 399 0 50 100 150 200 250 300 350 399
(a) Tn(n) = ne−|n|/ 300 and Tm(m) = me−|m|/ 300 (a) Tn(n) = n√|n|/ 25 and Tm(m) = m√|n|/ 25
0 0
50 50
100 100
150 150
200 200
250 250
300 300
350 350
399 399
0 50 100 150 200 250 300 350 399 0 50 100 150 200 250 300 350 399
(b) Tn(n) = ne−|n|/ 200 and Tm(m) = me−|m|/ 200 (b) Tn(n) = n|n|/ 300 and Tm(m) = m|m|/ 300
Figure 4-17 Nonlinear image warping with space constants of Figure 4-18 Clown image warped with (a) square-root transfor-
(a) 300 and (b) 200. mation and (b) inverse transformation.
4-10 EXAMPLES OF IMAGE INTERPOLATION APPLICATIONS 155
Answer:
(a)
4 4 4 8 8 8
4 4 4 8 8 8
4 4 4 8 8 8
12 12 12 16 16 16
12 12 12 16 16 16
12 12 12 16 16 16
(b)
1 2 1 2 4 2
2 4 2 4 8 4
1 2 1 2 4 2
3 6 3 4 8 4
6 12 6 8 16 8
3 6 3 4 8 4
156 CHAPTER 4 IMAGE INTERPOLATION
Summary
Concepts
• Interpolation is “connecting the dots” of 1-D samples, must be taken to preserve conjugate symmetry in the
and “filling in the gaps” of 2-D samples. 2-D DFT. Deleting rows and columns performs lowpass
• Interpolation can be used to rotate and to warp or filtering so that the downsampled image is not aliased.
“morph” images. • B-splines are piecewise polynomial functions that can be
• Upsampling an image can be performed by inserting used to interpolate samples in 1-D and 2-D.
rows and columns of zeros in the 2-D DFT of the image. • For N ≥ 2, computation of the coefficients {c[m]} from
Care must be taken to preserve conjugate symmetry in samples {x(m∆)} can be formulated as a deconvolution
the 2-D DFT. problem.
• Downsampling an image can be performed by deleting • 2-D interpolation using B-splines is a generalization of
rows and columns in the 2-D DFT of the image. Care 1-D interpolation using B-splines.
Mathematical Formulae
B-splines B-spline
2 3
βN (t) = rect(t) ∗ · · · ∗ rect(t) 2/3 − t + |t| /2 for |t| ≤ 1,
| {z } 3
N+1 times β3 (t) = (2 − |t|) /6 for 1 ≤ |t| ≤ 2,
B-spline 0 for |t| ≥ 2
(
1 for |t| < 1/2,
β0 (t) = rect(t) = B-spline 1-D interpolation
0 for |t| > 1/2 ∞ t
x(t) = ∑ c[m] βN −m
B-spline ( m=−∞ ∆
1 − |t| for |t| ≤ 1, Nearest-neighbor 1-D interpolation
β1 (t) =
0 for |t| ≥ 1 x(t) = x(n∆) for |t − n∆| < ∆/2
Linear 1-D interpolation
B-spline t t
2 x(t) = x(n∆) (n + 1) − + x((n + 1)∆) −n
3/4 − t for 0 ≤ |t| ≤ 1/2, ∆ ∆
β2 (t) = (3/2 − |t|)2/2 for 1/2 ≤ |t| ≤ 3/2, for n∆ ≤ t ≤ (n + 1)∆
0 for |t| ≥ 3/2
Important Terms Provide definitions or explain the meaning of the following terms:
B-spline downsampling interpolation Lanczos function nearest-neighbor thumbnail image upsampling
pling. Note that 64 is an even number. 4.14 Another way to derive the formula for linear in-
terpolation is as follows: The goal is to interpolate the
4.4 Write a MATLAB program that loads the 64 × 64 image in
four points { f (0, 0), f (1, 0), f (0, 1), f (1, 1)} using a formula
tinyletters.mat, deletes the last row and column to make
f (x, y) = f0 + f1 x + f2 y + f3 xy, where { f0 , f2 , f2 , f3 } are found
it 63 × 63, and magnifies it by four using upsampling. This is
from the given points. This extends to
easier than Problem 4.3 since 63 is an odd number.
{ f (n, m), f (n + 1, ), f (n, m + 1), f (n + 1, m + 1)}
Section 4-5: Downsampling
for any integers n, m.
4.5 Write a MATLAB program that loads the 200 × 200 image (a) Set up a linear system of equations with unknowns
in clown.mat, antialias lowpass filters it, and demagnifies { f0 , f1 , f2 , f3 } and knowns
it by four using downsampling. (This is how the image in
tinyclown.mat was created.) { f (0, 0), f (1, 0), f (0, 1), f (1, 1)}.
4.6 Repeat Problem 4.5, but skip the antialias lowpass filter.
(b) Solve the system to obtain a closed-form expression for
4.7 Write a MATLAB program that loads the 256 × 256 image f (x, y) as a function of { f (0, 0), f (0, 1), f (1, 0), f (1, 1)}.
in letters.mat, antialias lowpass filters it, and demagnifies
it by four using downsampling. (This is how the image in 4.15 The image
tinyletters.mat was created.) a b c
d e f
4.8 Repeat Problem 4.7, but skip the antialias lowpass filter. g h i
4.9 Show that if the sinc interpolation formula is used to
is rotated 90◦ clockwise. What is the result?
upsample an M × M image f [n, m] to an N × N image g[n, m] by
an integer factor L (so that N = ML), then g[nL, mL] = f [n, m], 4.16 The image
so that the values of f [n, m] are preserved after upsampling. a b c
d e f
Section 4-8: 2-D Spline Interpolation g h i
√
4.10 Write a MATLAB program that loads the 50 × 50 image is rotated 45◦ clockwise and magnified by 2 using linear
in tinyclown.mat and magnifies it by four using nearest- interpolation.