2D Transforms in Image Processing
A 2D transform changes an image from the spatial domain (pixels) to the
frequency domain, where each point represents a specific frequency
component.
Spatial domain: Image is represented in terms of pixel intensities.
Frequency domain: Image is represented in terms of sine and cosine waves
(or complex exponentials).
📘 2. 2D Discrete Fourier Transform (2D DFT)
🔹 Definition:
The 2D DFT converts an image into its frequency components using sine and
cosine waves (complex exponentials).
Key Properties:
Low-frequency components (e.g., smooth areas) are near the origin (u=0,
v=0).
High-frequency components (e.g., edges) are farther from the origin.
Magnitude gives amplitude; phase gives structure/detail.
Applications:
Image filtering (remove noise, blurring)
Edge detection
Image watermarking
Image reconstruction
2D Discrete Cosine Transform (2D DCT)
🔹 Definition:
The 2D DCT is similar to the DFT but uses only cosine functions, resulting in
real-valued output. It is highly effective for image compression.
Key Properties:
Energy compaction: Most image information is concentrated in a few low-
frequency coefficients.
High compression efficiency with minimal loss.
DCT coefficients are real, unlike DFT which gives complex values.
📷 Applications:
JPEG compression (DCT on 8x8 image blocks)
Image denoising
Feature extraction
Feature DFT DCT
Output Complex values Real values
Basis Sine & Cosine Only Cosine
Symmetry No Yes (even extension)
Compression Less efficient More efficient (used in JPEG)
Computation More complex Faster, simpler
Conclusion:
Use DFT when you need both magnitude and phase information (e.g.,
filtering, analysis).
Use DCT when you want compact representation of image (e.g.,
compression).
1. Two-Dimensional Fourier Transform
🔹 Line: “The 2D Fourier transform of f(x, y) is represented as F(u, v) = …”
F(u,v) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) \cdot e^{-j2\pi
(ux + vy)} dx dy
This is the continuous 2D Fourier Transform.
: the image in the spatial domain.
: the transformed image in the frequency domain.
: complex exponential basis function.
This tells us how much of a certain frequency component is present in
the image.
🔹 Line: “Inverse”
F(x,y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} F(u,v) \cdot e^{j2\pi
(ux + vy)} du dv
This converts the frequency representation back to the original image .
🔹 Line: “F(u,v) = R(u,v) + jI(u,v)”
Is a complex number.
: real part
: imaginary part
🔹 Line: “Amplitude Spectrum”
|F(u,v)| = \sqrt{R^2(u,v) + I^2(u,v)}
It’s similar to finding the length of a vector from its real and imaginary parts.
🟩 2. Phase and Power Spectrum
🔹 Phase Spectrum:
\phi(u,v) = \tan^{-1} \left( \frac{I(u,v)}{R(u,v)} \right)
Important for preserving the structure or edges of an image.
🔹 Power Spectrum:
P(u,v) = |F(u,v)|^2 = R^2(u,v) + I^2(u,v)
Used in image analysis and filtering.
🟩 3. 2D Transform – DFT
🔹 Line:
This is the Discrete Fourier Transform (DFT) for digital images (size ).
Converts spatial image to frequency domain.
🔹 Inverse DFT:
F(x,y) = \sum_{u=0}^{M-1} \sum_{v=0}^{N-1} F(u,v) \cdot e^{j2\pi\left(\
frac{ux}{M} + \frac{vy}{N}\right)}
Converts back from frequency domain to spatial domain.
🔹 If the image is sampled in a square array (M = N):
F(u,v) = \frac{1}{N} \sum_{x=0}^{N-1} \sum_{y=0}^{N-1} f(x,y) e^{-j2\
pi\left(\frac{ux + vy}{N}\right)}
Simplified version of DFT for square images.
Discrete Cosine Transform (DCT)
🔹 Purpose:
For mathematical convenience
To extract relevant information
Used widely in image compression (e.g., JPEG)
🔹 1D DCT Formula:
C(u) = \alpha(u) \sum_{x=0}^{N-1} f(x) \cos \left( \frac{(2x+1)u\pi}{2N} \
right)
Where:
\alpha(u) = \begin{cases} \sqrt{\frac{1}{N}}, & u = 0 \\ \sqrt{\frac{2}
{N}}, & u > 0 \end{cases}
🔹 Inverse DCT:
F(x) = \sum_{u=0}^{N-1} \alpha(u) C(u) \cos \left( \frac{(2x+1)u\pi}{2N} \
right)
Reconstructs original signal from DCT coefficients.
🔹 Properties of DCT:
✔️Real transform: No complex numbers.
✔️Excellent energy compaction: Most information is in the first few
coefficients.
✔️Fast computation: Efficient algorithms available.
✔️Widely used: Especially in image compression (JPEG standard).
Why Use These Spectra in Image Processing?
When we apply the Fourier Transform (FT) to an image, we convert it from
the spatial domain (pixels) to the frequency domain. In the frequency
domain, the output is a complex number at each frequency point.
A complex number has:
A magnitude (length)
A phase (angle)
From this complex output, we derive:
Amplitude spectrum → shows how strong each frequency is.
Phase spectrum → shows where (position or structure) those frequencies
occur.
Power spectrum → shows the energy at each frequency.
🟩 1. Amplitude Spectrum
🔹 What is it?
|F(u,v)| = \sqrt{R^2(u,v) + I^2(u,v)}
: real part
: imaginary part
Represents the magnitude of each frequency component.
🔹 Why is it useful?
It tells how much of each frequency is present in the image.
Useful for understanding the contrast, edges, and repetitive patterns.
Important in filtering and enhancement.
Example: A high amplitude at high frequency → sharp edges in the
image.
🟩 2. Phase Spectrum
🔹 What is it?
\phi(u,v) = \tan^{-1} \left( \frac{I(u,v)}{R(u,v)} \right)
Measures the angle or orientation of the frequency component.
Indicates the position and alignment of features in the image.
🔹 Why is it useful?
It preserves the structure and geometry of the image.
Without the phase spectrum, the reconstructed image loses its identity.
Most important for reconstructing an image’s shape and layout.
Fact: If you keep only phase and discard amplitude, the image is still
recognizable. But if you keep only amplitude and discard phase, it
becomes unrecognizable!
🟩 3. Power Spectrum
🔹 What is it?
P(u,v) = |F(u,v)|^2 = R^2(u,v) + I^2(u,v)
It’s the square of the amplitude spectrum.
Represents the energy at each frequency component.
🔹 Why is it useful?
Shows how much energy (information) is carried by different frequencies.
Helps identify dominant frequency components.
Useful for noise analysis, texture analysis, and image classification.
🟦 Summary Table
Spectrum Formula Represents Use
Amplitude Strength of frequency Contrast, filtering
Phase Position/structure Reconstruction, shape
Power Energy content Texture, noise analysis
🔍 Real-World Use Case Example (JPEG Compression):
DCT is used instead of FT, but similar idea:
Keep high amplitude coefficients → important image info
Ignore small power components → helps in compression
Preserve phase or position → retains image structure