0% found this document useful (0 votes)
7 views17 pages

Image Processing Notes

This document covers essential image processing techniques, including spatial and frequency domain filtering, detection, and segmentation. It provides theoretical foundations, mathematical equations, practical examples, and discussions on the advantages and disadvantages of various methods. The content is tailored for academic understanding and exam preparation, facilitating confidence in answering conceptual and comparative questions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views17 pages

Image Processing Notes

This document covers essential image processing techniques, including spatial and frequency domain filtering, detection, and segmentation. It provides theoretical foundations, mathematical equations, practical examples, and discussions on the advantages and disadvantages of various methods. The content is tailored for academic understanding and exam preparation, facilitating confidence in answering conceptual and comparative questions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Image Processing: Spatial

and Frequency Domain


Filtering, Detection, and
Segmentation
This document provides a detailed explanation of key image processing techniques, including Spatial Domain Filtering,
Frequency Domain Filtering, Point, Line, and Edge Detection, Thresholding, and Region-Based Segmentation.
Each section includes theory, mathematical equations, practical examples with image matrices, advantages,
disadvantages, use cases, and justifications for choosing one method over another. The content is designed for
academic understanding and exam preparation, enabling you to answer conceptual and comparative questions
confidently.

2.1 Spatial Domain Filtering


The Mechanics of Spatial Filtering
Theory: Spatial domain filtering manipulates pixel values directly by applying a filter mask (kernel) over a neighborhood.
The mask slides across the image, computing a weighted sum (or other operation) of pixel values to produce the output
pixel. This localized approach is intuitive and computationally efficient for small masks.

Mathematical Representation: For an image ( f(x, y) ) and a filter mask ( w(s, t) ) of size ( m \times n ), the filtered
image ( g(x, y) ) is: [ g(x, y) = \sum_ \sum_ w(s, t) f(x+s, y+t) ] where ( a = \lfloor m/2 \rfloor ), ( b = \lfloor n/2 \rfloor ). The
mask is centered at ((x, y)), and border pixels may require padding (e.g., zero-padding or replication).

Mechanics:

1. Center the mask at pixel ((x, y)).


2. Multiply neighborhood pixel values by mask weights.
3. Sum the results to compute ( g(x, y) ).
4. Slide the mask to the next pixel and repeat.

Example: Consider a ( 4 \times 4 ) image and a ( 3 \times 3 ) averaging filter: [ f = \begin 10 & 20 & 30 & 40 \ 50 & 60 &
70 & 80 \ 90 & 100 & 110 & 120 \ 130 & 140 & 150 & 160 \end, \quad w = \frac{1}{9} \begin 1 & 1 & 1 \ 1 & 1 & 1 \ 1 & 1 &
1 \end ] Compute ( g(1, 1) ): [ g(1, 1) = \frac{1}{9} (10 + 20 + 30 + 50 + 60 + 70 + 90 + 100 + 110) = \frac{540}{9} = 60 ]
For border pixels, assume zero-padding or replicate edge values.
Advantages:

Simple to implement and computationally efficient for small masks.


Intuitive for localized processing, ideal for real-time applications.
Flexible mask design for various effects (e.g., smoothing, sharpening).

Disadvantages:

Limited to local neighborhoods, missing global image properties.


Sensitive to mask size and shape, which can affect results.
Border handling (padding) may introduce artifacts.

Use Cases:

Noise reduction in digital photography.


Edge enhancement in medical imaging.
Preprocessing for feature extraction in computer vision.

Smoothing Spatial Filters


Smoothing filters reduce noise and intensity variations, producing a blurred effect. They are divided into linear (e.g.,
averaging filter) and non-linear (e.g., median filter).

Linear Filters: Averaging Filter

Theory: The averaging filter replaces each pixel with the weighted average of its neighborhood, smoothing out high-
frequency components like noise. Weights are typically equal but can vary (e.g., Gaussian weights).

Equation: For a ( 3 \times 3 ) averaging filter: [ g(x, y) = \frac{1}{9} \sum_{1} \sum_{1} f(x+s, y+t) ] Mask: [ w = \frac{1}{9}
\begin 1 & 1 & 1 \ 1 & 1 & 1 \ 1 & 1 & 1 \end ]

Example: Using the ( 4 \times 4 ) image above, apply the averaging filter at ((1, 1)): [ g(1, 1) = 60 \quad (\text) ] The
output image is smoother, with reduced intensity variations.

Advantages:

Simple and fast, ideal for real-time processing.


Effective for Gaussian noise reduction.
Reduces high-frequency components without complex computations.

Disadvantages:

Blurs edges, reducing image sharpness and detail.


Ineffective against impulse noise (e.g., salt-and-pepper).
Uniform weights may oversmooth textured regions.

Use Cases:
Preprocessing for edge detection to reduce noise.
Smoothing low-resolution images for aesthetic purposes.
Noise reduction in video frames.

Order-Statistic Filters: Median Filter

Theory: The median filter, a non-linear method, replaces each pixel with the median value of its neighborhood. It excels
at removing impulse noise (salt-and-pepper) while preserving edges.

Equation: For a ( 3 \times 3 ) neighborhood, sort the 9 pixel values and select the median (5th value in sorted order).

Example: Consider an image with salt-and-pepper noise: [ f = \begin 10 & 20 & 255 & 40 \ 50 & 60 & 0 & 80 \ 90 & 100 &
110 & 120 \ 130 & 140 & 150 & 160 \end ] At ((1, 1)), the ( 3 \times 3 ) neighborhood is: [ [10, 20, 255, 50, 60, 0, 90, 100,
110] ] Sorted: ([0, 10, 20, 50, 60, 90, 100, 110, 255]), median = 60. Thus, ( g(1, 1) = 60 ).

Advantages:

Robust against impulse noise, eliminating outliers.


Preserves edges better than averaging filters.
Effective for non-Gaussian noise distributions.

Disadvantages:

Computationally expensive due to sorting.


May alter fine textures if overused or with large windows.
Less effective for Gaussian noise compared to averaging.

Use Cases:

Noise removal in medical imaging (e.g., MRI, CT scans).


Preprocessing for object detection where edge preservation is critical.
Cleaning salt-and-pepper noise in satellite imagery.

Application of Median Filtering for Noise Removal: Median filtering is ideal for impulse noise because it discards
extreme values (e.g., 255, 0 in the example), unlike averaging filters, which blend outliers into the output, causing
blurring. For instance, in the above image, the median filter restores the pixel at ((1, 2)) to a value consistent with its
neighbors, preserving the image’s structure.

Justification (Median vs. Averaging):

Why Median over Averaging? Median filters remove impulse noise without blurring edges, making them suitable
for images with salt-and-pepper noise. Averaging filters are better for Gaussian noise but compromise edge
details.
Trade-off: Median filters are slower due to sorting but offer superior edge preservation.
Use Case Example: In medical imaging, median filters are preferred for MRI scans with impulse noise to
maintain anatomical boundaries, while averaging filters suit smoother noise in X-rays.
Sharpening Spatial Filters
Sharpening filters enhance edges and details by amplifying high-frequency components.

The Laplacian

Theory: The Laplacian, a second-order derivative operator, highlights regions of rapid intensity change (edges) by
computing the difference between a pixel and its neighbors.

Equation (Discrete Laplacian): [ \nabla2 f(x, y) = f(x+1, y) + f(x-1, y) + f(x, y+1) + f(x, y-1) - 4f(x, y) ] Mask: [ w = \begin 0 & 1 & 0 \ 1 & -4
& 1 \ 0 & 1 & 0 \end ] Sharpened image: [ g(x, y) = f(x, y) - c \cdot \nabla
2 f(x, y) \quad (c > 0 \text) ]

Example: For the ( 4 \times 4 ) image at ((1, 1) = 60): [ \nabla^2 f(1, 1) = (20 + 100 + 50 + 70 - 4 \cdot 60) = 40 - 240 =
-200 ] If ( c = 1 ): [ g(1, 1) = 60 - (-200) = 260 \quad (\text) ] Negative values indicate edge transitions, requiring
normalization for display.

Advantages:

Simple and isotropic (direction-independent).


Effective for enhancing edges and fine details.
Highlights all edge orientations equally.

Disadvantages:

Sensitive to noise, amplifying it alongside edges.


Produces negative values, requiring scaling or clipping.
Limited control over sharpening strength.

Use Cases:

Edge enhancement in medical imaging (e.g., X-ray bone structure).


Text sharpening in scanned documents.
Preprocessing for feature detection in computer vision.

Unsharp Masking and Highboost Filtering

Theory: Unsharp masking enhances edges by subtracting a blurred version of the image from the original, amplifying
high-frequency components. Highboost filtering extends this by increasing the amplification factor.

Equation: [ g(x, y) = f(x, y) + k \cdot [f(x, y) - f_{\text}(x, y)] ] where ( f_{\text} ) is the blurred image (e.g., via averaging
filter), and ( k \geq 0 ). If ( k > 1 ), it’s highboost filtering.

Example: Blur the ( 4 \times 4 ) image using the averaging filter: [ f_{\text}(1, 1) = 60 ] Compute the mask: [ \text = f -
f_{\text}, \quad \text(1, 1) = 60 - 60 = 0 ] If ( k = 1 ): [ g(1, 1) = 60 + 1 \cdot 0 = 60 ] For stronger edges, increase ( k ).

Advantages:

Flexible control over sharpening via ( k ).


Preserves low-frequency content (background).
Effective for subtle edge enhancement.

Disadvantages:

Amplifies noise if ( k ) is too high.


Requires careful tuning of ( k ) and blur parameters.
Blurring step adds computational overhead.

Use Cases:

Enhancing details in digital photography.


Preprocessing for texture analysis in industrial inspection.
Improving visual clarity in video processing.

First-Order Derivatives: The Gradient (Sobel, Prewitt, Roberts)

Theory: First-order derivatives detect edges by computing the gradient, which measures the rate of intensity change.

Gradient: [ \nabla f = \begin G_x \ G_y \end = \begin \frac{\partial f}{\partial x} \ \frac{\partial f}{\partial y} \end ] Edge
magnitude: [ |\nabla f| = \sqrt{G_x2 + G_y2} \approx |G_x| + |G_y| \quad (\text) ] Direction: [ \theta = \tan^{-1}(G_y / G_x) ]

Sobel Operator: [ G_x = \begin -1 & 0 & 1 \ -2 & 0 & 2 \ -1 & 0 & 1 \end, \quad G_y = \begin -1 & -2 & -1 \ 0 & 0 & 0 \ 1 &
2 & 1 \end ]

Prewitt Operator: [ G_x = \begin -1 & 0 & 1 \ -1 & 0 & 1 \ -1 & 0 & 1 \end, \quad G_y = \begin -1 & -1 & -1 \ 0 & 0 & 0 \ 1
& 1 & 1 \end ]

Roberts Operator: [ G_x = \begin 1 & 0 \ 0 & -1 \end, \quad G_y = \begin 0 & 1 \ -1 & 0 \end ]

Example (Sobel): For the ( 4 \times 4 ) image at ((1, 1) = 60): [ G_x = (30 \cdot 1 + 70 \cdot 2 + 110 \cdot 1) - (10 \cdot 1
+ 50 \cdot 2 + 90 \cdot 1) = 280 - 200 = 80 ] [ G_y = (90 \cdot 1 + 100 \cdot 2 + 110 \cdot 1) - (10 \cdot 1 + 20 \cdot 2 + 30
\cdot 1) = 400 - 80 = 320 ] [ |\nabla f| \approx 80 + 320 = 400, \quad \theta = \tan^{-1}(320/80) \approx 76^\circ ]

Advantages (Sobel vs. Prewitt vs. Roberts):

Sobel: Weighted center pixels reduce noise impact, ideal for noisy images.
Prewitt: Simpler and faster but less robust to noise.
Roberts: Compact ( 2 \times 2 ) mask, good for diagonal edges, but noise-sensitive.

Disadvantages:

All are noise-sensitive; Sobel is least affected.


Roberts struggles with non-diagonal edges due to small mask.
Prewitt’s uniform weights make it less effective in noisy conditions.

Use Cases:

Sobel: Edge detection in autonomous driving for lane detection.


Prewitt: Simple edge detection in low-noise environments.
Roberts: Edge detection in resource-constrained devices (e.g., embedded systems).

Justification (Sobel vs. Prewitt vs. Roberts):

Sobel over Prewitt: Sobel’s weighted center improves noise robustness, making it suitable for real-world images.
Sobel over Roberts: Sobel’s ( 3 \times 3 ) mask captures more context, enhancing edge detection accuracy.
Roberts Use Case: Preferred in low-compute environments due to its smaller mask, but less reliable for complex
edges.

2.2 Frequency Domain Filtering


Introduction to 2-D DFT and Its Application
Theory: The 2-D Discrete Fourier Transform (DFT) transforms an image from the spatial domain to the frequency
domain, representing it as a sum of sinusoids with varying frequencies. This enables filtering by modifying frequency
components.

Equation: For an ( M \times N ) image ( f(x, y) ): [ F(u, v) = \sum_ \sum_ f(x, y) e^{-j2\pi (ux/M + vy/N)} ] Inverse DFT: [ f(x,
y) = \frac{1} \sum_ \sum_ F(u, v) e^{j2\pi (ux/M + vy/N)} ] where ( u, v ) are frequency variables, and ( F(u, v) ) is the
complex frequency spectrum.

Application:

Low frequencies (near ((0, 0))): Represent smooth regions (e.g., background).
High frequencies (far from center): Represent edges, noise, and details.
Filtering modifies ( F(u, v) ) to enhance or suppress specific frequencies.

Example: For a ( 2 \times 2 ) image: [ f = \begin 1 & 2 \ 3 & 4 \end ] Compute ( F(0, 0) ): [ F(0, 0) = 1 \cdot e0 + 2 \cdot e0 +
3 \cdot e0 + 4 \cdot e0 = 1 + 2 + 3 + 4 = 10 ] (Full DFT requires computing all ( u, v ), typically via FFT for efficiency.)

Wavelet Transform: Wavelet transforms decompose an image into multi-resolution subbands (low and high frequencies)
using localized basis functions (wavelets), capturing both spatial and frequency information.

Equation (1-D for simplicity): [ W(a, b) = \int f(t) \psi_{a,b}(t) dt, \quad \psi_{a,b}(t) = \frac{1}{\sqrt} \psi\left(\frac\right) ]
where ( a ) is scale, ( b ) is translation, and ( \psi ) is the wavelet function.

Haar Transform: The simplest wavelet transform, using a step function: [ \psi(t) = \begin 1 & 0 \leq t < 0.5 \ -1 & 0.5 \leq t
< 1 \ 0 & \text \end ] For a 1-D signal ([a, b]):

Average: ((a + b)/\sqrt{2})


Difference: ((a - b)/\sqrt{2})

Example (Haar): For ([1, 3]): [ \text = \frac{1+3}{\sqrt{2}} = 2\sqrt{2}, \quad \text = \frac{1-3}{\sqrt{2}} = -\sqrt{2} ] For a 2-
D image, apply Haar transform row-wise, then column-wise.
Advantages (Wavelet/Haar):

Captures spatial and frequency information simultaneously.


Haar is computationally simple and orthogonal.
Wavelets enable multi-resolution analysis, ideal for compression.

Disadvantages:

Wavelets are complex for large images or real-time processing.


Haar’s step function lacks smoothness, limiting its effectiveness for natural images.
Requires careful selection of wavelet basis for specific tasks.

Use Cases:

Image compression (e.g., JPEG2000 uses wavelets).


Denoising in signal processing.
Feature extraction in texture analysis.

Justification (Wavelet vs. DFT):

Wavelet over DFT: Wavelets provide localized frequency information, making them better for compression and
denoising. DFT is global and suited for filtering.
Haar Use Case: Preferred in low-compute environments due to simplicity, but less effective for smooth transitions
compared to other wavelets (e.g., Daubechies).

2.3 Frequency Domain Filtering


Fundamentals
Fourier Spectrum and Phase Angle
Theory: The Fourier transform ( F(u, v) ) is complex: [ F(u, v) = R(u, v) + jI(u, v) ]

Magnitude (Spectrum): ( |F(u, v)| = \sqrt{R2 + I2} ), represents frequency amplitudes.


Phase Angle: ( \phi(u, v) = \tan^{-1}(I/R) ), preserves structural information (e.g., edge locations).
The spectrum determines intensity, while phase governs spatial arrangement.

Steps for Filtering in the Frequency Domain


1. Compute the DFT of the image: ( F(u, v) ).
2. Design a filter transfer function ( H(u, v) ) (e.g., low-pass, high-pass).
3. Multiply: ( G(u, v) = H(u, v) \cdot F(u, v) ).
4. Compute the inverse DFT to obtain the filtered image ( g(x, y) ).

Correspondence Between Spatial and Frequency Domains


Theory: Spatial convolution is equivalent to frequency domain multiplication: [ f(x, y) * w(x, y) \leftrightarrow F(u, v) \cdot
W(u, v) ]

Small spatial masks affect a wide range of frequencies.


Large spatial masks correspond to narrow frequency filters.

Example: A ( 3 \times 3 ) averaging filter in the spatial domain acts as a low-pass filter in the frequency domain,
attenuating high frequencies (edges, noise).

Frequency Domain Filters


Smoothing Filters (Low-Pass)

Ideal Low-Pass Filter: [ H(u, v) = \begin 1 & D(u, v) \leq D_0 \ 0 & \text \end ] where ( D(u, v) = \sqrt{u2 + v2} ),
and ( D_0 ) is the cutoff frequency.

Disadvantages: Sharp cutoff causes ringing artifacts (Gibbs phenomenon).

Use Case: Basic noise removal, but rarely used due to artifacts.

Butterworth Low-Pass Filter: [ H(u, v) = \frac{1}{1 + [D(u, v)/D_0]^{2n}} ] where ( n ) is the order, controlling
transition sharpness.

Advantages: Smooth transition reduces ringing.

Use Case: General-purpose smoothing with controlled cutoff.

Gaussian Low-Pass Filter: [ H(u, v) = e^{-D2(u, v)/(2\sigma2)} ]

Advantages: Smooth attenuation, no ringing, natural results.

Use Case: Preferred for noise reduction in natural images.

Sharpening Filters (High-Pass)

Ideal High-Pass Filter: [ H(u, v) = \begin 0 & D(u, v) \leq D_0 \ 1 & \text \end ]

Disadvantages: Ringing artifacts due to sharp cutoff.

Use Case: Edge enhancement, but rarely used.

Butterworth High-Pass Filter: [ H(u, v) = \frac{1}{1 + [D_0/D(u, v)]^{2n}} ]

Advantages: Controlled sharpening, reduced artifacts.

Use Case: Enhancing details with tunable parameters.

Gaussian High-Pass Filter: [ H(u, v) = 1 - e^{-D2(u, v)/(2\sigma2)} ]


Advantages: Smooth edge enhancement, no ringing.

Use Case: Enhancing fine details in medical or scientific imaging.

Example (Gaussian Low-Pass): For a ( 4 \times 4 ) image, compute the DFT, apply: [ H(u, v) = e^{-(u2 +
v
2)/(2\sigma^2)}, \quad \sigma = 1 ] Multiply with ( F(u, v) ), then compute the inverse DFT to get a smoothed image.

Advantages (Frequency Domain Filtering):

Effective for global effects (e.g., periodic noise removal).


Precise control over frequency components.
Ideal for analyzing image frequency content.

Disadvantages:

Computationally intensive (DFT/FFT).


Less intuitive than spatial filtering.
Requires padding to avoid wraparound artifacts.

Use Cases:

Removing periodic noise in scanned images.


Enhancing textures in remote sensing.
Analyzing frequency content in signal processing.

Justification (Gaussian vs. Butterworth vs. Ideal):

Gaussian over Ideal: Gaussian’s smooth transition avoids ringing, producing natural results.
Gaussian over Butterworth: Simpler to implement (no order parameter), widely used for noise reduction.
Butterworth Use Case: Preferred when precise control over transition steepness is needed (e.g., specialized
filtering).
Ideal Limitation: Rarely used due to artifacts, but useful for theoretical studies.

4.1 Point, Line, and Edge Detection


Detection of Isolated Points
Theory: Isolated points are pixels with intensities significantly different from their neighbors, often indicating noise or
small features. Detection uses masks to highlight high-contrast pixels.

Equation: For a ( 3 \times 3 ) mask: [ w = \begin -1 & -1 & -1 \ -1 & 8 & -1 \ -1 & -1 & -1 \end ] Response: [ R(x, y) =
\sum_{1} \sum_{1} w(s, t) f(x+s, y+t) ] A pixel is an isolated point if ( |R(x, y)| > T ).

Example: For a ( 4 \times 4 ) image: [ f = \begin 10 & 10 & 10 & 10 \ 10 & 100 & 10 & 10 \ 10 & 10 & 10 & 10 \ 10 & 10 &
10 & 10 \end ] At ((1, 1) = 100): [ R(1, 1) = (-1)(10 + 10 + 10 + 10 + 10 + 10 + 10 + 10) + 8 \cdot 100 = -80 + 800 = 720 ] If
( T = 500 ), ( |720| > 500 ), so ((1, 1)) is an isolated point.
Advantages:

Simple and computationally efficient.


Effective for detecting anomalies (e.g., noise, defects).
Works well in low-noise environments.

Disadvantages:

Sensitive to noise, which may be mistaken for points.


Threshold selection is critical and image-dependent.
Limited to small, high-contrast features.

Use Cases:

Defect detection in manufacturing (e.g., circuit boards, textiles).


Noise identification in astronomical imaging.
Preprocessing for feature detection in computer vision.

Line Detection
Theory: Line detection identifies linear structures (e.g., roads, boundaries) using directional masks tuned to specific
orientations (horizontal, vertical, diagonal). The mask with the strongest response indicates the line’s presence and
orientation.

Equation: Common ( 3 \times 3 ) masks:

Horizontal: [ w_h = \begin -1 & -1 & -1 \ 2 & 2 & 2 \ -1 & -1 & -1 \end ]
Vertical: [ w_v = \begin -1 & 2 & -1 \ -1 & 2 & -1 \ -1 & 2 & -1 \end ]
Diagonal (+45°): [ w_ = \begin -1 & -1 & 2 \ -1 & 2 & -1 \ 2 & -1 & -1 \end ] Apply each mask and select the
orientation with maximum ( |R(x, y)| ).

Example: For the image: [ f = \begin 10 & 10 & 10 & 10 \ 10 & 100 & 100 & 10 \ 10 & 100 & 100 & 10 \ 10 & 10 & 10 & 10
\end ] Apply the vertical mask at ((1, 1) = 100): [ R_v(1, 1) = (-1)(10 + 10 + 10) + 2(100 + 100 + 100) - (10 + 10 + 10) =
-30 + 600 - 30 = 540 ] The high response suggests a vertical line.

Advantages:

Effective for detecting oriented linear features.


Simple to implement with small masks.
Tunable for specific line orientations.

Disadvantages:

Limited to predefined orientations (e.g., 0°, 45°, 90°).


Sensitive to noise and line thickness variations.
May miss curved or complex lines.

Use Cases:
Road detection in satellite imagery.
Blood vessel detection in angiography.
Crack detection in infrastructure inspection.

Edge Models
Theory: Edges are boundaries between regions with different intensities. Common edge models include:

Step Edge: Abrupt intensity change (ideal but rare in real images).
Ramp Edge: Gradual change over several pixels (common due to blur).
Roof Edge: Intensity peak (e.g., thin lines or ridges). Edge detection uses derivatives to identify these transitions:
First-Order Derivatives (Gradient): Measure intensity change magnitude and direction.
Second-Order Derivatives (Laplacian): Detect zero-crossings at edges.

Canny’s Edge Detection Algorithm


Theory: Canny’s algorithm is a robust edge detection method that optimizes noise reduction, edge localization, and
single-edge response.

Steps:

1. Smoothing: Apply a Gaussian filter to reduce noise: [ G(x, y) = \frac{1}{2\pi\sigma2} e{-(x2 + y2)/(2\sigma^2)} ]
Convolve: ( f_{\text} = f * G ).
2. Gradient Computation: Use Sobel operators to compute gradient magnitude ( M(x, y) ) and direction ( \theta(x,

y) ): [ M(x, y) = \sqrt{G_x2 + G_y2}, \quad \theta(x, y) = \tan^{-1}(G_y / G_x) ]


3. Non-Maximum Suppression: Thin edges by suppressing non-maximum gradient values along the gradient
direction.
4. Double Thresholding: Apply thresholds ( T_{\text}, T_{\text} ) to classify edges as strong, weak, or non-edges.
5. Edge Tracking by Hysteresis: Connect weak edges to strong edges if contiguous.

Example: For the image: [ f = \begin 50 & 50 & 50 & 50 \ 50 & 100 & 100 & 50 \ 50 & 100 & 100 & 50 \ 50 & 50 & 50 & 50
\end ]

Step 1: Apply Gaussian blur (( \sigma = 1 )) to smooth the image.


Step 2: Compute Sobel gradients at ((1, 1) = 100): [ G_x = (100 \cdot 1 + 100 \cdot 2 + 100 \cdot 1) - (50 \cdot 1
+ 50 \cdot 2 + 50 \cdot 1) = 400 - 200 = 200 ] [ G_y = (50 \cdot 1 + 100 \cdot 2 + 100 \cdot 1) - (50 \cdot 1 + 50
\cdot 2 + 50 \cdot 1) = 350 - 200 = 150 ] [ M = \sqrt{2002 + 1502} \approx 250, \quad \theta \approx 36.87^\circ ]
Step 3: Suppress non-maxima along ( \theta ).
Step 4: Apply ( T_{\text} = 100, T_{\text} = 200 ); ( M = 250 ) is a strong edge.
Step 5: Trace connected edges to form continuous boundaries.

Advantages:

Robust to noise due to Gaussian smoothing.


Precise edge localization via non-maximum suppression.
Connects fragmented edges through hysteresis, improving continuity.

Disadvantages:

Computationally intensive due to multiple steps.


Requires tuning of ( \sigma ), ( T_{\text} ), and ( T_{\text} ).
May miss low-contrast edges if thresholds are poorly set.

Use Cases:

Object boundary detection in robotics and autonomous vehicles.


Edge detection in medical imaging (e.g., MRI, ultrasound).
Contour extraction in computer vision applications.

Justification (Canny vs. Sobel):

Canny over Sobel: Canny’s noise reduction, edge thinning, and hysteresis make it more reliable for noisy or
complex images. Sobel is faster but produces thicker, noisier edges.
Sobel Use Case: Preferred for simple, low-noise images or real-time applications where speed is critical.

Edge Linking: Local Processing and Boundary Detection


Local Processing: Theory: Local edge linking connects edge points based on proximity and gradient similarity.

Method: For each edge pixel, check neighbors in a ( 3 \times 3 ) window. Link if gradient direction and magnitude
are similar.
Equation: Link pixels ((x_1, y_1)) and ((x_2, y_2)) if: [ |\theta(x_1, y_1) - \theta(x_2, y_2)| < T_\theta \quad \text
\quad |M(x_1, y_1) - M(x_2, y_2)| < T_M ]

Boundary Detection Using Regional Processing (Polygonal Fitting): Theory: Fits polygons to edge points to form
closed boundaries, simplifying complex edge contours.

Method: Group edge points into segments, then fit straight lines or curves using least-squares.
Equation (Line Fitting): For points ((x_i, y_i)), minimize: [ E = \sum_i (y_i - (mx_i + c))^2 ] Solve for slope ( m )
and intercept ( c ).

Example: For edge points ((1, 1), (1, 2), (2, 1), (2, 2)), group into a square and fit lines to form a polygon. For a line
segment ((1, 1), (1, 2)), the fitted line is vertical (( x = 1 )).

Advantages:

Local Processing: Simple, fast, and effective for small gaps.


Polygonal Fitting: Produces smooth, closed boundaries, ideal for shape analysis.
Combines local and global information for robust linking.

Disadvantages:

Local Processing: Fails with large gaps or noisy edges.


Polygonal Fitting: Oversimplifies complex shapes, losing fine details.
Both require parameter tuning (e.g., thresholds, segment size).

Use Cases:

Object contour extraction in computer vision.


Shape analysis in industrial inspection (e.g., part recognition).
Boundary detection in geographic information systems.

4.2 Thresholding
Foundation
Theory: Thresholding segments an image into regions (e.g., object vs. background) by comparing pixel intensities to a
threshold, producing a binary or labeled image.

Equation: For an image ( f(x, y) ): [ g(x, y) = \begin 1 & \text f(x, y) \geq T \ 0 & \text \end ] where ( T ) is the threshold.

Role of Illumination and Reflectance


Illumination: Affects overall brightness, causing uneven intensity distributions (e.g., shadows).
Reflectance: Determines how objects reflect light, influencing contrast between regions.
Challenge: Non-uniform illumination requires adaptive thresholding to handle varying lighting conditions.

Basic Global Thresholding


Theory: Global thresholding applies a single threshold to the entire image, assuming a bimodal intensity histogram (e.g.,
distinct object and background).

Algorithm:

1. Select an initial threshold ( T ) (e.g., mean intensity).


2. Segment into two groups: ( G_1 (f(x, y) \geq T) ) and ( G_2 (f(x, y) < T) ).
3. Compute means ( m_1 ) and ( m_2 ) of ( G_1 ) and ( G_2 ).
4. Update ( T = (m_1 + m_2) / 2 ).
5. Repeat until ( T ) converges.

Example: For the image: [ f = \begin 50 & 50 & 100 & 100 \ 50 & 50 & 100 & 100 \ 50 & 50 & 100 & 100 \ 50 & 50 & 100
& 100 \end ]

Initial ( T = (50 \cdot 8 + 100 \cdot 8) / 16 = 75 ).


( G_1 ( \geq 75) ): all 100s, ( m_1 = 100 ).
( G_2 ( < 75) ): all 50s, ( m_2 = 50 ).
New ( T = (100 + 50) / 2 = 75 ). Converged.
Output: [ g = \begin 0 & 0 & 1 & 1 \ 0 & 0 & 1 & 1 \ 0 & 0 & 1 & 1 \ 0 & 0 & 1 & 1 \end ]
Advantages:

Simple and computationally efficient.


Effective for images with bimodal histograms and uniform illumination.
Suitable for real-time applications.

Disadvantages:

Fails with non-uniform illumination or complex histograms.


Sensitive to noise, which can distort the histogram.
Assumes distinct intensity classes, limiting applicability.

Use Cases:

Document binarization for text extraction (e.g., OCR).


Object segmentation in controlled lighting (e.g., industrial inspection).
Background separation in simple scenes.

Justification (Global vs. Adaptive Thresholding):

Global over Adaptive: Faster and sufficient for uniform illumination and bimodal histograms.
Adaptive Use Case: Necessary for images with varying lighting (e.g., outdoor scenes, medical images).
Trade-off: Global thresholding is simpler but less robust; adaptive thresholding handles complex lighting but is
slower.

4.3 Region-Based Segmentation


Region Growing
Theory: Region growing starts with seed pixels and expands regions by adding neighboring pixels that satisfy a similarity
criterion (e.g., intensity or texture).

Algorithm:

1. Select seed pixels (manually or automatically, e.g., intensity peaks).


2. For each seed, add neighbors if: [ |f(x, y) - f(x_s, y_s)| < T ] or if the region mean ( \mu_R ) satisfies: [ |f(x, y) -
\mu_R| < T ]
3. Update region mean and continue until no more pixels can be added.

Example: For the image: [ f = \begin 50 & 52 & 100 & 102 \ 51 & 50 & 101 & 100 \ 50 & 51 & 100 & 101 \ 100 & 100 &
102 & 101 \end ]

Seed at ((0, 0) = 50), ( T = 5 ).


Neighbors ((0, 1) = 52), ((1, 0) = 51), ((1, 1) = 50) satisfy ( |f - 50| < 5 ).
Region: {(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)}.
Stop at ((3, 0) = 100), as ( |100 - 50| > 5 ).
Advantages:

Produces connected, homogeneous regions.


Flexible similarity criteria (e.g., intensity, texture).
Effective for small, well-defined regions.

Disadvantages:

Sensitive to seed selection, which affects results.


Computationally intensive for large images or complex criteria.
May merge distinct regions if threshold is too loose.

Use Cases:

Tumor segmentation in medical imaging (e.g., MRI).


Object extraction in homogeneous regions (e.g., agriculture).
Region labeling in scene understanding.

Region Splitting and Merging


Theory: This hierarchical method divides an image into regions and merges similar ones, often using a quadtree
structure for efficiency.

Algorithm (Quadtree Approach):

1. Splitting:
Divide the image into four quadrants.
For each quadrant, check homogeneity (e.g., variance ( \sigma^2 < T )).
If not homogeneous, split further.
2. Merging:
Merge adjacent regions if: [ |\mu_ - \mu_| < T ] where ( \mu_, \mu_ ) are region means.

Example: For the image: [ f = \begin 50 & 50 & 100 & 100 \ 50 & 50 & 100 & 100 \ 50 & 50 & 150 & 150 \ 50 & 50 & 150
& 150 \end ]

Split: Top-left quadrant (50s) is homogeneous (( \sigma^2 \approx 0 )); bottom-right splits into 100s and 150s due
to high variance.
Merge: Top-left remains; bottom-right forms two regions (100s and 150s).
Output: Three regions: {50s}, {100s}, {150s}.

Advantages:

Systematic and hierarchical, handling complex images.


Balances global and local information.
Robust to noise with appropriate homogeneity criteria.

Disadvantages:
Computationally expensive, especially for deep quadtrees.
May produce blocky regions due to rectangular splits.
Requires tuning of homogeneity and merging thresholds.

Use Cases:

Land cover classification in remote sensing.


Scene segmentation in autonomous driving.
Object grouping in complex images.

Justification (Region Growing vs. Splitting/Merging):

Region Growing over Splitting/Merging: Intuitive for small, homogeneous regions but sensitive to seed
selection and less systematic.
Splitting/Merging Use Case: Preferred for large, varied images due to its hierarchical approach, ensuring robust
segmentation.
Trade-off: Region growing is faster for targeted regions; splitting/merging is more comprehensive but slower.

Exam-Ready Tips and Justifications


1. Spatial vs. Frequency Domain Filtering:

Spatial: Faster for small masks, intuitive, and localized. Ideal for real-time applications (e.g., mobile
imaging).
Frequency: Better for global effects (e.g., periodic noise removal), but computationally expensive. Use for
specialized filtering tasks.
Example: Use spatial averaging for quick noise reduction in video; use frequency domain for removing
scanner artifacts.

2. Median vs. Averaging Filter:

Median: Removes impulse noise (salt-and-pepper) while preserving edges, ideal for medical imaging.
Averaging: Better for Gaussian noise but blurs edges, suited for smooth textures.
Example: Apply median filter to MRI scans with impulse noise; use averaging for X-rays with Gaussian
noise.

3. Sobel vs. Prewitt vs. Roberts:

Sobel over Prewitt: Sobel’s weighted center reduces noise, making it reliable for real-world images.
Sobel over Roberts: Sobel’s ( 3 \times 3 ) mask captures more context, improving edge detection.
Roberts Use Case: Use in low-compute devices for diagonal edges, but less robust overall.

4. Canny vs. Sobel:

Canny: Reduces noise, thins edges, and connects fragments, ideal for noisy or complex images.
Sobel: Faster for simple, low-noise images or real-time processing.
Example: Use Canny for object detection in robotics; use Sobel for quick edge detection in embedded
systems.

5. Gaussian vs. Butterworth vs. Ideal Filters:

Gaussian over Ideal: Avoids ringing, producing natural results for noise reduction.
Gaussian over Butterworth: Simpler (no order parameter), widely used for smoothing.
Butterworth Use Case: Use for precise control in specialized filtering (e.g., scientific imaging).

6. Global vs. Adaptive Thresholding:

Global: Simple and fast for uniform illumination and bimodal histograms.
Adaptive: Handles varying lighting, essential for outdoor or medical images.
Example: Use global for document binarization; use adaptive for unevenly lit natural scenes.

7. Region Growing vs. Splitting/Merging:

Region Growing: Best for small, homogeneous regions but requires seed selection.
Splitting/Merging: Systematic for large, varied images, ensuring robust segmentation.
Example: Use region growing for tumor segmentation; use splitting/merging for land cover analysis.

8. Polygonal Fitting for Boundary Detection:

Simplifies complex edges into smooth contours, ideal for shape analysis.
May oversimplify intricate boundaries, losing fine details.
Example: Use for object recognition in industrial inspection; avoid for highly detailed organic shapes.

Additional Notes
Practical Considerations: When implementing these techniques, consider image size, noise characteristics, and
computational constraints. For example, Sobel is preferred in embedded systems due to its simplicity, while
Canny suits high-precision tasks.
Parameter Tuning: Many methods (e.g., Canny, thresholding) require careful parameter selection. Use histogram
analysis for thresholding or cross-validation for filter parameters.
Extensions: Advanced techniques like adaptive median filtering, multi-scale edge detection, or deep learning-
based segmentation can enhance these methods for specific applications.

This comprehensive guide equips you with the knowledge to understand, apply, and compare image processing
techniques effectively. For specific exam-style questions or deeper clarification, feel free to ask!

You might also like