Image Processing Notes
Image Processing Notes
Mathematical Representation: For an image ( f(x, y) ) and a filter mask ( w(s, t) ) of size ( m \times n ), the filtered
image ( g(x, y) ) is: [ g(x, y) = \sum_ \sum_ w(s, t) f(x+s, y+t) ] where ( a = \lfloor m/2 \rfloor ), ( b = \lfloor n/2 \rfloor ). The
mask is centered at ((x, y)), and border pixels may require padding (e.g., zero-padding or replication).
Mechanics:
Example: Consider a ( 4 \times 4 ) image and a ( 3 \times 3 ) averaging filter: [ f = \begin 10 & 20 & 30 & 40 \ 50 & 60 &
70 & 80 \ 90 & 100 & 110 & 120 \ 130 & 140 & 150 & 160 \end, \quad w = \frac{1}{9} \begin 1 & 1 & 1 \ 1 & 1 & 1 \ 1 & 1 &
1 \end ] Compute ( g(1, 1) ): [ g(1, 1) = \frac{1}{9} (10 + 20 + 30 + 50 + 60 + 70 + 90 + 100 + 110) = \frac{540}{9} = 60 ]
For border pixels, assume zero-padding or replicate edge values.
Advantages:
Disadvantages:
Use Cases:
Theory: The averaging filter replaces each pixel with the weighted average of its neighborhood, smoothing out high-
frequency components like noise. Weights are typically equal but can vary (e.g., Gaussian weights).
Equation: For a ( 3 \times 3 ) averaging filter: [ g(x, y) = \frac{1}{9} \sum_{1} \sum_{1} f(x+s, y+t) ] Mask: [ w = \frac{1}{9}
\begin 1 & 1 & 1 \ 1 & 1 & 1 \ 1 & 1 & 1 \end ]
Example: Using the ( 4 \times 4 ) image above, apply the averaging filter at ((1, 1)): [ g(1, 1) = 60 \quad (\text) ] The
output image is smoother, with reduced intensity variations.
Advantages:
Disadvantages:
Use Cases:
Preprocessing for edge detection to reduce noise.
Smoothing low-resolution images for aesthetic purposes.
Noise reduction in video frames.
Theory: The median filter, a non-linear method, replaces each pixel with the median value of its neighborhood. It excels
at removing impulse noise (salt-and-pepper) while preserving edges.
Equation: For a ( 3 \times 3 ) neighborhood, sort the 9 pixel values and select the median (5th value in sorted order).
Example: Consider an image with salt-and-pepper noise: [ f = \begin 10 & 20 & 255 & 40 \ 50 & 60 & 0 & 80 \ 90 & 100 &
110 & 120 \ 130 & 140 & 150 & 160 \end ] At ((1, 1)), the ( 3 \times 3 ) neighborhood is: [ [10, 20, 255, 50, 60, 0, 90, 100,
110] ] Sorted: ([0, 10, 20, 50, 60, 90, 100, 110, 255]), median = 60. Thus, ( g(1, 1) = 60 ).
Advantages:
Disadvantages:
Use Cases:
Application of Median Filtering for Noise Removal: Median filtering is ideal for impulse noise because it discards
extreme values (e.g., 255, 0 in the example), unlike averaging filters, which blend outliers into the output, causing
blurring. For instance, in the above image, the median filter restores the pixel at ((1, 2)) to a value consistent with its
neighbors, preserving the image’s structure.
Why Median over Averaging? Median filters remove impulse noise without blurring edges, making them suitable
for images with salt-and-pepper noise. Averaging filters are better for Gaussian noise but compromise edge
details.
Trade-off: Median filters are slower due to sorting but offer superior edge preservation.
Use Case Example: In medical imaging, median filters are preferred for MRI scans with impulse noise to
maintain anatomical boundaries, while averaging filters suit smoother noise in X-rays.
Sharpening Spatial Filters
Sharpening filters enhance edges and details by amplifying high-frequency components.
The Laplacian
Theory: The Laplacian, a second-order derivative operator, highlights regions of rapid intensity change (edges) by
computing the difference between a pixel and its neighbors.
Equation (Discrete Laplacian): [ \nabla2 f(x, y) = f(x+1, y) + f(x-1, y) + f(x, y+1) + f(x, y-1) - 4f(x, y) ] Mask: [ w = \begin 0 & 1 & 0 \ 1 & -4
& 1 \ 0 & 1 & 0 \end ] Sharpened image: [ g(x, y) = f(x, y) - c \cdot \nabla
2 f(x, y) \quad (c > 0 \text) ]
Example: For the ( 4 \times 4 ) image at ((1, 1) = 60): [ \nabla^2 f(1, 1) = (20 + 100 + 50 + 70 - 4 \cdot 60) = 40 - 240 =
-200 ] If ( c = 1 ): [ g(1, 1) = 60 - (-200) = 260 \quad (\text) ] Negative values indicate edge transitions, requiring
normalization for display.
Advantages:
Disadvantages:
Use Cases:
Theory: Unsharp masking enhances edges by subtracting a blurred version of the image from the original, amplifying
high-frequency components. Highboost filtering extends this by increasing the amplification factor.
Equation: [ g(x, y) = f(x, y) + k \cdot [f(x, y) - f_{\text}(x, y)] ] where ( f_{\text} ) is the blurred image (e.g., via averaging
filter), and ( k \geq 0 ). If ( k > 1 ), it’s highboost filtering.
Example: Blur the ( 4 \times 4 ) image using the averaging filter: [ f_{\text}(1, 1) = 60 ] Compute the mask: [ \text = f -
f_{\text}, \quad \text(1, 1) = 60 - 60 = 0 ] If ( k = 1 ): [ g(1, 1) = 60 + 1 \cdot 0 = 60 ] For stronger edges, increase ( k ).
Advantages:
Disadvantages:
Use Cases:
Theory: First-order derivatives detect edges by computing the gradient, which measures the rate of intensity change.
Gradient: [ \nabla f = \begin G_x \ G_y \end = \begin \frac{\partial f}{\partial x} \ \frac{\partial f}{\partial y} \end ] Edge
magnitude: [ |\nabla f| = \sqrt{G_x2 + G_y2} \approx |G_x| + |G_y| \quad (\text) ] Direction: [ \theta = \tan^{-1}(G_y / G_x) ]
Sobel Operator: [ G_x = \begin -1 & 0 & 1 \ -2 & 0 & 2 \ -1 & 0 & 1 \end, \quad G_y = \begin -1 & -2 & -1 \ 0 & 0 & 0 \ 1 &
2 & 1 \end ]
Prewitt Operator: [ G_x = \begin -1 & 0 & 1 \ -1 & 0 & 1 \ -1 & 0 & 1 \end, \quad G_y = \begin -1 & -1 & -1 \ 0 & 0 & 0 \ 1
& 1 & 1 \end ]
Roberts Operator: [ G_x = \begin 1 & 0 \ 0 & -1 \end, \quad G_y = \begin 0 & 1 \ -1 & 0 \end ]
Example (Sobel): For the ( 4 \times 4 ) image at ((1, 1) = 60): [ G_x = (30 \cdot 1 + 70 \cdot 2 + 110 \cdot 1) - (10 \cdot 1
+ 50 \cdot 2 + 90 \cdot 1) = 280 - 200 = 80 ] [ G_y = (90 \cdot 1 + 100 \cdot 2 + 110 \cdot 1) - (10 \cdot 1 + 20 \cdot 2 + 30
\cdot 1) = 400 - 80 = 320 ] [ |\nabla f| \approx 80 + 320 = 400, \quad \theta = \tan^{-1}(320/80) \approx 76^\circ ]
Sobel: Weighted center pixels reduce noise impact, ideal for noisy images.
Prewitt: Simpler and faster but less robust to noise.
Roberts: Compact ( 2 \times 2 ) mask, good for diagonal edges, but noise-sensitive.
Disadvantages:
Use Cases:
Sobel over Prewitt: Sobel’s weighted center improves noise robustness, making it suitable for real-world images.
Sobel over Roberts: Sobel’s ( 3 \times 3 ) mask captures more context, enhancing edge detection accuracy.
Roberts Use Case: Preferred in low-compute environments due to its smaller mask, but less reliable for complex
edges.
Equation: For an ( M \times N ) image ( f(x, y) ): [ F(u, v) = \sum_ \sum_ f(x, y) e^{-j2\pi (ux/M + vy/N)} ] Inverse DFT: [ f(x,
y) = \frac{1} \sum_ \sum_ F(u, v) e^{j2\pi (ux/M + vy/N)} ] where ( u, v ) are frequency variables, and ( F(u, v) ) is the
complex frequency spectrum.
Application:
Low frequencies (near ((0, 0))): Represent smooth regions (e.g., background).
High frequencies (far from center): Represent edges, noise, and details.
Filtering modifies ( F(u, v) ) to enhance or suppress specific frequencies.
Example: For a ( 2 \times 2 ) image: [ f = \begin 1 & 2 \ 3 & 4 \end ] Compute ( F(0, 0) ): [ F(0, 0) = 1 \cdot e0 + 2 \cdot e0 +
3 \cdot e0 + 4 \cdot e0 = 1 + 2 + 3 + 4 = 10 ] (Full DFT requires computing all ( u, v ), typically via FFT for efficiency.)
Wavelet Transform: Wavelet transforms decompose an image into multi-resolution subbands (low and high frequencies)
using localized basis functions (wavelets), capturing both spatial and frequency information.
Equation (1-D for simplicity): [ W(a, b) = \int f(t) \psi_{a,b}(t) dt, \quad \psi_{a,b}(t) = \frac{1}{\sqrt} \psi\left(\frac\right) ]
where ( a ) is scale, ( b ) is translation, and ( \psi ) is the wavelet function.
Haar Transform: The simplest wavelet transform, using a step function: [ \psi(t) = \begin 1 & 0 \leq t < 0.5 \ -1 & 0.5 \leq t
< 1 \ 0 & \text \end ] For a 1-D signal ([a, b]):
Example (Haar): For ([1, 3]): [ \text = \frac{1+3}{\sqrt{2}} = 2\sqrt{2}, \quad \text = \frac{1-3}{\sqrt{2}} = -\sqrt{2} ] For a 2-
D image, apply Haar transform row-wise, then column-wise.
Advantages (Wavelet/Haar):
Disadvantages:
Use Cases:
Wavelet over DFT: Wavelets provide localized frequency information, making them better for compression and
denoising. DFT is global and suited for filtering.
Haar Use Case: Preferred in low-compute environments due to simplicity, but less effective for smooth transitions
compared to other wavelets (e.g., Daubechies).
Example: A ( 3 \times 3 ) averaging filter in the spatial domain acts as a low-pass filter in the frequency domain,
attenuating high frequencies (edges, noise).
Ideal Low-Pass Filter: [ H(u, v) = \begin 1 & D(u, v) \leq D_0 \ 0 & \text \end ] where ( D(u, v) = \sqrt{u2 + v2} ),
and ( D_0 ) is the cutoff frequency.
Use Case: Basic noise removal, but rarely used due to artifacts.
Butterworth Low-Pass Filter: [ H(u, v) = \frac{1}{1 + [D(u, v)/D_0]^{2n}} ] where ( n ) is the order, controlling
transition sharpness.
Ideal High-Pass Filter: [ H(u, v) = \begin 0 & D(u, v) \leq D_0 \ 1 & \text \end ]
Example (Gaussian Low-Pass): For a ( 4 \times 4 ) image, compute the DFT, apply: [ H(u, v) = e^{-(u2 +
v
2)/(2\sigma^2)}, \quad \sigma = 1 ] Multiply with ( F(u, v) ), then compute the inverse DFT to get a smoothed image.
Disadvantages:
Use Cases:
Gaussian over Ideal: Gaussian’s smooth transition avoids ringing, producing natural results.
Gaussian over Butterworth: Simpler to implement (no order parameter), widely used for noise reduction.
Butterworth Use Case: Preferred when precise control over transition steepness is needed (e.g., specialized
filtering).
Ideal Limitation: Rarely used due to artifacts, but useful for theoretical studies.
Equation: For a ( 3 \times 3 ) mask: [ w = \begin -1 & -1 & -1 \ -1 & 8 & -1 \ -1 & -1 & -1 \end ] Response: [ R(x, y) =
\sum_{1} \sum_{1} w(s, t) f(x+s, y+t) ] A pixel is an isolated point if ( |R(x, y)| > T ).
Example: For a ( 4 \times 4 ) image: [ f = \begin 10 & 10 & 10 & 10 \ 10 & 100 & 10 & 10 \ 10 & 10 & 10 & 10 \ 10 & 10 &
10 & 10 \end ] At ((1, 1) = 100): [ R(1, 1) = (-1)(10 + 10 + 10 + 10 + 10 + 10 + 10 + 10) + 8 \cdot 100 = -80 + 800 = 720 ] If
( T = 500 ), ( |720| > 500 ), so ((1, 1)) is an isolated point.
Advantages:
Disadvantages:
Use Cases:
Line Detection
Theory: Line detection identifies linear structures (e.g., roads, boundaries) using directional masks tuned to specific
orientations (horizontal, vertical, diagonal). The mask with the strongest response indicates the line’s presence and
orientation.
Horizontal: [ w_h = \begin -1 & -1 & -1 \ 2 & 2 & 2 \ -1 & -1 & -1 \end ]
Vertical: [ w_v = \begin -1 & 2 & -1 \ -1 & 2 & -1 \ -1 & 2 & -1 \end ]
Diagonal (+45°): [ w_ = \begin -1 & -1 & 2 \ -1 & 2 & -1 \ 2 & -1 & -1 \end ] Apply each mask and select the
orientation with maximum ( |R(x, y)| ).
Example: For the image: [ f = \begin 10 & 10 & 10 & 10 \ 10 & 100 & 100 & 10 \ 10 & 100 & 100 & 10 \ 10 & 10 & 10 & 10
\end ] Apply the vertical mask at ((1, 1) = 100): [ R_v(1, 1) = (-1)(10 + 10 + 10) + 2(100 + 100 + 100) - (10 + 10 + 10) =
-30 + 600 - 30 = 540 ] The high response suggests a vertical line.
Advantages:
Disadvantages:
Use Cases:
Road detection in satellite imagery.
Blood vessel detection in angiography.
Crack detection in infrastructure inspection.
Edge Models
Theory: Edges are boundaries between regions with different intensities. Common edge models include:
Step Edge: Abrupt intensity change (ideal but rare in real images).
Ramp Edge: Gradual change over several pixels (common due to blur).
Roof Edge: Intensity peak (e.g., thin lines or ridges). Edge detection uses derivatives to identify these transitions:
First-Order Derivatives (Gradient): Measure intensity change magnitude and direction.
Second-Order Derivatives (Laplacian): Detect zero-crossings at edges.
Steps:
1. Smoothing: Apply a Gaussian filter to reduce noise: [ G(x, y) = \frac{1}{2\pi\sigma2} e{-(x2 + y2)/(2\sigma^2)} ]
Convolve: ( f_{\text} = f * G ).
2. Gradient Computation: Use Sobel operators to compute gradient magnitude ( M(x, y) ) and direction ( \theta(x,
Example: For the image: [ f = \begin 50 & 50 & 50 & 50 \ 50 & 100 & 100 & 50 \ 50 & 100 & 100 & 50 \ 50 & 50 & 50 & 50
\end ]
Advantages:
Disadvantages:
Use Cases:
Canny over Sobel: Canny’s noise reduction, edge thinning, and hysteresis make it more reliable for noisy or
complex images. Sobel is faster but produces thicker, noisier edges.
Sobel Use Case: Preferred for simple, low-noise images or real-time applications where speed is critical.
Method: For each edge pixel, check neighbors in a ( 3 \times 3 ) window. Link if gradient direction and magnitude
are similar.
Equation: Link pixels ((x_1, y_1)) and ((x_2, y_2)) if: [ |\theta(x_1, y_1) - \theta(x_2, y_2)| < T_\theta \quad \text
\quad |M(x_1, y_1) - M(x_2, y_2)| < T_M ]
Boundary Detection Using Regional Processing (Polygonal Fitting): Theory: Fits polygons to edge points to form
closed boundaries, simplifying complex edge contours.
Method: Group edge points into segments, then fit straight lines or curves using least-squares.
Equation (Line Fitting): For points ((x_i, y_i)), minimize: [ E = \sum_i (y_i - (mx_i + c))^2 ] Solve for slope ( m )
and intercept ( c ).
Example: For edge points ((1, 1), (1, 2), (2, 1), (2, 2)), group into a square and fit lines to form a polygon. For a line
segment ((1, 1), (1, 2)), the fitted line is vertical (( x = 1 )).
Advantages:
Disadvantages:
Use Cases:
4.2 Thresholding
Foundation
Theory: Thresholding segments an image into regions (e.g., object vs. background) by comparing pixel intensities to a
threshold, producing a binary or labeled image.
Equation: For an image ( f(x, y) ): [ g(x, y) = \begin 1 & \text f(x, y) \geq T \ 0 & \text \end ] where ( T ) is the threshold.
Algorithm:
Example: For the image: [ f = \begin 50 & 50 & 100 & 100 \ 50 & 50 & 100 & 100 \ 50 & 50 & 100 & 100 \ 50 & 50 & 100
& 100 \end ]
Disadvantages:
Use Cases:
Global over Adaptive: Faster and sufficient for uniform illumination and bimodal histograms.
Adaptive Use Case: Necessary for images with varying lighting (e.g., outdoor scenes, medical images).
Trade-off: Global thresholding is simpler but less robust; adaptive thresholding handles complex lighting but is
slower.
Algorithm:
Example: For the image: [ f = \begin 50 & 52 & 100 & 102 \ 51 & 50 & 101 & 100 \ 50 & 51 & 100 & 101 \ 100 & 100 &
102 & 101 \end ]
Disadvantages:
Use Cases:
1. Splitting:
Divide the image into four quadrants.
For each quadrant, check homogeneity (e.g., variance ( \sigma^2 < T )).
If not homogeneous, split further.
2. Merging:
Merge adjacent regions if: [ |\mu_ - \mu_| < T ] where ( \mu_, \mu_ ) are region means.
Example: For the image: [ f = \begin 50 & 50 & 100 & 100 \ 50 & 50 & 100 & 100 \ 50 & 50 & 150 & 150 \ 50 & 50 & 150
& 150 \end ]
Split: Top-left quadrant (50s) is homogeneous (( \sigma^2 \approx 0 )); bottom-right splits into 100s and 150s due
to high variance.
Merge: Top-left remains; bottom-right forms two regions (100s and 150s).
Output: Three regions: {50s}, {100s}, {150s}.
Advantages:
Disadvantages:
Computationally expensive, especially for deep quadtrees.
May produce blocky regions due to rectangular splits.
Requires tuning of homogeneity and merging thresholds.
Use Cases:
Region Growing over Splitting/Merging: Intuitive for small, homogeneous regions but sensitive to seed
selection and less systematic.
Splitting/Merging Use Case: Preferred for large, varied images due to its hierarchical approach, ensuring robust
segmentation.
Trade-off: Region growing is faster for targeted regions; splitting/merging is more comprehensive but slower.
Spatial: Faster for small masks, intuitive, and localized. Ideal for real-time applications (e.g., mobile
imaging).
Frequency: Better for global effects (e.g., periodic noise removal), but computationally expensive. Use for
specialized filtering tasks.
Example: Use spatial averaging for quick noise reduction in video; use frequency domain for removing
scanner artifacts.
Median: Removes impulse noise (salt-and-pepper) while preserving edges, ideal for medical imaging.
Averaging: Better for Gaussian noise but blurs edges, suited for smooth textures.
Example: Apply median filter to MRI scans with impulse noise; use averaging for X-rays with Gaussian
noise.
Sobel over Prewitt: Sobel’s weighted center reduces noise, making it reliable for real-world images.
Sobel over Roberts: Sobel’s ( 3 \times 3 ) mask captures more context, improving edge detection.
Roberts Use Case: Use in low-compute devices for diagonal edges, but less robust overall.
Canny: Reduces noise, thins edges, and connects fragments, ideal for noisy or complex images.
Sobel: Faster for simple, low-noise images or real-time processing.
Example: Use Canny for object detection in robotics; use Sobel for quick edge detection in embedded
systems.
Gaussian over Ideal: Avoids ringing, producing natural results for noise reduction.
Gaussian over Butterworth: Simpler (no order parameter), widely used for smoothing.
Butterworth Use Case: Use for precise control in specialized filtering (e.g., scientific imaging).
Global: Simple and fast for uniform illumination and bimodal histograms.
Adaptive: Handles varying lighting, essential for outdoor or medical images.
Example: Use global for document binarization; use adaptive for unevenly lit natural scenes.
Region Growing: Best for small, homogeneous regions but requires seed selection.
Splitting/Merging: Systematic for large, varied images, ensuring robust segmentation.
Example: Use region growing for tumor segmentation; use splitting/merging for land cover analysis.
Simplifies complex edges into smooth contours, ideal for shape analysis.
May oversimplify intricate boundaries, losing fine details.
Example: Use for object recognition in industrial inspection; avoid for highly detailed organic shapes.
Additional Notes
Practical Considerations: When implementing these techniques, consider image size, noise characteristics, and
computational constraints. For example, Sobel is preferred in embedded systems due to its simplicity, while
Canny suits high-precision tasks.
Parameter Tuning: Many methods (e.g., Canny, thresholding) require careful parameter selection. Use histogram
analysis for thresholding or cross-validation for filter parameters.
Extensions: Advanced techniques like adaptive median filtering, multi-scale edge detection, or deep learning-
based segmentation can enhance these methods for specific applications.
This comprehensive guide equips you with the knowledge to understand, apply, and compare image processing
techniques effectively. For specific exam-style questions or deeper clarification, feel free to ask!