0% found this document useful (0 votes)
55 views28 pages

PDF 1 - Classical Computer Vision Algorithms

This document is a comprehensive guide on classical computer vision algorithms, covering topics such as edge detection, thresholding, contour detection, and morphological operations, along with code examples and interview questions. It includes detailed explanations of techniques like Sobel and Canny edge detection, global and adaptive thresholding, and contour analysis. Each section provides practical code implementations in Python, making it a valuable resource for understanding and applying computer vision concepts.

Uploaded by

bhiral356
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views28 pages

PDF 1 - Classical Computer Vision Algorithms

This document is a comprehensive guide on classical computer vision algorithms, covering topics such as edge detection, thresholding, contour detection, and morphological operations, along with code examples and interview questions. It includes detailed explanations of techniques like Sobel and Canny edge detection, global and adaptive thresholding, and contour analysis. Each section provides practical code implementations in Python, making it a valuable resource for understanding and applying computer vision concepts.

Uploaded by

bhiral356
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Classical Computer Vision Algorithms

A Comprehensive Guide with Code Examples and Interview Questions

Table of Contents
1. Edge Detection
2. Thresholding

3. Contour Detection
4. Morphological Operations

5. Histogram Equalization
6. Interview Questions & Practice Tasks

Edge Detection

Overview
Edge detection is fundamental in computer vision for identifying boundaries between objects and
regions in images. Edges represent significant local changes in intensity.

Sobel Edge Detection

Concept

The Sobel operator uses two 3×3 convolution kernels to detect horizontal and vertical edges:

Sobel X (Vertical edges):

-1 0 1
-2 0 2
-1 0 1

Sobel Y (Horizontal edges):

-1 -2 -1
0 0 0
1 2 1

The gradient magnitude is calculated as: G = √(Gx² + Gy²)

Code Implementation
python

import cv2
import numpy as np
import matplotlib.pyplot as plt

def sobel_edge_detection(image_path):
# Read image in grayscale
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Apply Sobel operator


sobel_x = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=3)
sobel_y = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=3)

# Calculate gradient magnitude


sobel_combined = np.sqrt(sobel_x**2 + sobel_y**2)

# Normalize to 0-255 range


sobel_combined = np.uint8(sobel_combined / sobel_combined.max() * 255)

return sobel_x, sobel_y, sobel_combined

# Manual implementation for understanding


def manual_sobel(img):
sobel_x_kernel = np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]])
sobel_y_kernel = np.array([[-1, -2, -1], [0, 0, 0], [1, 2, 1]])

rows, cols = img.shape


sobel_x = np.zeros_like(img, dtype=np.float64)
sobel_y = np.zeros_like(img, dtype=np.float64)

for i in range(1, rows-1):


for j in range(1, cols-1):
gx = np.sum(img[i-1:i+2, j-1:j+2] * sobel_x_kernel)
gy = np.sum(img[i-1:i+2, j-1:j+2] * sobel_y_kernel)
sobel_x[i, j] = gx
sobel_y[i, j] = gy

magnitude = np.sqrt(sobel_x**2 + sobel_y**2)


return sobel_x, sobel_y, magnitude

Canny Edge Detection

Concept

Canny edge detection is a multi-stage algorithm:


1. Gaussian Blur - Noise reduction
2. Gradient Calculation - Find intensity gradients

3. Non-maximum Suppression - Thin edges to single pixels


4. Double Thresholding - Identify strong and weak edges

5. Edge Tracking - Connect weak edges to strong edges

Code Implementation

python
def canny_edge_detection(image_path, low_threshold=50, high_threshold=150):
# Read image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Apply Canny edge detection


edges = cv2.Canny(img, low_threshold, high_threshold)

return edges

# Manual implementation of key steps


def manual_canny(img, low_thresh=50, high_thresh=150):
# Step 1: Gaussian blur
blurred = cv2.GaussianBlur(img, (5, 5), 1.4)

# Step 2: Calculate gradients


sobel_x = cv2.Sobel(blurred, cv2.CV_64F, 1, 0, ksize=3)
sobel_y = cv2.Sobel(blurred, cv2.CV_64F, 0, 1, ksize=3)

gradient_magnitude = np.sqrt(sobel_x**2 + sobel_y**2)


gradient_direction = np.arctan2(sobel_y, sobel_x)

# Step 3: Non-maximum suppression


suppressed = non_max_suppression(gradient_magnitude, gradient_direction)

# Step 4: Double thresholding


strong_edges = suppressed > high_thresh
weak_edges = (suppressed >= low_thresh) & (suppressed <= high_thresh)

# Step 5: Edge tracking by hysteresis


final_edges = edge_tracking(strong_edges, weak_edges)

return final_edges

def non_max_suppression(gradient_mag, gradient_dir):


rows, cols = gradient_mag.shape
suppressed = np.zeros_like(gradient_mag)
angle = gradient_dir * 180.0 / np.pi
angle[angle < 0] += 180

for i in range(1, rows-1):


for j in range(1, cols-1):
# Determine neighbors based on gradient direction
if (0 <= angle[i,j] < 22.5) or (157.5 <= angle[i,j] <= 180):
neighbors = [gradient_mag[i, j-1], gradient_mag[i, j+1]]
elif 22.5 <= angle[i,j] < 67.5:
neighbors = [gradient_mag[i-1, j+1], gradient_mag[i+1, j-1]]
elif 67.5 <= angle[i,j] < 112.5:
neighbors = [gradient_mag[i-1, j], gradient_mag[i+1, j]]
else:
neighbors = [gradient_mag[i-1, j-1], gradient_mag[i+1, j+1]]

# Suppress if not local maximum


if gradient_mag[i,j] >= max(neighbors):
suppressed[i,j] = gradient_mag[i,j]

return suppressed

Diagram Concept

Original Image → Gaussian Blur → Gradient Calculation → Non-max Suppression → Double Threshold → Edge
Tracking → Final Edges

Q&A: Edge Detection


Q1: What's the difference between Sobel and Canny edge detection? A: Sobel is simpler and faster,
using convolution kernels to find gradients. Canny is more sophisticated with multiple stages, producing
cleaner, single-pixel-wide edges with better noise suppression.

Q2: Why do we use Gaussian blur before edge detection? A: Gaussian blur reduces noise that could
cause false edges. It smooths the image while preserving important edge information.

Q3: What happens if Canny thresholds are too high/low? A: Too high: Miss weak but important edges.
Too low: Include noise as edges. The ratio should typically be 2:1 or 3:1 (high:low).

Thresholding

Overview
Thresholding converts grayscale images to binary images by separating pixels into foreground and
background based on intensity values.

Global Thresholding

Concept

Uses a single threshold value for the entire image:

If pixel intensity > threshold → white (255)

If pixel intensity ≤ threshold → black (0)

Code Implementation
python

def global_thresholding(image_path, threshold=127):


img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Manual thresholding
binary_manual = np.where(img > threshold, 255, 0).astype(np.uint8)

# OpenCV thresholding
ret, binary_cv = cv2.threshold(img, threshold, 255, cv2.THRESH_BINARY)

return img, binary_manual, binary_cv

# Example with different threshold types


def threshold_types_demo(image_path):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

threshold_value = 127
max_value = 255

# Different threshold types


_, binary = cv2.threshold(img, threshold_value, max_value, cv2.THRESH_BINARY)
_, binary_inv = cv2.threshold(img, threshold_value, max_value, cv2.THRESH_BINARY_INV)
_, trunc = cv2.threshold(img, threshold_value, max_value, cv2.THRESH_TRUNC)
_, tozero = cv2.threshold(img, threshold_value, max_value, cv2.THRESH_TOZERO)
_, tozero_inv = cv2.threshold(img, threshold_value, max_value, cv2.THRESH_TOZERO_INV)

return {
'original': img,
'binary': binary,
'binary_inv': binary_inv,
'trunc': trunc,
'tozero': tozero,
'tozero_inv': tozero_inv
}

Adaptive Thresholding

Concept

Calculates threshold locally for each pixel based on neighborhood statistics. Better for images with
varying illumination.

Code Implementation

python
def adaptive_thresholding(image_path):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Adaptive Mean Thresholding


adaptive_mean = cv2.adaptiveThreshold(
img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2
)

# Adaptive Gaussian Thresholding


adaptive_gaussian = cv2.adaptiveThreshold(
img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2
)

return img, adaptive_mean, adaptive_gaussian

# Manual adaptive thresholding implementation


def manual_adaptive_threshold(img, block_size=11, C=2):
rows, cols = img.shape
binary = np.zeros_like(img)
offset = block_size // 2

for i in range(offset, rows - offset):


for j in range(offset, cols - offset):
# Extract local neighborhood
neighborhood = img[i-offset:i+offset+1, j-offset:j+offset+1]

# Calculate local threshold (mean - C)


local_mean = np.mean(neighborhood)
threshold = local_mean - C

# Apply threshold
binary[i, j] = 255 if img[i, j] > threshold else 0

return binary

Otsu's Thresholding

Concept

Automatically finds optimal threshold by minimizing intra-class variance (or maximizing inter-class
variance). Assumes bimodal histogram.

Code Implementation

python
def otsu_thresholding(image_path):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Otsu's thresholding
ret, otsu_binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

print(f"Optimal threshold found by Otsu: {ret}")

return img, otsu_binary, ret

# Manual Otsu implementation


def manual_otsu(img):
# Calculate histogram
hist, bins = np.histogram(img.flatten(), 256, [0, 256])
total_pixels = img.size

best_threshold = 0
max_variance = 0

for threshold in range(256):


# Class probabilities
w0 = np.sum(hist[:threshold]) / total_pixels # Background
w1 = np.sum(hist[threshold:]) / total_pixels # Foreground

if w0 == 0 or w1 == 0:
continue

# Class means
mu0 = np.sum([i * hist[i] for i in range(threshold)]) / (w0 * total_pixels)
mu1 = np.sum([i * hist[i] for i in range(threshold, 256)]) / (w1 * total_pixels)

# Inter-class variance
variance = w0 * w1 * (mu0 - mu1) ** 2

if variance > max_variance:


max_variance = variance
best_threshold = threshold

return best_threshold

Q&A: Thresholding
Q1: When should you use adaptive vs global thresholding? A: Use adaptive thresholding for images
with uneven illumination or varying lighting conditions. Use global thresholding for uniformly lit images
with clear bimodal histograms.
Q2: What's the advantage of Otsu's method? A: Otsu's method automatically finds the optimal
threshold without manual tuning, making it robust for images with bimodal intensity distributions.

Q3: What are the parameters in adaptive thresholding? A: Block size (neighborhood size for local
threshold calculation) and C (constant subtracted from mean/weighted mean).

Contour Detection

Overview
Contours are curves joining continuous points with same color or intensity, representing object
boundaries.

Concept
Contours are detected using the border following algorithm after binarization. OpenCV uses Suzuki-Abe
algorithm for contour detection.

Diagram Concept

Original Image → Preprocessing → Binary Image → Contour Detection → Contour Hierarchy

Code Implementation

python
def contour_detection(image_path):
# Read image
img = cv2.imread(image_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Preprocessing - thresholding
_, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)

# Find contours
contours, hierarchy = cv2.findContours(
binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
)

# Draw contours
img_contours = img.copy()
cv2.drawContours(img_contours, contours, -1, (0, 255, 0), 2)

return img, binary, img_contours, contours

def contour_analysis(contours):
"""Analyze contour properties"""
results = []

for i, contour in enumerate(contours):


# Calculate area
area = cv2.contourArea(contour)

# Calculate perimeter
perimeter = cv2.arcLength(contour, True)

# Approximate contour
epsilon = 0.02 * perimeter
approx = cv2.approxPolyDP(contour, epsilon, True)

# Bounding rectangle
x, y, w, h = cv2.boundingRect(contour)

# Convex hull
hull = cv2.convexHull(contour)

# Solidity (area/convex_area)
hull_area = cv2.contourArea(hull)
solidity = area / hull_area if hull_area > 0 else 0

# Aspect ratio
aspect_ratio = w / h
results.append({
'contour_id': i,
'area': area,
'perimeter': perimeter,
'vertices': len(approx),
'bounding_box': (x, y, w, h),
'solidity': solidity,
'aspect_ratio': aspect_ratio
})

return results

# Contour filtering based on properties


def filter_contours(contours, min_area=100, max_area=10000):
"""Filter contours based on area"""
filtered = []
for contour in contours:
area = cv2.contourArea(contour)
if min_area <= area <= max_area:
filtered.append(contour)
return filtered

Advanced Contour Operations

python
def advanced_contour_operations(image_path):
img = cv2.imread(image_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)

contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

result_img = img.copy()

for contour in contours:


# Minimum enclosing circle
(x, y), radius = cv2.minEnclosingCircle(contour)
center = (int(x), int(y))
radius = int(radius)
cv2.circle(result_img, center, radius, (255, 0, 0), 2)

# Minimum area rectangle


rect = cv2.minAreaRect(contour)
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(result_img, [box], 0, (0, 0, 255), 2)

# Fit ellipse (if contour has enough points)


if len(contour) >= 5:
ellipse = cv2.fitEllipse(contour)
cv2.ellipse(result_img, ellipse, (0, 255, 255), 2)

return result_img

Q&A: Contour Detection


Q1: What's the difference between RETR_EXTERNAL and RETR_TREE? A: RETR_EXTERNAL retrieves
only outer contours, while RETR_TREE retrieves all contours and creates a full hierarchy of nested
contours.

Q2: When would you use CHAIN_APPROX_SIMPLE vs CHAIN_APPROX_NONE? A:


CHAIN_APPROX_SIMPLE stores only essential points (saves memory), while CHAIN_APPROX_NONE stores
all boundary points (more accurate for some applications).

Q3: How do you distinguish between different shapes using contours? A: Use contour properties like
area, perimeter, aspect ratio, solidity, and number of vertices after approximation.

Morphological Operations
Overview

Morphological operations process images based on shapes using a structuring element (kernel). Primary
operations are erosion and dilation.

Basic Operations

Erosion

Shrinks foreground objects by removing pixels at boundaries.

Dilation

Expands foreground objects by adding pixels at boundaries.

Code Implementation

python
def morphological_operations(image_path):
# Read and preprocess image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, binary = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)

# Define structuring element


kernel = np.ones((5, 5), np.uint8)

# Basic operations
erosion = cv2.erode(binary, kernel, iterations=1)
dilation = cv2.dilate(binary, kernel, iterations=1)

# Compound operations
opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)
gradient = cv2.morphologyEx(binary, cv2.MORPH_GRADIENT, kernel)
tophat = cv2.morphologyEx(binary, cv2.MORPH_TOPHAT, kernel)
blackhat = cv2.morphologyEx(binary, cv2.MORPH_BLACKHAT, kernel)

return {
'original': img,
'binary': binary,
'erosion': erosion,
'dilation': dilation,
'opening': opening,
'closing': closing,
'gradient': gradient,
'tophat': tophat,
'blackhat': blackhat
}

# Manual implementation
def manual_erosion(img, kernel):
rows, cols = img.shape
k_rows, k_cols = kernel.shape
offset_r, offset_c = k_rows // 2, k_cols // 2

eroded = np.zeros_like(img)

for i in range(offset_r, rows - offset_r):


for j in range(offset_c, cols - offset_c):
# Extract neighborhood
neighborhood = img[i-offset_r:i+offset_r+1, j-offset_c:j+offset_c+1]

# Erosion: minimum value where kernel is 1


eroded[i, j] = np.min(neighborhood[kernel == 1])
return eroded

def manual_dilation(img, kernel):


rows, cols = img.shape
k_rows, k_cols = kernel.shape
offset_r, offset_c = k_rows // 2, k_cols // 2

dilated = np.zeros_like(img)

for i in range(offset_r, rows - offset_r):


for j in range(offset_c, cols - offset_c):
neighborhood = img[i-offset_r:i+offset_r+1, j-offset_c:j+offset_c+1]

# Dilation: maximum value where kernel is 1


dilated[i, j] = np.max(neighborhood[kernel == 1])

return dilated

Advanced Morphological Operations

python
def noise_removal_pipeline(image_path):
"""Complete pipeline for noise removal using morphological operations"""
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

# Remove small noise


kernel_small = np.ones((3, 3), np.uint8)
opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel_small, iterations=2)

# Fill small holes


kernel_large = np.ones((7, 7), np.uint8)
closing = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel_large, iterations=2)

return img, binary, opening, closing

def create_custom_kernels():
"""Create different structuring elements"""
kernels = {
'rectangular': cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5)),
'elliptical': cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)),
'cross': cv2.getStructuringElement(cv2.MORPH_CROSS, (5, 5)),
'custom_plus': np.array([[0, 1, 0],
[1, 1, 1],
[0, 1, 0]], dtype=np.uint8),
'diamond': np.array([[0, 0, 1, 0, 0],
[0, 1, 1, 1, 0],
[1, 1, 1, 1, 1],
[0, 1, 1, 1, 0],
[0, 0, 1, 0, 0]], dtype=np.uint8)
}

return kernels

Operation Descriptions
Opening = Erosion + Dilation

Removes small objects and noise

Separates connected objects

Smooths object boundaries

Closing = Dilation + Erosion

Fills small holes and gaps


Connects nearby objects
Smooths object boundaries

Gradient = Dilation - Erosion

Highlights object boundaries


Creates outline effect

Top Hat = Original - Opening

Highlights small bright details


Removes large structures

Black Hat = Closing - Original

Highlights small dark details

Removes large structures

Q&A: Morphological Operations


Q1: What's the effect of kernel size on morphological operations? A: Larger kernels create more
dramatic effects - more erosion/dilation, better noise removal in opening, larger hole filling in closing.

Q2: How do you choose the right structuring element? A: Choose based on the shape of features you
want to preserve or remove. Circular kernels for general use, rectangular for specific directional
operations.

Q3: What's the difference between opening and closing? A: Opening removes small objects and
separates connected ones (erosion first). Closing fills gaps and connects nearby objects (dilation first).

Histogram Equalization

Overview
Histogram equalization improves image contrast by redistributing pixel intensities to utilize the full
dynamic range.

Concept
Transforms the image so that its histogram becomes approximately uniform, enhancing contrast
especially in low-contrast images.

Code Implementation

python
def histogram_equalization(image_path):
# Read image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Apply histogram equalization


equalized = cv2.equalizeHist(img)

# Calculate histograms
hist_original = cv2.calcHist([img], [0], None, [256], [0, 256])
hist_equalized = cv2.calcHist([equalized], [0], None, [256], [0, 256])

return img, equalized, hist_original, hist_equalized

# Manual implementation
def manual_histogram_equalization(img):
# Calculate histogram
hist, bins = np.histogram(img.flatten(), 256, [0, 256])

# Calculate cumulative distribution function


cdf = hist.cumsum()

# Normalize CDF
cdf_normalized = cdf * 255 / cdf[-1]

# Apply transformation
equalized = np.interp(img.flatten(), bins[:-1], cdf_normalized)
equalized = equalized.reshape(img.shape).astype(np.uint8)

return equalized

# Adaptive Histogram Equalization (CLAHE)


def adaptive_histogram_equalization(image_path, clip_limit=2.0, tile_size=(8, 8)):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Create CLAHE object


clahe = cv2.createCLAHE(clipLimit=clip_limit, tileGridSize=tile_size)

# Apply CLAHE
clahe_img = clahe.apply(img)

return img, clahe_img

# Color image histogram equalization


def color_histogram_equalization(image_path):
img = cv2.imread(image_path)
# Method 1: Equalize each channel separately
img_yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
img_yuv[:,:,0] = cv2.equalizeHist(img_yuv[:,:,0]) # Equalize Y channel
img_eq_separate = cv2.cvtColor(img_yuv, cv2.COLOR_YUV2BGR)

# Method 2: Convert to HSV and equalize V channel


img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
img_hsv[:,:,2] = cv2.equalizeHist(img_hsv[:,:,2]) # Equalize V channel
img_eq_hsv = cv2.cvtColor(img_hsv, cv2.COLOR_HSV2BGR)

return img, img_eq_separate, img_eq_hsv

Visualization Code

python
def visualize_histogram_equalization(image_path):
img, equalized, hist_orig, hist_eq = histogram_equalization(image_path)

plt.figure(figsize=(15, 10))

# Original image
plt.subplot(2, 3, 1)
plt.imshow(img, cmap='gray')
plt.title('Original Image')
plt.axis('off')

# Equalized image
plt.subplot(2, 3, 2)
plt.imshow(equalized, cmap='gray')
plt.title('Equalized Image')
plt.axis('off')

# Original histogram
plt.subplot(2, 3, 4)
plt.plot(hist_orig)
plt.title('Original Histogram')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')

# Equalized histogram
plt.subplot(2, 3, 5)
plt.plot(hist_eq)
plt.title('Equalized Histogram')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')

# CDF comparison
plt.subplot(2, 3, 3)
cdf_orig = hist_orig.cumsum()
cdf_eq = hist_eq.cumsum()
plt.plot(cdf_orig / cdf_orig.max(), label='Original CDF')
plt.plot(cdf_eq / cdf_eq.max(), label='Equalized CDF')
plt.title('Cumulative Distribution Functions')
plt.legend()

plt.tight_layout()
plt.show()

Q&A: Histogram Equalization


Q1: When might histogram equalization make an image worse? A: When the image already has good
contrast or when important details are in specific intensity ranges. Over-equalization can create artifacts
and unnatural appearance.

Q2: What's the difference between global and adaptive histogram equalization (CLAHE)? A: Global
applies single transformation to entire image. CLAHE divides image into tiles and applies localized
equalization, preventing over-amplification and preserving local details.

Q3: Why do we equalize only the luminance channel in color images? A: Equalizing color channels
separately can cause color shifts. Working with luminance (Y in YUV or V in HSV) preserves color
information while improving contrast.

Interview Questions & Practice Tasks

Technical Interview Questions

Edge Detection

1. Q: Explain the mathematical foundation of edge detection. A: Edges correspond to high


gradients in image intensity. We use first-order derivatives (gradient) or second-order derivatives
(Laplacian) to detect rapid intensity changes.

2. Q: How would you optimize edge detection for real-time applications? A: Use simpler operators
(Sobel instead of Canny), reduce image resolution, use integral images for fast convolution, or
implement GPU-based processing.
3. Q: What causes false edges in edge detection? A: Noise, compression artifacts, illumination
changes, texture patterns, and inappropriate threshold values. Solutions include preprocessing with
Gaussian blur, proper threshold tuning, and using robust operators like Canny.

Thresholding

4. Q: How do you handle images with non-uniform illumination? A: Use adaptive thresholding,
apply illumination correction (background subtraction), or use local normalization techniques before
global thresholding.

5. Q: Explain the mathematical principle behind Otsu's method. A: Otsu's method finds the
threshold that minimizes intra-class variance (or maximizes inter-class variance) by treating
thresholding as a classification problem with two classes.

6. Q: When would you use multiple thresholds instead of binary thresholding? A: For multi-class
segmentation, when objects have distinct intensity ranges, or for creating hierarchical segmentation
(e.g., different tissue types in medical imaging).

Contour Detection
7. Q: How do you handle nested contours and holes in objects? A: Use cv2.RETR_TREE or
cv2.RETR_CCOMP to capture hierarchy information. Analyze the hierarchy array to distinguish
between outer contours, holes, and nested objects.
8. Q: What's the computational complexity of contour detection? A: O(n) where n is the number of
pixels in the binary image, as each pixel is visited once during the border-following algorithm.
9. Q: How do you match contours between different images? A: Use shape descriptors like Hu
moments, contour area, perimeter ratios, Fourier descriptors, or shape context matching.

Morphological Operations

10. Q: How do you design a morphological operation for a specific noise pattern? A: Analyze the
noise characteristics (size, shape) and design structuring elements that are larger than noise but
smaller than objects of interest.
11. Q: Explain the duality between erosion and dilation. A: Erosion of foreground equals dilation of
background with reflected structuring element. This duality is fundamental to morphological theory.

12. Q: How do you preserve important features while removing noise? A: Use size-appropriate
structuring elements, combine multiple operations (opening followed by closing), or use conditional
morphology based on feature properties.

Histogram Processing

13. Q: Why might CLAHE be preferred over global histogram equalization? A: CLAHE prevents over-
amplification of noise, preserves local contrast, and avoids the "washed out" appearance that global
equalization can create.
14. Q: How do you evaluate the quality of contrast enhancement? A: Use metrics like contrast
improvement index, edge preservation metrics, structural similarity (SSIM), or task-specific
performance measures.

15. Q: Explain histogram matching vs. histogram equalization. A: Histogram equalization creates
uniform distribution. Histogram matching transforms one image's histogram to match a reference
histogram, allowing more controlled enhancement.

Coding Challenges

Challenge 1: Robust Edge Detection

python
def robust_edge_detection(image_path, noise_level='medium'):
"""
Implement edge detection pipeline robust to different noise levels

Args:
image_path: Path to input image
noise_level: 'low', 'medium', 'high'

Returns:
Processed edge image
"""
# Your implementation here
pass

# Test cases:
# - Clean synthetic image
# - Natural image with texture
# - Noisy image (add Gaussian noise)

Challenge 2: Adaptive Thresholding Comparison

python

def compare_thresholding_methods(image_path):
"""
Compare different thresholding methods and return performance metrics

Returns:
Dictionary with results from different methods and quality metrics
"""
# Your implementation here
pass

# Metrics to implement:
# - Processing time
# - Number of connected components
# - Foreground/background ratio
# - Visual quality assessment

Challenge 3: Shape Analysis Pipeline

python
def analyze_shapes(image_path):
"""
Complete pipeline for shape detection, classification, and measurement

Returns:
List of detected shapes with properties and classifications
"""
# Your implementation here
pass

# Requirements:
# - Detect and classify shapes (circle, rectangle, triangle, etc.)
# - Measure dimensions
# - Handle overlapping shapes
# - Return confidence scores

Challenge 4: Morphological Noise Removal

python

def intelligent_noise_removal(image_path, noise_type='salt_pepper'):


"""
Adaptive noise removal using morphological operations

Args:
noise_type: 'salt_pepper', 'gaussian', 'speckle'

Returns:
Cleaned image
"""
# Your implementation here
pass

# Test with different noise types and intensities

Challenge 5: Contrast Enhancement Evaluation

python
def optimal_contrast_enhancement(image_path):
"""
Find optimal contrast enhancement parameters using quality metrics

Returns:
Enhanced image and optimal parameters
"""
# Your implementation here
pass

# Implement multiple quality metrics:


# - Entropy
# - Average gradient
# - Edge density
# - Local contrast

Practical Applications

Application 1: Document Processing

Problem: Enhance scanned documents with varying illumination

Solution: Combine adaptive thresholding with morphological operations


Key considerations: Preserve text while removing noise and shadows

Application 2: Quality Control in Manufacturing

Problem: Detect defects in manufactured parts

Solution: Edge detection + contour analysis for shape verification


Key considerations: Handle varying lighting conditions and part orientations

Application 3: Medical Image Preprocessing

Problem: Enhance X-ray images for better diagnosis


Solution: CLAHE + selective morphological operations

Key considerations: Preserve diagnostic information while enhancing visibility

Application 4: Biometric Systems

Problem: Preprocess fingerprint images for recognition

Solution: Histogram equalization + ridge enhancement + noise removal

Key considerations: Maintain ridge patterns while removing artifacts

Performance Optimization Tips


Memory Optimization

python

# Use appropriate data types


img_uint8 = np.uint8(img) # For display
img_float32 = np.float32(img) # For calculations

# In-place operations when possible


cv2.equalizeHist(img, img) # Modifies img directly

# Release memory for large images


del large_image_array

Speed Optimization

python

# Use OpenCV optimized functions instead of manual loops


# Vectorized operations with NumPy
# Consider image scaling for real-time applications
# Use multithreading for independent operations

Real-World Implementation Considerations

Preprocessing Pipeline Design

1. Input validation: Check image format, size, bit depth

2. Noise assessment: Determine appropriate filtering


3. Parameter selection: Adaptive parameter tuning based on image characteristics

4. Quality control: Validate output quality


5. Error handling: Robust error handling and fallback strategies

Parameter Tuning Guidelines

Edge detection: Start with standard parameters, adjust based on noise level

Thresholding: Use Otsu as baseline, switch to adaptive for uneven illumination


Morphological operations: Match structuring element size to feature scale

Histogram equalization: Monitor for over-enhancement artifacts

Common Pitfalls and Solutions

Edge Detection Pitfalls

Problem: Too many false edges from noise


Solution: Proper preprocessing with Gaussian blur

Problem: Missing weak but important edges

Solution: Adjust Canny thresholds or use multi-scale edge detection

Thresholding Pitfalls

Problem: Loss of important details in shadows


Solution: Use adaptive thresholding or illumination correction
Problem: Sensitivity to parameter selection

Solution: Implement automatic parameter selection based on image statistics

Morphological Operations Pitfalls

Problem: Loss of important small features

Solution: Use appropriate structuring element size and shape


Problem: Insufficient noise removal
Solution: Combine multiple operations or use iterative processing

Histogram Processing Pitfalls

Problem: Over-enhancement creating artifacts


Solution: Use CLAHE with appropriate clip limit

Problem: Color distortion in color images


Solution: Work in appropriate color space (YUV, LAB)

Testing and Validation

Test Image Categories

1. Synthetic images: Perfect geometric shapes, known ground truth


2. Natural images: Real-world scenarios with varying conditions
3. Noisy images: Add controlled noise for robustness testing

4. Edge cases: Extreme lighting, high contrast, low contrast

Performance Metrics

Accuracy: Compare with ground truth annotations

Processing time: Measure computational efficiency


Memory usage: Monitor resource consumption
Robustness: Test with various image types and conditions
Conclusion
Classical computer vision algorithms form the foundation for understanding image processing and
computer vision. While deep learning has revolutionized many applications, these fundamental
techniques remain essential for:

Preprocessing steps in modern pipelines

Real-time applications with limited computational resources


Understanding and debugging complex vision systems
Applications requiring interpretable and controllable processing

Mastering these algorithms provides the conceptual foundation necessary for advanced computer vision
work and enables you to make informed decisions about when to use classical methods versus modern
deep learning approaches.

The key to success with classical computer vision is understanding the underlying mathematical
principles, knowing when to apply each technique, and having the practical skills to tune parameters and
combine methods effectively for robust real-world applications.

You might also like