PDF 1 - Classical Computer Vision Algorithms
PDF 1 - Classical Computer Vision Algorithms
Table of Contents
1. Edge Detection
2. Thresholding
3. Contour Detection
4. Morphological Operations
5. Histogram Equalization
6. Interview Questions & Practice Tasks
Edge Detection
Overview
Edge detection is fundamental in computer vision for identifying boundaries between objects and
regions in images. Edges represent significant local changes in intensity.
Concept
The Sobel operator uses two 3×3 convolution kernels to detect horizontal and vertical edges:
-1 0 1
-2 0 2
-1 0 1
-1 -2 -1
0 0 0
1 2 1
Code Implementation
python
import cv2
import numpy as np
import matplotlib.pyplot as plt
def sobel_edge_detection(image_path):
# Read image in grayscale
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
Concept
Code Implementation
python
def canny_edge_detection(image_path, low_threshold=50, high_threshold=150):
# Read image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
return edges
return final_edges
return suppressed
Diagram Concept
Original Image → Gaussian Blur → Gradient Calculation → Non-max Suppression → Double Threshold → Edge
Tracking → Final Edges
Q2: Why do we use Gaussian blur before edge detection? A: Gaussian blur reduces noise that could
cause false edges. It smooths the image while preserving important edge information.
Q3: What happens if Canny thresholds are too high/low? A: Too high: Miss weak but important edges.
Too low: Include noise as edges. The ratio should typically be 2:1 or 3:1 (high:low).
Thresholding
Overview
Thresholding converts grayscale images to binary images by separating pixels into foreground and
background based on intensity values.
Global Thresholding
Concept
Code Implementation
python
# Manual thresholding
binary_manual = np.where(img > threshold, 255, 0).astype(np.uint8)
# OpenCV thresholding
ret, binary_cv = cv2.threshold(img, threshold, 255, cv2.THRESH_BINARY)
threshold_value = 127
max_value = 255
return {
'original': img,
'binary': binary,
'binary_inv': binary_inv,
'trunc': trunc,
'tozero': tozero,
'tozero_inv': tozero_inv
}
Adaptive Thresholding
Concept
Calculates threshold locally for each pixel based on neighborhood statistics. Better for images with
varying illumination.
Code Implementation
python
def adaptive_thresholding(image_path):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
# Apply threshold
binary[i, j] = 255 if img[i, j] > threshold else 0
return binary
Otsu's Thresholding
Concept
Automatically finds optimal threshold by minimizing intra-class variance (or maximizing inter-class
variance). Assumes bimodal histogram.
Code Implementation
python
def otsu_thresholding(image_path):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
# Otsu's thresholding
ret, otsu_binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
best_threshold = 0
max_variance = 0
if w0 == 0 or w1 == 0:
continue
# Class means
mu0 = np.sum([i * hist[i] for i in range(threshold)]) / (w0 * total_pixels)
mu1 = np.sum([i * hist[i] for i in range(threshold, 256)]) / (w1 * total_pixels)
# Inter-class variance
variance = w0 * w1 * (mu0 - mu1) ** 2
return best_threshold
Q&A: Thresholding
Q1: When should you use adaptive vs global thresholding? A: Use adaptive thresholding for images
with uneven illumination or varying lighting conditions. Use global thresholding for uniformly lit images
with clear bimodal histograms.
Q2: What's the advantage of Otsu's method? A: Otsu's method automatically finds the optimal
threshold without manual tuning, making it robust for images with bimodal intensity distributions.
Q3: What are the parameters in adaptive thresholding? A: Block size (neighborhood size for local
threshold calculation) and C (constant subtracted from mean/weighted mean).
Contour Detection
Overview
Contours are curves joining continuous points with same color or intensity, representing object
boundaries.
Concept
Contours are detected using the border following algorithm after binarization. OpenCV uses Suzuki-Abe
algorithm for contour detection.
Diagram Concept
Code Implementation
python
def contour_detection(image_path):
# Read image
img = cv2.imread(image_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Preprocessing - thresholding
_, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
# Find contours
contours, hierarchy = cv2.findContours(
binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
)
# Draw contours
img_contours = img.copy()
cv2.drawContours(img_contours, contours, -1, (0, 255, 0), 2)
def contour_analysis(contours):
"""Analyze contour properties"""
results = []
# Calculate perimeter
perimeter = cv2.arcLength(contour, True)
# Approximate contour
epsilon = 0.02 * perimeter
approx = cv2.approxPolyDP(contour, epsilon, True)
# Bounding rectangle
x, y, w, h = cv2.boundingRect(contour)
# Convex hull
hull = cv2.convexHull(contour)
# Solidity (area/convex_area)
hull_area = cv2.contourArea(hull)
solidity = area / hull_area if hull_area > 0 else 0
# Aspect ratio
aspect_ratio = w / h
results.append({
'contour_id': i,
'area': area,
'perimeter': perimeter,
'vertices': len(approx),
'bounding_box': (x, y, w, h),
'solidity': solidity,
'aspect_ratio': aspect_ratio
})
return results
python
def advanced_contour_operations(image_path):
img = cv2.imread(image_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, binary = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
result_img = img.copy()
return result_img
Q3: How do you distinguish between different shapes using contours? A: Use contour properties like
area, perimeter, aspect ratio, solidity, and number of vertices after approximation.
Morphological Operations
Overview
Morphological operations process images based on shapes using a structuring element (kernel). Primary
operations are erosion and dilation.
Basic Operations
Erosion
Dilation
Code Implementation
python
def morphological_operations(image_path):
# Read and preprocess image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, binary = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
# Basic operations
erosion = cv2.erode(binary, kernel, iterations=1)
dilation = cv2.dilate(binary, kernel, iterations=1)
# Compound operations
opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)
gradient = cv2.morphologyEx(binary, cv2.MORPH_GRADIENT, kernel)
tophat = cv2.morphologyEx(binary, cv2.MORPH_TOPHAT, kernel)
blackhat = cv2.morphologyEx(binary, cv2.MORPH_BLACKHAT, kernel)
return {
'original': img,
'binary': binary,
'erosion': erosion,
'dilation': dilation,
'opening': opening,
'closing': closing,
'gradient': gradient,
'tophat': tophat,
'blackhat': blackhat
}
# Manual implementation
def manual_erosion(img, kernel):
rows, cols = img.shape
k_rows, k_cols = kernel.shape
offset_r, offset_c = k_rows // 2, k_cols // 2
eroded = np.zeros_like(img)
dilated = np.zeros_like(img)
return dilated
python
def noise_removal_pipeline(image_path):
"""Complete pipeline for noise removal using morphological operations"""
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
def create_custom_kernels():
"""Create different structuring elements"""
kernels = {
'rectangular': cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5)),
'elliptical': cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)),
'cross': cv2.getStructuringElement(cv2.MORPH_CROSS, (5, 5)),
'custom_plus': np.array([[0, 1, 0],
[1, 1, 1],
[0, 1, 0]], dtype=np.uint8),
'diamond': np.array([[0, 0, 1, 0, 0],
[0, 1, 1, 1, 0],
[1, 1, 1, 1, 1],
[0, 1, 1, 1, 0],
[0, 0, 1, 0, 0]], dtype=np.uint8)
}
return kernels
Operation Descriptions
Opening = Erosion + Dilation
Q2: How do you choose the right structuring element? A: Choose based on the shape of features you
want to preserve or remove. Circular kernels for general use, rectangular for specific directional
operations.
Q3: What's the difference between opening and closing? A: Opening removes small objects and
separates connected ones (erosion first). Closing fills gaps and connects nearby objects (dilation first).
Histogram Equalization
Overview
Histogram equalization improves image contrast by redistributing pixel intensities to utilize the full
dynamic range.
Concept
Transforms the image so that its histogram becomes approximately uniform, enhancing contrast
especially in low-contrast images.
Code Implementation
python
def histogram_equalization(image_path):
# Read image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
# Calculate histograms
hist_original = cv2.calcHist([img], [0], None, [256], [0, 256])
hist_equalized = cv2.calcHist([equalized], [0], None, [256], [0, 256])
# Manual implementation
def manual_histogram_equalization(img):
# Calculate histogram
hist, bins = np.histogram(img.flatten(), 256, [0, 256])
# Normalize CDF
cdf_normalized = cdf * 255 / cdf[-1]
# Apply transformation
equalized = np.interp(img.flatten(), bins[:-1], cdf_normalized)
equalized = equalized.reshape(img.shape).astype(np.uint8)
return equalized
# Apply CLAHE
clahe_img = clahe.apply(img)
Visualization Code
python
def visualize_histogram_equalization(image_path):
img, equalized, hist_orig, hist_eq = histogram_equalization(image_path)
plt.figure(figsize=(15, 10))
# Original image
plt.subplot(2, 3, 1)
plt.imshow(img, cmap='gray')
plt.title('Original Image')
plt.axis('off')
# Equalized image
plt.subplot(2, 3, 2)
plt.imshow(equalized, cmap='gray')
plt.title('Equalized Image')
plt.axis('off')
# Original histogram
plt.subplot(2, 3, 4)
plt.plot(hist_orig)
plt.title('Original Histogram')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
# Equalized histogram
plt.subplot(2, 3, 5)
plt.plot(hist_eq)
plt.title('Equalized Histogram')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
# CDF comparison
plt.subplot(2, 3, 3)
cdf_orig = hist_orig.cumsum()
cdf_eq = hist_eq.cumsum()
plt.plot(cdf_orig / cdf_orig.max(), label='Original CDF')
plt.plot(cdf_eq / cdf_eq.max(), label='Equalized CDF')
plt.title('Cumulative Distribution Functions')
plt.legend()
plt.tight_layout()
plt.show()
Q2: What's the difference between global and adaptive histogram equalization (CLAHE)? A: Global
applies single transformation to entire image. CLAHE divides image into tiles and applies localized
equalization, preventing over-amplification and preserving local details.
Q3: Why do we equalize only the luminance channel in color images? A: Equalizing color channels
separately can cause color shifts. Working with luminance (Y in YUV or V in HSV) preserves color
information while improving contrast.
Edge Detection
2. Q: How would you optimize edge detection for real-time applications? A: Use simpler operators
(Sobel instead of Canny), reduce image resolution, use integral images for fast convolution, or
implement GPU-based processing.
3. Q: What causes false edges in edge detection? A: Noise, compression artifacts, illumination
changes, texture patterns, and inappropriate threshold values. Solutions include preprocessing with
Gaussian blur, proper threshold tuning, and using robust operators like Canny.
Thresholding
4. Q: How do you handle images with non-uniform illumination? A: Use adaptive thresholding,
apply illumination correction (background subtraction), or use local normalization techniques before
global thresholding.
5. Q: Explain the mathematical principle behind Otsu's method. A: Otsu's method finds the
threshold that minimizes intra-class variance (or maximizes inter-class variance) by treating
thresholding as a classification problem with two classes.
6. Q: When would you use multiple thresholds instead of binary thresholding? A: For multi-class
segmentation, when objects have distinct intensity ranges, or for creating hierarchical segmentation
(e.g., different tissue types in medical imaging).
Contour Detection
7. Q: How do you handle nested contours and holes in objects? A: Use cv2.RETR_TREE or
cv2.RETR_CCOMP to capture hierarchy information. Analyze the hierarchy array to distinguish
between outer contours, holes, and nested objects.
8. Q: What's the computational complexity of contour detection? A: O(n) where n is the number of
pixels in the binary image, as each pixel is visited once during the border-following algorithm.
9. Q: How do you match contours between different images? A: Use shape descriptors like Hu
moments, contour area, perimeter ratios, Fourier descriptors, or shape context matching.
Morphological Operations
10. Q: How do you design a morphological operation for a specific noise pattern? A: Analyze the
noise characteristics (size, shape) and design structuring elements that are larger than noise but
smaller than objects of interest.
11. Q: Explain the duality between erosion and dilation. A: Erosion of foreground equals dilation of
background with reflected structuring element. This duality is fundamental to morphological theory.
12. Q: How do you preserve important features while removing noise? A: Use size-appropriate
structuring elements, combine multiple operations (opening followed by closing), or use conditional
morphology based on feature properties.
Histogram Processing
13. Q: Why might CLAHE be preferred over global histogram equalization? A: CLAHE prevents over-
amplification of noise, preserves local contrast, and avoids the "washed out" appearance that global
equalization can create.
14. Q: How do you evaluate the quality of contrast enhancement? A: Use metrics like contrast
improvement index, edge preservation metrics, structural similarity (SSIM), or task-specific
performance measures.
15. Q: Explain histogram matching vs. histogram equalization. A: Histogram equalization creates
uniform distribution. Histogram matching transforms one image's histogram to match a reference
histogram, allowing more controlled enhancement.
Coding Challenges
python
def robust_edge_detection(image_path, noise_level='medium'):
"""
Implement edge detection pipeline robust to different noise levels
Args:
image_path: Path to input image
noise_level: 'low', 'medium', 'high'
Returns:
Processed edge image
"""
# Your implementation here
pass
# Test cases:
# - Clean synthetic image
# - Natural image with texture
# - Noisy image (add Gaussian noise)
python
def compare_thresholding_methods(image_path):
"""
Compare different thresholding methods and return performance metrics
Returns:
Dictionary with results from different methods and quality metrics
"""
# Your implementation here
pass
# Metrics to implement:
# - Processing time
# - Number of connected components
# - Foreground/background ratio
# - Visual quality assessment
python
def analyze_shapes(image_path):
"""
Complete pipeline for shape detection, classification, and measurement
Returns:
List of detected shapes with properties and classifications
"""
# Your implementation here
pass
# Requirements:
# - Detect and classify shapes (circle, rectangle, triangle, etc.)
# - Measure dimensions
# - Handle overlapping shapes
# - Return confidence scores
python
Args:
noise_type: 'salt_pepper', 'gaussian', 'speckle'
Returns:
Cleaned image
"""
# Your implementation here
pass
python
def optimal_contrast_enhancement(image_path):
"""
Find optimal contrast enhancement parameters using quality metrics
Returns:
Enhanced image and optimal parameters
"""
# Your implementation here
pass
Practical Applications
python
Speed Optimization
python
Edge detection: Start with standard parameters, adjust based on noise level
Thresholding Pitfalls
Performance Metrics
Mastering these algorithms provides the conceptual foundation necessary for advanced computer vision
work and enables you to make informed decisions about when to use classical methods versus modern
deep learning approaches.
The key to success with classical computer vision is understanding the underlying mathematical
principles, knowing when to apply each technique, and having the practical skills to tune parameters and
combine methods effectively for robust real-world applications.