0% found this document useful (0 votes)
3 views5 pages

AL701 Computer Vision Complete Notes

The document provides comprehensive notes on Computer Vision, covering its definition, goals, and fundamental concepts such as image representation and processing. It details binary image processing techniques, color spaces, image enhancement methods, edge detection, and segmentation approaches, as well as various applications including gesture recognition and object tracking. The notes emphasize the use of deep learning models and tools like OpenCV for real-time computer vision tasks.

Uploaded by

mani manish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views5 pages

AL701 Computer Vision Complete Notes

The document provides comprehensive notes on Computer Vision, covering its definition, goals, and fundamental concepts such as image representation and processing. It details binary image processing techniques, color spaces, image enhancement methods, edge detection, and segmentation approaches, as well as various applications including gesture recognition and object tracking. The notes emphasize the use of deep learning models and tools like OpenCV for real-time computer vision tasks.

Uploaded by

mani manish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

AL701 – Computer Vision COMPLETE NOTES (All Units +

Images Included)

UNIT I – INTRODUCTION TO COMPUTER VISION

Computer Vision is a field of Artificial Intelligence that deals with the extraction, analysis,
and understanding of useful information from images and videos. It enables machines to
interpret visual data like humans. The main goals include object detection, classification,
segmentation, and scene understanding.

Diagram: Image as f(x,y) – 2D Intensity Function

Images are represented as a 2D function f(x,y) where each pixel holds intensity values.
Types include binary, grayscale, RGB, and colored images. Image processing involves
modifying images, while Computer Vision focuses on understanding them. Basic image
operations such as resizing, cropping, rotating, contrast enhancement, and bitwise
operations help prepare images for analysis.
UNIT II – BINARY IMAGE PROCESSING

Binary image processing converts grayscale images into two-level binary images using
thresholding. Techniques include global thresholding, Otsu’s optimal thresholding, and
adaptive thresholding. Morphological operations such as erosion, dilation, opening, and
closing help refine shapes, remove noise, and extract meaningful structures.

Diagram: Morphological Operations – Erosion & Dilation

Connected Component Analysis (CCA) labels distinct objects using 4-connectivity or


8-connectivity. Contour analysis extracts shape boundaries, useful for measuring area,
perimeter, and shape classification.
UNIT III – COLOR SPACES & IMAGE ENHANCEMENT

Color spaces provide different ways to represent color data. RGB is device-dependent,
whereas HSV, LAB, and YCbCr offer better segmentation and illumination invariance.
Histogram Equalization enhances contrast by redistributing intensity values.

Diagram: RGB to HSV Conversion Flow

CLAHE (Contrast Limited Adaptive Histogram Equalization) improves local contrast while
preventing noise amplification. Filtering using kernels such as box, Gaussian, and median
filters helps smooth images and remove noise. Convolution is the core mathematical
operation.
UNIT IV – GRADIENTS, EDGE DETECTION, SEGMENTATION,
RECOGNITION

Image gradients represent intensity changes. First-order derivative filters like Sobel,
Prewitt, and Roberts detect edges, while Laplacian is a second-order operator for sharper
edges. Canny Edge Detector is the most accurate multi-stage detector.

Diagram: Canny Edge Detection Pipeline

Segmentation techniques divide an image into meaningful regions. Major approaches


include thresholding, region growing, K-means clustering, watershed algorithm, and deep
learning-based segmentation. Image classification uses CNN architectures such as VGG,
ResNet, and MobileNet. Object detection uses YOLO, SSD, and Faster R-CNN for
real-time detection.
UNIT V – COMPUTER VISION APPLICATIONS

Computer Vision applications include gesture recognition, motion estimation, object


tracking, face detection, and deep-learning based perception. Motion estimation uses
optical flow, block matching, and feature tracking, while object tracking uses algorithms
like KCF, Camshift, Deep SORT, and Kalman filter. Face detection uses Haar cascades
and deep learning models.

Diagram: Face Detection Pipeline

The OpenCV DNN module runs deep learning models such as YOLO, SSD, and
MobileNet for real-time computer vision tasks. These applications are widely used in
autonomous driving, robotics, augmented reality, and surveillance systems.

You might also like