Freeman Chain Code
Image representation and description • Representation schemes • Chain codes • Polygonal
approximations • Signatures • Boundary segments • The skeleton of a region • Line segmented
encoding Freeman Chain Code
Region representation: - based on external characteristics (its boundary) - based on internal
characteristics (pixels comprising the region)
Generally, an external representation is chosen when a primary focus is on shape characteristics. An
internal representation is selected when a primary focus is on reflectivity properties, such as color or
texture.
REPRESENTATION SCHEMES T he segmentation techniques provide results in form of pixels along the
boundary or pixels contained in a region: gional descriptors, such as area, perimeter, compactness,
mean value, etc.
It is a standard practice to use boundary representation schemes that compact the data into
representations that are more useful in the computation of descriptors. One of the schemes are
chain codes.
Chain codes • represent a boundary by a connected sequence of straight-line segments of specified
length and direction • the direction of each segment is coded by using a numbering scheme: Basic
Concept: The Freeman chain code assigns a numerical value to each direction in which a
contour can extend. For a pixel grid, the directions are typically defined as:
o 0: Right (East)
o 1: Up-Right (North-
East)
o 2: Up (North)
o 3: Up-Left (North-West)
o 4: Left (West)
o 5: Down-Left (South-
West)
o 6: Down (South)
o 7: Down-Right (South-
East)
o
o Encoding: Starting
from a designated
2. boundary pixel, the algorithm follows the contour of the object, recording the
direction of movement from one boundary pixel to the next. This sequence of
direction numbers forms the chain code.
3. Normalization: To reduce the variability caused by the starting point and orientation,
the chain code can be normalized. This might involve choosing a standard starting
point or rotating the code to a standard orientation.
Regional Descriptors
Region description: - boundary descriptors, such as boundary length, diameter, curvature, etc. - re
In addition to boundary representation, regional descriptors can provide information about the
area and shape of objects in an image:p
1. Area: The total number of pixels that make up the object.
2. Perimeter: The length of the boundary of the object, which can be derived from the
chain code.
3. Shape Factors: Various metrics can describe the shape of the object, such as aspect
ratio, circularity, and compactness.
4. Moments: Geometric moments can capture information about the distribution of pixel
values within the region, which helps in object recognition.
5. Centroid: The center of mass of the object region, calculated based on pixel
coordinates.
Example
Step 1: Binary Image
Consider a binary image where '1' represents the object (foreground) and '0' represents the
background:
000000
011100
010100
011100
000000
Step 2: Identify the Contour
Starting from the top-left boundary pixel (1,1), we can follow the contour of the object:
1. Start at (1, 1).
2. Move right to (1, 2) → Code: 0
3. Move right to (1, 3) → Code: 0
4. Move down to (2, 3) → Code: 2
5. Move down to (3, 3) → Code: 2
6. Move left to (3, 2) → Code: 4
7. Move left to (3, 1) → Code: 4
8. Move up to (2, 1) → Code: 6
Chain Code Representation
The sequence of directions we recorded gives us the Freeman chain code:
00224466
Applications
● Object Recognition: The Freeman chain code can be used as a feature in object
recognition systems.
● Shape Analysis: It helps in analyzing and classifying shapes based on their contour
information.
● Image Compression: The chain code can serve as a compact representation of object
boundaries, which can be useful in image compression algorithms.
Advantages and Limitations
● Advantages:
oCompact representation of contours.
oInvariance to translation and rotation (when normalized).
● Limitations:
o Sensitive to noise and boundary roughness.
o May not capture all geometric properties of complex shapes.
UNIT 3
Object Recognition:
Object recognition is the technique of identifying the object present in images and
videos. It is one of the most important applications of machine learning and deep
learning. The goal of this field is to teach machines to understand (recognize) the content
of an image just like humans do.
Object Recognition Using Machine Learning
HOG (Histogram of oriented Gradients) feature Extractor and SVM (Support
Vector Machine) model: Before the era of deep learning, it was a state-of-the-art
method for object detection. It takes histogram descriptors of both positive ( images
that contain objects) and negative (images that does not contain objects) samples
and trains our SVM model on that.
Bag of features model: Just like bag of words considers document as an orderless
collection of words, this approach also represents an image as an orderless
collection of image features. Examples of this are SIFT, MSER, etc.
Viola-Jones algorithm: This algorithm is widely used for face detection in the image
or real-time. It performs Haar-like feature extraction from the image. This generates a
large number of features. These features are then passed into a boosting classifier.
This generates a cascade of the boosted classifier to perform image detection. An
image needs to pass to each of the classifiers to generate a positive (face found)
result. The advantage of Viola-Jones is that it has a detection time of 2 fps which can
be used in a real-time face recognition system.
Object Recognition Using Deep Learning
Convolution Neural Network (CNN) is one of the most popular ways of doing object
recognition. It is widely used and most state-of-the-art neural networks used this
method for various object recognition related tasks such as image classification. This
CNN network takes an image as input and outputs the probability of the different
classes. If the object present in the image then it’s output probability is high else the
output probability of the rest of classes is either negligible or low. The advantage of
Deep learning is that we don’t need to do feature extraction from data as compared
to machine learning.
Challenges of Object Recognition:
Since we take the output generated by last (fully connected) layer of the CNN model is a
single class label. So, a simple CNN approach will not work if more than one class labels
are present in the image.
If we want to localize the presence of an object in the bounding box, we need to try a
different approach that not only outputs the class label but also outputs the bounding
box locations.
Image Classification :
In Image classification, it takes an image as an input and outputs the classification
label of that image with some metric (probability, loss, accuracy, etc). For Example:
An image of a cat can be classified as a class label “cat” or an image of Dog can be
classified as a class label “dog” with some probability.
Object Localization: This algorithm locates the presence of an object in the image
and represents it with a bounding box. It takes an image as input and outputs the
location of the bounding box in the form of (position, height, and width).
Object Detection:
Object Detection algorithms act as a combination of image classification and object
localization. It takes an image as input and produces one or more bounding boxes with the
class label attached to each bounding box
Challenges of Object Detection:
● In object detection, the bounding boxes are always rectangular. So, it does not help with
determining the shape of objects
● Object detection cannot accurately estimate some measurements such as the area
Edge Linking
Edge linking refers to the process of connecting edge fragments in an image that likely
belong to the same object or boundary. After detecting edges using techniques like the
Canny Edge Detector or Sobel Operator, the edges often appear as discontinuous
segments. Edge linking aims to trace and connect these discontinuous edges to form a
coherent boundary of an object.
Key Steps in Edge Linking:
Edge Detection: Initially, edges are detected using algorithms like the Canny Edge detector.
Edge Linking: These detected edges (which may be fragmented) are linked together to form
continuous boundaries using various methods, such as:
o Hough Transform: For detecting straight lines and curves.
o Gradient Orientation and Magnitude: The direction and strength of edges can guide
linking.
o Connected Components Analysis: This can help link neighboring edge pixels based
on proximity and similarity.
Boundary Detection
Boundary detection identifies the periphery of an object within an image. The boundary is
typically where there is a significant change in pixel intensity or color. Boundary detection is
crucial for segmenting objects within an image.
Methods of Boundary Detection:
● Gradient-Based Methods: These methods highlight areas of high intensity change (edges),
such as:
oSobel Operator: Computes the gradient in both x and y directions.
oPrewitt and Laplacian Operators: Used for edge detection and boundary
identification.
● Contour Detection (Active Contours or Snakes): A method where a curve (or snake) evolves
around an object, adjusting to the object's boundary by minimizing an energy function.
3. Region Detection
Region detection aims to partition an image into multiple regions based on characteristics like
color, texture, or intensity. The goal is to group pixels that share similar properties into
distinct regions, which may correspond to objects or areas of interest.
Techniques for Region Detection:
● Region Growing: A pixel is chosen as a seed, and neighboring pixels are added to the region
if they have similar properties (e.g., color or intensity).
● Region Splitting and Merging: The image is initially divided into regions (splitting), which are
then merged based on certain criteria (e.g., homogeneity).
● Watershed Algorithm: This method views an image as a topographic surface, where regions
are identified based on the image's intensity values, like water flowing and filling basins.
● Graph-Based Segmentation: Image pixels are treated as nodes in a graph, and segmentation
is achieved by cutting the graph based on similarity measures.
Relationship between Edge Linking, Boundary Detection, and Region
Detection
● Edge Linking and Boundary Detection: Both are concerned with detecting the
boundaries of objects. Edge linking refines the boundaries detected by edge detection,
ensuring that the edges form coherent shapes. Boundary detection identifies these
boundaries as the final step in the segmentation process.
● Boundary Detection and Region Detection: Boundary detection provides the
boundaries that separate different regions within an image. Region detection identifies
areas of interest, and boundary detection marks the separation between those regions.
● Edge Linking and Region Detection: After edge linking, the image's edges are more
coherent, which can assist in defining the regions for segmentation. By detecting these
boundaries, region segmentation becomes more accurate.
IMAGE SEGMENTATION
Image segmentation is one of the key computer vision tasks, It separates objects,
boundaries, or structures within the image for more meaningful analysis. Image
segmentation plays an important role in extracting meaningful information from images,
enabling computers to perceive and understand visual data in a manner that humans
understand, view, and perceive. In this article let us discuss in detail image
segmentation, types of image segmentation, how image segmentation is done, and its
use cases in different domains.
Image segmentation is a fundamental technique in digital image processing and
computer vision. It involves partitioning a digital image into multiple segments (regions or
objects) to simplify and analyze an image by separating it into meaningful components,
Which makes the image processing more efficient by focusing on specific regions of
interest. A typical image segmentation task goes through the following steps:
1. Groups pixels in an image based on shared characteristics like colour, intensity, or
texture.
2. Assigns a label to each pixel, indicating its belonging to a specific segment or object.
3. The resulting output is a segmented image, often visualized as a mask or overlay
highlighting the different segments.
Image segmentation techniques
The traditional image segmentation techniques which formed the foundation of
modern image segmentation methods using deep learning algorithms, uses
thresholding, edge detection, Region-Based Segmentation, clustering algorithms and
Watershed Segmentation. These techniques are more reliant on principle of image
processing, mathematical operation and heuristics to separate an image into
meaningful regions.
Thresholding: This method involves selecting a threshold value and classifying image
pixels between foreground and background based on intensity values
Edge Detection: Edge detection method identify abrupt change in intensity or
discontinuation in the image. It uses algorithms like Sobel, Canny or Laplacian edge
detectors.
Region-based segmentation: This method segments the image into smaller regions
and iteratively merges them based on predefined attributes in colour, intensity and
texture to handle noise and irregularities in the image.
Clustering Algorithm: This method uses algorithms like K-means or Gaussian models
to group object pixels in an image into clusters based on similar features like colour or
texture.
Watershed Segmentation:The watershed segmentation treats the image like a
topographical map where the watershed lines are identifies based on pixel intensity and
connectivity like water flowing down different valleys.
These traditional methods offer basic techniques of image segmentation with
limitations, but provide foundation for more advanced methods.
Deep learning image segmentation models
Deep learning image segmentation models are a powerful technique which
leverages the neural network architecture to automatically divide an image into
different segments and extract features from images for accurate analysis and
segmentation tasks.
Below are some of the popular deep learning models used for image segmentation:
U-Net: This model uses U-Shaped network to efficiently segment medical images. This
model is very efficient in working with small amount of data and provide precise
segmentation.
Fully Convolutional Network (FCN):This model has the ability to process image of any
size and output spatial maps. This is achieved by replacing fully connected layers in a
conventional CNN with convolutional layers. This helps in segmenting an entire image
pixel by pixel.
SegNet: This model includes a encoder-decoder network, used for tasks like scene
understanding and object recognition. The encoder here captures the context in the
image and the decoder performs the precise localization and segmentation objects by
using the context.
DeepLab: The key feature of DeepLab is the use of atrous convolutions used to capture
multi-scale context with multiple parallel filters.
Mask R-CNN: This model extents the Faster R-CNN object detection framework, by
adding a branch for predicting segmentation masks alongside bounding box regression.
Vision Transformer (ViT): A new model that applies transformers to image
segmentation. The image is divided into patches and processes them sequentially to
understand the global context of the image.
image splitting and merging important question – refer imp q & a