ARTI 404: Image Processing
Lecture 06b – Image Segmentation
Dr. Irfan 5/26/2021
[Adapted from Digital Image Processing, 4th Edition, Gonzalez & Woods , 2018]
Felemban, IAU, 2025
Lecture Reading and Reminders
❑ Lecture Reading
❑ Chapter 10
❑ Reminders
Felemban, IAU, 2025
Lecture Outline
❑ Segmentation principals and concepts
Felemban, IAU, 2025
Segmentation Principals and Concepts
Dr. Irfan 5/26/2021
Felemban, IAU, 2025
Image Segmentation
🞑 Image segmentation is the partitioning of an image into distinct regions or
categories that correspond to different objects or parts of objects, where each
region contains pixels with similar attributes, and each pixel in an image is
allocated to one of these categories.
🞑 A good segmentation is typically one in which pixels in the same category have
similar intensity values and form a connected region, whereas the neighboring
pixels that are in different categories have dissimilar values.
🞑 The goal of this is to simplify/change the representation of an image into something
more meaningful and easier to analyze.
Felemban, IAU, 2025
Image Segmentation
🞑 Partition an image into correct segments is often a very challenging problem.
🞑 Segmentation principles and concepts:
🞑 Edge-based segmentation
🞑 Thresholding
🞑 Region extraction
🞑 Morphological watersheds
Felemban, IAU, 2025
Edge-based Segmentation
❑ Edge Detection:
▪ We have discussed edge detection algorithms
(Roberts, Sobel, Prewitt, LoG, etc)
▪ When implemented, there are normally breaks in
line. For this reason, these are generally followed
by linking procedures to assemble edge pixels into
meaningful edges.
❑ Linking Edge Points:
▪ Local processing: This is the simplest approach for linking pixels within a small
neighborhood.
▪ Global processing (Hough transform): Here, we attempt to link edge pixels that lie on
specific curves. The Hough transform is designed to detect lines, using the parametric
representation of a line.
Felemban, IAU, 2025
Linking Edge Points: Local Processing
🞑 Analyze pixels in a small neighborhood Sxy of each edge point
🞑Pixels that are similar are linked
Principal properties used for establishing similarity:
(1) M (x, y) = |∇ f (x, y )|: Magnitude of gradient vector o The magnitude of the gradient vector represents the rate of
change of a function in a particular direction.
(2) α (x, y ): Direction of gradient vector
🞑Edge pixel with coordinates (s, t ) in Sxy is similar in magnitude to pixel at (x, y ) if :
|M (s, t ) − M (x, y )| ≤E , where E is a positive threshold
🞑 Edge pixel with coordinates (s, t ) in Sxy has an angle similar topixel at (x, y ) if:
|α (s, t ) − α (x, y )| ≤A , where A is a positive angle threshold.
🞑 Edge pixel (s, t ) in Sxy is linked with (x, y ) if both criteria are satisfied.
Felemban, IAU, 2025
Linking Edge Points: Local Processing
❑ The above strategy is expensive because all neighbors of every point have to be examined.
❑ Simplification suitable for real-time applications:
1. Compute the gradient magnitude and angle arrays , M (x, y ) and α (x, y ) of input image f (x, y ).
2. Form a binary image:
TM is a threshold, A is a specified angle direction, and ±TA defines a “band” of acceptable directions about A.
3. Scan rows of g and fill (set to 1) all gaps (sets of 0s) in each row that do not exceed a specified
length L.
4. Rotate g by θ and apply step (3). Rotate the result back by − θ.
Note: Image rotation is expensive when linking in numerous directions is required; steps (3) and (4) are combined
into a single, radial scanning procedure.
Felemban, IAU, 2025
Linking Edge Points: Global Processing
🞑 Hough transform: is a feature extraction technique that aims to find instances of objects of a
certain shape (line, circle, ellipse) using a voting procedure carried out in a parameter space.
Original image Edge map
What if we want to find the two wheels of the bicycle?
Issues: Discontinuities, which points to fit it to?, Noise
Source: [Link]
Felemban, IAU, 2025
Hough Transform for Lines
🞑 What is a parameter space?
Image Space Parameter Space
The line (in the image space) that both
points share
this line represents all the straight lines
that pass-through point 𝑥𝑖 , 𝑦𝑖 in the
image space
• 𝑥𝑖 , 𝑦𝑖 and (𝑥𝑗 , 𝑦𝑗 )are two points that belong to a line in an edge
• The equation of the line is 𝑦 = 𝑎𝑥 + 𝑏
• For point 𝑥𝑖 , 𝑦𝑖 we have:
𝑦𝑖 = 𝑎𝑥𝑖 + 𝑏 ⇔ 𝑏 = −𝑎𝑥𝑖 + 𝑦𝑖
• Point in image space maps to a line in the parameter space
• Line in image space maps to a point in the parameter space
Felemban, IAU, 2025
Hough Transform for Lines
🞑 How does voting happen?
(1) Quantize parameter space (a,b) Accumulator array A(a,b)
a
(2) Create accumulator array A(a,b) 0 0 0 0 0
0 0 0 0 0
(3) Set A(a,b)=0 for all (a,b)
0 0 0 0 0
(4) For each edge point 𝑥𝑖 , 𝑦𝑖 : 0 0 0 0 0
The (a, b) values at this maxima
0 0 0 0 0 correspond to the lines in the image
A(a,b)= A(a,b)+1
b
If (a,b) lies on the line: 𝑏 = −𝑎𝑥𝑖 + 𝑦𝑖 a 1 0 0 0 0 a 1 0 0 0 1
0 1 0 0 0 0 1 0 1 0
0 0 1 0 0 0 0 2 0 0
0 0 0 1 0 0 1 0 1 0
0 0 0 0 1 1 0 0 0 1
b b
Felemban, IAU, 2025
Hough Transform: Multiple Lines Detection
a
Felemban, IAU, 2025
🞑 Problem: slope of the line − ∞ ≤ b ≤∞
0
Meaning we need a massive accumulator array (memory, computation)
🞑 Solution: use the normal representation of a line:
x cos θ + y sin θ = 𝜌
0 ≤ θ≤𝜋 , 𝜌 is finite ≤size of the image
Point in image space
maps to a sinusoid in
the parameter space
Felemban, IAU, 2025
Edge map
The strong peaks
correspond to the two
lines
Felemban, IAU, 2025
Hough Transform for Circles
of
🞑 Hough transform can be extended to detect other shapes such as circles.
Assumption: r is known
• Point in image space maps to a circle in the parameter space
• Intersection (point) in parameter space maps to the circle in the image space
Felemban, IAU, 2025
Felemban, IAU, 2025
Segmentation using Thresholding
🞑 Thresholding refersto a familyof algorithms that use a pixel value as a
so
threshold to create a binary image (an image with only black-and-white
pixels)from a grayscale image.
🞑 Itprovides the simplestway to segment objects from a background in an
image.
🞑 The threshold can be chosen manually (by looking at the histogram of
pixel values) or automatically using an algorithm.
Felemban, IAU, 2025
Thresholding and Otsu's segmentation
🞑 Histogram-based thresholding method (with the assumption of a bimodal histogram).
🞑Itcomputes an optimal threshold value and compute the following (which are
separated by that threshold value):
🞑 Maximizing the between-class variance
🞑Minimizing the intra-class variance
The method is optimum in the sense that it maximizes the between-class variance, a well-known measure used in
statistical discriminant analysis.
Felemban, IAU, 2025
20
Felemban, IAU, 2025
Source: [Link] 21
Felemban, IAU, 2025
Region-based Segmentation
🞑 Region segmentation using clustering and superpixels
🞑 Simple linear iterative clustering (SLIC)
IoT
🞑 Region segmentation using graph cuts
🞑 Felzenszwalb's Graph-based Image Segmentation *
* Felzenszwalb, Pedro F., and Daniel P. Huttenlocher. "Efficient graph-based image
segmentation." International journal of computer vision 59 (2004): 167-181.
Felemban, IAU, 2025
Region Segmentation Using Superpixels
🞑 What are Superpixels?
The idea is to replace the standard pixel grid by grouping pixels into
primitive regions that are more perceptually meaningful than individual
pixels.
Felemban, IAU, 2025
Note: boundaries between regions of interest must be preserved in a superpixel image.
a
Felemban, IAU, 2025
RGB example
Felemban, IAU, 2025
SLIC Superpixel Algorithm
Felemban, IAU, 2025
🞑 To represent pixels 5-dimensional vectors are used 𝒛 = 𝑟 𝑔 𝑏 𝑥 𝑦 𝑇 .
(r, g,b) are the color components , and (x, y) are the pixel coordinates.
🞑 Notation:
ntp : Number of pixels in the input image
nsp: Number of superpixels used to segment the input image
ntp/nsp: Approx. size of each superpixel
𝑠= ntp/nsp: For roughly equally sized super pixels there would be a super pixel
center at every grid interval s
🞑 [Link]()
Felemban, IAU, 2025
Felemban, IAU, 2025
Felemban, IAU, 2025
Felemban, IAU, 2025
Felzenszwalb's Graph-based Image Segmentation
🞑 Apply a graph-based approach to segmentation.
🞑 First constructs an undirected graph with the image pixels as vertices (the set to be
segmented) and the weight of an edge between the two vertices being a measure of the
dissimilarity (for example, the difference in intensity).
🞑 In the graph-based approach, the problem of partitioning an image into segments
translates to finding connected components in the constructed graph.
🞑 The algorithm is very similar to Kruskal’s
algorithm for computing the MST for an
undirected graph.
* minimum spanning tree (MST)
Felemban, IAU, 2025
🞑 The edges between two vertices in the same component should have
relatively low weights, and edges between vertices in different components
should have higher weights.
🞑 Felzenszwalb’s preserves the details
in low-variability image regions while
ignores the details in high-variability
regions.
🞑 The algorithm has a single-scale parameter that influences the size of the
segments. The actual size and the number of the segments can vary greatly,
depending on local contrast.
Felemban, IAU, 2025
Source: [Link]
Felemban, IAU, 2025
Segmentation Using Morphological Watersheds
🞑 The concept of a watershed is based on visualizing an image in three dimensions, two
spatial coordinates versus intensity, as in Fig. 2.18(a).
Felemban, IAU, 2025
🞑 Any grayscale image can be represented as a topographic view.
Eng
🞑 In such a “topographic” interpretation, we consider three types of points:
(1) points belonging to a regional minimum
(2) points at which a drop of water, if placed at the location of any of those points, would fall with
certainty to a single minimum
(3) points at which water would be equally likely to fall to more than one such minimum
Source: [Link]
Felemban, IAU, 2025
🞑 If we flood this surface from its minima and, if we prevent the merging of the
waters coming from different sources, we partition the image into two
different sets: the catchment basins and the watershed lines.
🞑 The set of points satisfying
condition (2) is called the catchment
basin or watershed of that minimum.
🞑 The points satisfying condition (3)
form crest lines on the topographic
surface and are referred to as divide
lines or watershed lines.
Source: [Link]
Felemban, IAU, 2025
🞑 If we apply this transformation to the image gradient, the catchment basins should
theoretically correspond to the homogeneous grey level regions of this image.
From top to bottom and from right to left:
• Original image.
• Gradient image.
• Watershed of the gradient image.
• Final contours.
Source: [Link]
Felemban, IAU, 2025
🞑 Problem: over-segmentation due to noise or local irregularities in the gradient
image.
🞑 Solution: flooding the topographic surface from a previously defined set
of markers.
Source: [Link]
Felemban, IAU, 2025
🞑 What are markers?
A marker is a connected component belonging to an image.
Marker selection steps:
(1) Preprocessing
(2) Definition of a set of criteria that markers must satisfy.
Example:
(1) Pre-processing: smoothing filter (Box, Gaussian) .
(2) Define an internal marker as:
1- A region that is surrounded by points of higher “altitude”
2- The points in the region form a connected component
3- All the points in the connected component have the same intensity value.
Felemban, IAU, 2025
In the below example: we did the gradient of the smoothed image and restrict the
algorithm to operate on a single watershed that contains the marker in that
particular region.
Felemban, IAU, 2025
Segmentation Using Different Methods (Implementation)
Source: [Link]
Felemban, IAU, 2025
Explore by Yourself (More to read)
🞑 QuickShift
🞑 Region growing
🞑 Active Contours
Felemban, IAU, 2025
END OF LECTURE 6b
Felemban, IAU, 2025