Image Segmentation
Image Segmentation
Image Segmentation
5.1. INTRODUCTION
Segmentation:
The process of subdividing an image into its constituent parts is known as
segmentation. This is the first step in the image analysis. Segmentation should be
stopped when object of interest in an application have been isolated.
Segmentation algorithms for monochrome images generally based on two
important properties known as,
(i) Discontinuity (to partition an image based on abrupt changes in gray level.
The main areas of interest in this category are detection of isolated points, the
detection of lines and edges of an image).
(ii) Similarity (principal approach based on threshold, region growing, region
splitting & merging)
The concept of segmenting an image based on discontinuity or similarity of the
gray level values of its pixels is applicable to both static and dynamic images.
The various techniques based on gray level discontinuity are given below.
(i) Detection of isolated points,
(ii) Detection of line,
(iii) Detection of edges.
5.2 Digital Image Processing
The way to look for discontinuities is to run a mask through the image. The
representation of mask is given below.
w1 w2 w3
w4 w5 w6
w7 w8 w9
wi = Coefficient in the mask.
Zi = Gray level of the pixel associated with mask coefficient wi.
R = The response of mask at any point in an image.
R w 1 z1 w 2 z 2 ... w 9 z 9
9
R w
i 1
i zi
It involves computing sum of products of the coefficients with the gray levels
contained in the region encompassed by the mask.
The response of the mask is defined with respect to its center location when the
mask is centered on a boundary pixel. The response is calculated by using appropriate
partial neighbourhood.
Edge detection is used for detecting meaningful discontinuities in gray level. First
and second order digital derivatives are implemented to detect the edges in an image.
Edge:
An edge is a boundary between two regions with relatively distinct gray level
properties.
Model of an ideal digital edge is shown here.
The second derivative has a zero crossing at the mid point of the transition in gray
level. The first and second derivative at any point in an image can be obtained by
using the magnitude of the gradient operators at that point and laplacian operator.
These are explained below.
P1 P2 P3 –1 –1 –1 –1 0 1
P4 P5 P6 0 0 0 –1 0 1
P7 P8 P9 –1 –1 –1 –1 0 1
3 3 image Prewift mask for Prewift mask for
horizontal detection vertical detection
Figure 5.7 Prewift masks
G x P7 P8 P9 P1 P2 P3
5.6 Digital Image Processing
G y P3 P6 P9 P1 P4 P7
P1, … P9 are pixel values in a subimage prewift masks are simpler to implement
compared with sobel operators.
Sobels masks are given below.
P1 P2 P3 –1 –2 –1 –1 0 1
P4 P5 P6 0 0 0 –2 0 2
P7 P8 P9 1 2 1 –1 0 1
(a) 3 3 image (b) Mask for (c) Mask for
horizontal detection vertical detection
Figure 5.8 Sobal masks
Fig. (b) is used to find Gx.
By using Roberts operators, the partial derivative at the center of the 3 3 mask
can be implemented.
G x P9 P5
G y P8 P6
2 f 2 f
2 f
x 2 y 2
The laplacian response is sensitive to noise. It is rarely used for edge detection.
P1 P2 P3 0 –1 0
P4 P5 P6 –1 4 –1
P7 P8 P9 0 –1 0
3 3 image Laplacian mask
Figure 5.9 Laplacian masks
Image Segmentation 5.7
2 f 4 P5 P2 P4 P6 P8
The coefficient associated with the center pixels should be positive. The
coefficients associated with the outer pixels should be negative. The sum of the
coefficients is zero.
Practically, the set of pixels detected by the gradient operators will not form a
complete boundary due to noise, non-uniform illumination etc. The following
techniques are used for edge linking and boundary detection.
(i) Local processing,
(ii) Global processing using Hough transform
(iii) Global processing via Graph theoretic approach
Some of these techniques are explained below.
The edge pixels are determined by using the gradient operators. The boundary is
not completely defined by the edge pixels. Small gaps are there. To fill these gaps, we
consider the characteristics of pixels in a small neighbourhood. The neighbourhood
pixels similar to the boundary pixels are linked.
Two properties are used to check the similarity of the neighbourhood pixels with
respect to the edge pixels. These are given below.
The strength of the gradient operator response to produce the edge pixel.
The direction of the gradient
(x, y) – Edge pixel
(x1, y1) Neighbourhood pixel.
If neighbourhood pixel is similar to the edge pixel, then,
f x, y f x1 , y1 T
T – Non-negative threshold value.
The neighbourhood pixel with respect to the edge pixel has an angle similar to the
edge pixel if
x, y x1 , y1 Q
5.8 Digital Image Processing
Gy
x, y tan 1
Gx
- Angle of threshold
If above equations are satisfied, then, the neighbourhood pixel is linked to the
edge pixel. This procedure is repeated for every edge pixel location.
This technique based on representing the edge elements in the form of a graph and
searching the graph for the low-cost path.
This technique can provide good result in the presence of noise. The graph is
given as G = (N, V)
N = Set of non-empty nodes
V = Set of unordered pairs of nodes in N.
The pair (Ni, Nj) of V is known as edge.
The graph in which the edges are directed is known as directed graph.
Ni = parent node if an edge is directed from the node Ni to Nj.
Expansion of the node: The process of identifying the successor of a node is
known as expansion of the node.
Start node (or) Root node: The node at level 0.
Goal node: The node in the least level.
C(ni, nj) – the cost associated with the edges between 2 nodes ni and nj.
The cost of entire path is given as
K
C Cn
i2
i 1 , ni
Cp, q H f p f q
H = Highest grey level of in the image.
Image Segmentation 5.9
5.4. THRESHOLDING
We can consider the histogram of an image f(x, y) which consists of light objects
on a dark background.
Figure 5.11 Histogram of an image which consists of dark background and a light
object.
This histogram consists of two dominant regions. One for the dark background
and another for the object.
T is used to represent the threshold value. If separates object and background
region. If f(x, y) > T, then, the corresponding point is known as object point.
Otherwise, that point is known as background point.
We can see the histogram which correspond to two different light objects on a
dark background. Here two threshold values T1 and T2 are shown.
2 1 2 2
Image Segmentation 5.11
P z dz .
2
If P1 = P2, then the nth optimal threshold is the mean average. It is given as
m1 m 2
T
2
The threshold value selection depend the peaks in the histogram. The pixels lie on
or near the boundary between the objects and backgrounds are considered to improve
the histogram. Because of doing this, the threshold selection will be improved.
A three level image can be formed using the gradient f and 2f at (x, y)
It is given as,
0, for f T
G x , y , for f T & 2 f 0
, for f T & 2 f 0
0 is used to represent the pixels that are not on the edge.
+ is used to represent the pixels that are on the dark side of an edge.
– is used to represent the pixels that are on the light side of an edge.
The transition from light background to dark object is represented by the
occurrence of ‘–’ followed by ‘+’.
The transition from object to the background is represented by the occurrence ‘+’
followed by ‘–’.
Use of Motion in segmentation process:
Generally two approaches are used to apply motion in segmentation process. They
are
(i) Spatial domain approach and (ii) Frequency domain approach
5.12 Digital Image Processing
Here, a set of seed points are used as starting points. From the starting point,
regions grow by appending (neighbouring pixels which have similar properties) to
each seed point.
The algorithm starts with the selection of the seed pixel and desired predicate
P(R1) to compare with neighbouring pixels. Then, the initial seed is assigned to region
RL. Then, the eight neighbouring pixels of the seed pixel are assigned to region R1, if
these pixels meet the predicate requirement. The process of comparing the eight
neighbouring pixels continues until no further pixels can be added to the region. Then,
Image Segmentation 5.13
the region R1 is completely defined. Pixels belong to this region are removed from the
original image.
A new seed pixel is located within another area of the image and the same
procedure is repeated until all pixels defining region R2 are determined. The same
procedure is repeated again and again until all regions have been defined.
Seed pixel is located by scanning the image pixel by pixel and to use the first
pixel found which has the minimum or maximum gray level.
After that, the region growing process started.
We can see the example for the threshold value of 3 (Th = 3)
1 0 5 6 7
1 0 5 5 5
0 1 6 7 6
1 2 5 6 7
1 1 7 6 5
Here, 5 5 image pixels are shown. The values inside the boxes are the gray
levels.
Here, 2 seeds are 2 and 7.
The location of 2 is (4, 2)
The location of 7 is (3, 4)
These are the starting points. Then, the regions are grown here, Th = 3.
Th = Threshold value.
The absolute difference between the gray level of that pixel and the gray level of
the seed must be less than the threshold value.
R1 R1 R2 R2 R2
R1 R1 R2 R2 R2
R1 R1 R2 R2 R2
R1 R1 R2 R2 R2
R1 R1 R2 R2 R2
Region 1 Region 2
Figure 5.13 Region based segmentation. Region growing by pixel aggregation
This technique is simple. But, it is difficult to select the initial seed which
properly represent the region of interest. It is difficult to select the suitable properties
for including points as various regions during region growing process.
5.14 Digital Image Processing
R3 R4
Initially an, image is divided into four regions, R1, R2, R3, R4.
Image Segmentation 5.15
Quad tree gives the region division of the image. R is the region of the original
image. In the next level, original image is divided into four non-overlapping regions.
The next level is the subdivision of these four regions into four smaller regions.
The final level shows the subdivision of R23 into four regions.
After the image has been split into regions that meet the logical predicate
requirement, regions which have the same predicate requirement are merged to form a
new enlarged region. The process of splitting an image into small regions and then
merging connected regions together is known as region segmentation by splitting and
merging.
In image analysis, the 1st step is image segmentation. Then, the next step is image
representation and description methods. The output of the segmentation process is the
5.16 Digital Image Processing
0 6 6 (b) 0 6 6 (b) 0 6 6
0 6 0 0 6 0 0 6 0
(a) 6 6 0 (a) 6 0 0 (a) 6 0 0 (b)
Pixel (a) is a neighbour (ii) 8-connectivity (iii) m-connectivity
of pixel (b)
(i) 4-connectivity
Here, 6 is the gray level of the pixel
The direction of each segment is coded by using a numbering method.
Direction numbers for 4 directional code and 8-directional code are given below
Image Segmentation 5.17
Digital images are processed in a grid format with equal spacing in x and y
directions.
(30103323212110) (7276543220)
Figure 5.18. 4-directional chain code Figure 5.19 8-directional code
5.18 Digital Image Processing
The accuracy of the resulting code depends on the grid spacing. If, the grids are
closed, then, the accuracy is good. The chain code will be varied depend on the
starting point. This method has two problems.
(i) The resulting chain of codes is long.
(ii) If there is any small disturbances along the boundary due to noise, then there
is a change in the code which may not related to the boundary shape.
This technique is to merge points along a boundary until the least square error line
fit of the points merged so far exceeds a preset threshold. Finally, the intersections of
adjacent line segments form the vertices of the polygon.
In this technique, a segment is subdivided into two parts until a required criterion
is satisfied. The maximum perpendicular distance from a boundary segment to the line
joining its two end points not exceed a preset threshold.
(a) (b)
The region boundary is partitioned by following the contour of S and the points
are marked at which a transition is made. This technique is independent of region size
and orientation. Description of a region is based on its area, the area of its convex
deficiency and the number components in the convex deficiency.
The shape number of the boundary is based on the 4-directional code (or) 8-
directional code. Shape number is defined as the first difference of smallest
magnitude. The number of digits is the shape number is known as the order of the
boundary.
Image Segmentation 5.21
S – Starting point.
Chain code : 0 3 2 1
Difference:
Difference: 3 3 3 3
Order = 4
5.22 Digital Image Processing
Example:
S – Starting point
Chain code: 0 0 0 3 2 2 2 1
Difference:
Difference : 3 0 0 3 3 0 0 3
Shape No.: 0 0 3 3 0 0 3 3
(shift difference by times)
Sk Au . e N
u 0
Ŝk Au . e N
u 0
Example:
Original N = 24 M=2 M = 20 M = 22
(M is too low, (M < N) (Shape is
so reconstruction more like original)
shape is different
from original)
Some basic properties of Fourier descriptors are shown below.
Transformation Boundary Fourier Descriptors
Identity S(k) A(u)
j
Rotation Sr(k) = S(k) e Ar(u) = A(u) ej
Translation St(k) = S(k) + xy At(u) = A(u) + xy (u)
Scaling Ss(k) = C S(k) As(u) = C . A(u)
5.24 Digital Image Processing
E=C–H =V–A+F
Example:
H=2
E = C – H = –1
E = –1
Example 2:
Connected components, C = 2
Holes = 0
So, E = C – H = 2 – 0
E=2
Example 3:
Here, H = 2
Connected components, c = 1
So, E = C – H = 1 – 2
E = –1
Example 4:
Example 5:
Find number of vertexes and Holes?
Here, V=6
A=6
C=1
H=1
Example 6:
Find E value?
Here, H = 2, C = 1
E = C – H = –1
E = –1
Vertices, V = 7
Edges, A = 11
Faces, F = 2
Holes, H = 3
C=1
Image Segmentation 5.27
E=V–A+F
= 7 – 11 + 2 = –2
E = C – H = 1 – 3 = –2
E=2
Example 8: Obtain Euler number for letter A.
H=1
C=1
E=C–H=1–1=0
E=0
5.10. TEXTURE
Texture refers to repetition of basic texture elements known as texels (or) texture
primitives (or) texture elements. It contains several pixels whose placement will be
periodic, quasi periodic (or) random. The artificial textures are periodic natural
textures are random.
This approach is very simple. Here, texture is described using statistical moments
of the gray level histogram of an image.
r = random variable to denote he gray level in the image.
P(ri) = Corresponding histogram
i = 0, 1, … L – 1
5.28 Digital Image Processing
is maximum for an image in which all the gray levels are equal.
Entropy is also used as a measure of variability. It is zero for constant gray level
images.
Measure of texture computed using histogram suffer from the limitation that they
carry no information regarding relative position of pixels with respect to each other.
This problem is over come by using co-occurrence matrix method.
Image Segmentation 5.29
In this technique, distribution of intensities and intensity values are used. For
texture analysis, relative positions of pixels in an image are also considered.
Q is an operator which defines the position of two pixels relative to each other.
Co-occurrence matrix is denoted as C. Cij is the number of times that pixel pairs with
intensities zi and zj occur in an image, 1 i, j L.
Q is a position operator. It is defined as ‘one pixel immediately to the right’.
Image is given as L = 2, 1 i, j 2.
4 4 image size
Co-occurrence matrix Cij is given as
C11 C12
C21 C22
Here, Q defined as one pixel immediately the right.
Its other name is syntactic approach. This technique uses the structure and inter
relationship among components in a pattern.
The block diagram of structural approach for pattern description is shown below.
The input is divided into simple subpatterns and given to the primitive extraction
block. Good primitives are used as basic elements to provide compact description of
patterns. Grammar construction block is used to generate a useful pattern description
language.
Problems:
What is the order of the shape number of the figure which is given below
Solution:
S Starting point
Chain code : 0 0 0 3 3 2 1 2 3 2 1 1
Difference: 3 0 0 3 0 3 3 1 1 3 3 0
Shape No. (Shift the difference by 3 times)
330300303311
Order of the shape no. 12
5.32 Digital Image Processing