Robotics and Robot Applications
Lecture 38: High Level Vision - I
Shyamanta M Hazarika
Biomimetic Robotics and Artificial Intelligence Lab
Mechanical Engineering
Indian Institute of Technology Guwahati
[Link]@[Link]
Segmentation
•How many “objects” are there in the image below?
•Assuming the answer is “4”, what exactly defines an object?
2 © Shyamanta M Hazarika, ME, IIT Guwahati
Gray Level Thresholding
• Many images consist of two regions that
occupy different gray level ranges.
• Such images are characterized by a bimodal
image histogram.
• An image histogram is a function h defined on
the set of gray levels in a given image.
• The value h(k) is given by the number of pixels
in the image having image intensity k.
3 © Shyamanta M Hazarika, ME, IIT Guwahati
Gray Level Thresholding
4 © Shyamanta M Hazarika, ME, IIT Guwahati
Gray Level Thresholding
Binary Image
5 © Shyamanta M Hazarika, ME, IIT Guwahati
Segmentation
• Segmentation can be viewed as a process of pixel
classification;; the image is segmented into objects or
regions by assigning individual pixels to classes.
• Connected Component Labeling assigns pixels to
specific classes by verifying if an adjoining pixel (i.e.,
neighboring pixel) already belongs to that class.
• There are two “standard” definitions of pixel
connectivity:
• 4-Neighbor Connectivity and
• 8-Neighbor Connectivity.
6 © Shyamanta M Hazarika, ME, IIT Guwahati
Segmentation
Computationally Efficient Techniques
• Contour tracking/border following identify the
pixels that fall on the boundaries of the objects, i.e.,
pixels that have a neighbor that belongs to the
background class or region.
• There are two “standard” code definitions used to
represent boundaries:
• Crack Code: Based on 4-connectivity
• Chain Code: Based on 8-connectivity.
7 © Shyamanta M Hazarika, ME, IIT Guwahati
Segmentation
Boundary Representation: CRACK Code
3
0 2
CRACK CODE:
10111211222322333300103300
8 © Shyamanta M Hazarika, ME, IIT Guwahati
Segmentation
Boundary Representation: CHAIN Code
7 6 5
0 4
1 2 3
CHAIN CODE:
12232445466601760
9 © Shyamanta M Hazarika, ME, IIT Guwahati
Contour Tracking Algorithm
Generating Crack Code
• Identify a pixel P that belongs to the class “objects”
and a neighboring pixel (4 neighbor connectivity) Q
that belongs to the class “background”.
• Depending on the relative position of Q relative to P,
identify pixels U and V as follows:
CODE 0 CODE 1 CODE 2 CODE 3
V Q Q P P U U V
U P V U Q V P Q
10 © Shyamanta M Hazarika, ME, IIT Guwahati
Contour Tracking Algorithm
Generating Crack Code
• Assume that a pixel has a value of “1” if it belongs to
the class “object” and “0” if it belongs to the class
“background”.
• Pixels U and V are used to determine the next
“move” (i.e., the next element of crack code) as
summarized in the following truth table:
U V P’ Q’ TURN CODE*
X 1 V Q RIGHT CODE-1
1 0 U V NONE CODE
0 0 P U LEFT CODE+1
*Implement as a modulo 4 counter
11 © Shyamanta M Hazarika, ME, IIT Guwahati
Contour Tracking Algorithm
Generating Crack Code
3
0 2 V Q P
U P
Q V U
1
CODE 0 CODE 1 CODE 2 CODE 3 V P
Q U
V Q Q P P U U V
U P V U Q V P Q V P
Q U
U
X
V
1
P’
V
Q’
Q
TURN
RIGHT
CODE*
CODE-1
V U
1 0 U V NONE CODE
0 0 P U LEFT CODE+1
*Implement as a modulo 4 counter
12 © Shyamanta M Hazarika, ME, IIT Guwahati
Contour Tracking Algorithm
Generating Chain Code
• Identify a pixel P that belongs to the class “objects” and
a neighboring pixel (4 neighbor connectivity) R0 that
belongs to the class “background”.
• Assume that a pixel has a value of “1” if it belongs to the
class “object” and “0” if it belongs to the class “background”.
• Assign the 8-connectivity neighbors of P to R0, R1,.., R7
as follows:
R7 R6 R5
R0 P R4
R1 R2 R3
13 © Shyamanta M Hazarika, ME, IIT Guwahati
Contour Tracking Algorithm
Generating Chain Code
7 6 5 R7 R6 R5
R7 R
R6 RP5 R4
0
0 4
P6 R
R07 R
R R245 R3
1
1 2 3 R701 R
P62 R543
R
R01 R
P72 R
R643 R5
R1 R
R02 R
R R
P3 R
6 R5
4
7
R
R10 R
P2 R
R34
R1 R2 R3
14 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Blob Analysis
• Once the image has been segmented into classes
representing the objects in the image, the next step is
to generate a high level description of the various
objects.
• A comprehensive set of form parameters describing
each object or region in an image is useful for object
recognition.
• Ideally the form parameters should be independent of
the object’s position and orientation as well as the
distance between the camera and the object (i.e.,
scale factor).
15 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Blob Analysis: Form Parameters
16 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Blob Analysis: Form Parameters
• Examples of form parameters that are invariant with
respect to position, orientation, and scale:
• Number of holes in the object
• Compactness or Complexity: (Perimeter)2/Area
• Moment invariants
• All of these parameters can be evaluated during
contour following.
17 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Generalized Moments
• Shape features or form parameters provide a high
level description of objects or regions in an image
• Many shape features can be conveniently
represented in terms of moments. The (p,q)th
moment of a region R defined by the function f(x,y) is
given by:
m pq = òò x y f ( x, y )dxdy
p q
R
18 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Generalized Moments
• In the case of a digital image of size n by m pixels,
this equation simplifies to:
n m
M ij = åå x y f ( x, y) i j
x =1 y =1
• For binary images the function f(x,y) takes a value of 1
for pixels belonging to class “object” and “0” for class
“background”.
19 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Some Useful Moments
• The center of mass of a region can be defined in
terms of generalized moments as follows:
M 10 M 01
X = Y=
M 00 M 00
20 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Principal Axis X
Principal Axis
Center of Mass
Y
21 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Principal Axis
• The principal axis of an object is the axis passing
through the center of mass which yields the minimum
moment of inertia.
• This axis forms an angle θ with respect to the X axis.
• The principal axis is useful in robotics for determining
the orientation of randomly placed objects.
2M 11
tan 2q =
TAN
M 20 - M 02
22 © Shyamanta M Hazarika, ME, IIT Guwahati
Object Recognition
Some More Moments
• The minimum/maximum moment of inertia about an
axis passing through the center of mass are given by:
2
M 02 + M 20 ( M 02 - M 20 ) + 4M 2
11
I MIN = -
2 2
2
M 02 + M 20 ( M 02 - M 20 ) + 4M 2
11
I MAX = +
2 2
23 © Shyamanta M Hazarika, ME, IIT Guwahati