International Journal of Computer Theory and Engineering, Vol. 4, No.
1, February 2012
Object Shape Recognition in Image for
Machine Vision Application
Mohd Firdaus Zakaria, Hoo Seng Choon, and Shahrel Azmin Suandi
machine is to assist a human such as in complex assembly
Abstract—Vision is the most advanced of our senses, so it is operation, it is necessary to have means of exchanging
not surprising that images contribute important role in human information about the current scenario between man and
perception. This is analogous to machine vision such as shape machine in real time. The problem cannot be solved if the
recognition application which is important field nowadays. This
paper proposed shape recognition method where circle, square
operator needs to type in the object‟s coordinates or move the
and triangle object in the image will be recognizable by the mouse pointer to an image of the object on a screen to enable
algorithm. This proposed method utilizes intensity value from the machine to detect the objects present in the conveyor. As
the input image then thresholded by Otsu’s method to obtain a result, the machine needs to be equipped with a camera so
the binary image. Median filtering is applied to eliminate noise that it will use the image captured to do further processing
and Sobel operator is used to find the edges. Thinning method is and identify types of shape on the conveyor.
used to remove unwanted edge pixels where these pixels may be
counted in the parameter estimation algorithm, hence increase
There are several methods that have been developed by the
the false detection. The shapes are decided by compactness of past researchers for the shape detection such as using
the region. The experimental results show that this method generalizes Hough transform [1]–[3], template matching [4],
archives 85% accuracy when implemented in selected database. [5] etc. However, both mentioned methods are sensitive to
noise and sampling artifact. To overcome this problem, M.
Index Terms—Object area, object parameter, and shape Kass proposed active contour models [6], [7] but this method
recognition. suffers from complexity and high computational time.
As being described above, this paper proposed a method
for shape recognition especially for object on the conveyor
I. INTRODUCTION
with simple algorithm with low computational time. This
Machine vision is one of the applications of computer proposed method used intensity value from the input image
vision to industry and manufacturing, whereas computer which is then threshold by Otsu‟s method to obtain the binary
vision is mainly focused on machine-based image processing. image. Otsu‟s method selects the threshold automatically
Machine vision usually requires additional digital input or from the grayscale histogram and the thresholded image
output devices and computer networks to control other contains two regions, i.e., foreground and background.
manufacturing equipment such as robotic arms. Machine Median filtering is applied to eliminate noise and Sobel
vision is subfield of engineering that encompasses computer operator is used to find the edges. Thinning method is used to
science, optics, mechanical engineering and industrial remove unwanted edge pixels where these pixels may be
automation. One of the most common applications of counted in the parameter estimation algorithm, hence
machine vision is the inspection of manufactured goods such increase the false detection. The shapes are decided by
as semiconductors chips, automobiles, foods and compactness of the region. The experimental results show
pharmaceuticals. Just like human inspectors working on that this method archives 85% accuracy when implemented
assembly lines using their vision to inspect part visually to in selected database.
judge the quality of workmanship, machine vision systems The rest of the paper is organized as follows: Section II
use input device such as camera and image processing presents details explanations of the proposed method. Section
software to perform similar inspections. III will show the results and discussions and finally
Machine vision systems are programmed to perform conclusion in section IV.
narrowly defined tasks such as shape recognition on a
conveyor, reading serial numbers and searching for surface
defects. The interaction between human and machine II. PROPOSED METHOD
typically consists of programming and maintaining the
Fig. 1 shows the block diagram of the proposed method.
machine by the human operator. As long as the machine acts
The input image taken by the input device is first converted to
out preprogrammed behavior, a direct interaction between
hue, saturation, and lightness (HSL) color space where only L
man and machine is not necessary anyway. However, if the
value will be processed. The processed L component will be
used as template to determine the shape to produce the final
Manuscript received on October 20, 2010; revised February 5, 2011. This
work is supported in part by the Universiti Sains Malaysia Postgraduate output.
Incentive Research Grant No. 1001/PELECT/8021023.
Mohd Firdaus Zakaria and Shahrel Azmin Suandi are with Intelligent A. Color Space Conversion
Biometric Group, School of Electrical and Electronic Engineering, In the proposed method, HSL color space is chosen and
Universiti Sains Malaysia, Engineering Campus, 14300 Nibong Tebal,
Seberang Prai Selatan, Penang, Malaysia (e-mail: only one channel, L will be processed instead of using three
[email protected],
[email protected]). channels as in RGB color space. The advantages of using one
76
International Journal of Computer Theory and Engineering, Vol. 4, No. 1, February 2012
color channel instead of three channels are the processing threshold can be chosen at the bottom of this valley as
time and complexity can be reduced significantly. The L proposed by Prewitt and Mendelsohn [9]. However, for most
value contains lightness value of the input image where L is real image, it is usually difficult to detect the bottom valley
calculated as shown in Eq. (1). precisely, especially in such cases as when the valley is flat
and abroad, imbued with noises or when the two peaks are
extremely unequal in height, often producing no traceable
valley.
Otsu‟s method is nonparametric and unsupervised method
where, L is the lightness value, R is the red channel of the of automatic threshold selection for image segmentation. An
input image, G is the green channel of the input image and B optimal threshold is selected by the discriminant criterion
is the blue channel of the input image. [10], namely, so as to maximize the separability of the
resultant classes in gray level. The procedure is simple,
RGB to HSL
utilizing only the zero and the first-order cumulative
moments of the gray level histogram.
Otsu Threshold There are three types of discriminant criteria and the one
used in this paper to obtain an optimal threshold value is
shown in Eq. (2).
Image Fill
Median
Filtering
where is the measure of separability of the resultant classes
in gray levels, B is between-class variance and W is
Sobel within-classes variant.
The value of must be maximized to obtain a suitable
threshold value. The optimal threshold value is the one that
Thinning maximizes the between-classes variance, B or conversely
minimizes the within-classes variance, W. This directly
deals with the problem of evaluating the goodness of
Compactness threshold.
Shape
Template
Output
Fig. 1. Block diagram of the proposed method
Fig. 3. Otsu‟s threshold
Fig. 2 shows the conversion of the input image in red,
In Fig. 3, by using the Otsu‟s method, the binary image clearly
green, blue (RGB) color space to L channel in HSL color shows the differences between the object and background. The
space. The L image produces good color separation between objects are marked with one while the background is marked with
the object and its background. zero value.
C. Image Fills
Image fills is a function to fill the „holes‟ in the binary
image of the input image. This method is suitable to eliminate
the noise that exists in the image.
Fig. 2. RGB color space (left) to L channel (right image) of HSL color space
conversion result
B. Otsu’s Threshold
Otsu‟s threshold [8] is a method that selects a threshold
automatically from a gray level histogram. In this method, it
Fig. 4. Image fills function
is important to select an adequate threshold of gray level to
extract the object from their background. In an ideal case, the The small circles in Fig. 4 depict the „holes‟ in the input
histogram has a deep and sharp valley between two peaks image. By implementing image fills algorithm, the „holes‟
representing object and background, respectively, so that the region will be converted to neighboring value hence
77
International Journal of Computer Theory and Engineering, Vol. 4, No. 1, February 2012
eliminate the noise. Fig. 7 shows the effect of thinning process. Thinning is
needed here because there will be an increase of pixel count if
D. Median Filtering
the arrangement of the pixels is not in a straight line.
Median filtering [11] is usually used to reduce „salt and
pepper‟ noise and preserve edges. In the proposed method,
the size for median filter operator is set to 1010 matrixes.
Fig. 7. Images before (left) and after (right) thinning process
G. Shape Recognition
Fig. 5. 10 ×10 median filtering results The proposed method recognizes the shapes of an object
by computing the compactness [13]. Eq. (3) shows the
Fig. 5 illustrates the effect of median filtering. From the equation for compactness calculation.
output image, noise has been reduced to minimum and some
edges also been smoothed. This process is essential to make
sure all corresponding edges for each object are connected
properly so that the perimeter can be computed appropriately.
E. Sobel Operator where c is the compactness, c is the perimeter and A is the
area.
Sobel operator [11] is an operator used in image processing,
Computing c like this is applicable to all geometric shapes,
particularly for edge detection algorithm. Actually, it is a
independent of a scale and orientation and its value is
discrete differentiation operator, computing an
dimensionless. In the proposed method, according to
approximation of the gradient of the image intensity function. compactness value, circle has compactness in the range of 1
It is also two dimensional map of gradient at each point and
to 14, square‟s compactness is from 15 to 19 and triangle‟s
can be processed and viewed as if it itself an image, with the
compactness is from 20 to 40.
area of high gradient or the likely edges visible as white lines.
In the proposed method, Sobel mask is used to detect the
shape‟s outer edges. The outer edge of each shape is needed
to compute the perimeter of each shape. The perimeter is
obtained by counting the total white pixels in the edge of a
shape. At each image point, the gradient vector of the Sobel
mask points increases in the direction of largest possible
intensity.
Fig. 8. Circle template and circle detection output
Fig. 6. Edge detection results using Sobel operator
Fig. 6 demonstrates the edge detection by Sobel operator. Fig. 9. Square template and square detection output
The convolution between the Sobel operators with input
image will produce edge, i.e. pixel values equal to one, where
same value region will produce zeros and otherwise will
produce ones.
F. Thinning
The morphological thinning operator is the subtraction
between the input image and the sub generating operator with
structuring A and B. Both structuring elements will be rotated
90 for four times. This means that there will be eight Fig. 10. Triangle template and triangle detection output
structuring elements. The result will be the input image with Fig. 8, Fig. 9 and Fig 10 depict the template of
pixels in which its center contains the pattern specified by A corresponding shape of circle, square and triangle,
and B marked as zero. This operation removes pixels which respectively. This template is determined by compactness
satisfy the pattern given by the structuring elements A and B value, and applied on the input image in RGB color space
[12]. to produce the output image.
78
International Journal of Computer Theory and Engineering, Vol. 4, No. 1, February 2012
III. RESULTS AND DISCUSSIONS
The proposed method is tested on a database consists of 70
images with size 640480. This dataset can be divided into
four groups which are dataset that contains only one object,
three same objects, three different objects and multiple
different objects. Fig. 11 illustrates the categories in the
dataset and Table I shows the corresponding results.
(a) Input image (b) Circles detection
TABLE I: PROPOSED METHOD ACCURACY
Objects Number of
Number of Objects Accuracy %
Shape Objects
Circle 3 100
One Object Square 7 100
Triangle 4 100
Circle 6 50
Three Same Objects Square 9 67
Triangle 7 86 (c) Squares detection (d) Triangles detection
Circle
Three Different Fig. 12. Example of successful detection
Square 18 100
Objects
Triangle
Circle
Multiple Different
Square 16 75
Objects
Triangle
(a) Input image (b) Triangles detection
Fig. 13. Example of inaccurate detection
(a) One object (b) Three same object
(a) Otsu‟s threshold output using gray scale image
(c) Three different objects (d) Multiple different objects
Fig. 11. Four categorizes dataset
(b) Otsu‟s threshold output using lightness channel
Fig. 12 shows the example of successful detection by using
the proposed method. Fig. 13 demonstrates the example of Fig. 14. Advantages of HSL color space
incorrect detection of the proposed method. There are several
reasons why the proposed method produced undesirable
IV. CONCLUSION
detection:
Shapes detection method has been proposed in this paper.
Due to the input image has uneven intensity, the image
Its main objective is to differentiate basic shape such as circle,
is not thresholded properly and thus the shapes cannot
square and triangle in the given input image by merely
be detected.
employing computer vision techniques. This method utilize
Some of the objects are touching each other which compactness as the shape indicator where the compactness
contribute to inaccurate calculation in the parameter for circle is fixed from 1 to 14, square‟s compactness is in
and area estimation. range 15 to 19 and triangle‟s compactness is from 20 to 40.
Noises not totally eliminated where these noises will be From the result in the Section III, the proposed method
detected as objects. achieved 85% detection accuracy in the selected database.
Fig. 14 depicts the advantages of using HSL color space to However, this method is sensitive to noise and lighting
obtain the L channel over using typical grayscale level from condition. Poor lighting condition image will bring
the RGB color space. complexity in Otsu‟s threshold algorithm and the outcome of
the result is not desirable.
79
International Journal of Computer Theory and Engineering, Vol. 4, No. 1, February 2012
REFERENCES [7] C. Xu and J. L. Prince, “Snakes, shapes, and gradient vector flow,”
IEEE Transactions on Image Processing, vol. 7, no. 3, pp. 359–369,
[1] R. O. Duda and P. E. Hart, “Use of the hough transformation to detect
1998.
lines and curves in pictures,” Comm. ACM, vol. 15, pp. 11–15, 1972.
[8] N. Otsu, “A threshold selection method from gray-level histogram,”
[2] D. H. Ballard, “Generalizing the hough transform to detect arbitrary
shapes,” Pattern Recognition, vol. 13, no. 2, pp. 111–122, 1981. IEEE Transaction on Systems, Man and Cybernatics, vol. 9, no. 1, pp.
[3] D. Shi, L. Zheng, and J. Liu, “Advanced hough transforms using a 62–66, 1979.
multilayer fractional fourier method,” IEEE Transactions on Image [9] J. M. S. Prewitt and M. L. Mendelsohn, “The analysis of cell images,”
Processing, vol. 19, no. 6, pp. 1558–1566, 2010. Annals of the New York Academy of Sciences, vol. 128, pp. 1035–1053,
[4] J. P. Lewis, “Fast Template Matching,” in Proc. of Canadian Image 1966.
Processing and Pattern Recognition Society, Quebec, 1995, pp. [10] K. Fukunage, Introduction to Statistical Pattern Recognition, New
120-123. York: Academic Press, pp. 225–257, 1972.
[5] R. Brunelli, “Template matching techniques in computer vision: theory [11] R. C. Gonzales and R. E. Woods, Digital Image Processing, 2nd ed.,
and practice,” Wiley, ISBN 978-0-470-51706-2, 2009. New Jersey: Prentice Hall, 2002.
[6] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: active contour [12] V. E. Duro, “Fingerprints thinning algorithms,” IEEE Aerospace and
models,” International Journal of Computer Vision. vol. 1, no. 4, pp. Electronic System Magazine, vol. 18, no. 9, pp. 28–30, 2003.
321–331, 1987. [13] M. Pomplun. [2007] Compactness. [Online]. Available:
http://www.cs.umb.edu/~marc/cs675/cv09-11.pdf.
80