View metadata, citation and similar papers at core.ac.
uk brought to you by CORE
provided by Hasanuddin University Repository
Vehicle Detection and Tracking using
Gaussian Mixture Model and Kalman Filter
Indrabayu 1, Rizki Yusliana Bakti 2, Intan Sari Areni3, A. Ais Prayogi4
1,2,4
Informatics Study Program
3
Electrical Engineering Study Program
Hasanuddin University
Makassar, Indonesia
[email protected],
[email protected],
[email protected],
[email protected] Abstract—Intelligent Transport System (ITS) is a method usually applied to detect objects from image. GMM method is
used in traffic arrangements to make efficient road transport detection method that compares between foreground object
system. One of the ITS application is the detection and tracking (moving object) and background object (stationary object).
of vehicle objects. In this research, Gaussian Mixture Model This approach is usually applied for detecting object within a
(GMM) method was applied for vehicle detection and Kalman
video. The second thing besides vehicle detecting is tracking
Filter method was applied for object tracking. The data used are
vehicles video under two different conditions. First condition is the object in traffic surveillance using CCTV. One of the
light traffic and second condition is heavy traffic. Validation of methods that can be used in object tracking is Kalman Filter
detection system is conducted using Receiver Operating method [4]-[6].
Characteristic (ROC) analysis. The result of this research shows In this research, vehicle detection is conducted using
that the light traffic condition gets 100% for the precision value, GMM method combined with object tracking with Kalman
94.44% for sensitivity, 100% for specificity, and 97.22% for Filter. Combining those techniques are expected to achieve
accuracy. While the heavy traffic condition gets 75.79% for the higher accuracy in system detection. The rest consecutive
precision value, 88.89% for sensitivity, 70.37% for specificity, subsection in this paper are as follows: Methodology,
and 79.63% for accuracy. With avarage consistency of Kalman
Discussion and Conclusion.
Filter for object tracking is 100%.
II. METHODOLOGY
Keywords— Intelligent Transport System, Gaussian Mixture
Model, Kalman Filter, Video This research used (.mov) format video as input with
frame rate of 25 fps and resolution of 640 x 480. Data was
I. INTRODUCTION taken from top of a pedestrian bridge with static camera
position. The steps of the research were described in the
Recently, the technological advance in various fields is following block diagram.
growing rapidly, particularly in the field of transport, namely
Intelligent Transport System (ITS). ITS is a method used
traffic arrangements to make efficient road-based transport
system and it has been applied in the developed countries.
Example of ITS application is the use of CCTV cameras for
FIGURE 1. SYSTEM BLOCK DIAGRAM
surveillance. The transportation authority and decision-makers
can easily obtain data to be used in traffic engineering, such as
Based on the figure 1, first step is video preparation as
data on the number of vehicles and vehicle speed.
input for the system. Next step is Vehicle detection using
To obtain data on the number of vehicles and vehicle
GMM method. In this step, foreground objects and
speed through CCTV, the first thing to be done is to detect
background objects were separated. The object detected as
vehicle object. There are several methods that can be used for
vehicle was marked with bounding box. The last step was
vehicles detection such as Histogram of Oriented Gradient
tracking the detected object in each frame using Kalman
(HOG), Viola Jones and GMM [1] - [3]. Object detection
Filter.
through CCTV is done by distinguishing between object to be
Figure 2 shows the whole flowchart of vehicle detection
detected with other objects. HOG and Viola Jones are
system. The steps of vehicle detection system shows in the
detection methods that relying on existed database in detection
flowchart are discussed as follows.
process. This database consists of two forms of data i.e.
positive and negative database. Positive database is collection A. Extract Frame
of data that contain the object to be detected, while the The first step was to process vehicle video. Video is a
negative database is collection of data that does not contain collection of several frames. The longer the duration of a
the object to be detected. This scenario of data distinction is video, the bigger number of frames it contained. Video then
978-1-5090-5548-7/16/$31.00 ©2016 IEEE
extracted to become several frames and processed one by one segmentation on the frame of input video. Figure 3(a) is
until last frame in the video. As shown in Figure 2, each frame original image before given the limited of area. Figure 3(b)
was processed sequentially until last frame in the video. shows the limit of ROI and Non-ROI areas.
FIGURE 3. ROI SEGMENTATION. (A) ORIGINAL IMAGE, (B) REGION OF
INTEREST AREA
C. Gaussian Mixture Model (GMM)
GMM is a density model that consists of several Gaussian
component functions. This method can perform well when
used for extraction process of background because its
reliability against the changes in light and condition during
repeated object detection [3]. Pixel in the video scene is
modeled in Gaussian distribution. Each pixel in the frame was
compared with model formed from GMM. Pixels with
similarity values under the standard deviation and highest
weight factor were considered as background, while pixels
with higher standard deviation and lower weight factor
considered as foreground [7].
Pixel then categorized into one of GMM candidate model.
If the color of pixel is categorized as a background model then
the pixel will be given zero (0) or black color. While the pixel
is uncategorized in background model then it will be
considered as foreground and given one (1) or white color.
Then, the resulting binary image will be processed further.
Foreground is a moving object and changing position in every
frame of video (dynamic), while background is an object with
the position unchanged in every video frames (static) [3].
After foreground object detected, the filter process is done
to fill the hole on the foreground object. This research uses
morphology process to filter the noise and fill the hole on the
detected object.
FIGURE 2. FLOWCHART OF CAR DETECTION SYSTEM
B. ROI Segmentation
Region of Interests (ROI) is area that contains the object to
be detected [6]. ROI segmentation is needed to limit the area
to be processed. The first step in ROI segmentation is
determining the positions of polygon pixel that will be used to
cover an area that is not an observation point of detection. The
second step is closing the Non-Region of Interest area with
polygon made previously. Non – Region of Interest area will
be considered as background, so the object that passing FIGURE 4. MORPHOLOGY PROCESS FOR FOREGROUND DETECTION.
through this area will be ignored. Figure 3 shows the ROI (A) FOREGROUND AND BACKGROUND, (B) FILTER.
The used method is Gaussian Mixture Model (GMM) for
Figure 4(a) shows the foreground object and detected detection object and kalman filter for tracking objects. Object
background. At the point (a) shows that there are holes on detection in the video is determined based on the foreground
the foreground object. In order to clarify the foreground size. Causes of error detection on GMM methods include
object was detected then the morphology process was shadow vehicle is detected as an object and two adjacent
performed. Morphology operation is a filter that combines vehicles are considered as a single object [10]. Under these
between erosion and dilation process in binary or grayscale conditions, data collection was done during the day with a
image. Filtering on the morphology process is showed in consideration of light.
figure 4(b). Determining ROI boundary performed before object
detection, called segmentation ROI, to filter unneeded object
area. In this study, the area outside the boundaries of the ROI
D. Car Detection
is called Non-ROI area. Moving object in the area of Non-ROI
The detected foreground object is adapted with blob area. will be ignored and considered as a background. Segmentation
The object corresponding with blob area is detected as the ROI will determine boundary pixel positions of the non-ROI
vehicle object and marked by a bounding box. While the area.
object detected as foreground but not corresponding with blob
area will be ignored and is not marked by a bounding box.
Figure 5 shows that the vehicle object is detected and marked
by a bounding box.
FIGURE 6. PIXELS POSITION OF NON – ROI AREA BOUNDARY
FIGURE 5. BOUNDING BOX
E. Kalman Filter Figure 6 shows the pixels position of Non – ROI area.
After the vehicle object was detected, the system then After determining the pixels position of Non – ROI area
proceeds with object tracking. Object tracking is a method in boundary, then the next is created the polygon for covering the
vision computer to find the location of detected object [8]. In area. The next step detects object using GMM.
this research, Kalman Filter method was used for object
tracking. Kalman filter is a well-performed recursive method
used to track the object in video frame [4],[9]. Kalman filter
uses information of detected object on the previous frame and
provides the new position estimation of the object.
Kalman filter consists of two steps namely prediction and
correction [6], [9]. The prediction step is responsible for
projecting the future condition and the current object position.
While correction step provides reciprocity, namely combines
the actual measurement with the prior estimation for getting
improved posterior estimation [6].
III. RESULT
This research uses video data with .mov format and a
resolution of 640 x 480 pixels. Data retrieval was done on the FIGURE 7. OBJECT DETECTION IN THE LIGHT TRAFFIC CONDITION.
roadway in urban area with observing two conditions, i.e. light (A) ORIGINAL IMAGE, (B) BINARY IMAGE
traffic and heavy traffic.
Figure 7 shows the detection of vehicle object using GMM TABLE I. THE RESULT OF DETECTION SYSTEM
for light traffic condition. GMM method detects the moving
object based on the blob size of foreground that has been
detected. When foreground fulfill the size of the blob, it will
be treated as an object and a given bounding box. Otherwise,
the object will be ignored as shown in Figure 7 (b). In light
traffic condition, vehicles were seen clearly separated so that
no error detection.
Table I shows that precision value for light traffic is 100%
while for heavy traffic is 75.79%. Sensitivity value for light
traffic is 94.44% while for heavy traffic is 88.89%. Specificity
value for light traffic is 100% while for heavy traffic is
70.37%. System accuracy for light traffic is 97.22% while for
the heavy traffic is 79.63%. This proves that GMM method is
better for the light traffic condition.
FIGURE 8. OBJECT DETECTION IN THE HEAVY TRAFFIC CONDITION
(A) ORIGINAL IMAGE, (B) BINARY IMAGE
Figure 8 shows vehicle detection in the heavy traffic
condition. Figure 8(a) indicates that the vehicle is overlap with
another vehicle, so that two adjacent vehicles are detected as
single object, such as showed on figure 8(b).
Performance of detection system is measured by Receiver
Operating Characteristic (ROC) analysis. The parameters in
the ROC analysis are TP (True Positif), FN (False Negative),
FIGURE 9. DETECTED VEHICLE IN EACH FRAME
FP (False Positive) and TN (True Negative). The system
performance is determined by equation below.
Figure 9 shows the detected vehicle in each frame. The
vehicle began to be detected on frame 80 until frame 90. With
Precision / Positive Predictive Value (PPV): using the tracking object so will be known that detected object
on the first frame is the same with detected object on the next
frame.
Validation of tracking object is calculated by the
Specificity / True Negative Rate (TNR): consistency of tracking object ID prediction in percentage that
calculated using the formula as follows.
Sensitivity / Recall / True Positive Rate (TPR):
where :
Pn = Number of nth ID prediction consistency
Dn = Data number of nth ID
N = ID total in video
Accuracy:
Test result of tracking object system for seeing the
consistency prediction using Kalman filter method is showed
in Table II.
TABEL II. THE RESULT OF TRACKING OBJECT SYSTEM object reaches 100%. The results show that GMM method
working properly in light traffic.
REFERENCES
[1] R. Y. Bakti, Indrabayu, and I. S. Areni, Cascade Classification for
Car Detection”. SNATIKA, 2015. [Indonesian]
[2] D. Djamaluddin, T. Indrabulan, Andani, Indrabayu, and S. W.
Sidehabi, “The simulation of vehicle counting system for traffic
surveillance using Viola Jones method.” Makassar Int. Conf. Electr.
Eng. Inform. MICEEI, pp. 130 – 135, 2014.
[3] Indrabayu, Basri, A. Achmad, I. Nurtanio, and F. Mayasari, “Blob
Modification in Counting Vehicles using Gaussian Mixture Models
Under Heavy Traffic.” Asian Res. Publ. Netw. ARPN, vol. 10, 2015.
[4] W. L. Khong, W. Y. Kow, H. T. Tan, H. P. Yoong, and K. T. K. Teo,
“Kalman Filtering Based Object Tracking in Surveillance Video
System,” CUTSE Int. Conf., 2011.
[5] C. S. Rao and P. Darwin, “Frame Difference and Kalman Filter
Techniques for Detection of Moving Vehicles in Video
Surveillance,” Int. J. Eng. Res. Appl. IJERA, vol. 2, no. 6, pp. 1168–
1170, 2012.
[6] S. Han, and N. Vasconcelos, “Object-Based Regions of Interest for
Image Compression,” University of California, San Diego.
Tabel II shows that the percentage of consistency
[6] C. Li, L. Guo, and Y. Hu, “A New Method Combining HOG and
prediction using kalman filter method reaches 100%. This Kalman Filter for Video-based Human Detection and Tracking,” Int.
proves that this method is appropriate to used for tracking Congr. Image Signal Process. CISP2010, 2010.
object. [7] A. Nurhadiyatna, B. Hardjono, A. Wibisono, I. Sina, W. Jatmiko, M.
A. Ma’sum, and P. Mursanto, “Improved Vehicle Speed Estimation
Using Gaussian Mixture Model and Hole Filling Algorithm,”
IV. CONCLUSIONS ICACSIS, 2013.
The research is conducted using data of vehicle video and [8] I. A. Siradjuddin, M. R. Widyanto, T. Basaruddin, “Particle Filter
divided into two conditions i.e. light traffic and heavy traffic. with Gaussian Weighting for Human Tracking.” TELKOMNIKA,
vol. 10, 2012.
The detection object uses Gaussian Mixture Models method
[9] H. S. Parekh, D. G. Thakore, and U. K. Jaliya, “A Survey on Object
and the tracking object uses Kalman filter method. System Detection and Tracking Methods,” Int. J. Innov. Res. Comput.
validation for detection object is conducted with using ROC Commun. Eng. IJIRCCE, vol. 2, no. 2, 2014.
analysis the parameters of precision, sensitivity, specivisity [10] Lim R., Sutjiadi R. and Setyati E. 2011. “Adaptive Background
and accuracy. Light traffic condition obtains the precision of Extraction-Gaussian Mixture Models Method for Vehicle Counting
Application in video base”. National Conference of Informatics
100%, sensitivity of 94.44%, specificity of 100% and Engineering and Information System (SeTISI). pp 19. in Bahasa.
accuracy of 97.22%. While for heavy traffic condition obtains
the precision of 75.79%, sensitivity of 88.89%, specificity of
70.37% and accuracy of 79.63%. The consistency of tracking