Pothole Detection Using Computer Vision and Learning
Pothole Detection Using Computer Vision and Learning
8, AUGUST 2020
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
DHIMAN AND KLETTE: POTHOLE DETECTION USING COMPUTER VISION AND LEARNING 3537
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
3538 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 8, AUGUST 2020
by using an accelerometer and GPS. The proposed system The proposed system considers shadow effects on the road
lacks accuracy regarding the isolation of potholes from other and aims to remove those effects of shadows using a shadow-
road anomalies. removal algorithm. The system is unable to perform in rainy
weather. The authors concluded that the system should be
further extended to perform also on video data as the system
C. 2D-Vision-Based Methods
was only tested on 2D images collected using an iPhone
Vision-based methods use 2-dimensional (2D) image or camera with 5 megapixel image resolution.
video data, captured using a digital camera, and process Bashkar and Manohar [44] propose a methodology of
this data using 2D images or video processing techniques detecting pothole’s mean depth by using SURF features on
[21], [22]. The choice of the applied image processing tech- uncalibrated stereo pairs of images (without employing dis-
niques is highly dependent on the application for which 2D parity images). A particular methodology has been developed
images are being processed. for this purpose, but appears to suffers from uncalibrated stereo
Koch and Brilakis [8] proposed a method aiming at a rectification; it is far from providing good results.
separation of defect and non-defect regions in an image using Ying et al. [46] proposed a system which can detect road
histogram shape based threshold. The authors consider the surface based on a feature detector which is shadow-occurence
shape of a pothole as being approximately elliptical based on optimized. This system uses a connected-component-analysis
a perspective view. The authors emphasize on using machine algorithm and other morphological algorithms and is demon-
learning in future work, and claim that the proposed work strated on images of datasets provided by KITTI [47] and
already results in 86% Accuracy along with 86% Recall ROMA [48].
and 82% Precision, with the common definitions of Thekkethala et al. [25] used two (stereoscopic) cameras
TP and applied stereo matching to estimate depth of a pothole
Precision = (1)
TP + FP on asphalt pavement surface. After performing binarization
TP and morphological operations, a skeleton of a pothole is
Recall = (2)
TP + FN estimated. The system is tested on 24 images and no estimates
TP + TN of depth have been provided. The system can detect skeletons
Accuracy = (3)
TP + TN + FP + FN of potholes of great depression. Authors did not estimate the
road manifold.
Precision · Recall
F1 = 2 · (4)
Precision + Recall
where TP is the number of true positives, FP of false positives, D. 3D Scene Reconstruction-Based Methods
TN of true negatives, and FN of false negatives. 3D scene reconstruction is the method of capturing the
Tedeschi and Benedetto [10] recently suggested a system shape, depth, and appearance of objects in the real world; it
for automatic pavement distress recognition (ADPR) which relies on 3D surface reconstruction which typically demands
is able to perform in real time by identifying road distress more computations than 2D vision. Rendering of surface
including fatigue cracks, longitudinal and traversal cracks, and elevations helps to understand accuracy during the design
potholes. The authors used a combination of technologies of of 3D vision systems. 3D scene reconstruction can be based
the OpenCV library and for the classification of the three on using various types of sensors, such as Kinect [28], stereo-
different types of road distresses, three classifiers have been vision cameras, or a 3D laser. Kinect sensors are mainly used
used based on local binary pattern (LBP) features; they in fields of (indoor) robotics or gaming.
achieved more than 70% for Precision, Recall, and the 3D lasers define an advanced road-survey technology; com-
F1-measure. pared to camera-based systems it still comes with higher
Authors discussed difficulties of defining the severity of costs; [30], [31] report survey cycles of (usually) once in four
considered kinds of road distresses. For texture classification years. A 3D laser uses a laser source to illuminate the surface
the authors used Haralick’s features [23] based on gray level and a scan camera for capturing the created light patterns. [32]
co-occurrence matrices (GLCMs) and then classified image applied the common laser-line projection; the recorded laser
regions using a tool from [24]. line deforms when it strikes an obstacle (and supports thus the
Ryu et al. [26] proposed a method to detect potholes both 3D reconstruction), but does not work well, e.g., on wet roads
for asphalt or concrete road surfaces using 2D images collected or potholes filled with water.
by a mounted optical device on a survey vehicle. The system Stereo-vision cameras are considered to be cost-effective
mainly works in three steps of image segmentation, candidate as compared to other sensors. Stereo vision aims at effective
region extraction and decision. The system fails to detect and accurate disparities, calculated from left-and-right image
potholes in darker images (image regions) due to shadows pairs, to be used for estimating depth or distance; see, for
(e.g. of trees or cars) present in real-world road recordings. example, [33]. Commonly, the canonical left-right calibrated
Powell and Satheeshkumar [27] present a method for the stereo camera setup is used while aiming at a reconstruction
detection of potholes by segmenting images into defected or dense 3D surfaces. A disparity map represents per-pixel cor-
non-defected regions. After extracting the texture information respondences for a rectified stereo pair. Figure 2 illustrates a
from defected regions, this texture information is compared recorded 3D scene with a calculated (color-encoded) disparity
with texture information obtained from non-defected regions. map.
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
DHIMAN AND KLETTE: POTHOLE DETECTION USING COMPUTER VISION AND LEARNING 3539
TABLE III
E XAMPLES OF 3D R ECONSTRUCTION -BASED
M ETHODS AND U SED S ENSORS
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
3540 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 8, AUGUST 2020
TABLE IV
E XAMPLES OF CNN S FOR I MAGE S EGMENTATION ,
AND U SED D ATA S ETS
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
DHIMAN AND KLETTE: POTHOLE DETECTION USING COMPUTER VISION AND LEARNING 3541
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
3542 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 8, AUGUST 2020
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
DHIMAN AND KLETTE: POTHOLE DETECTION USING COMPUTER VISION AND LEARNING 3543
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
3544 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 8, AUGUST 2020
level of accuracy. dataset can be seen here: https://vimeo.com/337886918. This video is 29 fps.
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
DHIMAN AND KLETTE: POTHOLE DETECTION USING COMPUTER VISION AND LEARNING 3545
Fig. 10. Detected “potholes” using LM2 method, shown in two columns with original image on the left and predicted results on the right.
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
3546 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 8, AUGUST 2020
Fig. 12. Detected “potholes” from validation dataset using LM1 method, shown in two columns with original image on the left and predicted results on the
right. Top to bottom, left to right: Ten frames in order as listed in Table V.
Fig. 13. Examples of false detections from test PNW dataset using LM1. The detection of pothole including false positives varies from 3070 to 3076 frames
shown in top two rows.
promising. Figure 12 shows that in validation dataset pothole is of arbitrary shape, under bright sunshine a tree is miss-
instances are correctly identified while a false positive has been classified as a pothole in this case (this could be excluded by
detected in the third image (from the bottom) - as a pothole identifying a ground manifold first).
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
DHIMAN AND KLETTE: POTHOLE DETECTION USING COMPUTER VISION AND LEARNING 3547
Fig. 14. Pothole marked in red color presents a very complex situation as
it is filled with water and shadow of a tree.
TABLE VI
C OMPARATIVE E VALUATION OF THE P ROPOSED LM1, SV2 AND SV1 IN %
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
3548 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 8, AUGUST 2020
Fig. 16. Detected “potholes” using LM1 method, shown in two columns with original image on the left and predicted results on the right.
TABLE VIII Using the LM2 method, the developed model is applicable
E VALUATION M EASURES ( IN %) FOR LM2 R ESULTS S HOWN IN F IG . 12; for real-time scenarios. In Table VIII, IoU is greater than
N OTE T HAT BB 1 R EPRESENTS B OUNDING B OX 1, AND BB 2 B OUNDING
B OX 2 FOR D IFFERENT I NSTANCES OF P OTHOLES
50%, thus we may say that results are promising. The PNW test
frames get divided into the same number of grids as selected
during the training period, i.e. 13 × 13. The model can predict
multiple bounding boxes in each grid, so we keep the one with
the highest IoU value. This leads to an enforcement of spatial
diversity in making predictions.
VI. C ONCLUSION
The gravity of pothole related accidents can be understood
by increased numbers of accidents around the world due
to potholes. In this research, four different techniques are
proposed and tested against each other. Each technique has its
own benefits and can provide different pathways to a number
of applications. The LM1 model can identify a pothole under
challenging weather conditions with good precision and
recall whereas the LM2 model is capable of real-time
pothole identification. The SV2 approach can identify potholes
and road manifolds with very high accuracy when used with
stereo-vision cameras. The SV2 approach can also be used to
track a pothole from one frame to another, and is relatively
Fig. 17. The potholes marked in purple colors can be perceived as one easy to implement.
big pothole or can be counted separately(two potholes in left image, six The findings that we have presented here suggest that it
potholes in right image)[other potholes are not marked here, but considered
in experiments].
is very difficult to define the irregular shape of a pothole
which further makes it difficult to annotate ground truth. This,
in turn, causes a complex process of matching results with
evaluation measure. However, as shown in Fig. 17, more than ground truth. To date, there is no platform or benchmark
one pothole are adjacent to each other can be identified as one available for pothole identification. As a result of conducting
big pothole or multiple small potholes. In conclusion, we did this research, we also put forward six datasets specifically
not include pothole counts as an evaluation measure because for pothole identification, and discussed applications of two
it depends on individual counting. different areas of research such as computer vision and deep
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
DHIMAN AND KLETTE: POTHOLE DETECTION USING COMPUTER VISION AND LEARNING 3549
learning. It would be fruitful to pursue further research in [24] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical
order to combining the output of LM1 for annotating pothole Machine Learning Tools and Techniques, 3rd ed. San Mateo, CA, USA:
Morgan Kaufmann, 2016.
data and to use it to train more LM2-type models in order to [25] M. V. Thekkethala and S. Reshma, “Pothole detection and volume
increase detection accuracy for real-time purposes. estimation, using stereoscopic cameras,” in Proc. Int. Conf. Mixed
Design Integr. Circuits Syst., 2016, pp. 47–51.
ACKNOWLEDGMENT [26] S.-K. Ryu, T. Kim, and Y.-R. Kim, “Image-based pothole detection
system for ITS service and road management system,” Math. Problems
The authors acknowledge fruitful discussions with Hsiang- Eng., vol. 2015, 2015, Art. no. 968361.
Jen Chien, Auckland, New Zealand, which have been very [27] L. Powell and K. G. Satheeshkumar, “Automated road distress detec-
tion,” in Proc. Int. Conf. Emerg. Technol. Trends, 2016, pp. 1–6.
helpful at various steps of the reported work.
[28] A. Rasheed, K. Kamal, T. Zafar, S. Mathavan, and M. Rahman,
“Stabilization of 3D pavement images for pothole metrology using
R EFERENCES the Kalman filter,” in Proc. Int. Conf. Intell. Transp. Syst., 2015,
pp. 2671–2676.
[1] A. Heaton, “Potholes and more potholes: Is it just us,” Tech. Rep.,
Mar. 2018. [Online]. Available: https://medium.com [29] H. Hirschmüller, “Accurate and efficient stereo processing by semi-
[2] (2018). Pothole Facts. [Online]. Available: www.pothole.info/the-facts global matching and mutual information,” in Proc. Int. Conf. Comput.
[3] N. Dwivedi, “The pothole proposition,” Tech. Rep., Aug. 2018. [Online]. Vis. Pattern Recognit., pp. 807–814.
Available: https://medium.com [30] Q. Li, M. Yao, X. Yao, and B. Xu, “A real-time 3D scanning system
[4] (2018). Christchurch Report. [Online]. Available: www.stuff.co.nz/the- for pavement distortion inspection,” Meas. Sci. Technol., vol. 21, no. 8,
press/news/100847641/christchurch-the-pothole-capital-of-new-zealand/ pp. 015702-1–015702-8, 2010.
[5] H. Kong, J.-Y. Audibert, and J. Ponce, “General road detection from a [31] X. Yu and E. Salari, “Pavement pothole detection and severity mea-
single image,” Image Process., vol. 19, no. 8, pp. 2220–2221, Aug. 2010. surement using laser imaging,” in Proc. Int. Conf. Electro/Inf. Technol.,
[6] X. Ai, Y. Gao, J. G. Rarity, and N. Dahnoun, “Obstacle detection using 2011, pp. 1–5.
U-disparity on quadratic road surfaces,” in Proc. Int. Conf. Intell. Transp. [32] K. K. Vupparaboina, R. R. Tamboli, P. M. Shenu, and S. Jana, “Laser-
Syst., Oct. 2013, pp. 1352–1357. based detection and depth estimation of dry and water-filled potholes:
[7] F. Oniga, S. Nedevschi, M. M. Meinecke, and T. B. To, “Road surface A geometric approach,” in Proc. Nat. Conf. Commun., 2015, pp. 1–6.
and obstacle detection based on elevation maps from dense stereo,” in [33] R. Klette, Concise Computer Vision: An Introduction Into Theory and
Proc. Int. Conf. Intell. Transp. Syst., Oct. 2007, pp. 859–865. Algorithms. London, U.K.: Springer, 2014.
[8] C. Koch and I. Brilakis, “Pothole detection in asphalt pavement images,” [34] Z. Zhang, X. Ai, C. K. Chan, and N. Dahnoun, “An efficient algorithm
Adv. Eng. Inform., vol. 25, no. 3, pp. 507–515, 2011. for pothole detection using stereo vision,” in Proc. Int. Conf. Acoust.,
[9] (2018). Driverless Car Market Watch. [Online]. Available: Speech Signal Process., 2014, pp. 564–568.
http://www.driverless-future.com/?page_id=384 [35] W. Khan, “Accuracy of stereo-based object tracking in a driver assis-
[10] A. Tedeschi and F. Benedetto, “A real-time automatic pavement crack tance context,” Ph.D. dissertation, Dept. Comput. Sci., Auckland Univ.,
and pothole recognition system for mobile Android-based devices,” Adv. Auckland, New Zealand, 2013.
Eng. Inform., vol. 32, pp. 11–25, Apr. 2017.
[36] H. Youquan, W. Jian, Q. Hanxing, Z. Wei, and X. Jianfang, “A research
[11] B.-H. Lin and S.-F. Tseng, “A predictive analysis of citizen hotlines
of pavement potholes detection based on three-dimensional projec-
1999 and traffic accidents: A case study of taoyuan city,” in Proc. Int.
tion transformation,” in Proc. Int. Conf. Image Signal Process., 2011,
Conf. Big Data Smart Comput., Feb. 2017, pp. 374–376.
pp. 1805–1808.
[12] D. Santani et al., “Communisense: Crowdsourcing road hazards in
nairobi” in Proc. Int. Conf. Hum.-Comput. Interact. Mobile Devices [37] C. Zhang and A. Elaksher, “An unmanned aerial vehicle-based imag-
Services, Aug. 2015, pp. 445–456. ing system for 3D measurement of unpaved road surface distresses,”
[13] D. O’Carroll, “For the love of pizza, Domino’s is now fixing potholes Comput.-Aided Civil Infrastruct. Eng., vol. 27, no. 2, pp. 118–129, 2012.
in roads,” Wellington, New Zealand, Tech. Rep., Jun. 2018. [Online]. [38] Y.-W. Hsu, J. W. Perng, and Z.-H. Wu, “Design and implementation
Available: https://stuff.co.nz of an intelligent road detection system with multisensor integration,” in
[14] A. Dhiman, H.-J. Chien, and R. Klette, “Road surface distress detection Proc. Int. Conf. Mach. Learn. Cybern., 2016, pp. 219–225.
in disparity space,” in Proc. Int. Conf. Image Vis. Comput. New Zealand, [39] T. Naidoo, D. Joubert, T. Chiwewe, A. Tyatyantsi, B. Rancati, and
Dec. 2017, pp. 1–6. A. Mbizeni, “Visual surveying platform for the automated detection of
[15] A. Dhiman, H.-J. Chien, and R. Klette, “A multi-frame stereo vision- road surface distresses,” in Proc. Int. Conf. Sensors MEMS Electro-Optic
based road profiling technique for distress analysis,” in Proc. ISPAN, Syst., 2014, Art. no. 92570D.
Oct. 2018, pp. 7–14. [40] F. Orhan and P. E. Eren, “Road hazard detection and sharing with
[16] A. Dhiman, S. Sharma, and R. Klette, Identification of Road Potholes. multimodal sensor analysis on smartphones,” in Proc. Int. Conf. Next
Stratford, U.K.: MIND, 2019. Gener. Mobile Apps Services Technol., 2013, pp. 56–61.
[17] A. Mednis, G. Stardins, R. Zviedris, G. Kanonirs, and L. Selavo, “Real [41] T. Garbowski and T. Gajewski, “Semi-automatic inspection tool of
time pothole detection using Android smartphones with accelerometers,” pavement condition from three-dimensional profile scans,” Intell. Transp.
in Proc. Int. Conf. Distrib. Comput. Sensor Syst. Workshops, Jun. 2011, Syst., vol. 172, pp. 310–318, Jan. 2017.
pp. 1–6. [42] FEMat Project. Accessed: May 20, 2019. [Online]. Available:
[18] M. Ghadge, D. Pandey, and D. Kalbande, “Machine learning approach www.fematproject.pl/index.html
for predicting bumps on road,” in Proc. Int. Conf. Appl. Theor. Comput. [43] T. Shen, G. Schamp, and M. Haddad, “Stereo vision based road surface
Commun. Technol., Oct. 2015, pp. 481–485. preview,” in Proc. Int. Conf. Intell. Transp. Syst., 2014, pp. 1843–1849.
[19] F. Seraj, B. J. van der Zwaag, A. Dilo, T. Luarasi, and P. Havinga,
[44] V. A. Bashkar and G. T. Manohar, “Surface pothole depth estimation
“RoADS: A road pavement monitoring system for anomaly detection
using stereo mode of image processing,” Advance Res. Eng. Technol.,
using smart phones,” in Proc. Int. Workshop Modeling Social Media,
vol. 4, pp. 1169–1177, Jan. 2016.
Jan. 2014, pp. 128–146.
[20] J. Ren and D. Liu, “PADS: A reliable pothole detection system [45] Y.-H. Tseng, S.-C. Kanga, J.-R. Changb, and C.-H. Leea, “Strategies for
using machine learning,” in Proc. Int. Conf. Smart Comput. Commun., autonomous robots to inspect pavement distresses,” Autom. Construct.,
Jan. 2016, pp. 327–338. vol. 20, no. 8, pp. 1156–1172, 2011.
[21] K. Georgieva, C. Koch, and M. König, “Wavelet transform on multi- [46] Z. Ying, G. Li, X. Zang, R. Wang, and W. Wang, “A novel shadow-free
GPU for real-time pavement distress detection,” in Proc. Comput. Civil feature extractor for real-time road detection,” in Proc. Int. Conf. Pervas.
Eng., May 2015, pp. 99–106. Ubiquitous Comput., 2016, pp. 611–615.
[22] K. Doycheva, C. Koch, and M. König, “Implementing textural features [47] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics:
on GPUs for improved real-time pavement distress detection,” Real-Time The KITTI dataset,” Int. J. Robot. Res., vol. 32, no. 11, pp. 1231–1237,
Image Process., vol. 33, pp. 1–12, Sep. 2016. 2013.
[23] R. M. Haralick, K. Shanmugam, and I. Dinstein, “Textural features [48] T. Veit, J.-P. Tarel, P. Nicolle, and P. Charbonnier, “Evaluation of road
for image classification,” Syst. Man, vol. SMC-3, no. 6, pp. 610–621, marking feature extraction,” in Proc. Int. Conf. ITSC, Beijing, China,
Nov. 1973. 2008, pp. 174–181.
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.
3550 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 21, NO. 8, AUGUST 2020
[49] M. Staniek, “Neural networks in stereo vision evaluation of road [75] K. A. Levenberg, “A method for the solution of certain non-linear
pavement condition,” in Proc. Int. Symp. Non-Destructive Test. Civil problems in least squares,” Quart. Appl. Math., vol. 2, no. 2,
Eng., 2015, pp. 15–17. pp. 164–168, 1944.
[50] L. K. Suong and K. Jangwoo, “Detection of potholes using a deep [76] OpenMP. Accessed: Nov. 25, 2018. [Online]. Available:
convolutional neural network,” Universal Comput. Sci., vol. 24, no. 9, www.openmp.org/mp-documents/openmp-4.5.pdf
pp. 1244–1257, 2018. [77] S. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. Knowl.
[51] V. Pereira, S. Tamura, S. Hayamizu, and H. Fukai, “A deep learning- Data Eng., vol. 22, no. 10, pp. 1345–1359, Oct. 2010.
based approach for road pothole detection in timor leste,” in Proc. Int. [78] H. Kaiming, G. Georgia, D. Piotr, and G. Ross, “Mask R-CNN,” CoRR,
Conf. Service Oper. Logistics, Informat., 2018, pp. 279–284. vol. abs/1703.06870, 2017.
[52] K. E. An, S. W. Lee, S.-K. Ryu, and D. Seo, “Detecting a pothole [79] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature
using deep convolutional neural network models for an adaptive shock hierarchies for accurate object detection and semantic segmentation,”
observing in a vehicle driving,” in Proc. Int. Conf. Consumer Electron., in Proc. CVPR, 2014, pp. 580–587.
2018, pp. 1–2. [80] J. R. R. Uijlings, K. E. A. van de Sande, T. Gevers, and
[53] Y. Bhatia, R. Rai, V. Gupta, N. Aggarwal, and A. Akula, “Con- A. W. M. Smeulders, “Selective search for object recognition,” Int. J.
volutional neural networks based potholes detection using thermal Comput. Vis., vol. 104, no. 2, pp. 154–171, Apr. 2013.
imaging,” King Saud Univ., Comput. Inf. Sci., to be published. doi: [81] R. Girshick, “Fast R-CNN,” in Proc. ICCV, 2015, pp. 1440–1448.
10.1016/j.jksuci.2019.02.004. [82] R. Shaoqing, H. Kaiming, G. Ross, and S. Jian, “Faster R-CNN:
[54] B. Cyganek and J. P. Siebert, An Introduction to 3D Computer Vision Towards real-time object detection with region proposal networks,”
Techniques and Algorithms. Hoboken, NJ, USA: Wiley, 2011. CoRR, vol. abs/1506.01497, 2015.
[83] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
[55] A. Mikhailiuk and N. Dahnoun, “Real-time pothole detection on
recognition,” in Proc. CVPR, 2018, pp. 770–778.
TMS320C6678 DSP,” in Proc. Int. Conf. Imag. Syst. Techn., 2016,
[84] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie,
pp. 123–128.
“Feature pyramid networks for object detection,” in Proc. CVPR, vol. 1,
[56] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, no. 2, 2017, p. 4.
“DeepLab: Semantic image segmentation with deep convolutional nets, [85] T.-Y. Lin et al., “Microsoft COCO: Common objects in context,” in
atrous convolution, and fully connected CRFs,” IEEE Trans. Pattern Proc. ECCV, 2014, pp. 740–755.
Anal. Mach. Intell., vol. 40, no. 4, pp. 834–848, Apr. 2017. [86] J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” 2016,
[57] G. Lin, A. Milan, C. Shen, and I. D. Reid, “RefineNet: Multi-path arXiv:1612.08242. [Online]. Available: https://arxiv.org/abs/1612.08242
refinement networks for high-resolution semantic segmentation,” CoRR, [87] J. Hui, “mAP (mean average precision) for object detection,” Tech. Rep.,
vol. abs/1611.06612, 2016. Mar. 2018. [Online]. Available: https://medium.com/
[58] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing [88] D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nesic,
network,” CoRR, vol. abs/1612.01105, 2016. X. Wang, and P. Westling, “High-resolution stereo datasets with
[59] C. Peng, X. Zhang, G. Yu, G. Luo, and J. Sun, “Large kernel matters— subpixel-accurate ground truth,” in Proc. Int. Conf. GCPR, in Lecture
Improve semantic segmentation by global convolutional network,” Notes in Computer Science, vol. 8753, 2014, pp. 31–42.
CoRR, vol. abs/1703.02719, 2017. [89] T. Vaudrey, C. Rabe, R. Klette, and J. Milburn, “Differences between
[60] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks stereo and motion behavior on synthetic and real-world stereo
for semantic segmentation,” in Proc. CVPR, 2015, 3431–3440. sequences,” in Proc. Int. Conf. Image Vis. Comput. New Zealand, 2008,
[61] V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A deep pp. 1–6.
convolutional encoder-decoder architecture for image segmentation,” [90] R. Guzmán, J.-B. Hayet, and R. Klette, “Towards ubiquitous autonomous
IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, driving: The CCSAD dataset,” in Proc. Int. Conf. Comput. Anal. Images
2017. Patterns, 2015, pp. 582–593.
[62] H. Song, K. Baek, and Y. Byun, “Pothole detection using machine [91] A. Börner et al., “IPS—A vision-aided navigation system,” in Proc. Adv.
learning,” Adv. Sci. Technol. Lett., vol. 150, pp. 151–155, Feb. 2018. Opt. Technol., vol. 6, no. 2, pp. 121–129, 2017.
[63] C. Szegedy, V. Vanhoucke, S. Ioffe, J. SHlens, and Z. Wojna, “Rethink- [92] D. Grießbach, D. Baumbach, and S. Zuev, “Stereo-vision-aided inertial
ing the inception architecture for computer vision,” in Proc. CVPR, 2016, navigation for unknown indoor and outdoor environments,” in Proc.
pp. 2818–2826. Indoor Positioning Indoor Navigat., 2014, pp. 709–716.
[64] H. Maeda, Y. Sekimoto, T. Seto, T. Kashiyama, and H. Omata, “Road [93] S. Nienaber, M. J. Booysen, and R. S. Kroon, “Detecting potholes using
damage detection using deep neural networks with images captured simple image processing techniques and real-world footage,” in Proc.
through a smartphone,” 2018, arXiv:1801.09454. [Online]. Available: Southern Afr. Transp. Conf., Jul. 2015.
https://arxiv.org/abs/1801.09454 [94] PNW Dataset. Accessed: May 25, 2019. [Online]. Available:
[65] J. Huang et al., “Speed/accuracy trade-offs for modern convolutional www.youtube.com/watch?v=BQo87tGRM74
object detectors,” in Proc. CVPR, 2017, pp. 7310–7311.
[66] A. G. Howard et al., “MobileNets: Efficient convolutional neural
networks for mobile vision applications,” 2017, arXiv:1704.04861. Amita Dhiman received the master’s degree in com-
[Online]. Available: https://arxiv.org/abs/1704.04861 puter science. She is currently pursuing the Ph.D.
[67] A. Zhang et al., “Automated pixel-level pavement crack detection on degree with the Auckland University of Technology.
3D asphalt surfaces using a deep-learning network,” J. Comput.-Aided Teaching Assistant with the Auckland University of
Civil Infrastruct. Eng., vol. 32, no. 10, pp. 805–819, 2017. Technology. She has coauthored papers in image
[68] R. Labayrade, D. Aubert, and J.-P. Tarel, “Real time obstacle detection processing, stereo vision, and deep learning.
in stereovision on non flat road geometry through ‘v-disparity’ represen-
tation,” in Proc. IEEE Intell. Vehicles Symp., Jun. 2002, pp. 646–651.
[69] N. H. Saleem, H.-J. Chien, M. Rezaei, and R. Klette, “Improved stixel
estimation based on transitivity analysis in disparity space,” in Proc. Int.
Conf. Comput. Anal. Images Patterns, 2017, pp. 28–40.
[70] J. Serra, Image Analysis and Mathematical Morphology. Orlando, FL,
USA: Academic, 1983. Reinhard Klette is a Professor with Auckland
[71] R. Klette and A. Rosenfeld, Digital Geometry. San Francisco, CA, USA: University of Technology. He has coauthored more
Morgan Kaufmann, 2003. than 300 publications in peer-reviewed journals or
[72] V. Lepetit, F. Moreno-Noguer, and P. Fua, “EPnP: An accurate O(n) conferences and books on computer vision, image
solution to the PnP problem,” Int. J. Comput. Vis., vol. 81, no. 2, processing, geometric algorithms, and panoramic
pp. 155–166, 2009. imaging. He is a fellow of the Royal Society of
[73] H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded up robust New Zealand. He is on the Honorary Board of the
features,” in Proc. Eur. Conf. Comput. Vis., 2006, pp. 404–417. International Journal of Computer Vision. He was
[74] H.-J. Chien and R. Klette, “Regularised energy model for robust an Associate Editor of the IEEE PAMI from 2001 to
monocular egomotion estimation,” in Proc. Int. Joint Conf. Comput. Vis. 2008.
Imag. Comput. Graph., Theory Appl., vol. 6, 2011, pp. 361–368.
Authorized licensed use limited to: GITAM University. Downloaded on April 25,2025 at 12:41:45 UTC from IEEE Xplore. Restrictions apply.