AlexNet for Automated Casting Defect Detection
AlexNet for Automated Casting Defect Detection
net/publication/355580587
CITATIONS READS
23 2,986
2 authors:
All content following this page was uploaded by Shiron Thalagala on 03 November 2021.
Abstract - Automated inspection of surface defects is hazardous environments including costly concerns of the
beneficial for casting product manufacturers in terms of safety of such employees.
inspection cost and time, which ultimately affect overall The visual identification process of defects in metal
business performance. Intelligent systems that are capable of castings needs to entertain two main requirements during
image classification are widely applied in visual inspection as
a major component of modern smart manufacturing. Image
the process of inspection. One is the identification of surface
classification tasks performed by Convolutional Neural defects on the casting, and two is the identification of
Networks (CNNs) have recently shown significant defects located inside the cast product which are not visible
performance over the conventional machine learning to the naked eye. The latter is relatively complicated and
techniques. Particularly, AlexNet CNN architecture, which expensive, commonly accomplished by non-destructive
was proposed at the early stages of the development of CNN testing (NDT) methods such as ultrasonic testing, eddy-
architectures, shows outstanding performance. In this paper, current testing, magnetic particle testing, and radiographic
we investigate the application of AlexNet CNN architecture- (X-ray) testing [7].
based transfer learning for the classification of casting surface The main purpose of non-destructive testing is to
defects. We used a dataset containing casting surface defect
images of a pump impeller for testing the performance. We
identify defects located inside the test object by the naked
examined four experimental schemes where the degree of the eye without damaging the object. X-ray computer
knowledge obtained from the pre-trained model is varied in tomography (XCT) is a widely used non-destructive casting
each experiment. Furthermore, using a simple grid search inspection method that generates two-dimensional/three-
method we explored the best overall setting for two crucial dimensional images of the object interior structure [8].
hyperparameters. Our results show that despite the simple Inspecting such interior images along with the inspection of
architecture, AlexNet with transfer learning can be casting surfaces of every manufactured product is necessary
successfully applied for the recognition of casting surface to maintain lower defect levels. Not only the interior images
defects of the pump impeller. generated by XCT but also the conventional photographs of
Keywords - automated inspection, casting defect detection, the casting surfaces can be fed into intelligent systems that
convolutional neural networks, hyperparameters, transfer use image processing and machine learning techniques for
learning recognition, categorization, and localization of casting
I. INTRODUCTION defects [6].
Convolutional neural networks (CNNs), which lie in
Cost and time effective quality management [1] in a the domain of machine learning have been well studied for
manufacturing operation is a significant aspect regardless of their appropriateness in computer vision applications [9].
the domain. Nevertheless, producing higher quality The structure of CNNs is analogous to that of the
products that yield higher customer satisfaction with the connectivity pattern in the visual cortex of the human brain.
least cost and time has been a challenging task for CNNs are capable of extracting features by themselves and
manufacturing firms. Product visual inspection for defects, there is no need to perform manual feature extractions in the
being a crucial element in quality management, is input images which, however, is essential in some primitive
increasingly automated in present manufacturing firms due machine learning techniques. Fig. 1 illustrates the
to numerous benefits [2] which ultimately result in higher difference in image classification approach between
business performance. primitive machine learning methods and CNNs. Hence,
Metal casting is a manufacturing process where molten over the last decade, CNNs have successfully applied for
metals are solidified in a mold to obtain the required shape automated inspection of casting defects with varying
[3]. Though metal casting processes span across a wide performances [10]–[12]. Since the onset of the CNNs,
variety of metals and several specific techniques, the most numerous architectures have been generated by carrying out
common defect types can be categorized as blowholes, structural reformulations, regularizations, parameter
shrinkages, cracks, sand inclusions, defective surfaces, and optimizations, etc. [13]. AlexNet [14] is a prominent CNN
mismatches [4]. Proper identification of casting defects architecture that performs competently in the tasks of image
effectively is vital as unnoticed defective finished products recognition. While CNNs perform better in the realm of
which go to the customers’ hand can cause fatal mechanical images over traditional machine learning techniques still
failures [5]. Automating the process of visual inspection of some common hindrances for lack of generalization of
metal castings with the aid of intelligent systems [6] is models are not fully conquered by research. Specifically,
beneficial in terms of accuracy, inspection time, and cost. models trained for the same feature space and the same
Especially, it prevents the facilitation of human labor in distribution drastically reduce their performance when
129
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
tested on a different dataset with different feature background subtraction method followed by a thresholding
distribution. algorithm is proposed in [18]. The idea is to generate an
image with the same pixel intensities as the original image
except defective regions using low-pass filtering [19]. The
newly constructed image is then subtracted from the
original image resulting in a residual image containing only
defective regions. In [20] the Modified Median filter,
MODAN-Filter, is proposed to identify contours of the
casting defects from non-defective areas with a function to
calculate the pixel values of the background image.
Furthermore, equations of the MODAN-Filter are
generalized in [21] to achieve higher robustness. These
filtering-based methods that depend on optimum filter
parameters, however, can be unreliable when image noise is
present substantially. In [22], the wavelet transform method
is described as a potential technique to identify certain
casting defect types.
Feature-based detection of casting defects is another
trending approach that can be seen applied in [10], [23].
During this process, each pixel is classified as a defect or
not based on the features calculated using sets of nearby
pixels. Common features include statistical descriptors such
as mean, standard deviation, skewness, kurtosis, energy,
and entropy [24]. In [25], a hierarchical and a non-
hierarchical linear classifier has been implemented based on
six geometric and gray value features namely contrast,
position, aspect ratio, width-area ratio, length-area ratio,
and roundness. A Fuzzy logic-based method for the
Fig. 1. Difference in image classification approach between conventional detection and classification of defects that appear in the
machine learning techniques and CNNs radiographic images is proposed in [11].
Many modern studies have tested numerous CNN
Transfer learning has significantly addressed the issue architectures in terms of the performance and accuracy of
of using a single CNN model for the recognition tasks in casting defect recognition tasks. Among those, Region-
different image fields. Transfer learning in CNNs is the use Based Convolutional Neural Networks (R-CNNs) are used
of knowledge gained by training a model in one domain, on for the automatic localization of casting defects
another in a dissimilar domain [15]. It helps not only to significantly [12]. R-CNNs are capable of setting bounding
mitigate the computational cost in training but also to boxes around categorical patches in the images where this
generalize the CNN models over different domains. can be implemented easily to mark the defects in the casting
Moreover, transfer learning is beneficial in situations when defect images. In [10], a new CNN architecture called Xnet-
adequate data is lacking for learning from scratch. Despite II is introduced which comprises five convolutional and
the successful applications of transfer learning in automated fully connected layers. Moreover, they have used a dataset
recognition of casting defects, selection of the unique CNN generated through simulation using Generative Adversarial
model parameters (hyperparameters) [16] relevant to each Networks (GAN) [27] instead of real casting defect images.
casting image dataset is still necessary. Lack of sufficient data is a common problem in the
This paper focuses on: (1) investigating the application machine learning domain. Data augmentation where new
of an AlexNet CNN model which is pre-trained on an images are generated by augmenting the existing images of
entirely different larger dataset to recognize images of casting defects efficiently and accurately with low
casting surface defects, and (2) optimizing hyperparameters background noise is proposed in [28]. This mechanism is
for best performance. The pivot of this study is a based on a traditional image enlargement technique,
classification task to segregate faulty casting products in a precisely forcing the CNN to learn more in the regions of
manufactured batch through pattern recognition. Further the image that need high attention in order to perform better
classification of defect types or localization of defects, in the classification task. On the other hand, transfer
however, are out of the scope of this study. The dataset [17] learning is effective not only in the lack of data scenarios
used in the study comprised only two classes named ‘defect’ but also in respective to the robustness of the model. In [5],
and ‘defect-free’ representing images with one or more the authors use ResNet CNN architecture for the recognition
defects, and images without any visible defect, respectively. of casting defects. When compared to AlexNet, due to the
II. RELATED WORK architectural complexity, ResNet needs a significantly
larger number of computations which ultimately consumes
Recognition and localization of manufacturing defects higher computational resources.
using machine learning techniques are explored in
numerous studies over the recent years with the focus of III. METHODOLOGY
achieving high-performing robust models. Several In this section, we explain the approach used to
primitive computer vision techniques were used by several recognize casting surface defects of an industrial product
authors at the early stages of the pattern recognition field. A using AlexNet CNN architecture and transfer learning.
130
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
Improving the accuracy and the robustness of the AlexNet Among synthesized images, 5814 are annotated as defect-
architecture using transfer learning in the context of casting free and 7668 are annotated as defects. At last, all the
defect detection is the major objective of this study. images were resized to (224×224) pixels. Throughout all the
experimentation, training and validation data split is
A. Description of the dataset diversified by changing the amount of training data to 20%,
The dataset, obtained from Kaggle datasets [17], 40%, 50%, 60%, and 80% to understand the capacities of
consists of images of a submersible pump impeller which is generalization of the used models [30]. Hereinafter, the
manufactured as a casting product. All the images depict the ratio between the training image set and the validation
top view of the impeller and belong to two classes. The image set will be referred as train-test split ratio.
images that exhibit at least one casting defect on the surface
of the impeller are labeled as defect while all the other C. Non-parametric classification using the k-nearest
images, conversely, are labeled as defect-free. i.e., Any neighbor algorithm
casting defect on the surface that cannot be identified by the K-Nearest Neighbor (KNN) algorithm, which is a basic
naked eye from the images is labeled as defect-free. supervised machine learning algorithm, is used to
This dataset is collected under stable lighting investigate the capability of performing the classification
conditions with a Canon EOS 1300D DSLR camera. The task using raw pixel intensities as the input and without any
dataset contains a total of 1300 gray-scaled images with the sophisticated feature extraction techniques.
dimensions of each as (512×512) pixels. Among those, 781 In the context of computer vision, the KNN algorithm
images are labeled as defect, and the remaining 519 images performs classification of the data points (pixel values)
are labeled as defect-free. Fig. 2 shows eight sample images based on the distance between them and with the
(size and the resolution is altered in order to adhere to paper assumption that similar features exist nearby. Common
guidelines) and corresponding labels which are randomly methods of calculating the distance include the Euclidean
picked from the two classes. All the images acquired for this distance:
study from the original dataset are only the raw images and
the augmentation is done as a part of this study. ( , )= ∑ ( − ) (1)
B. Image augmentation
In this section, we discuss the image data augmentation and the Manhattan/city block distance:
techniques applied for the dataset before the
experimentation. As in [29], several classical techniques ( , )=∑ | − | (2)
that belong to geometrical and color-based transformations
were applied randomly to yield higher variability. As per where ( , ) is the distance between two and points in
geometric transformations, rotation, shearing, mirroring, the image spatial domain with N pixels.
scaling (zoom-in/out) and translation were applied.
Nevertheless, color space transformations were limited only In this study, the KNN algorithm is performed with the
to change of apparent brightness as the dataset already raw pixel intensities of casting images without any feature
contains grayscale images. Moreover, apparent brightness extraction with the Manhattan distance calculation metric
change (performed randomly) in each pixel intensity of an and the k value equals to five. The variation of precision,
image was restricted to a maximum of 20% (either increase recall, and f1-score is observed by varying the train-test split
or decrease) of the current intensity. It prevents introducing ration.
new defect regions which were not in the original image or
D. CNN architecture
disappearing significant regions of the image with low
intensities by further decreasing the intensity. Despite the emerging CNN architectures, we base our
model around AlexNet architecture due to three reasons. (1)
To the best of our knowledge, application of AlexNet based
transfer learning in recognition of casting defects is not
addressed in past literature, (2) AlexNet is applied in a
diverse set of deep learning problems witnessing promising
results [30], [31], (3) AlexNet, which was proposed in 2012,
is regarded as the first deep CNN architecture which
showed pioneering results in image recognition and
classification tasks [32]. We show that AlexNet is
sufficiently deep and reliable for a modest classification of
casting surface defects when compared to other deeper
sophisticated architectures born after AlexNet, if
hyperparameters are properly optimized.
AlexNet consists of five 2D convolutional layers
(Conv2D) followed by three fully connected layers (FC).
The build of the AlexNet architecture is illustrated in Table
Fig. 2. Randomly picked eight number of sample images from the I and it is constructed with several common CNN
dataset annotated as defect and defect-free
components
Fig. 3 shows one sample image (annotated as defect-
free) and corresponding images synthesized by augmenting
that image using all the techniques used in this study.
131
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
Fig. 3. Six transformations applied to a single original image (the symbols ‘x’ and ‘o’ in red color are used to understand the transformation in respect to the
original image). Relevant transformation is labeled on top of the image.
a) Convolution layers
TABLE I. LAYERS OF THE ALEXNET ARCHITECTURE
Each convolutional layer consists of a set of filters
known as convolutional kernels where each neuron plays Layer Parameters (f=no. of
Size of
feature maps, k=kernel size,
the role of a kernel. The kernel is a matrix of integers where ID Layer Type
s=strides, act=activation
Feature
it will multiply its weights with corresponding values of a Map
function)
subset of pixels of the input image. The selected subset of 0 Input layer
Input image size=(224x224)
pixels of the input image has a similar dimension to the pixels, Channels=1
f=96, k=(11 x 11), s=4,
kernel. Then, the resulting values are summed up to 1 Conv2D
act=ReLU
555596
generate one value that represents the value of a pixel in
2 Max Pool f=96, k=(3 x 3), s=2, 272796
the output (feature map). The kernel strides across the input
Batch
image producing the output (feature map of the entire 3
normalization
N/A 272796
image) of the convolution layer. In each layer, the kernel f=256, k=(5 x 5), s=1,
4 Conv2D 272796
strides over a varying number of pixels at a time in both act=ReLU
dimensions (height and width). The convolution process 5 Max Pool f=256, k=(3 x 3), s=2, 1313256
can be mathematically expressed as [33]: Batch
6 N/A 1313256
normalization
f=384, k=(3 x 3), s=1,
( , )=∑ ∑ , ( , ). ( , ) (3) 7 Conv2D
act=ReLU
1313384
Batch
8 N/A 1313384
where, ( , ) is an element of the input image tensor with normalization
and coordinates, which is element-wise multiplied by f=384, k=(3 x 3), s=1,
9 Conv2D 1313384
act=ReLU
( , ) index of the convolutional kernel of the
Batch
layer. and are the rows and columns of the kernel 7 N/A 1313384
normalization
matrix. ( , ) is the corresponding output feature map f=256, k=(3 x 3), s=1,
11 Conv2D 1313256
with columns and rows while is the image channel act=ReLU
index. 12 Max Pool f=256, k=(3 x 3), s=2, 66256
b) Pooling layers Batch
13 N/A 66256
normalization
Pooling operation sums up identical information in the 14 Dropout Rate=0.5 66256
local region of the feature map generated by a
15 FC f, k, s are N/A, act=ReLU 4096
convolutional layer and outputs a single value within that
region [34]. AlexNet consists of three pooling layers 16 Dropout Rate=0.5 4096
followed by the first, second and last convolution layers. 17 FC f, k, s are N/A, act=ReLU 1024
c) Activation function 18 Dropout Rate=0.5 1024
Fig. 4. (a) and (b) are the accuracies of training process and the validation process, respectively over different experimental configurations (ECs)
(which are mentioned under methodology of this paper) and varying train-test split ratios.
134
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
performance and for the safety of the end-users who [7] S. Gholizadeh, “A review of non-destructive testing methods of
consume products with critical mechanical components composite materials,” Procedia Structural Integrity, vol. 1, pp.
50–57, 2016.
fabricated by casting. Automated inspection of casting [8] Q. Wan, H. Zhao, and C. Zou, “Effect of micro-porosities on
defects leads to lesser inspection times and circumvents fatigue behavior in aluminum die castings by 3D X-ray
safety problems of employees working in hazardous tomography inspection,” ISIJ international, vol. 54, no. 3, pp.
environments. 511–515, 2014.
[9] K. O’Shea and R. Nash, “An introduction to convolutional neural
In this paper, we discussed the application of AlexNet networks,” arXiv preprint arXiv:1511.08458, 2015.
CNN architecture-based transfer learning for automated [10] H. Strecker, “A local feature method for the detection of flaws in
inspection of surface defects of a submersible pump automated X-ray inspection of castings,” Signal Processing, vol.
impeller manufactured by casting. Over the last decade, for 5, no. 5, pp. 423–431, 1983, doi: https://doi.org/10.1016/0165-
1684(83)90005-1.
the task of casting defect recognition, numerous [11] Z. Górny, S. Kluska-Nawarecka, D. Wilk-Ko\lodziejczyk, and K.
sophisticated architectures were proposed with higher Regulski, “Diagnosis of casting defects using uncertain and
architectural complexity and better performance compared incomplete knowledge,” Archives of Metallurgy and Materials,
to the AlexNet architecture. Using the results of our study, vol. 55, no. 3, pp. 827–836, 2010.
[12] M. Ferguson, R. Ak, Y.-T. T. Lee, and K. H. Law, “Automatic
we show (limited to the dataset used) that a simpler localization of casting defects with convolutional neural
architecture like AlexNet can perform better when it is networks,” in 2017 IEEE international conference on big data
implemented with transfer learning and optimized model (big data), 2017, pp. 1726–1735.
parameters. As future work, methods discussed in this study [13] L. Alzubaidi et al., “Review of deep learning: Concepts, CNN
architectures, challenges, applications, future directions,” Journal
can be tested over other datasets containing images of of big Data, vol. 8, no. 1, pp. 1–74, 2021.
casting surface defects of different products. [14] Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet
Over the several experimental configurations tested, classification with deep convolutional neural networks,”
the use of the exact feature extractor of the pre-trained Advances in neural information processing systems, vol. 25, pp.
1097–1105, 2012.
model for training demonstrated the best performance in [15] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE
terms of training accuracy and the training time (Although Transactions on knowledge and data engineering, vol. 22, no. 10,
training with weights initialized from the pre-trained model pp. 1345–1359, 2009.
resulted in the overall highest accuracy the training time is [16] Koutsoukas, K. J. Monaghan, X. Li, and J. Huan, “Deep-learning:
investigating deep neural networks hyper-parameters and
higher in contrast to using the entire feature extractor. comparison of performance to shallow methods for modeling
Several recommendations for a casting surface defect bioactivity data,” Journal of cheminformatics, vol. 9, no. 1, pp.
detection system can be made based on the results of this 1–13, 2017.
study. Nevertheless, as future work, the practical usability [17] R. Dabhi, “Casting product image data for quality inspection,”
Kaggle.com. https://kaggle.com/ravirajsinh45/real-life-
of such a system needs to be tested prior to implementation industrial-dataset-of-casting-product (accessed Jun. 14, 2021).
as several dataset-specific parameters still need to be [18] Gayer, A. Saya, and A. Shiloh, “Automatic recognition of
adjusted depending on the circumstance. The process of welding defects in real-time radiography,” Ndt International, vol.
capturing the surface images of the casting products is vital 23, no. 3, pp. 131–136, 1990.
[19] Eckelt, N. Meyendorf, W. Morgner, and U. Richter, “Use of
including, but not limited to: (1) adhering to proper lighting automatic image processing for monitoring of welding processes
conditions, and (2) maintaining unique and plain and weld inspection,” in Non-destructive testing, Elsevier, 1989,
background when capturing. As shown in the results, pp. 37–41.
transfer learning can be implemented to reduce the training [20] Filbert, R. Klatte, W. Heinrich, and M. Purschke, “Computer
aided inspection of castings,” in IEEE-IAS Annual Meeting,
time and enhance the robustness of the model. Moreover, 1987, pp. 1087–1095.
transfer learning is beneficial when the number of training [21] Mery, “New approaches for defect recognition with X-ray
images is lower. Specifically, the use of a feature extractor testing,” Insight, vol. 44, no. 10, pp. 614–15, 2002.
from the pre-trained model and limiting the training only [22] X. Li, S. K. Tso, X.-P. Guan, and Q. Huang, “Improving
automatic detection of defects in castings by applying wavelet
with the classification layers (fully connected layers) with technique,” IEEE Transactions on Industrial Electronics, vol. 53,
casting defect data is advantageous instead of using all the no. 6, pp. 1927–1934, 2006.
parameters of the pre-trained model. Furthermore, fine- [23] Kehoe and G. A. Parker, “An intelligent knowledge based
tuning the model hyperparameters is crucial for obtaining approach for the automated radiographic inspection of castings,”
NDT & E International, vol. 25, no. 1, pp. 23–36, 1992.
better results. [24] D. Wang, B. Wang, H. Yao, H. Liu, and F. Tombari, “Local
image descriptors with statistical losses,” in 2018 25th IEEE
REFERENCES International Conference on Image Processing (ICIP), 2018, pp.
[1] W. Barkman, In-process quality control for manufacturing. CRC 1208–1212. doi: 10.1109/ICIP.2018.8451855.
Press, 1989. [25] R. R. Da Silva, M. H. S. Siqueira, L. P. Calôba, and J. M. Rebello,
[2] R. T. Chin and C. A. Harlow, “Automated visual inspection: A “Radiographics pattern recognition of welding defects using
survey,” IEEE transactions on pattern analysis and machine linear classifiers,” Insight, vol. 43, no. 10, pp. 669–74, 2001.
intelligence, no. 6, pp. 557–573, 1982. [26] Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta,
[3] M. Sahoo, Principles of metal casting. McGraw-Hill Education, and A. A. Bharath, “Generative adversarial networks: An
2014. overview,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp.
[4] T. V. Sai, T. Vinod, and G. Sowmya, “A critical review on 53–65, 2018.
casting types and defects,” Engineering and Technology, vol. 3, [27] L. Jiang, Y. Wang, Z. Tang, Y. Miao, and S. Chen, “Casting
no. 2, pp. 463–468, 2017. defect detection in X-ray images using convolutional neural
[5] M. K. Ferguson, A. Ronay, Y.-T. T. Lee, and K. H. Law, networks and attention-guided data augmentation,”
“Detection and segmentation of manufacturing defects with Measurement, vol. 170, p. 108736, 2021.
convolutional neural networks and transfer learning,” Smart and [28] Shorten and T. M. Khoshgoftaar, “A survey on image data
sustainable manufacturing systems, vol. 2, 2018. augmentation for deep learning,” Journal of Big Data, vol. 6, no.
[6] D. Mery, T. Jaeger, and D. Filbert, “A review of methods for 1, pp. 1–48, 2019.
automated recognition of casting defects,” INSIGHT-WIGSTON [29] S. P. Mohanty, D. P. Hughes, and M. Salathé, “Using deep
THEN NORTHAMPTON-, vol. 44, no. 7, pp. 428–436, 2002. learning for image-based plant disease detection,” Frontiers in
plant science, vol. 7, p. 1419, 2016.
135
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka
136