0% found this document useful (0 votes)
82 views9 pages

AlexNet for Automated Casting Defect Detection

This paper investigates the use of AlexNet convolutional neural network architecture for automated recognition of casting surface defects through transfer learning. The study utilizes a dataset of pump impeller images to classify defects and optimize hyperparameters for improved performance. Results indicate that AlexNet can effectively identify casting surface defects, demonstrating its potential in enhancing automated inspection processes in manufacturing.

Uploaded by

kalyanitandasi9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views9 pages

AlexNet for Automated Casting Defect Detection

This paper investigates the use of AlexNet convolutional neural network architecture for automated recognition of casting surface defects through transfer learning. The study utilizes a dataset of pump impeller images to classify defects and optimize hyperparameters for improved performance. Results indicate that AlexNet can effectively identify casting surface defects, demonstrating its potential in enhancing automated inspection processes in manufacturing.

Uploaded by

kalyanitandasi9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/355580587

Application of AlexNet convolutional neural network architecture-based


transfer learning for automated recognition of casting surface defects

Conference Paper · September 2021


DOI: 10.1109/SCSE53661.2021.9568315

CITATIONS READS

23 2,986

2 authors:

Shiron Thalagala Chamila Walgampaya


University of Macau University of Peradeniya
7 PUBLICATIONS 33 CITATIONS 43 PUBLICATIONS 174 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Shiron Thalagala on 03 November 2021.

The user has requested enhancement of the downloaded file.


Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

Paper No: SC-21 Smart Computing

Application of AlexNet convolutional neural


network architecture-based transfer learning for
automated recognition of casting surface defects
Shiron Thalagala* Chamila Walgampaya
Dept. of Electromechanical Engineering Dept. of Engineering Mathematics
University of Macau, China University of Peradeniya, Sri Lanka
[email protected] [email protected]

Abstract - Automated inspection of surface defects is hazardous environments including costly concerns of the
beneficial for casting product manufacturers in terms of safety of such employees.
inspection cost and time, which ultimately affect overall The visual identification process of defects in metal
business performance. Intelligent systems that are capable of castings needs to entertain two main requirements during
image classification are widely applied in visual inspection as
a major component of modern smart manufacturing. Image
the process of inspection. One is the identification of surface
classification tasks performed by Convolutional Neural defects on the casting, and two is the identification of
Networks (CNNs) have recently shown significant defects located inside the cast product which are not visible
performance over the conventional machine learning to the naked eye. The latter is relatively complicated and
techniques. Particularly, AlexNet CNN architecture, which expensive, commonly accomplished by non-destructive
was proposed at the early stages of the development of CNN testing (NDT) methods such as ultrasonic testing, eddy-
architectures, shows outstanding performance. In this paper, current testing, magnetic particle testing, and radiographic
we investigate the application of AlexNet CNN architecture- (X-ray) testing [7].
based transfer learning for the classification of casting surface The main purpose of non-destructive testing is to
defects. We used a dataset containing casting surface defect
images of a pump impeller for testing the performance. We
identify defects located inside the test object by the naked
examined four experimental schemes where the degree of the eye without damaging the object. X-ray computer
knowledge obtained from the pre-trained model is varied in tomography (XCT) is a widely used non-destructive casting
each experiment. Furthermore, using a simple grid search inspection method that generates two-dimensional/three-
method we explored the best overall setting for two crucial dimensional images of the object interior structure [8].
hyperparameters. Our results show that despite the simple Inspecting such interior images along with the inspection of
architecture, AlexNet with transfer learning can be casting surfaces of every manufactured product is necessary
successfully applied for the recognition of casting surface to maintain lower defect levels. Not only the interior images
defects of the pump impeller. generated by XCT but also the conventional photographs of
Keywords - automated inspection, casting defect detection, the casting surfaces can be fed into intelligent systems that
convolutional neural networks, hyperparameters, transfer use image processing and machine learning techniques for
learning recognition, categorization, and localization of casting
I. INTRODUCTION defects [6].
Convolutional neural networks (CNNs), which lie in
Cost and time effective quality management [1] in a the domain of machine learning have been well studied for
manufacturing operation is a significant aspect regardless of their appropriateness in computer vision applications [9].
the domain. Nevertheless, producing higher quality The structure of CNNs is analogous to that of the
products that yield higher customer satisfaction with the connectivity pattern in the visual cortex of the human brain.
least cost and time has been a challenging task for CNNs are capable of extracting features by themselves and
manufacturing firms. Product visual inspection for defects, there is no need to perform manual feature extractions in the
being a crucial element in quality management, is input images which, however, is essential in some primitive
increasingly automated in present manufacturing firms due machine learning techniques. Fig. 1 illustrates the
to numerous benefits [2] which ultimately result in higher difference in image classification approach between
business performance. primitive machine learning methods and CNNs. Hence,
Metal casting is a manufacturing process where molten over the last decade, CNNs have successfully applied for
metals are solidified in a mold to obtain the required shape automated inspection of casting defects with varying
[3]. Though metal casting processes span across a wide performances [10]–[12]. Since the onset of the CNNs,
variety of metals and several specific techniques, the most numerous architectures have been generated by carrying out
common defect types can be categorized as blowholes, structural reformulations, regularizations, parameter
shrinkages, cracks, sand inclusions, defective surfaces, and optimizations, etc. [13]. AlexNet [14] is a prominent CNN
mismatches [4]. Proper identification of casting defects architecture that performs competently in the tasks of image
effectively is vital as unnoticed defective finished products recognition. While CNNs perform better in the realm of
which go to the customers’ hand can cause fatal mechanical images over traditional machine learning techniques still
failures [5]. Automating the process of visual inspection of some common hindrances for lack of generalization of
metal castings with the aid of intelligent systems [6] is models are not fully conquered by research. Specifically,
beneficial in terms of accuracy, inspection time, and cost. models trained for the same feature space and the same
Especially, it prevents the facilitation of human labor in distribution drastically reduce their performance when
129
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

tested on a different dataset with different feature background subtraction method followed by a thresholding
distribution. algorithm is proposed in [18]. The idea is to generate an
image with the same pixel intensities as the original image
except defective regions using low-pass filtering [19]. The
newly constructed image is then subtracted from the
original image resulting in a residual image containing only
defective regions. In [20] the Modified Median filter,
MODAN-Filter, is proposed to identify contours of the
casting defects from non-defective areas with a function to
calculate the pixel values of the background image.
Furthermore, equations of the MODAN-Filter are
generalized in [21] to achieve higher robustness. These
filtering-based methods that depend on optimum filter
parameters, however, can be unreliable when image noise is
present substantially. In [22], the wavelet transform method
is described as a potential technique to identify certain
casting defect types.
Feature-based detection of casting defects is another
trending approach that can be seen applied in [10], [23].
During this process, each pixel is classified as a defect or
not based on the features calculated using sets of nearby
pixels. Common features include statistical descriptors such
as mean, standard deviation, skewness, kurtosis, energy,
and entropy [24]. In [25], a hierarchical and a non-
hierarchical linear classifier has been implemented based on
six geometric and gray value features namely contrast,
position, aspect ratio, width-area ratio, length-area ratio,
and roundness. A Fuzzy logic-based method for the
Fig. 1. Difference in image classification approach between conventional detection and classification of defects that appear in the
machine learning techniques and CNNs radiographic images is proposed in [11].
Many modern studies have tested numerous CNN
Transfer learning has significantly addressed the issue architectures in terms of the performance and accuracy of
of using a single CNN model for the recognition tasks in casting defect recognition tasks. Among those, Region-
different image fields. Transfer learning in CNNs is the use Based Convolutional Neural Networks (R-CNNs) are used
of knowledge gained by training a model in one domain, on for the automatic localization of casting defects
another in a dissimilar domain [15]. It helps not only to significantly [12]. R-CNNs are capable of setting bounding
mitigate the computational cost in training but also to boxes around categorical patches in the images where this
generalize the CNN models over different domains. can be implemented easily to mark the defects in the casting
Moreover, transfer learning is beneficial in situations when defect images. In [10], a new CNN architecture called Xnet-
adequate data is lacking for learning from scratch. Despite II is introduced which comprises five convolutional and
the successful applications of transfer learning in automated fully connected layers. Moreover, they have used a dataset
recognition of casting defects, selection of the unique CNN generated through simulation using Generative Adversarial
model parameters (hyperparameters) [16] relevant to each Networks (GAN) [27] instead of real casting defect images.
casting image dataset is still necessary. Lack of sufficient data is a common problem in the
This paper focuses on: (1) investigating the application machine learning domain. Data augmentation where new
of an AlexNet CNN model which is pre-trained on an images are generated by augmenting the existing images of
entirely different larger dataset to recognize images of casting defects efficiently and accurately with low
casting surface defects, and (2) optimizing hyperparameters background noise is proposed in [28]. This mechanism is
for best performance. The pivot of this study is a based on a traditional image enlargement technique,
classification task to segregate faulty casting products in a precisely forcing the CNN to learn more in the regions of
manufactured batch through pattern recognition. Further the image that need high attention in order to perform better
classification of defect types or localization of defects, in the classification task. On the other hand, transfer
however, are out of the scope of this study. The dataset [17] learning is effective not only in the lack of data scenarios
used in the study comprised only two classes named ‘defect’ but also in respective to the robustness of the model. In [5],
and ‘defect-free’ representing images with one or more the authors use ResNet CNN architecture for the recognition
defects, and images without any visible defect, respectively. of casting defects. When compared to AlexNet, due to the
II. RELATED WORK architectural complexity, ResNet needs a significantly
larger number of computations which ultimately consumes
Recognition and localization of manufacturing defects higher computational resources.
using machine learning techniques are explored in
numerous studies over the recent years with the focus of III. METHODOLOGY
achieving high-performing robust models. Several In this section, we explain the approach used to
primitive computer vision techniques were used by several recognize casting surface defects of an industrial product
authors at the early stages of the pattern recognition field. A using AlexNet CNN architecture and transfer learning.
130
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

Improving the accuracy and the robustness of the AlexNet Among synthesized images, 5814 are annotated as defect-
architecture using transfer learning in the context of casting free and 7668 are annotated as defects. At last, all the
defect detection is the major objective of this study. images were resized to (224×224) pixels. Throughout all the
experimentation, training and validation data split is
A. Description of the dataset diversified by changing the amount of training data to 20%,
The dataset, obtained from Kaggle datasets [17], 40%, 50%, 60%, and 80% to understand the capacities of
consists of images of a submersible pump impeller which is generalization of the used models [30]. Hereinafter, the
manufactured as a casting product. All the images depict the ratio between the training image set and the validation
top view of the impeller and belong to two classes. The image set will be referred as train-test split ratio.
images that exhibit at least one casting defect on the surface
of the impeller are labeled as defect while all the other C. Non-parametric classification using the k-nearest
images, conversely, are labeled as defect-free. i.e., Any neighbor algorithm
casting defect on the surface that cannot be identified by the K-Nearest Neighbor (KNN) algorithm, which is a basic
naked eye from the images is labeled as defect-free. supervised machine learning algorithm, is used to
This dataset is collected under stable lighting investigate the capability of performing the classification
conditions with a Canon EOS 1300D DSLR camera. The task using raw pixel intensities as the input and without any
dataset contains a total of 1300 gray-scaled images with the sophisticated feature extraction techniques.
dimensions of each as (512×512) pixels. Among those, 781 In the context of computer vision, the KNN algorithm
images are labeled as defect, and the remaining 519 images performs classification of the data points (pixel values)
are labeled as defect-free. Fig. 2 shows eight sample images based on the distance between them and with the
(size and the resolution is altered in order to adhere to paper assumption that similar features exist nearby. Common
guidelines) and corresponding labels which are randomly methods of calculating the distance include the Euclidean
picked from the two classes. All the images acquired for this distance:
study from the original dataset are only the raw images and
the augmentation is done as a part of this study. ( , )= ∑ ( − ) (1)
B. Image augmentation
In this section, we discuss the image data augmentation and the Manhattan/city block distance:
techniques applied for the dataset before the
experimentation. As in [29], several classical techniques ( , )=∑ | − | (2)
that belong to geometrical and color-based transformations
were applied randomly to yield higher variability. As per where ( , ) is the distance between two and points in
geometric transformations, rotation, shearing, mirroring, the image spatial domain with N pixels.
scaling (zoom-in/out) and translation were applied.
Nevertheless, color space transformations were limited only In this study, the KNN algorithm is performed with the
to change of apparent brightness as the dataset already raw pixel intensities of casting images without any feature
contains grayscale images. Moreover, apparent brightness extraction with the Manhattan distance calculation metric
change (performed randomly) in each pixel intensity of an and the k value equals to five. The variation of precision,
image was restricted to a maximum of 20% (either increase recall, and f1-score is observed by varying the train-test split
or decrease) of the current intensity. It prevents introducing ration.
new defect regions which were not in the original image or
D. CNN architecture
disappearing significant regions of the image with low
intensities by further decreasing the intensity. Despite the emerging CNN architectures, we base our
model around AlexNet architecture due to three reasons. (1)
To the best of our knowledge, application of AlexNet based
transfer learning in recognition of casting defects is not
addressed in past literature, (2) AlexNet is applied in a
diverse set of deep learning problems witnessing promising
results [30], [31], (3) AlexNet, which was proposed in 2012,
is regarded as the first deep CNN architecture which
showed pioneering results in image recognition and
classification tasks [32]. We show that AlexNet is
sufficiently deep and reliable for a modest classification of
casting surface defects when compared to other deeper
sophisticated architectures born after AlexNet, if
hyperparameters are properly optimized.
AlexNet consists of five 2D convolutional layers
(Conv2D) followed by three fully connected layers (FC).
The build of the AlexNet architecture is illustrated in Table
Fig. 2. Randomly picked eight number of sample images from the I and it is constructed with several common CNN
dataset annotated as defect and defect-free
components
Fig. 3 shows one sample image (annotated as defect-
free) and corresponding images synthesized by augmenting
that image using all the techniques used in this study.
131
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

Fig. 3. Six transformations applied to a single original image (the symbols ‘x’ and ‘o’ in red color are used to understand the transformation in respect to the
original image). Relevant transformation is labeled on top of the image.

a) Convolution layers
TABLE I. LAYERS OF THE ALEXNET ARCHITECTURE
Each convolutional layer consists of a set of filters
known as convolutional kernels where each neuron plays Layer Parameters (f=no. of
Size of
feature maps, k=kernel size,
the role of a kernel. The kernel is a matrix of integers where ID Layer Type
s=strides, act=activation
Feature
it will multiply its weights with corresponding values of a Map
function)
subset of pixels of the input image. The selected subset of 0 Input layer
Input image size=(224x224)

pixels of the input image has a similar dimension to the pixels, Channels=1
f=96, k=(11 x 11), s=4,
kernel. Then, the resulting values are summed up to 1 Conv2D
act=ReLU
555596
generate one value that represents the value of a pixel in
2 Max Pool f=96, k=(3 x 3), s=2, 272796
the output (feature map). The kernel strides across the input
Batch
image producing the output (feature map of the entire 3
normalization
N/A 272796
image) of the convolution layer. In each layer, the kernel f=256, k=(5 x 5), s=1,
4 Conv2D 272796
strides over a varying number of pixels at a time in both act=ReLU
dimensions (height and width). The convolution process 5 Max Pool f=256, k=(3 x 3), s=2, 1313256
can be mathematically expressed as [33]: Batch
6 N/A 1313256
normalization
f=384, k=(3 x 3), s=1,
( , )=∑ ∑ , ( , ). ( , ) (3) 7 Conv2D
act=ReLU
1313384
Batch
8 N/A 1313384
where, ( , ) is an element of the input image tensor with normalization
and coordinates, which is element-wise multiplied by f=384, k=(3 x 3), s=1,
9 Conv2D 1313384
act=ReLU
( , ) index of the convolutional kernel of the
Batch
layer. and are the rows and columns of the kernel 7 N/A 1313384
normalization
matrix. ( , ) is the corresponding output feature map f=256, k=(3 x 3), s=1,
11 Conv2D 1313256
with columns and rows while is the image channel act=ReLU
index. 12 Max Pool f=256, k=(3 x 3), s=2, 66256
b) Pooling layers Batch
13 N/A 66256
normalization
Pooling operation sums up identical information in the 14 Dropout Rate=0.5 66256
local region of the feature map generated by a
15 FC f, k, s are N/A, act=ReLU 4096
convolutional layer and outputs a single value within that
region [34]. AlexNet consists of three pooling layers 16 Dropout Rate=0.5 4096
followed by the first, second and last convolution layers. 17 FC f, k, s are N/A, act=ReLU 1024
c) Activation function 18 Dropout Rate=0.5 1024

Use of Rectified Linear Unit (ReLU) as a non-linear 19 FC f, k, s are N/A, act=softmax 2


activation function of each layer is a significant
characteristic in AlexNet. ReLU activation function is:
e) Fully connected layer
( ) = max (0, ) (4)
At the end of the feature extraction stage
(accomplished by convolutional layers), three fully
where is the function input and ( ) is the function connected layers are introduced which perform
output which equal to the input when the input is positive classification globally [35].
and equal to zero otherwise.
f) Dropout
d) Batch normalization
To achieve generalization, some units or connections
As a countermeasure for the overfitting, batch with a certain probability within the network are randomly
normalization is performed after several layers of the
AlexNet.
132
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

skipped (dropout) [36]. The AlexNet model executes F. Implementation


dropout after several fully connected layers in it. Training of the AlexNet model is accomplished using
g) Output layer the Google Collaboratory tool–a free online python
programming environment specially designed for machine
The final layer of AlexNet architecture which acts as learning tasks. CPU is composed of a single core hyper
the output layer uses the softmax activation function [37]. threaded Intel Xeon Processors at 2.3Ghz speed and 13GB
The softmax function is given by: RAM while GPU is a Tesla K80 GPU with a 12 GB
GDDR5 VRAM.
( ) For the implementation of the AlexNet model and
( )= (5)
∑ ( ) KNN, TensorFlow [41] and Scikit-learn [42] open-source
tools are used. TensorFlow is an open-source framework
where is the element of the input vector, is the designed for the implementation and experimentation of
number of classes which, in our case is two–defect and machine learning-related tasks while Scikit-learn is a high-
defect-free. level machine learning library for python programming
In our study, four modifications were carried out on the language. Furthermore, pre-trained models including the
original AlexNet model creating an AlexNet variant. The weights are acquired using PyTorch–an open-source deep
modifications are: (1) Number of channels in the input learning framework [43].
convolutional layer is changed from three to one as our
dataset consists of only grayscale images, (2) Dropout is All the experiments ran for ten epochs, where epochs
imposed after each fully connected layer, (3) Batch are the number of training iterations where each neural
normalization is performed after third and fourth network accomplishes one learning instance over the
convolutional layers, and (4) Number of output features of dataset. The selection of ten epochs is based on the
the second fully connected layer changed from 4096 to empirical observation that conveys all the training in each
1024. experiment is always converged with ten epochs with
optimal hyperparameters.
E. Application of transfer learning and optimizing
model hyperparameters
TABLE II. PRECISION, RECALL AND F1-SCORE OF THE TWO CLASSES
ImageNet dataset [38] is used for pre-training of the OBTAINED AFTER CLASSIFICATION USING KNN ALGORITHM
AlexNet model and the influence of the transfer learning is
Defect Defect-free
tested using three experimental configurations (EC):
Test: F1- F1-
Precision Recall Precision Recall
• EC1: AlexNet is trained with the casting surface Train Score Score
defect dataset without any pre-training with 0.2:0.8 0.86 0.87 0.88 0.86 0.81 0.83
weights initialized randomly (training from 0.4:0.6 0.85 0.88 0.86 0.84 0.79 0.81
scratch). 0.6:0.4 0.85 0.88 0.86 0.84 0.79 0.81
0.8:0.2 0.84 0.82 0.83 0.77 0.79 0.78
• EC2: the same process in the previous
configuration repeated, but the weights initialized IV. RESULTS AND DISCUSSIONS
with the ones found from the pre-trained model
instead of random weights. This section presents the results obtained by following
the methods discussed in the previous section and related
• EC3: the exact weights of all the feature extraction interpretations.
layers pre-trained on the ImageNet dataset were
used. A. Classification without learning
• EC4: the entire model parameters (including both The results of the KNN classification of the casting
parameters of convolutional and fully connected surface defect dataset are presented in this section. Table II
layers) of the pre-trained model on the ImageNet shows precision, recall, and the f1-score corresponding to
dataset is used on the casting surface defect each class (defect and defect-free) obtained after
dataset. performing the KNN algorithm with varying the train-test
split ratio. With the reduction of the training set percentage,
In each configuration, two types of hyperparameters there is no significant gradual change in the accuracy as
including optimizer [39] and learning rate are optimized there is no learning that occurred during the training process
using the grid search method to achieve higher accuracy by the KNN algorithm unlike the learning models discussed
with modest robustness. In the grid search method, all the in this paper.
possible combinations of the selected hyperparameters are The overall average accuracy of the classification of
tested in multiple trials. The grid search methods suffers casting surface image data using the KNN algorithm is
from the curse of dimensionality [40] where the number of relatively lower when compared to the results of CNN
trials grows exponentially with the increase of the number models discussed in the future sections. This lower accuracy
of hyperparameters. Nevertheless, the other sophisticated reveals that the classification using raw pixel intensities and
optimizations are not used as we obtained sufficient their proximities to neighbor values in the casting surface
accuracies by varying only the two aforementioned defect images are not significant. This phenomenon
hyperparameters. discloses that all the images in each class are unique up to a
certain extent in respect of pixel intensities which in return,
induces the importance of the feature extraction. On the
133
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

Fig. 4. (a) and (b) are the accuracies of training process and the validation process, respectively over different experimental configurations (ECs)
(which are mentioned under methodology of this paper) and varying train-test split ratios.

other hand, when observed with a perspective of the


Specifically, even with 20% training images, the use of
accuracies (i.e., All the accuracies are around 0.8 which is
the exact feature extractor of the pre-trained model for
regarded as a significant performance in image
training (EC3) induced higher accuracy than training from
classification tasks) it reveals that the image dataset has
scratch. In the instance where both feature extractor weights
lower levels of noise.
and classifier weights (weights of the fully connected
B. Classification with learning layers) of the pre-trained model are used on training, an
overall accuracy of 0.9 is achieved.
Classification endorsed by the application of CNNs On the contrary, validation process accuracy (as shown
manifested higher accuracies when compared to the in Fig. 4-b) does not fluctuate considerably over the
classification performed by the KNN algorithm. Fig. 4 variation of train-test split ratio regardless of the
illustrates the variation of accuracy with different train-test experimental configurations except where training is done
split ratios and different experimental configurations. from scratch. All the transfer learning schemes (EC2, EC3,
For each experimental configuration, training accuracy and EC4) show improved validation accuracies when
(as shown in Fig. 4-a) is dropped when the training image compared to training from scratch (EC1) on the casting
portion decreases while increasing the number of validation surface image dataset.
images. In fact, demonstrating the common idea that lesser Table III indicates the possible combinations of the
training in deep learning models causes lesser accuracies. hyperparameters used for the grid search method and
Nevertheless, the size of the drop is negligible as all the related accuracies for EC3 with 20% of training images.
accuracies are above 0.9 (or equal to 0.9) in each scenario. During optimization of hyperparameters, first, we picked a
The highest overall accuracy is achieved when the training random learning rate (0.0001) and performed a grid search
weights are initialized from the pre-trained model (EC2) with seven optimizer types. The best performance is gained
instead of random initialization (EC1). by setting the optimizer to the RMSprop algorithm [39].
Fixing the optimizer as RMSprop algorithm, then we tested
several learning rates which resulted in 0.0001 as the
TABLE III. RESULTS OF THE GRID SEARCH METHOD PERFORMED TO
FIND BEST OPTIMIZER AND LEARNING RATE
optimum value. Overall best hyperparameters (i.e.,
optimizer type and learning rate) found by the grid search
Search 1: Learning Rate is Randomly Selected method with the other hyperparameters found from the
(=0.0001) and Fixed to Test Several Optimizer Types
literature were standardized as shown in Table IV over the
Learning Training Training time
Rate
Optimizer
accuracy (seconds) final run of each experiment.
Adam 0.94 742
Adadelta 0.57 757 TABLE IV. OPTMIZED HYPERPARAMETER SETTINGS/VALUES
AdamW 0.90 484 STANDARIZED THROUGHT ALL EXPERIMENTS
0.0001 Adamax 0.89 518 Obtained with Grid Search
ASGD 0.57 505 Hyperparameter Setting/Value (GS) Method/Using
RMSprop 0.93 635 Literature (LT)
SGD 0.58 744 Optimizer RMSprop GS
Search 2: Best Optimizer (RMSprop) from Search 1 is Learning Rate 0.0001 GS
Fixed and Tested Several Learning Rates Learning rate Step (decay
LT
Learning Training Training time policy over epoch)
Optimizer
rate accuracy (seconds) Momentum 0.9 LT
0.1 0.55 630 Batch Size 16 LT
0.01 0.57 634
RMSprop 0.001 0.94 641
0.0001 0.93 637 V. CONCLUSIONS AND FUTURE WORK
0.00001 0.93 642 Maintaining quality standards is vital in the casting
product manufacturing industry for better business

134
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

performance and for the safety of the end-users who [7] S. Gholizadeh, “A review of non-destructive testing methods of
consume products with critical mechanical components composite materials,” Procedia Structural Integrity, vol. 1, pp.
50–57, 2016.
fabricated by casting. Automated inspection of casting [8] Q. Wan, H. Zhao, and C. Zou, “Effect of micro-porosities on
defects leads to lesser inspection times and circumvents fatigue behavior in aluminum die castings by 3D X-ray
safety problems of employees working in hazardous tomography inspection,” ISIJ international, vol. 54, no. 3, pp.
environments. 511–515, 2014.
[9] K. O’Shea and R. Nash, “An introduction to convolutional neural
In this paper, we discussed the application of AlexNet networks,” arXiv preprint arXiv:1511.08458, 2015.
CNN architecture-based transfer learning for automated [10] H. Strecker, “A local feature method for the detection of flaws in
inspection of surface defects of a submersible pump automated X-ray inspection of castings,” Signal Processing, vol.
impeller manufactured by casting. Over the last decade, for 5, no. 5, pp. 423–431, 1983, doi: https://doi.org/10.1016/0165-
1684(83)90005-1.
the task of casting defect recognition, numerous [11] Z. Górny, S. Kluska-Nawarecka, D. Wilk-Ko\lodziejczyk, and K.
sophisticated architectures were proposed with higher Regulski, “Diagnosis of casting defects using uncertain and
architectural complexity and better performance compared incomplete knowledge,” Archives of Metallurgy and Materials,
to the AlexNet architecture. Using the results of our study, vol. 55, no. 3, pp. 827–836, 2010.
[12] M. Ferguson, R. Ak, Y.-T. T. Lee, and K. H. Law, “Automatic
we show (limited to the dataset used) that a simpler localization of casting defects with convolutional neural
architecture like AlexNet can perform better when it is networks,” in 2017 IEEE international conference on big data
implemented with transfer learning and optimized model (big data), 2017, pp. 1726–1735.
parameters. As future work, methods discussed in this study [13] L. Alzubaidi et al., “Review of deep learning: Concepts, CNN
architectures, challenges, applications, future directions,” Journal
can be tested over other datasets containing images of of big Data, vol. 8, no. 1, pp. 1–74, 2021.
casting surface defects of different products. [14] Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet
Over the several experimental configurations tested, classification with deep convolutional neural networks,”
the use of the exact feature extractor of the pre-trained Advances in neural information processing systems, vol. 25, pp.
1097–1105, 2012.
model for training demonstrated the best performance in [15] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE
terms of training accuracy and the training time (Although Transactions on knowledge and data engineering, vol. 22, no. 10,
training with weights initialized from the pre-trained model pp. 1345–1359, 2009.
resulted in the overall highest accuracy the training time is [16] Koutsoukas, K. J. Monaghan, X. Li, and J. Huan, “Deep-learning:
investigating deep neural networks hyper-parameters and
higher in contrast to using the entire feature extractor. comparison of performance to shallow methods for modeling
Several recommendations for a casting surface defect bioactivity data,” Journal of cheminformatics, vol. 9, no. 1, pp.
detection system can be made based on the results of this 1–13, 2017.
study. Nevertheless, as future work, the practical usability [17] R. Dabhi, “Casting product image data for quality inspection,”
Kaggle.com. https://kaggle.com/ravirajsinh45/real-life-
of such a system needs to be tested prior to implementation industrial-dataset-of-casting-product (accessed Jun. 14, 2021).
as several dataset-specific parameters still need to be [18] Gayer, A. Saya, and A. Shiloh, “Automatic recognition of
adjusted depending on the circumstance. The process of welding defects in real-time radiography,” Ndt International, vol.
capturing the surface images of the casting products is vital 23, no. 3, pp. 131–136, 1990.
[19] Eckelt, N. Meyendorf, W. Morgner, and U. Richter, “Use of
including, but not limited to: (1) adhering to proper lighting automatic image processing for monitoring of welding processes
conditions, and (2) maintaining unique and plain and weld inspection,” in Non-destructive testing, Elsevier, 1989,
background when capturing. As shown in the results, pp. 37–41.
transfer learning can be implemented to reduce the training [20] Filbert, R. Klatte, W. Heinrich, and M. Purschke, “Computer
aided inspection of castings,” in IEEE-IAS Annual Meeting,
time and enhance the robustness of the model. Moreover, 1987, pp. 1087–1095.
transfer learning is beneficial when the number of training [21] Mery, “New approaches for defect recognition with X-ray
images is lower. Specifically, the use of a feature extractor testing,” Insight, vol. 44, no. 10, pp. 614–15, 2002.
from the pre-trained model and limiting the training only [22] X. Li, S. K. Tso, X.-P. Guan, and Q. Huang, “Improving
automatic detection of defects in castings by applying wavelet
with the classification layers (fully connected layers) with technique,” IEEE Transactions on Industrial Electronics, vol. 53,
casting defect data is advantageous instead of using all the no. 6, pp. 1927–1934, 2006.
parameters of the pre-trained model. Furthermore, fine- [23] Kehoe and G. A. Parker, “An intelligent knowledge based
tuning the model hyperparameters is crucial for obtaining approach for the automated radiographic inspection of castings,”
NDT & E International, vol. 25, no. 1, pp. 23–36, 1992.
better results. [24] D. Wang, B. Wang, H. Yao, H. Liu, and F. Tombari, “Local
image descriptors with statistical losses,” in 2018 25th IEEE
REFERENCES International Conference on Image Processing (ICIP), 2018, pp.
[1] W. Barkman, In-process quality control for manufacturing. CRC 1208–1212. doi: 10.1109/ICIP.2018.8451855.
Press, 1989. [25] R. R. Da Silva, M. H. S. Siqueira, L. P. Calôba, and J. M. Rebello,
[2] R. T. Chin and C. A. Harlow, “Automated visual inspection: A “Radiographics pattern recognition of welding defects using
survey,” IEEE transactions on pattern analysis and machine linear classifiers,” Insight, vol. 43, no. 10, pp. 669–74, 2001.
intelligence, no. 6, pp. 557–573, 1982. [26] Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta,
[3] M. Sahoo, Principles of metal casting. McGraw-Hill Education, and A. A. Bharath, “Generative adversarial networks: An
2014. overview,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp.
[4] T. V. Sai, T. Vinod, and G. Sowmya, “A critical review on 53–65, 2018.
casting types and defects,” Engineering and Technology, vol. 3, [27] L. Jiang, Y. Wang, Z. Tang, Y. Miao, and S. Chen, “Casting
no. 2, pp. 463–468, 2017. defect detection in X-ray images using convolutional neural
[5] M. K. Ferguson, A. Ronay, Y.-T. T. Lee, and K. H. Law, networks and attention-guided data augmentation,”
“Detection and segmentation of manufacturing defects with Measurement, vol. 170, p. 108736, 2021.
convolutional neural networks and transfer learning,” Smart and [28] Shorten and T. M. Khoshgoftaar, “A survey on image data
sustainable manufacturing systems, vol. 2, 2018. augmentation for deep learning,” Journal of Big Data, vol. 6, no.
[6] D. Mery, T. Jaeger, and D. Filbert, “A review of methods for 1, pp. 1–48, 2019.
automated recognition of casting defects,” INSIGHT-WIGSTON [29] S. P. Mohanty, D. P. Hughes, and M. Salathé, “Using deep
THEN NORTHAMPTON-, vol. 44, no. 7, pp. 428–436, 2002. learning for image-based plant disease detection,” Frontiers in
plant science, vol. 7, p. 1419, 2016.

135
Smart Computing and Systems Engineering, 2021
Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka

[30] Abd Almisreb, N. Jamil, and N. M. Din, “Utilizing AlexNet deep


transfer learning for ear recognition,” in 2018 Fourth
International Conference on Information Retrieval and
Knowledge Management (CAMP), 2018, pp. 1–5.
[31] M. Z. Alom et al., “The history began from alexnet: A
comprehensive survey on deep learning approaches,” arXiv
preprint arXiv:1803.01164, 2018.
[32] Khan, A. Sohail, U. Zahoora, and A. S. Qureshi, “A survey of the
recent architectures of deep convolutional neural networks,”
Artificial Intelligence Review, vol. 53, no. 8, pp. 5455–5516,
2020.
[33] C.-Y. Lee, P. W. Gallagher, and Z. Tu, “Generalizing pooling
functions in convolutional neural networks: Mixed, gated, and
tree,” in Artificial intelligence and statistics, 2016, pp. 464–472.
[34] W. Rawat and Z. Wang, “Deep convolutional neural networks for
image classification: A comprehensive review,” Neural
computation, vol. 29, no. 9, pp. 2352–2449, 2017.
[35] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.
R. Salakhutdinov, “Improving neural networks by preventing co-
adaptation of feature detectors,” arXiv preprint arXiv:1207.0580,
2012.
[36] W. Liu, Y. Wen, Z. Yu, and M. Yang, “Large-margin softmax
loss for convolutional neural networks.,” in ICML, 2016, vol. 2,
no. 3, p. 7.
[37] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei,
“Imagenet: A large-scale hierarchical image database,” in 2009
IEEE conference on computer vision and pattern recognition,
2009, pp. 248–255.
[38] S. R. Labhsetwar, S. Haridas, R. Panmand, R. Deshpande, P. A.
Kolte, and S. Pati, “Performance Analysis of Optimizers for Plant
Disease Classification with Convolutional Neural Networks,”
arXiv preprint arXiv:2011.04056, 2020.
[39] J. Bergstra and Y. Bengio, “Random search for hyper-parameter
optimization.,” Journal of machine learning research, vol. 13, no.
2, 2012.
[40] M. Abadi et al., “Tensorflow: A system for large-scale machine
learning,” in 12th symposium on operating systems design and
implementation, 2016, pp. 265–283.
[41] F. Pedregosa et al., “Scikit-learn: Machine learning in Python,”
the Journal of machine Learning research, vol. 12, pp. 2825–
2830, 2011.
[42] Paszke et al., “Pytorch: An imperative style, high-performance
deep learning library,” Advances in neural information
processing systems, vol. 32, pp. 8026–8037, 2019.

136

View publication stats

You might also like