0% found this document useful (0 votes)
12 views12 pages

Machine Vision-Based Object Detection Strategy For

Uploaded by

Dr. Chekir Amira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views12 pages

Machine Vision-Based Object Detection Strategy For

Uploaded by

Dr. Chekir Amira
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Hindawi

Scientific Programming
Volume 2022, Article ID 1188974, 12 pages
https://doi.org/10.1155/2022/1188974

Research Article
Machine Vision-Based Object Detection Strategy for Weld Area

Chenhua Liu , Shen Chen, and Jiqiang Huang


School of Mechanical Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China

Correspondence should be addressed to Chenhua Liu; [email protected]

Received 27 February 2022; Revised 18 March 2022; Accepted 21 March 2022; Published 11 April 2022

Academic Editor: Tongguang Ni

Copyright © 2022 Chenhua Liu et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
For the noisy industrial environment, the welded parts will have different types of defects in the weld area during the welding
process, which need to be polished, and there are disadvantages such as low efficiency and high labor intensity when polishing
manually; machine vision is used to automate the polishing and achieve continuous and efficient work. In this study, the Faster
R-CNN object detection algorithm of two-stage is used to investigate the relationship between flops and the number of network
parameters on the model by using a V-shaped welded thick plate as the research object and establishing the workpiece dataset with
different lighting and angles, using six regional candidate networks for migration learning, comparing the convergence degree of
different Batch and Mini-Batch on the model, and exploring the relationship between flops and the number of network parameters
on the model. The optimal learning rate is selected for training to form a weld area object detection network based on the weld
plate workpiece under few samples. The study shows that the VGG16 model is the best in weld seam area recognition with 91.68%
average accuracy and 25.02 ms average detection time in the validation set, which can effectively identify weld seam areas in
various industrial environments and provide location information for subsequent automatic grinding of robotic arms.

1. Introduction are extracted manually, and their poor generalization and


low detection accuracy are unavoidable drawbacks.
The rapid development of welding has promoted the Given weld seam forming results from the nonlinear
progress of related industries. The weld seam quality directly formation of multiple welding parameters, convolutional
affects the structural use performance and the product life of neural networks (CNN) with nonlinear properties are also
the product [1]. Still, the weld seam, after welding, inevitably gradually applied to the welding industry [7], including
produces defects such as spatter, weld tumor, leakage, and object detection of weld seam areas. The mainstream object
porosity. Manual grinding of the weld area is required to detection methods are mainly divided into one-stage de-
eliminate welding defects, but there are disadvantages such tection algorithms represented by SSD and YOLO series
as subjectivity and low efficiency. And the current welding is based on regression analysis and two-stage detection algo-
a large V-shaped welded thick plate; an automated weld area rithms defined by Faster R-CNN series based on candidate
detection is needed to be introduced to find the location of regions according to the detection process [8]. The former
the weld area [2]. At present, the detection of the weld area at detection algorithm pursues more speed, while the latter also
home and abroad is mainly based on traditional and deep pursues more accuracy.
learning methods. Traditional image weld region extraction Although various object recognition methods have
algorithm identifies to determine the weld seam location by achieved significant results in workpiece recognition, the
constructing the critical points of the image and descriptor V-shaped weld plate workpiece samples belong to a small
to the image to give feature information, such as ORB number of instances, which cannot meet the deep network
(Oriented FAST and Rotated BRIEF) [3], Image Hu moment training requirements, and there is a lack of weld seam
invariant features [4], and AdaBoost weak learning [5]. Laser region extraction algorithms under the few-sample weld
weld area detection and 3D laser reconstruction of the weld plate workpiece. For V-shaped weld plate workpiece, such
area are widely used in current research [6]. Such methods few-sample data contains very little labeled data, mainly
2 Scientific Programming

based on the traditional classical mature object detection Highlighted


Area is The
method, using migration learning few-sample learning V-shaped
Weld Area
Welding Plate
strategy [9]; based on fine-tuned migration learning of the
implementation of the whole system, large-scale dataset for
learning the source domain model used the model pa-
rameters to initialize the object domain model, and later on,
the small-scale workpiece dataset is used for fine-tuned
recognition. For the accuracy of weld seam detection, the
Faster R-CNN network in the two-stage model is applied as
the object recognition framework. The fine-tuned migratory Weld Plate
Bevel
learning of the learning source domain is performed using
the VGG16 network. It is also tested in the test set with an
accuracy rate that meets the needs of industrial use.
Figure 1: Diagram of the CAD model of the welded plate and the
area of the weld.
2. Object Detection of Weld Seam with
Few Samples
6-DOF
2.1. Creation of Weld Area Dataset. Migration learning of the UR-ARM-
ROBOT
network model is performed on the dataset ImageNet, and
the Intel RealSense D435i RGB-D camera is used to capture
the V-shaped weld plate workpiece with weld seam, and each
view of the weld plate workpiece is shown in Figure 1. Using
the eye-in-hand calibration strategy [10] to find the coor-
dinate conversion relationship, as shown in Figure 2, the
coordinates of the weld area are converted to the robot
coordinates under the robot coordinates to realize the vi- Intel RealSense D435i
RGB-D With IMU
sion-based automated welding and grinding work [11].
Collect images of weld plate workpieces under backlight,
normal light, and multiple angles, 1000 images in total. In
V-shaped
order to improve the network feature learning and training Welding
speed, the image size is adjusted to 400 × 300 pixels. In order Plate

to learn the workpiece features at each angle and improve the


network overfitting phenomenon [12], for high-frequency Figure 2: Diagram of welding plate, robot arm, and camera.
images under unbalanced lighting, different variance values
and Gaussian filter noise with Gaussian kernel are added to
them, and the data enhancement scheme of a random level, The network model uses shared convolutional layers to
pretzel noise attack, arbitrary angle rotation, and random extract weld region image conv features’ map [15]. The
clipping is used to expand the dataset to 5000 samples [13] network model uses a shared convolutional layer to extract
because the workpiece samples are few samples’ data. Fol- the weld region image feature map, which is fed to the RPN
lowing the image data format of the neu-dataset (this is a (Region Proposal Networks) and the ROI (Region of In-
dataset of steel plate surface curves produced by North- terest), respectively. The feature extraction layer consists of
eastern University in China), the LabelIme annotation tool is 13 convolutional layers, 13 ReLu activation layers, and four
used to export to XML, ensuring that each annotated border pooling layers; the RPN network determines whether the
has only one weld feature. As shown in Figure 3, 80% of the feature map belongs to the foreground or background by the
images from the dataset were randomly selected as the Softmax classifier, where the RPN works as shown in
training set, and 10% of the validation and test sets were Figure 5.
divided, respectively. A 3 × 3 mask is used to slide the window motion on the
feature map, and the position of the mask center corre-
sponding to the original map is used as the center point
2.2. Migration Learning Model Building Based on Welded (Figure 5(a)). Nine anchors with different scale aspect ratios
Plate Workpiece Dataset. The Faster R-CNN [14] in the two- are generated in the feature map (Figure 5(b)), and each
stage model is used to build a workpiece-oriented object anchor is assigned to the corresponding class label (positive
recognition network with fewer samples, using the pre- label: foreground weld area and negative label: background
trained weights and bias information on the ImageNet area). The border regression algorithm obtains the weld area
dataset, freezing all parameters except the fully connected bounding box values and output to the ROI pooling layer for
layer, and modifying the Softmax classifier to train the weld dimensionality reduction. The input of the ROI pooling layer
seam features of the weld plate workpiece. A simplified is the feature map generated after the last convolution layer
schematic of the network structure is shown in Figure 4. and the candidate region box are generated by the RPN layer,
Scientific Programming 3

Raw Image Data Image Data Enhancement Division of Data Set

Num:1000 Num:5000
Training Set (80%)
Num=4000
infer-001 infer-002 infer-003 infer-004 infer-005 infer-006 infer-007

Validation Set (10%)


infer-008 infer-009 infer-010 infer-011 infer-012 infer-013 infer-014
Num=500

infer-015 infer-016 infer-017 infer-018 infer-019 infer-020 infer-021

Test Set (10%)


Num=500
infer-022 infer-023 infer-024 infer-025 infer-026 infer-027 infer-028

Figure 3: Image data enhancement: (a) original image, (b) Gaussian filtered image, (c) image with ’salt and pepper,’ (d) image with random
rotation, (e) image with random flap, and (f ) image with random cut.

Classification Layer

Convolutional Feature Map Pooling Layer Fully Connected BoundingBoxRegression


Layer Layer

Input Image Output Image


300×400 300×400
RPN
Classification Layer

Convolutional BoundingBoxRegression Proposal


Layer

Figure 4: Schematic diagram of Faster R-CNN network structure.

and the final output is the ROI feature map. Finally, the fully Initial training was conducted using SGD (stochastic
connected layer and the Softmax classifier determine gradient descent) with learning rate set to 0.1, momentum
whether the candidate region is the weld region and output factor set to 0.9, maximum Epochs set to 30, and dropout set
the exact location of the bounding box [16]. to 0.1; the six feature region candidate networks were trained
Six pretrained convolutional neural networks VGG16 in turn, respectively. During the training process, the loss
[17], VGG19 [18], Googlenet [19], Resnet50 [20], Alexnet function L is as in equations (1), (2), and (3):
[21], and Lenet [22] on the neu-dataset are used as RPNs for 1 1
migration learning so that the Faster R-CNN model first L 􏼈pi 􏼉, 􏼈ui 􏼉􏼁 � 􏽘 Lcls pi , pi∗ 􏼁 + λ 􏽘p , (1)
Ncls i Nreg i i
obtains the underlying feature weights of the images and
then migrates the learning of these feature information to
the task of weld region recognition, achieving the goal that Lcls pi , pi∗ 􏼁 � −log􏼂pi pi∗ 􏼃 + 1 − pi 􏼁 1 − pi∗ 􏼁􏼃, (2)
the model can recognize accurate weld regions with a small
number of artifacts. Lreg ti , ti∗ 􏼁 � R ti − ti∗ 􏼁, (3)
4 Scientific Programming

Sliding Window (3*3)


The Center Of Sliding Window
Anchor Boxes

(a) (b)

Figure 5: Sliding window diagram.

where ti � (tx , ty , tw , th ) is the coordinate value and bias of TP


the bounding box, ti ∗ denotes the predicted actual coor- P� , (5)
TP + FP’
dinate value and bias, Ncls is the number of classification
samples, 􏼈Pi 􏼉 and 􏼈ui 􏼉 are the output values, 􏼈Pi 􏼉 denotes the TP
predicted probability of the weld region, and Pi∗ denotes the R� , (6)
TP + FN’
anchor point discriminant value, and when its value is 1, the
anchor point is a positive label indicating identification to 1
the weld region [23]; when its value is 0, it is a negative label AP � 􏽚 P.R.dR. (7)
indicating identification to the weld plate region. Lcls denotes
0
the classification layer loss function, and Nreg denotes the
number of regression samples. Since L1 loss has high noise In (2), the larger the AP is, the better its recognition of
immunity and insensitivity to numerical fluctuations weld regions is. TP is the number of correctly classified
compared with L2 loss, it can achieve better results in finding positive samples, FP is the number of incorrectly classified
the border value of the target classification, so L1 loss is used, positive samples, and FN is the number of incorrectly
as shown in (1) [24]: classified negative samples. For each identified weld region,
the identification score of the weld region is expressed as a
⎨ 0.5x2
⎧ |x| < 1 confidence level. The precision-recall curves are plotted [26].
R � smoothL1 (x) � ⎩ . (4) The precision responds to how accurately it predicts the
|x| − 0.5 others
positive samples, the recall responds to how well the clas-
Iterative training and testing were performed on pro- sifier covers the positive samples, and the average precision
fessional graphics workstations with the training and testing is the integral of the precision-recall curve.
environments, as shown in Table 1. This study ensures that different candidate zone net-
works are tested and analyzed in a unified hardware-soft-
ware environment to ensure a consistent FLOPS (Floating-
2.3. Evaluation Method for the Performance of Weld Area Point Operations Per Second) [27]; the metric is used to
Identification Models. Precision P (Precision), Recall R measure to estimate the execution performance of the
(ReCall), and Average Precision AP (Average Precision) [25] computing platform. The number of parameters of the RPN
are used as evaluation metrics for object detection in the is only related to the network structure, and the memory
weld area to assess the degree of merit of the model. The occupation of the model is approximately four times the
calculation is shown in equations (5), (6), and (7): memory occupation of the number of parameters. The
Scientific Programming 5

Table 1: Hardware parameters and deep learning environment.


Environment name Version
Operating system Ubuntu16.04 LTS
CPU Intel (R) Core (TM) i9-12900 K @5.0 GHz
GPU GeForce RTX 3090 ×2
RAM/ROM Fury DDR4 2666 MHz 16 G ×2/2T 970 Evo Plus M.2
Deep learning framework/Python Pytorch stable (1.9.1)/3.81
IDE Pycharm for professional 2020.02
CUDA 11.4
cuDNN 8.1

number of parameters [28] for the convolutional and fully material is mild steel Q215, the welding method used is melt
connected layers is calculated as shown in equation (8), (9), electrode gas metal arc welding (GMAW) and multilayer
and (10): multipass welding, the shielding gas is argon, the welding
current is 200 A, the welding wire diameter is 2 mm, and
ParasConv � n ×(h × w × c + 1), (8)
welding speed is 2 mm per second. By the recognition effect
of the weld seam on the same V-shaped welding plate, the
ParasFull � Weightin × Weightout , (9) convolutional neural network with the best effect is selected
as the RPN, and the model is fine-tuned on this basis for
i
comparative analysis of the recognition of the weld seam in
ParasNum � 􏽘 ParasConvi + ParasFulli 􏼁, (10) different work scenarios.
1
Ensuring consistent FLOPS, the network accuracy im-
where c is the number of input channels, n is the number of ages and training loss images are plotted using the initial
output channels, h is the height of the convolutional layer, training parameters set in Section 2.2, using six different
and w is the width of the convolutional layer. The pooling RPNs in the training and validation sets, respectively. The
layer does not need to calculate the number of parameters. A result is shown in Figure 5.
low number of parameters prevents the model from As shown in Figure 6, the statistical training results
reaching the weld area features, making the model underfit. obtained by changing different RPNs in the Faster R-CNN
A high number of parameters will cause the model to occupy model are shown in Table 3, where AP (%) - T is the average
too much memory space, and the memory access cost precision of the training set and AP (%) - V is the average
(MAC) will increase. To measure the model complexity of precision of the validation set.
the candidate area network, the number of floating-point 100, 200, 300, 400, and 500 different weld plate images
operations, FlOPs, is introduced. are taken from the weld plate part test set as input images,
To compute FLOPs, this study assumes convolution is and their average detection times are recorded to evaluate
implemented as a sliding window [29] and that the non- the operational efficiency of the algorithm (Figure 7).
linearity function is computed free. For the specified con- Resnet50 is used as the RPN with the highest average
volution kernel, it is calculated as in equation (11). accuracy in the training and validation sets, and the fraction
of recognized weld areas is up to 94.34% (Figure 8(d)). The
FLOPs � 2HW􏼐Cin K2 + 1􏼑Cout , (11) Resnet series network provides a shortcut connection
mechanism, which ensures a reasonable recognition rate
where H, W, and Cin are height, width, and number of
even if the network depth of Resnet50 is 50 layers and the
channels of the output feature map, K is the kernel width
number of layers.
(assumed to be symmetric), and Cout is the number of output
The increase in the number of layers leads to flops
channels [30]. For fully connected layer, it is calculated as
4GFLOPs. Resnet50 average detection time is 92.03 ms.
shown in (7):
Among them, when Alexnet is used as RPN, the flops are
FLOPs � 2Din − 1􏼁Dout , (12) 727MFLOPs, and the average detection time consumes the
shortest 18.02 ms because the number of convolutional
where Din is the input dimensionality and Dout is the output layers is 5, the network learning effect is not as good as the
dimensionality. The performance parameters of each RPN deep network Figure 8(b), and its average recognition ac-
are shown in Table 2. curacy is 75.50% (Figure 8(a)). In the training process,
Alexnet accuracy function and loss function gradually
3. Result and Analysis converge, and the loss function curve in the validation set
fails to approximate the loss function curve in the training
3.1. Recognition Results’ Weld Areas by Different Feature set, so the result weld candidate box has more rear view parts
Extraction Networks. This study uses a V-shaped welded in the weld region candidate box (weld plate region). VGG19
steel plate with dimensions of 30.00 cm × 17.00 cm × can identify the weld area more completely (Figure 8(f ))
0.50 cm, a V-shaped opening angle of 45°, and a weld seam with 20GFLOPs and 20,483,904 parameters, as shown in
formed in the following welding parameters: the steel plate Table 2, resulting in a memory occupation of 548 M and a
6 Scientific Programming

Table 2: Performance parameters of RPNS.


Region proposal networks Number of network layers Number of parameters/memory size (MB) Flops
VGG16 16 62,378,344/(528 MB) 16 GFLOPs
VGG19 19 20,483,904/(548 MB) 20 GFLOPs
Resnet50 50 25,000,000/(78.1 MB) 4 GFLOPs
Alexnet 11 60,965,128/(233 MB) 727 MFLOPs
Googlenet 22 6,990,270 (51 MB) 2 GFLOPs
LeNet 7 431,080/(0.07 MB) 2.32 MFlops

1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
Accuracy

Accuracy
0.6 0.6
0.5 0.5
0.4
0.4 0.3
0.3 0.2
0.2 0.1
0.1 0.0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Epochs Epochs

Training Accuracy Training Accuracy


Validation Accuracy Validation Accuracy

3.0
2.8
2.4 2.5
2.0 2.0
LOSS

LOSS

1.6
1.5
1.2
0.8 1.0
0.4 0.5
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Epochs Epochs
The Results Of LeNet The Results Of GoogleNet
Training LOSS Training LOSS
Validation LOSS Validation LOSS

(a) (b)
1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
Accuracy

Accuracy

0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Epochs Epochs

Training Accuracy Training Accuracy


Validation Accuracy Validation Accuracy

1.4 1.4
1.2 1.2
1.0 1.0
LOSS

LOSS

0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Epochs Epochs
The Results Of VGG16 The Results Of VGG19

Training LOSS Training LOSS


Validation LOSS Validation LOSS

(c) (d)
Figure 6: Continued.
Scientific Programming 7

1.0 1.0
0.9
0.8 0.8
0.7
Accuracy

Accuracy
0.6 0.6
0.5
0.4 0.4
0.3
0.2 0.2
0.1
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Epochs Epochs

Training Accuracy Training Accuracy


Validation Accuracy Validation Accuracy

1.4 2.6
2.4
1.2 2.2
2.0
1.0 1.8
LOSS

LOSS
0.8 1.6
1.4
0.6 1.2
1.0
0.4 0.8
0.2 0.6
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Epochs Epochs
The Results Of ResNet50 The Results Of AlexNet

Training LOSS Training LOSS


Validation LOSS Validation LOSS

(e) (f )

Figure 6: Training curves of different RPN. (a) Training curves of LeNet, (b) Training curves of GoogleNet, (c) Training curves of VGG16,
(d) Training curves of VGG19, (e) Training curves of ResNet50. (f ) Training curves of AlexNet.

Table 3: Six types of RPN network AP values.


RPN VGG16 VGG19 Resnet50
AP (%) -T 91.55 85.34 94.34
AP (%) -V 86.45 70.05 84.5
RPN Alexnet Googlenet LeNet
AP (%) -T 75.50 84.55 65.23
AP (%) -V 66.50 73.60 45.60

180

160
155.50 153.80 151.60 150.30
140 148.70
Detection Time (ms)

120

100
93.50 92.80 91.50 91.00 91.35
80
72.45 71.70 69.30 68.15 70.90
60

40
26.48 23.15 24.75 25.30 25.42
20 18.70 18.40 17.50 17.30 18.20
12.50 12.00 12.53 12.80 11.92
0
100 200 300 400 500
Number Of Test Images (n)
VGG16 VGG19
Resnet50 Alexnet
Googlenet LeNet
Figure 7: Runtime with different numbers of test sets.
8 Scientific Programming

SCORES SCORES SCORES


0.8045 0.8566 0.6752

(a) (b) (c)


SCORES SCORES SCORES
0.9326 0.9130 0.9225

(d) (e) (f )

Figure 8: Recognition results of different RPN models for the same weld plate (comparison of six different RPNs using the same weld plate
workpiece): (a) Alexnet, (b) Googlenet, (c) Lenet, (d) Resnet50, (e) VGG16, and (f ) VGG19.

bloated model with a detection time of 70.50 ms. The worst reduce the memory occupied memory space. The average
performance is LeNet, which has the shortest average de- elapsed time is lower, only 25.02 ms, and the confidence level
tection time, occupies 2.32 MFlops, and has the smallest of the identified weld region is up to 91.30%, and the entire
number of parameters among the six networks. weld region is framed (Figure 8(e)). Therefore, the Faster
Because its network layer is 7, it cannot learn the weld R-CNN model based on the VGG16 candidate region
region’s image features in-depth and is more suitable for network is selected as the model to identify the weld region.
dealing with scenarios requiring shallow network work and
can only identify some features in the weld region under this 3.2. Optimization of Models. As shown in the Result of
study (Figure 8(c)). VGG16 in Figure 5, the Faster R-CNN based on VGG16 for
The average accuracy of VGG16 is slightly lower than weld area recognition object detection network achieves
that of Resnet50 at 91.55%. Since VGG16 uses several 91.55% accuracy in the training set, and the accuracy in the
consecutive 3x3 convolutional kernels instead of the larger validation set cannot converge to the same level as the test set.
convolutional kernels in Alexnet compared to AlexNet, for a And in order to reduce the loss value to convergence in both
given receptive field, the performance of the small con- training set and test set, different learning rates are chosen to be
volutional kernels with stacking is better than that of the adjusted, and different numbers of Epochs are added, and
large convolutional kernels because the multilayer nonlinear Mini-Batch is more suitable for fewer sample data set models,
layers can increase the network depth to ensure learning of so the optimization of the model is added on this basis. In order
more complex pat-terns, the flops’ value of VGG16 is lower to make a comparison with the initial training parameters in
than that of VGG19 in the same series, and the number of Section 2.2, the size of the Mini-Batch is chosen to be 128. the
parameters is increased to ensure the learning effect and time cost of training. As shown in Table 4.
Scientific Programming 9

Table 4: Mini-Batch in VGG16.


Epoch Mini-Batch training accuracy (%) Validation accuracy (%) Mini-Batch training loss Validation loss
1 38.52 45.85 1.36 1.28
5 79.20 55.98 0.76 0.95
10 88.10 64.50 0.51 0.84
15 90.56 78.64 0.45 0.80
20 92.35 82.65 0.34 0.76
25 93.48 83.62 0.31 0.71
30 93.85 85.62 0.28 0.69

Table 5: Accuracy results of training models with different numbers of Epoch and different learning rates.
Epochs: training accuracy (validation accuracy) (%)
Learning rate
30 35 40 45 50 55 60
0.1 89.57 (81.63) 91.18 (82.34) 93.88 (83.65) 92.03 (84.55) 92.86 (85.64) 92.01 (84.63) 91.61 (83.52)
0.01 93.85 (84.56) 94.10 (86.78) 94.33 (87.12) 94.85 (89.73) 95.17 (91.23) 94.01 (90.55) 93.51 (89.77)
0.001 94.56 (86.64) 94.84 (88.50) 95.25 (89.66) 95.36 (90.16) 95.66 (92.38) 93.83 (90.56) 92.84 (89.71)
0.0001 95.75 (89.31) 95.89 (89.65) 95.91 (91.68) 95.85 (92.36) 95.93 (92.45) 94.59 (91.63) 93.88 (91.09)
0.00001 94.79 (88.71) 94.97 (89.08) 95.48 (90.16) 95.12 (91.74) 95.25 (91.65) 94.77 (91.02) 93.87 (91.87)

The plasticity of the VGG16-based Faster-RCNN weld determined the results more excellent than this threshold as
region detection model is verified by adjusting different positive samples and those less than this threshold as
Epoch and learning rates, and then, the optimal network negative samples. As can be seen in Figure 9, as the entry is
model is obtained, and the results are shown in Table 5. gradually reduced, more and more weld samples are pre-
In Table 5, using VGG16 as RPN for migration learning dicted to be positive, and the curve reaches the (1, 0) co-
at different learning rates, the model’s accuracy gradually ordinate point, indicating that the target detection model
improves with increasing training times and starts to con- can identify the weld seam in the image data as a positive
verge in both the training and validation sets. Reducing the sample. In equation (7), the area enclosed by the curve and
learning rate of the network model can effectively improve the coordinate perimeter is AP. As shown in Figure 9, the
the average accuracy of the overall model network at the area surrounded by its X- and Y-axes occupies almost the
same number of training sessions. Among them, when the entire coordinate axis area, indicating the strong recognition
learning rate is 0.1 and 0.01, the network converges very performance of the model.
slowly, increasing the time to find the optimal value and not As shown in Figure 10, the color is similar to that of the
achieving optimal training accuracy. When the learning rate weld plate workpiece at different angles and working en-
is 0.0001 and Epochs is 30, the accuracy of the network on vironments, but the network can also identify the weld area
top of the training set reaches 95.75%, and as Epochs keep of both weld plate workpieces with a score above 90%. As
increasing, the training accuracy keeps converging, but the shown in Figure 11, the complex industrial environment
growth is slow, and the average accuracy of 45 iterations is with different angles and each large field of view is used for
lower than before, and if we use continued iterative training, recognition, and there are multiple weld parts in the image,
it will increase the algorithm is time complexity, but it does the model network is able to recognize the complete weld
not bring noticeable improvement of training accuracy. area, and some smaller weld areas in the lower-left corner of
When the learning rate is 0.00001, the effect is not as good as the image can also be accurately recognized.
the learning rate of 0.0001 because the model hovers around In Figure 12, the model is tested in this study by
the optimal value, does not converge, and appears to be placing it in different working scenarios. Figures 12(a),
overfitting. Each learning rate showed an overall decreasing 12(b) and 12(c) show the weld target detection recogni-
trend after Epochs of 40, which was due to overfitting the tion results at different angles. It can be seen that the
network during training. Therefore, in this study, VGG16 is recognition accuracy is almost always above 90% and is
used as the RPN, and the Faster R-CNN with two-stage is close to the training accuracy when trained on the training
trained on the weld plate workpiece dataset, keeping other set. The material color in the background of the recog-
parameters in Section 2.2 unchanged. The batch is changed nition scene is similar to that of the foreground weld plate,
to a Mini-Batch of size 128, and at a learning rate of 0.0001 and the object detection network can still accurately frame
and Epochs of 40 times, the 500 weld plates in the test set the weld area’s location. Figures 12(d), 12(e) and 12(f )
were used as the target for the Precision-Recall curve. The show the effect of weld target detection with the influence
initial accuracy threshold was set to 0.8500 using a non- of the side light source at a normal overhead angle, which
extreme suppression mechanism. The point of [0, 0.8500] can identify the weld area more completely and obtain an
was decremented in the Recall interval. The model accuracy of about 92%. Figures 12(g), 12(h) and 12(i)
10 Scientific Programming

1.0

0.8

0.6

Precision
0.4

0.2

0.0

0.0 0.2 0.4 0.6 0.8 1.0


Recall
Figure 9: Precision-Recall curve based on VGG16.

SCORES
0.9003
RES
SC O 0
0.933

Figure 10: The recognition effect of tilt angle.


SC 565
OR
0.8

ES

SC
O
0.9 RES
045

SCO
0.9 RES
466
76 S
0.8 RE
5
O
SC

54 S
0.9 ORE
5
SC

0.8 RES
0.7 ES R
O

156
SC

O
6
96

SC
0.7 ES R
O
SC

0
41

Figure 11: Multiple weld plate workpiece identification effect.

show the weld seam target detection results with laser 93.17% with laser and multiangle interference. The ex-
interference and multiangle recognition. The target de- perimental results in Figure 12 show the robustness of this
tection network can still frame the weld seam in a noisy neural network model for the identification of weld areas
environment, with a maximum recognition accuracy of in different working environments.
Scientific Programming 11

SCO
RE
S0
.91 SC
24 O
SCORES 0.9 RES
0.8946 13
3

SCORES SCORES
0.9321 0.9210

(a) (b) (c)

SCORES 0.9387
SCORES 0.9204
SCORES 0.9180

(d) (e) (f )

SC
0.9 ORE
SCO 04 S
R 4
0.88 ES
95
SCO
R
0.90 ES
11

SCORES
0.9279
283 SCORES 0.9317
SCORES 0.9

(g) (h) (i)

Figure 12: Multiangle and multienvironmental weld seam recognition effect.

4. Conclusion prevent the increase of time complexity; this study


adopts the learning rate of 0.0001, Mini-Batch is 128,
The six networks LeNet, Resnet50, Googlenet, VGG16, Epochs is 40 training parameters, and the identified
VGG19, and Alexnet are selected as the RPN of the Faster network is applied to weld area recognition; as
R-CNN model for training using the migration learning shown in figure, the recognition accuracy and
method, and the recognition performance of the six Faster breadth can meet the industrial requirements. The
R-CNN models with migration learning is compared and model is also able to accurately frame the weld seam
analyzed, and the following conclusions can be drawn: in a mixture of multiplate conditions, multiangle
(1) Faster R-CNN based on VGG16 weld region de- welding conditions, and laser light source
tection effect is the best, when the model learning interference.
rate of 0.01 training times for 30 times, the model in (3) From the point of view of real time, when the target
the training set average accuracy of 91.55%, and the detection model uses a VGG16 convolutional neural
validation set accuracy of 86.45%. In the test set network as RPN, it can meet both the requirements
average detection time of 25.02 ms, in order to of detection accuracy and detection time. The target
speed up the convergence of the network training detection model can achieve industrial recognition
process. In order to speed up the convergence of the efficiency for weld area extraction in different en-
network training process, the Batch is changed to vironments, whether the recognition effect is under
Mini-Batch size set to 128, which can further im- the ideal viewpoint or weld recognition effect under a
prove the accuracy of the model in the training and complex environment. The industrial requirements
validation sets, and the loss function convergence is for real time can be met.
reduced.
(2) When the learning rate is 0.0001, the accuracy de- Data Availability
creases with the increase of Epochs in the training
set, which is an overfitting phenomenon. After All data are from laboratory collection and self-
Epochs is 40, the model accuracy grows slowly to consumption.
12 Scientific Programming

Conflicts of Interest checkerboard based prediction,” Signal Processing, vol. 150,


pp. 171–182, 2018.
All authors have no conflicts of interest and acknowledge the [14] S. Ren, K. He, R. Girshick, J. Sun, and R.-C. N. N Faster,
order of attribution. Towards Real-Time Object Detection with Region Proposal
Networks, https://arxiv.org/abs/1506.01497 arXiv: 1506.01497
[cs.CV].
Acknowledgments [15] H. Huang, H. Zhou, X. Yang, L. Zhang, L. Qi, and A.-Y. Zang,
“Faster R-CNN for marine organisms detection and recog-
The authors thank the Chinese Ministry of Science and nition using data augmentation,” Neurocomputing, vol. 337,
Technology for supporting this research and the professors pp. 372–384, 2019.
who participated in this paper. And thanks the second [16] L. Chen, M. Zhou, W. Su, M. Wu, J. She, and K. Hirota,
author for technical support during the collection of the “Softmax regression based deep sparse autoencoder network
dataset. This work was funded by the National Key Research for facial emotion recognition in human-robot interaction,”
and Development Program of China (2018YFB1306904). Information Sciences, vol. 428, pp. 49–61, 2018.
[17] Z. Song, L. Fu, J. Wu, Z. Liu, R. Li, and Y. Cui, “Kiwifruit
detection in field images using Faster R-CNN with VGG16,”
References IFAC-PapersOnLine, vol. 52, no. Issue 30, pp. 76–81, 2019.
[18] N. Dey, Yu-D. Zhang, V. Rajinikanth, R. Pugalenthi, and
[1] L. Liu, H. Chen, and S. Chen, “Quality analysis of CMT lap N. S. Madhava Raja, “Customized VGG19 architecture for
welding based on welding electronic parameters and welding pneumonia detection in chest X-rays,” Pattern Recognition
sound,” Journal of Manufacturing Processes, vol. 74, pp. 1–13, 2022. Letters, vol. 143, pp. 67–74, 2021.
[2] G. Peng, B. Chang, G. Wang et al., “Vision sensing and [19] P. Tang, H. Wang, and S. Kwong, “G-MS2F: GoogLeNet based
feedback control of weld penetration in helium arc welding multi-stage feature fusion of deep CNN for scene recogni-
process,” Journal of Manufacturing Processes, vol. 72, tion,” Neurocomputing, vol. 225, pp. 188–197, 2017.
pp. 168–178, 2021. [20] A. Deshpande and V. Vania, “Estrela, Prashant Patavardhan,
[3] H. Sharif and M. Hölzel, “A comparison of prefilters in ORB- the DCT-CNN-ResNet50 architecture to classify brain tumors
based object detection,” Pattern Recognition Letters, vol. 93, with super-resolution, convolutional neural network, and the
pp. 154–161, 2017. ResNet50,” Neuroscience Informatics, vol. 1, no. Issue 4, 2021.
[4] Z. Wu, S. Jiang, X. Zhou et al., “Application of image retrieval [21] A. Unnikrishnan, S. V, and S. K P, “Deep AlexNet with re-
based on convolutional neural networks and Hu invariant duced number of trainable parameters for satellite image
moment algorithm in computer telecommunications,” classification,” Procedia Computer Science, vol. 143,
Computer Communications, vol. 150, pp. 729–738, 2020. pp. 931–938, 2018.
[5] G. E. Jun-Feng and L. U. O. Yu-Pin, “A comprehensive study [22] S. Lin, L. Cai, X. Lin, and R. Ji, “Masked face detection via a
for asymmetric AdaBoost and its application in object de- modified LeNet,” Neurocomputing, vol. 218, pp. 197–202, 2016.
tection,” Acta Automatica Sinica, vol. 35, no. Issue 11, [23] P. Wang, Y. Lei, Y. Ying, and H. Zhang, “Differentially private
pp. 1403–1409, 2009. SGD with non-smooth losses,” Applied and Computational
[6] K. Zhang, M. Yan, T. Huang, J. Zheng, and Z. Li, “3D re- Harmonic Analysis, vol. 56, pp. 306–336, 2022.
construction of complex spatial weld seam for autonomous [24] D. Cheng, Y. Gong, X. Chang, W. Shi, A. Hauptmann, and
welding by laser structured light scanning,” Journal of N. Zheng, “Deep feature learning via structured graph Lap-
Manufacturing Processes, vol. 39, pp. 200–207, 2019. lacian embedding for person re-identification,” Pattern Rec-
[7] Z. Zhang, G. Wen, and S. Chen, “Weld image deep learning- ognition, vol. 82, pp. 94–104, 2018.
based on-line defects detection using convolutional neural [25] Z. Avdeeva, E. Grebenyuk, and S. Kovriga, “Combined ap-
networks for Al alloy in robotic arc welding,” Journal of proach to forecasting of manufacturing system target indi-
Manufacturing Processes, vol. 45, pp. 208–216, 2019. cators in a changing external environment,” Procedia
[8] Q. Wang, Lu. Zhang, Y. Li, and K. Kpalma, “Overview of Computer Science, vol. 159, pp. 943–952, 2019.
deep-learning based methods for salient object detection in [26] A. Berger and S. Guda, “Threshold optimization for F measure
videos,” Pattern Recognition, vol. 104107340 pages, 2020. of macro-averaged precision and recall,” Pattern Recognition,
[9] Xu. Zhu, K. Chen, B. Anduv, X. Jin, and Z. Du, “Transfer vol. 102, 2020.
learning based methodology for migration and application of [27] Z. Guo, Y. Xiao, W. Liao, P. Veelaert, and W. Philips, “FLOPs-
fault detection and diagnosis between building chillers for efficient filter pruning via transfer scale for neural network
improving energy efficiency,” Building and Environment, acceleration,” Journal of Computational Science, vol. 55, 2021.
vol. 200, 2021. [28] P. Molchanov, S. Tyree, T. Karras, and T. Aila, “Pruning
[10] P. Martinez, M. Al-Hussein, and R. Ahmad, “Online vision- Convolutional Neural Networks for Resource Efficient
based inspection system for thermoplastic hot plate welding in Transfer Learning,” vol. 3, 2016.
window frame manufacturing,” Procedia CIRP, vol. 93, [29] C. R. Brodie, A. Constantin, and A. Lukas, “Flops, Gromov-
pp. 1316–1321, 2020. Witten invariants and symmetries of line bundle cohomology
[11] S. Sharifzadeh, I. Biro, and P. Kinnell, “Robust hand-eye on Calabi-Yau three-folds,” Journal of Geometry and Physics,
calibration of 2D laser sensors using a single-plane calibration vol. 171, 2022.
artefact,” Robotics and Computer-Integrated Manufacturing, [30] M. Wang, X. Fan, W. Zhang et al., “Balancing memory-
vol. 61, 2020. accessing and computing over sparse DNN accelerator via
[12] A. Araújo, “Polynomial regression with reduced over-fit- efficient data packaging,” Journal of Systems Architecture,
ting—the PALS technique,” Measurement, vol. 124, pp. 515– vol. 117, 2021.
521, 2018.
[13] S. Yi and Y. Zhou, “Parametric reversible data hiding in
encrypted images using adaptive bit-level data embedding and

You might also like