0% found this document useful (0 votes)
11 views11 pages

Fatigue Crack Detection

research about Fatigue Crack Detection

Uploaded by

Chuxuan Wei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views11 pages

Fatigue Crack Detection

research about Fatigue Crack Detection

Uploaded by

Chuxuan Wei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

applied

sciences
Article
Fatigue Crack Detection Based on Semantic Segmentation
Using DeepLabV3+ for Steel Girder Bridges
Xuejun Jia 1,2 , Yuxiang Wang 3, * and Zhen Wang 4, *

1 College of Transportation Engineering, Nanjing Technology University, Nanjing 211899, China;


[email protected]
2 China Construction Second Engineering Bureau Co., Ltd., Central China Branch, Wuhan 430062, China
3 School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology,
Wuhan 430074, China
4 School of Civil Engineering and Architecture, Wuhan University of Technology, Wuhan 430070, China
* Correspondence: [email protected] (Y.W.); [email protected] (Z.W.)

Abstract: Artificial intelligence technology is receiving more and more attention in structural health
monitoring. Fatigue crack detection in steel box girders in long-span bridges is an important and
challenging task. This paper presents a semantic segmentation network model for this task based
on DeepLabv3+, ResNet50, and active learning. Specifically, the classification network ResNet50
is re-tuned using the crack image dataset. Secondly, with the re-tuned ResNet50 as the backbone
network, a crack semantic segmentation network was constructed based on DeepLabv3+, which was
trained with the assistance of active learning. Finally, optimization for the probability threshold of
the pixel category was performed to improve the pixel-level detection accuracy. Tests show that,
compared with the crack detection network based on conventional ResNet50, this model can improve
MIoU from 0.6181 to 0.7241.

Keywords: semantic segmentation; DeepLabv3+; crack detection; threshold; active learning

1. Introduction
Citation: Jia, X.; Wang, Y.; Wang, Z.
Fatigue Crack Detection Based on With the continuous advancements in bridge design and construction technology, long-
Semantic Segmentation Using span steel bridges have developed rapidly. Among these, steel box-girder bridges have
DeepLabV3+ for Steel Girder Bridges. become popular due to their light weight, high torsional rigidity, and other advantages.
Appl. Sci. 2024, 14, 8132. https:// However, due to the coupling effect between initial material defects and dynamic vehicle
doi.org/10.3390/app14188132 loads, fatigue cracks often occur in steel bridge joints, especially around welded joints.
Fatigue cracks in steel girder bridges pose significant safety risks and can lead to
Academic Editor: José António
Correia
catastrophic failures if not detected and repaired in time. Traditional inspection methods,
which rely heavily on manual visual inspections, are labor-intensive, time-consuming, and
Received: 23 July 2024 prone to human error. In recent years, advancements in artificial intelligence (AI) have
Revised: 31 August 2024 revolutionized the field of structural health monitoring, offering more efficient, accurate,
Accepted: 3 September 2024 and automated solutions for fatigue crack detection. Techniques employed for fatigue crack
Published: 10 September 2024
detection in steel girder bridges include machine learning-based techniques, convolutional
neural networks (CNNs), and semantic segmentation networks, among others.
Machine learning (ML) has been widely applied in structural health monitoring for
feature extraction and pattern recognition. Various ML algorithms, such as support vector
Copyright: © 2024 by the authors.
Licensee MDPI, Basel, Switzerland.
machines (SVMs), k-nearest neighbor (KNN), and decision trees, have been employed
This article is an open access article
to classify and detect cracks in steel structures. For example, Zhang et al. [1] used SVM
distributed under the terms and
to classify features extracted from acoustic emission signals, achieving a high detection
conditions of the Creative Commons accuracy for fatigue cracks in steel beams. Similarly, Li et al. [2] implemented a KNN-
Attribution (CC BY) license (https:// based approach to analyze strain data collected from sensors on steel girder bridges and
creativecommons.org/licenses/by/ successfully identified crack initiation and propagation phases.
4.0/).

Appl. Sci. 2024, 14, 8132. https://doi.org/10.3390/app14188132 https://www.mdpi.com/journal/applsci


Appl. Sci. 2024, 14, 8132 2 of 11

CNNs have gained popularity due to their superior performance in image-based


applications. CNNs can automatically extract features from raw image data, making them
highly effective for crack detection in bridge structures. By processing images captured from
inspections, CNNs can identify and classify cracks with high precision. Kim et al. [3] devel-
oped a CNN model for crack detection in steel bridges using images taken by drones. The
model achieved an accuracy of over 95% in identifying fatigue cracks. Furthermore, Yang
and Yao [4] designed a multi-scale CNN to detect cracks of varying sizes and orientations,
demonstrating robust performances under different lighting and environmental conditions.
Semantic segmentation networks, such as DeepLabv3+ [5] and U-Net, are advanced
deep learning models that provide pixel-level classification, allowing for the precise local-
ization and segmentation of cracks in images. These networks are particularly useful for
identifying small and complex crack patterns that might be missed by traditional methods.
Chen et al. [6] introduced DeepLabv3+, which utilizes atrous spatial pyramid pooling
to capture multi-scale contextual information. This model has been applied to crack de-
tection, offering high accuracy in segmenting cracks from background noise. Similarly,
Ronneberger et al. [7] proposed the U-Net architecture, which has been widely adopted for
medical image segmentation and recently adapted for structural health monitoring tasks.
Transfer learning involves pre-training a model on a large dataset and fine-tuning it
on a smaller, specific dataset. This approach is beneficial when the labeled data for crack
detection are limited, as it leverages the knowledge gained from related tasks. Xu et al. [8]
applied transfer learning by fine-tuning a pre-trained ResNet model on a dataset of bridge
crack images. The transfer learning approach significantly improved the detection accuracy,
especially in scenarios with limited training data.
The integration of AI techniques into fatigue crack detection for steel girder bridges
has significantly advanced the field of structural health monitoring. However, developing
more robust models that can handle diverse environmental disturbance and provide more
accurate prediction results requires further investigation. Aiming to achieve this objective,
this paper presents a semantic segmentation model for fatigue crack detection based on
DeepLabv3+, ResNet50, and active learning.
This paper is organized as follows. Section 2 provides a brief introduction to DeepLabv3+.
The proposed fatigue crack detection model is presented in Section 3. In Section 4, relevant
models are compared in terms of validation accuracy and mean intersection over union
(MIoU). To further improve the model performance and robustness, the pixel category
threshold optimization is discussed in Section 5. Section 6 examines and presents the
compound model by comparing the predicted fatigue cracks with the ground truth. Finally,
the main contents are summarized in Section 7.

2. Overview of Semantic Segmentation Model: DeepLabv3+


In the field of computer vision, deep learning is mainly used for three aspects, namely:
image classification, object detection, and semantic segmentation. Image classification
predicts labels for an image. Object detection focuses on providing labels and locations for
objects in the images. The semantic segmentation technique involves labeling each pixel in
the image and hence segmenting the image into several regions with different attributes
and categories. Lately, semantic segmentation has been receiving more attention, since it
can provide more accurate locations and other information. This paper studies the fatigue
crack detection method in steel bridges based on DeepLabv3+ semantic segmentation
technology, which can determine the location, length, and shape of the crack.
DeepLabv3+ adopts an encoding–decoding architecture, as shown in Figure 1. The
encoding part designs multiple dilated convolution modules with different dilating rates to
obtain semantic information at multiple scales. The features outputted by the encoder are
firstly up-sampled to obtain semantic information, which is connected to the feature layer
with the same spatial resolution outputted by the low-level network. Bilinear interpolation
up-sampling is used to restore the output feature map to the spatial resolution of the input
image, leading to semantic segmentation outputs.
are firstly up-sampled to obtain semantic information, which is connected to the feature
layer with the same spatial resolution outputted by the low-level network. Bilinear inter-
polation up-sampling is used to restore the output feature map to the spatial resolution of
the Appl.
input image,
Sci. 2024, 14, 8132 leading to semantic segmentation outputs. 3 of 11

Figure 1. DeepLabv3+ model.


Figure 1. DeepLabv3+ model.
In order to accurately assess the quality of semantic segmentation results, evaluation
indicators
In order to accurately assess[9],the
such as category
quality of accuracy,
semantic intersection union ratio (IoU),
segmentation and mean
results, intersec-
evaluation
tion union ratio (MIoU), are often employed. For readers’ convenience, the latter two are
indicators [9], such as category accuracy, intersection union ratio (IoU), and mean inter-
described herein. The IoU of crack pixels can be written as follows:
section union ratio (MIoU), are often employed. For readers’ convenience, the latter two
TPi
are described herein. The IoU of crack pixels canIoU
bei =written (1)
TPi + FPas follows:
i + FPj

TP
i =
where TPi is theIoU
number of pixels fori true crack prediction, and FPj is the pixel number for
(1)
false background (non-crack) + FPi + FP
TPi prediction. MIoU is defined as the average of the IoU of two
j
categories, namely:
Ioui + Iou j
where TPi is the number of pixels for true crackMIoU prediction,
=
2
and FP j is the pixel num- (2)

ber for false background 3.


(non-crack) prediction.
Proposed Fatigue MIoU
Crack Detection is defined
Model as theSegmentation
Based on Sematic average of the IoU
of two categories, namely:
3.1. Overview of the Proposed Model
The semantic segmentation model for fatigue crack detection proposed in this paper
Iou + Iou
is shown in Figure 2. First, basedion the ResNet50
j network [10], the image sample dataset
MIoU =
is used for training the classification network ResNet50_crack. Compared with ResNet50, (2)
2
ResNet50_crack captures crack features and facilitates semantic segmentation network
training in the next step. Therefore, the DeepLabv3+ semantic segmentation network,
with ResNet50_crack as the backbone network, can treat crack semantic segmentation in
3. Proposed Fatigue Crack Detection
complex Model
backgrounds Based
with smaller on Sematic
generalization errorsSegmentation
and more accurate results. Active
learning [11,12] is employed to screen more valuable training images. The new dataset is
3.1. Overview of the Proposed Model
chosen to train the semantic segmentation network DeepLabv3+, and finally, the category
threshold of the model is optimized, as shown in Figure 2. The following section will
The semantic segmentation model for fatigue crack detection proposed in this paper
describe in detail the method of dividing the dataset, the screening method for effective
is shown in Figure 2. First, based
samples, andonthethe ResNet50
training networknetwork
of the classification [10], the image sample
ResNet50_crack and the dataset
semantic
segmentation network DeepLabv3+.
is used for training the classification network ResNet50_crack. Compared with ResNet50,
ResNet50_crack captures crack features and facilitates semantic segmentation network
training in the next step. Therefore, the DeepLabv3+ semantic segmentation network, with
ResNet50_crack as the backbone network, can treat crack semantic segmentation in com-
plex backgrounds with smaller generalization errors and more accurate results. Active
learning [11,12] is employed to screen more valuable training images. The new dataset is
chosen to train the semantic segmentation network DeepLabv3+, and finally, the category
Appl. Sci. 2024, 14, 8132 4 of 11

Figure
Figure Fatigue
2. 2. crackcrack
Fatigue detection model.
detection model.
Figure
3.2. 2. Fatigue crack
Classification detection
Network model.
ResNet50_crack
3.2. Initial
(1)
Classification Network ResNet50_crack
dataset for classification network
(1) Initial
The dataset
original for classification
image dataset network
includes 120 pictures with a size of 4928 × 3264 pixels or
5152 × 3864 pixels. A total of 100 images are randomly selected for training, while the
The original image dataset includes 120 pictures with a size of 4928 ×
remaining 20 images are used for testing the network. Generally, the image size has a
5152 ×
certain 3864
impact on pixels. Aresults
the training totalofofthe100
deepimages are randomly
learning model selected
[13,14]. Subsequently, for train
these
remaining
large images and20 ground
images areare
truths used for into
cropped testing thewith
samples network. Generally,
a size of 224 × 224 pixels,the im
resulting in 32,128 small image samples and corresponding ground truths. However, most
of these samples are intact and less informative for crack detection. For improving the
computation efficiency of the training process, these intact samples can be removed from
the dataset. Therefore, pixel variances in samples are evaluated and sorted, and then
the 60% of samples with the smallest variance are discarded. Subsequently, samples are
classified into crack and background datasets according to their ground truth. This process
is schematically shown in Figure 3a.
(2) Image screening based on active learning
The concept of active learning [11,12] originally referred to the case of semi-supervised
learning (that is, a part of the data would be labeled, with the remaining parts unlabeled),
where the algorithm could actively select the samples that were more informative and
representative. These samples were artificially labeled and then added to the dataset
for training. This paper adopts this idea of active learning to screen the most important
samples (namely, those with larger cross-entropy) for re-tuning the classification network
ResNet50_crack. The specific process is as follows, as shown in Figure 3b.
Appl. Sci. 2024, 14, 8132 5 of 11

(3) Training classification network ResNet50_crack


Dataset 2 in Figure 3c is randomly divided into two sets, namely, the training and
validation sets. This ensures that the training set and the validation set are consistent in
distribution. The fully connected layers of ResNet50 for 1000 categories are modified for 2
to accommodate this crack classification. The random gradient descent method is adopted
to train the network, with an initial learning rate of 0.001, a maximum epoch number of 20,
and a mini-batch-size of 32.
Appl. Sci. 2024, 14, x FOR PEER REVIEW The tuning process for the classification network to obtain ResNet50_crack is depicted
6 of 12
in Figure 3.

Figure 3. Tuning process of classification network ResNet50_crack. (a): dataset construction; (b):
Figure 3. Tuning process of classification network ResNet50_crack. (a): dataset construction; (b): ac-
active learning; (c): tuning process for ResNet50_crack.
tive learning; (c): tuning process for ResNet50_crack.

3.3. Semantic Segmentation Network DeepLabv3 + for Crack Detection


The training process for the semantic segmentation network DeepLabv3+ is shown
in Figure 4. Since this figure is very clear, the description here is omitted.
Appl. Sci. 2024, 14, 8132 6 of 11

3.3. Semantic Segmentation Network DeepLabv3 + for Crack Detection


Appl. Sci. 2024, 14, x FOR PEER REVIEW 7 of 12
The training process for the semantic segmentation network DeepLabv3+ is shown in
Figure 4. Since this figure is very clear, the description here is omitted.

Figure 4. 4.
Figure TheThetraining
trainingprocess of the
process of thesemantic
semanticsegmentation
segmentation network
network DeepLabv3+.
DeepLabv3+.

4. ComparativeStudies
4. Comparative Studies of
of Different
Different Models
Models
This section describes the comparison results of different models. In order to verify
This section describes the comparison results of different models. In order to verify
the effectiveness of the improved strategy proposed in this paper, we conducted compar-
theative
effectiveness
experiments of based
the improved strategy proposed
on the Deeplabv3+ algorithm.inThe
thisfirst
paper,
and we conducted
second compar-
models are
ative
the experiments
DeeplabV3+ with based on the and
ResNet50 Deeplabv3+ algorithm.
mobilenet-v2 The first networks.
[15] as backbone and second models
They are are
thecompared
DeeplabV3+with the proposed one. It should be noted that the difference between the second are
with ResNet50 and mobilenet-v2 [15] as backbone networks. They
network with
compared and thetheproposed
proposed one is that
one. the proposed
It should be notedonethat
is based on ResNet50,
the difference which the
between is sec-
ondre-tuned
network as and
a classification network
the proposed one isusing
that athe
crack detection
proposed onedataset.
is basedEvery model waswhich
on ResNet50,
trained foras
is re-tuned 20aepochs.
classification network using a crack detection dataset. Every model was
trained for 20 epochs. is carried out on the test dataset, which contains 20 raw pictures. All
This comparison
investigations are performed on a Lenovo workstation P910, which is installed with Nvidia
This comparison is carried out on the test dataset, which contains 20 raw pictures. All
GeForce, Santa Clara, CA, USA, GTX 1080 Ti, and MATLAB 2020a.
investigations are performed on a Lenovo workstation P910, which is installed with
Nvidia GeForce, Santa Clara, CA, USA, GTX 1080 Ti, and MATLAB 2020a.
The training results are shown in Table 1. The lowest training accuracy is seen in
mobilenet-v2 and is 99%, and the highest is ResNet50_crack, which is 99.26%. Compared
Appl. Sci. 2024, 14, 8132 7 of 11

The training results are shown in Table 1. The lowest training accuracy is seen in
mobilenet-v2 and is 99%, and the highest is ResNet50_crack, which is 99.26%. Compared
with mobileNet-v2, ResNet50_crack and ResNet50 are deeper and then have stronger
generalization ability for samples with complex backgrounds and finer cracks. Compared
with ResNet50, ResNet50_crack has undergone crack classification pre-training, and thus
the training of the semantic segmentation network is easier and has higher accuracy. Table 1
shows that the training process is slow but acceptable.

Table 1. Statistics for network training.

Training Accuracy
Training Accuracy Global Validation
Network Type Time (h) of Background
of Crack Pixels Accuracy Accuracy
Pixels
Model 1:
18.9 78.52% 99.74% 99.71% 99.11%
DeeplabV3+, ResNet50
Model 2:
14.7 70.09% 99.67% 99.64% 99.00%
DeeplabV3+, MobileNet-v2
Proposed Model:
DeeplabV3+, ResNet50_crack, 12.25 75.59% 99.89% 99.86% 99.26%
Active Learning

(1) Comparison of semantic segmentation indicators


In order to measure the role and contribution of the segmentation system, its per-
formance needs to be rigorously evaluated. Moreover, the evaluation must use standard
and recognized methods to ensure fairness. Therefore, the commonly used indicators of
semantic segmentation usually include: accuracy, intersection ratio (IoU), and average
intersection ratio (MIoU).
The performance indicators of three models for the test dataset are summarized in
Table 2. Clearly, the semantic segmentation network based on MobileNet-v2 performs the
worst in terms of IoU for crack pixels and MIoU for the model. Conversely, the proposed
model outperforms the others, which can be attributed to re-tuning the classification
network ResNet50 and active learning.
The crack distribution predicted by Model 1 and the proposed model are compared in
Figure 5 together with the ground truth. Clearly, the proposed model can provide results
that are more consistent with the ground truth.

Table 2. Semantic segmentation indicators for the test dataset.

IoU
Network Type MIoU
Crack Background
Model 1:
0.2391 0.9971 0.6181
DeeplabV3+, ResNet50
Model 2:
0.1910 0.9964 0.5937
DeeplabV3+, MobileNet-v2
Proposed Model:
DeeplabV3+, ResNet50_crack, 0.3897 0.9986 0.6942
Active Learning
Proposed Model:
DeeplabV3+, ResNet50_crack, 0.3897 0.9986 0.6942
Active Learning

The crack distribution predicted by Model 1 and the proposed model are compared
Appl. Sci. 2024, 14, 8132 8 of 11
in Figure 5 together with the ground truth. Clearly, the proposed model can provide re-
sults that are more consistent with the ground truth.

Appl. Sci. 2024, 14, x FOR PEER REVIEW 9 of 12

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)


Figure 5. Cracks predicted by different models: (a,d,g) Model 1; (b,e,h) proposed model; (c,f,i)
Figure 5. Cracks
ground truth. predicted by different models: (a,d,g) Model 1; (b,e,h) proposed model; (c,f,i) ground truth.

5. Pixel Category Threshold Optimization


5. Pixel Category Threshold Optimization
The abovementioned
The abovementioned analysis showsthat
analysis shows thatthethe proposed
proposed model
model tends tends to classify
to classify more more
pixels as crack ones, indicating a relatively good classification accuracy
pixels as crack ones, indicating a relatively good classification accuracy and poor IoU for and poor IoU for
thethe
crack pixels.
crack Therefore,
pixels. Therefore,ininorder
orderto tofurther
further improve
improve the theeffect
effectofofsemantic
semantic segmentation,
segmenta-
thistion,
paper
thissets
paperthesets
crack
the pixel
crack category determination
pixel category determination threshold
threshold to be
to begreater
greaterthan
than0.5.
Prior to setting this appropriate threshold, one must examine whether the relationships
0.5.
between the to
Prior model
settingperformance andthreshold,
this appropriate category threshold exhibit whether
one must examine similarity thefor the training
relation-
andships
test between
datasets.the This similarity
model implies
performance andits generalization;
category threshold an optimal
exhibit threshold
similarity for thechosen
fromtraining and testdataset
the training datasets.
canThis
be similarity
adopted for implies its generalization;
the test dataset. To reveal an optimal
this, thethreshold
classification
chosen and
accuracy from IoU
the training
for the dataset can be and
crack pixels adopted for the for
the MIoU test the
dataset.
model To reveal this, the with
are evaluated
classification accuracy and IoU for the crack pixels and the MIoU for the model are eval-
different thresholds using the training and test datasets, which are depicted in Figure 6.
uated with different thresholds using the training and test datasets, which are depicted in
It can be observed that (i) the performance indicators for the training dataset are always
Figure 6. It can be observed that (i) the performance indicators for the training dataset are
superior to those for the test dataset; (ii) the ideal similarity exists between two datasets,
always superior to those for the test dataset; (ii) the ideal similarity exists between two
anddatasets,
thus good and generalization is expected
thus good generalization for this for
is expected threshold; (iii) as(iii)
this threshold; theasthreshold increases,
the threshold
theincreases,
accuracythe of accuracy
the crackofpixel
the crack pixel detection always decreases, whereas the IoU and MIoU
detection always decreases, whereas the IoU and the
increase with the threshold <0.9 and<0.9
the MIoU increase with the threshold decline with the
and decline withthreshold
the threshold>0.9. Actually,
>0.9. Actually,aa larger
larger threshold means that fewer pixels are classified as crack, indicating fewer true pos-
itive crack pixels and lower accuracy. As defined in Equation (1), the IoU increases in that
its denominator is effectively reduced. Notably, the threshold step herein is 0.05 to save
computation time.
Appl. Sci. 2024, 14, 8132 9 of 11

threshold means that fewer pixels are classified as crack, indicating fewer true positive
crack pixels and lower accuracy. As defined in Equation (1), the IoU increases in that
Appl. Sci. 2024, 14, x FOR PEER REVIEW 10 of 12
its denominator is effectively reduced. Notably, the threshold step herein is 0.05
Appl. Sci. 2024, 14, x FOR PEER REVIEW 10 of to
12 save
computation time.

Figure 6. Semantic segmentation indicators with different category thresholds.


Figure 6. Semantic segmentation indicators with different category thresholds.
Figure 6. Semantic segmentation indicators with different category thresholds.

It It
can canbebeseen
It can be
seenfrom
from Figure
Figure 66 that
that thethemaximum
maximumvalues valuesof of
thethe
IoUIoU
andand
MIoU mustmust
MIoU
correspond to aseen from Figure
threshold around6 0.9.
thatInthe maximum
order values
to precisely of thethe
identify IoU and MIoU
optimal must
threshold,
correspond
correspond totoa athreshold
threshold around 0.9.
around0.01, In order
0.9. In order toprecisely
preciselyidentify
identify the optimal threshold, a
a smaller sampling step, namely, is chosentoand the
adopted for analysis,optimal threshold,
with the results
smaller sampling
a smaller step, namely, 0.01, is
is chosen andadopted
adopted for analysis, with the results
plotted insampling
Figure 7. step, namely,
The results 0.01,that
show chosen
when and
the threshold for analysis,
is 0.89, with
the highesttheMIoU
results
is
plotted
plottedin Figure
in Figure 7. The results
7. Thedataset show
resultsand
show that when
that when the threshold is 0.89, the highest MIoU
is is
0.7466 for the training 0.7231 for thethe threshold
test dataset. is 0.89,
This the highest
means that 0.89MIoU
is the
0.7466 for
0.7466 for
optimal the training
the training
threshold dataset
dataset and 0.7231 for the test dataset. This means that 0.89 is theis the
value. and 0.7231 for the test dataset. This means that 0.89
optimal threshold
optimal threshold value.value.

Figure 7. Semantic segmentation indicators around the indicator peak.


Figure 7. Semantic segmentation indicators around the indicator peak.
Figure 7. Semantic segmentation indicators around the indicator peak.
6. Fatigue Crack Detection Results Using the Proposed Model
6. 6. FatigueCrack
Fatigue CrackDetection
Since fatigue
Detection Results
Results Using the
cracks in steel box Using
girdersthe
Proposed
Proposed
in long-span
Model
Modelare slender, we use the
bridges
Since fatigue cracks in steel box girders in long-span bridges are
the slender,ofwe use de-
the
Since fatigue cracks in steel box girders in long-span bridges
original map, ground truth map, and prediction map to visualize are slender,
results crack we use
original map, ground truth map, and prediction map to visualize the results of crack de-
thetection, as map,
original shownground
in Figure 8. Itmap,
truth can beandseen that the predicted
prediction results are
map to visualize theconsiderably
results of crack
tection, as with
consistent shown theinground
Figuretruth,
8. It can be seenthethat the predicted results are considerably
detection, as shown in Figure 8. Italthough
can be seen raw
thatpictures are very
the predicted complex
results inconsiderably
are terms of
consistent
detection with
owing the
to ground
various truth, although
disturbances. the
This raw pictures
illustrates the are very
favorable complex in terms
performances of
consistent
detection
with
owing
theto
ground
various
truth, although
disturbances. This
the raw pictures
illustrates the
are very
favorable
complex inof
performances of
the
terms
the
of
proposed model.
detection owing
proposed model. to various disturbances. This illustrates the favorable performances of the
proposed model.
Appl. Sci.
Appl. 2024,
Sci. 14,14,8132
2024, x FOR PEER REVIEW 10 of
11 of 12 11

Raw pictures Ground truth Prediction


Figure 8. Selected fatigue crack detection results using the proposed model.
Figure 8. Selected fatigue crack detection results using the proposed model.

7.7.Conclusions
Conclusions
Thispaper
This paperpresents
presentsaasemantic
semanticsegmentation
segmentationnetwork
networkmodel
modelforforfatigue
fatiguecrack
crackdetection
detec-
tion based on DeepLabv3+, ResNet50, and active learning. The classification
based on DeepLabv3+, ResNet50, and active learning. The classification network ResNet50 network Res-
isNet50 is re-tuned
re-tuned using theusing theimage
crack crackdataset,
image dataset,
leading leading to ResNet50_crack,
to ResNet50_crack, which
which is is
adopted
adopted as the backbone network for DeepLabv3+ to construct a crack
as the backbone network for DeepLabv3+ to construct a crack semantic segmentation semantic segmen-
tation network.
network. This network
This network was trained
was trained with
with the the assistance
assistance of active
of active learning,
learning, followed
followed by the
optimization of the probability threshold of the pixel category. Compared withwith
by the optimization of the probability threshold of the pixel category. Compared the
the crack
crack detection network based on conventional ResNet50, this model
detection network based on conventional ResNet50, this model can improve MIoU from can improve MIoU
from 0.6181
0.6181 to 0.7241.
to 0.7241. In future
In future research,
research, we arewetoare to further
further improve
improve the detection
the detection accuracy
accuracy of the
Appl. Sci. 2024, 14, 8132 11 of 11

model for small cracks and to achieve the automatic and fast calculation of crack width,
length, and other information based on the Transformer network.

Author Contributions: Conceptualization, X.J.; Writing—original draft, Y.W.; Writing—review and


editing, X.J. and Z.W. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The datasets presented in this article are not readily available because
the data are part of an ongoing study.
Conflicts of Interest: Author Xuejun Jia was employed by the company China Construction Second
Engineering Bureau Co., Ltd. The remaining authors declare that the research was conducted in the
absence of any commercial or financial relationships that could be construed as a potential conflict
of interest.

References
1. Zhang, Y.; Li, H.; Wang, Y. Fatigue Crack Detection in Steel Beams Using Support Vector Machines. J. Struct. Health Monit. 2020,
15, 234–246.
2. Li, Z.; Zhao, J.; Chen, Q. K-Nearest Neighbors-Based Crack Detection Using Strain Data from Steel Girder Bridges. Struct. Control
Health Monit. 2019, 26, e2467.
3. Kim, S.; Park, J.; Choi, J. Drone-Based Crack Detection in Steel Bridges Using Convolutional Neural Networks. Autom. Constr.
2021, 126, 103675.
4. Yang, Y.; Yao, X. Multi-Scale Convolutional Neural Network for Crack Detection in Steel Bridges. Eng. Struct. 2023, 259, 114245.
5. Chen, L.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLabv3+: Semantic Image Segmentation with Deep
Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848.
[CrossRef]
6. Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic
Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October
2022; Springer: Cham, Switzerland, 2019; pp. 801–818.
7. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image
Computing and Computer-Assisted Intervention (MICCAI); Springer: Cham, Switzerland, 2015; Volume 9351, pp. 234–241.
8. Xu, K.; Zhang, J.; Li, W. Transfer Learning for Crack Detection in Steel Bridges. Comput.-Aided Civ. Infrastruct. Eng. 2023, 38,
265–278.
9. Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 15 October 2015; pp. 3431–3440.
10. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 12 December 2016; pp. 770–778.
11. Settles, B. Active Learning Literature Survey; Computer Sciences Technical Report 1648; University of Wisconsin-Madison: Madison,
WI, USA, 2009.
12. Wang, Z.; Xu, G.; Ding, Y.; Wu, B.; Lu, G. A vision-based active learning convolutional neural network model for concrete surface
crack detection. Adv. Struct. Eng. 2020, 23, 2952–2964. [CrossRef]
13. Rukundo, O. Effects of Image Size on Deep Learning. Electronics 2023, 12, 985. [CrossRef]
14. Rukundo, O. Evaluation of extra pixel interpolation with mask processing for medical image segmentation with deep learning.
Signal Image Video Process. 2024, 18, 1–8. [CrossRef]
15. Mark, S.; Andrew, H.; Menglong, Z.; Andrey, Z. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like