An Explainable Artificial Intelligence-Based Approach For Reliable Damage Detection in Polymer Composite Structures Using Deep Learning
An Explainable Artificial Intelligence-Based Approach For Reliable Damage Detection in Polymer Composite Structures Using Deep Learning
DOI: 10.1002/pc.29055
RESEARCH ARTICLE
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided
the original work is properly cited.
© 2024 The Author(s). Polymer Composites published by Wiley Periodicals LLC on behalf of Society of Plastics Engineers.
KEYWORDS
composite structures, damage detection, explainable artificial intelligence, laminated
composites, polymer composites, structural health monitoring, vision transformer
FIGURE 1 Proposed workflow for developing an explainable ViT-based deep learning model for damage detection in laminated
composites.
typical CNN with local receptive fields. This is achieved by Each input sequence vector is transformed into three
directing attention towards distinct regions of images and separate vectors q, k, and v, called query vector, key vec-
combining feature information from the whole image. tor, and value vector, respectively. Therefore, the atten-
Due to these advantages, the attention-based ViT models tion function (AF) with vector dimension d can be
have been utilized in numerous other engineering applica- expressed as:
tions36,40,41 However, this study is the first implementation
T
of the ViT model for damage detection in polymer com- qk
AF ðq, k,vÞ ¼ SoftMax pffiffiffiffiffi v ð1Þ
posite structures. dk
To recognize the principle and structure of the ViT
model, it is important to first understand the single
attention mechanism. For a given input sequence where non-linearity is introduced in the AF using the Soft-
X ¼ ðx 1 , x 2 , x 3 ,…, x n Þ, the self-attention mechanism Max activation to capture intricate patterns and representa-
acquires knowledge about the entities of a given sequence tions.42 This study utilized the advantages of a ViT-based
and provides the output sequence Z ¼ ðz1 , z2 , z3 , …, zn Þ. deep learning model using a 2D image-based dataset. The
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1540 AZAD and KIM
conceptual framework of the image-based ViT model for contain both information and the location of the patch.
this study is shown in Figure 2. In this case, the input Moreover, an extra class embedding vector is also
image x RHWK is divided into a series of smaller introduced to track the image class during learning.
patches x P RNðq K Þ (such as 16 patches in a 4 4 grid).
2
Ultimately, the data is fed into a conventional trans-
Herein, H and W represents the height and width of the former architecture to capture global correlations and
raw image in pixels, K is the number of image channels, features for each class which also consists of a multi-
and ðq qÞ represents the patch sizes in pixels. layer perceptron (MLP) layer that helps introduce
Thus, the length of the input sequence for the ViT net- non-linearity to extract complex features. These features
work equals the total number of small patches are then input to the classifier, which may contain several
given as: layers to detect and classify damage.
HW
N¼ ð2Þ
q2 2.2 | Explainable artificial
intelligence (XAI)
Next, the aforementioned single attention function is
implemented H times through the multi-head attention In recent years, XAI has gained significant attention as a
mechanism. The resulting multi-head attention (MH attention ) research topic in the area of artificial intelligence to com-
function is expressed as: prehend the reasons behind the decision of AI models.
Generally, the explanation of AI models is divided into
MH attention ðq,k, vÞ ¼ Concatenateðh1 , h2 , h3 ,…, hH ÞW O ð3Þ four axes using a hierarchical classification scheme:
(1) data explanation, (2) model explanation, (3) post-hoc
explanation, and (4) assessment of explanations.43 Data
hi ¼ Attention qW jq ,kW jk , vW jv , i ¼ 1, 2, 3, …, H ð4Þ
explainability encompasses a collection of methodologies
that aim to better understand the data used to develop
where W jq , W jk , W jv , and W O are the learnable hyperpara- AI models. The objective of model explainability is to
meters of the ViT module, and H represents the number develop models that possess inherent clarity or are
of attention heads. considered to be intrinsically explainable. Post-hoc
Meanwhile, a positional encoding layer is used to explainability refers to the retrospective process of
prevent the loss of the position information from the interpreting and providing human-understandable
input image. The position encoding process involves explanations for a trained model's decisions, offering
mapping an integer to a vector so that each vector can insight into factors influencing specific predictions.
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
AZAD and KIM 1541
FIGURE 3 Converting the ViT model to an explainable ViT model to develop a reliable and interpretable model for the end user.
T A B L E 1 Nomenclature and
State of the composites Nomenclature No. of samples No. of experiments
description of health conditions of
Healthy H 5 10 composites.
Delamination-1 D1
Delamination-2 D2
F I G U R E 4 Fabrication of CFRP laminates: (A) orientation of the plies, (B) hot press compression molding machine used to prepare
samples, and (C) health states of the composite where delaminations are seeded in the midplane of the composite at different locations.
FIGURE 5 Experimental setup comprising various units for collecting vibrational data from laminated composites.
obtain an excessive dataset. The ply orientation, fabrica- sample on the top surface to record vibration signals. The
tion setup, and damage configuration are shown in excitation system was placed in front of the shaker, while
Figure 4. During fabrication, the healthy composite was the data acquisition system was placed after the acceler-
developed by pressing eight layers in the compression ometer. The excitation system consisted of an excitation
molding machine. In the case of delaminated composites, data acquisition system (E-DAS) that acquired data from a
delaminations were seeded in the midplane by inserting a Lab-view PC using MATLAB Simulink followed by an
PTFE Teflon film with a thickness of 0.3 mm. For the amplifier and shaker. The acquisition system consisted of
vibration testing, a cantilever beam configuration with one the accelerometer, followed by the amplifier, and another
fixed and one free end was adopted. During the tests, the data acquisition system to collect response data. The experi-
fixed end was clamped with the shaker, simulating two mental configuration and its components are shown in
delamination cases: (1) delamination near the clamped Figure 5. The proposed experimental setup was built to illus-
end, and (2) delamination near the free end, as shown in trate the variability of vibrational data when delamination
Figure 4C. An accelerometer was affixed to the composite damages of the same size are positioned at distinct locations.
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
AZAD and KIM 1543
Therefore, to increase the reliability and accuracy of the respectively. In this case, the scale and shift of the corre-
SHM systems, it is essential to comprehend how the sponding wavelet function are controlled in the spectral
location of delamination impacts vibrational signals to and temporal domains, respectively, by the scale and
prevent damage to the composite structure. Additionally, shift factors. For an arbitrary signal of finite length
to enhance the practicality and dependability of the x ðt Þ L2 ðRÞ, the CWT can be obtained from the formula:
suggested methodology, 10 responses were obtained from
Z
each specimen in each health state. 1 þ∞
tb
CWT x ða, bÞ ¼ x ðt Þ, φa,b ðt Þ ¼ pffiffiffi x ðtÞφ dt
a ∞ a
ð8Þ
3.2 | Data pre-processing
t2
φðt Þ ¼ Ce 2 cos 5x ð9Þ
3.2.1 | Signal-to-image conversion
Continuous wavelet transform (CWT) is an effective Figure 6 illustrates the process of converting temporal
method for representing time series signals into time– data into a wavelet-enhanced representation. Initially, a
frequency scalograms.44 Feature extraction is commonly sliding window of 1875 data points with a shift of 625 data
conducted in any temporal or spectral domain; however, points is selected from the signals of all health states.
CWT gives considerable discriminative features in both This process produces a dataset of 2980 signal windows
domains. Additionally, the use of CWT-based images for each health state. These signal windows are then pro-
improves diagnostic accuracy through automatic feature cessed by CWT to obtain scalogram images. The signal
extraction, eliminating the need for human feature engi- windows for all health states are mapped to a scalogram
neering and extraction.18,45 The mother wavelet function (time-frequency) image, resulting in 2980 images for each
φðtÞ belonging to the L2 ðRÞ space subjected to Fourier health state. The scalograms are then resized to 112 112
transformations meets the following criteria: dimensions, which are suitable for feature extraction and
Z classification models.
þ∞
b ðωÞj2
jφ
dω < ∞ ð5Þ
∞ ω
3.2.2 | Data augmentation
b ðωÞ denotes the Fourier transform of φðt Þ, and ω
where, φ
represents the frequency. Moreover, the set of wavelet A frequently observed issue in deep learning-based SHM
functions can be derived from the following fundamental methods is insufficient data size. As the deep learning
wavelet function: approaches increase in scale and complexity, a significant
dataset is required to obtain optimal performance from
1 tb these models. Several data augmentation techniques are
φa,b ðt Þ ¼ pffiffiffi φ a, b R, a > 0 ð6Þ
a a available to increase the size of training data by produc-
ing artificial samples. In this study, data augmentation
t ¼ at 0 þ b ð7Þ was performed by adding zero-mean Gaussian noise.
Adding noise to the signals acts as regularization that
prevents overfitting, resulting in improved generalization
where, φa,b ðt Þ denotes the mathematical wavelet, while a capability and robustness of the deep learning model.47
and b represent the scale factor and shift factor, The Gaussian noise contains a probability density
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1544 AZAD and KIM
F I G U R E 6 Illustration of extracting a window from the raw signal of multiple health states, converting it into scalogram images, and
augmenting the data by adding zero-mean Gaussian noise.
function (PDF) similar to the normal distribution, and to taken into account for training, two random responses
augment the dataset, a random Gaussian function GðtÞ that is, 20% of the scalograms (3588) are used for testing,
can be incorporated into the original data to introduce and the remaining single random response that is, 10% of
noise.48 For the vibration signal x ðt Þ, the PDF can be for- the scalograms (1776) is used to validate the ViT network.
mulated as: This partitioned dataset is used in the next sections to
develop the damage detection model for composites. The
1 ðxμÞ2
model was developed on a system equipped with an
Pðx Þ ¼ pffiffiffiffiffiffiffiffi e 2σ2 ð10Þ
2πσ AMD Ryzen 53,500X processor, Windows 10 operating
system, DDR4 RAM (32 GB), and NVIDIA GeForce GTX
where, σ is the standard deviation of the augmentation 1650 (4 GB) graphics card.
filter and μ is the zero mean. The augmented data is then
obtained by adding the noise data to the original data as:
4.1 | ViT model architecture and
~x ðt Þ ¼ x ðt Þ þ Gðt Þ ð11Þ training
Due to the equal amount of positive and negative This section describes the process of developing the ViT
noise, the zero-mean value of the noise does not contrib- model architecture and defining optimal hyperparameters
ute to the net disturbance and is ultimately eliminated for damage detection of composites. The hyperparameters
from the system. Moreover, a standard deviation, or noise of the ViT model include number of patches, dimensional-
level, of 10% was chosen during data augmentation as it ity of the model (DM), number of transformer heads (NTH),
has been beneficial in detecting damage in laminated dimension of the intermediate layer of the multilayer per-
composites.49 The data augmentation process is presented ceptron (MLP) model (DMLP), and number of transformer
in Figure 6. Another 2980 images were obtained for each blocks. Moreover, the number and nature of layers in the
class through the data augmentation process. classifier also act as a hyperparameter of the model. The
optimal architecture of the proposed ViT model obtained
after several iterations is presented in Figure 7 along with
4 | R ES U L T S A N D D I S C U S S I O N its hyperparameters. In the ViT model, the input image is
first divided into smaller patches with non-overlapping
After the pre-processing stage corresponding to the D1, regions. These patches allow the ViT model to process
D2, and H classes, a total of 17,880 scalograms are images in a more scalable and computationally efficient
obtained. To develop the proposed ViT-based explainable manner, especially for large images.39 In our study, a 4 4
damage detection model, the entire dataset is split into grid size consisting of 16 square patches of pixel size
training, testing, and validation data. The first seven ran- (28 28) was found to be optimal, providing effective
dom responses that is, 70% of the scalograms (12,516) are discriminative features in the time-frequency domain.
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
AZAD and KIM 1545
F I G U R E 7 Detailed ViT model architecture with optimal hyperparameters and data dimensions for each stage of the damage
detection process.
Increasing the number of patches may provide enhanced network to capture non-linear relationships in the data
features but will increase computational costs, thus further while maintaining the computational efficiency of the
increasing the number of patches was not considered ben- model. Two such transformer blocks are added sequen-
eficial. The patches are then fed into a positional encoding tially to extract high-quality discriminative features auton-
block where the position of each patch is preserved. Addi- omously. The output of the transformer blocks contains
tional class embedding is also introduced here to learn the the features from the input data with dimensions of
class of the data. Both positional and class embedding are 16 784 8. These feature vectors are then processed by
flattened and then mapped to the DM dimension. In this the classifier to perform damage diagnosis. The classifier
study, DM = 8 was found to be optimal for the embedding block contains normalization, flattening, dropout, and sev-
space. Thus, instead of 3 RGB channels, an 8-dimensional eral dense layers. The dimensions of the respective layers
vector was found to effectively represent each patch, are shown in Figure 7. Herein, a dropout layer with a 10%
resulting in (16 784 8) as the model representation dropout was utilized to avoid overfitting the classifier.
space. The encoded feature space is used as input to the Moreover, the dimension of the final dense layer only
transformer block, where it is normalized before being fed (16 3) demonstrates that an exceptionally small final
into the multi-head attention. NTH = 2 is used in the feature vector is required. Such a small dimension of the
multi-head attention block to compute attention weights final layer indicates that the ViT model can extract high-
across multiple subspaces of the input. This is followed by quality features without the need for excessive computa-
another normalization layer and an MLP block that pro- tional resources.
cesses the output of the attention mechanism. The MLP Finally, the model was trained for 40 epochs using
block consists of a feed-forward neural network with the Adam optimizer at a learning rate of 5 105. To
DMLP = 16, controlling the complexity and capacity of the introduce non-linearity in the models, ReLU activation is
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1546 AZAD and KIM
FIGURE 8 Accuracy and loss curves of the ViT model using training and testing datasets.
used in all dense layers except the last dense layer (classi- performs exceptionally well not only in identifying the
fication later) which uses SoftMax activation. Moreover, healthy state but also in differentiating the same type of
due to the multi-class problem, the categorical cross- D1 and D2 delaminations present at different locations. The
entropy loss function is utilized during model compila- ROC curve also shows that the precision (true positive rate)
tion. The progression of the accuracy and loss curves with which is the ratio of accurately predicted positive observa-
respect to the number of epochs of training and testing tions to all original positives is high (1) and the false
data is shown in Figure 8. Both the accuracy and loss positive rate which is the ratio of inaccurately predicted
curves showed that the ViT model manages to learn positive observations to all original negatives in the data is
excessive information in the initial epochs. After that, the low (0). This indicates that the ViT model achieves almost
model starts to converge after 25 epochs, and the curves perfect sensitivity (all positives correctly identified) with no
tend to flatten out. The trained ViT model demonstrated false positives. According to the ROC curve analysis, the
training and testing accuracy of 99.30% and 97.77%, AUC was determined to be 99.95%, 99.89%, and 99.98% for
respectively. Similarly, a loss of 2.06 102 is observed D1, D2, and H, respectively. Thus, the proposed approach
for train data, and 6.38 102 for test data. This trained showed improved performance for detecting damage in
model is saved and validated on the unseen validation the polymer composites as compared to the existing state-
dataset in the next section. of-the-art models, as shown in Table 2. Moreover,
t-distributed stochastic neighbor embedding (t SNE) is
applied to the unseen validation data to visualize the fea-
4.2 | Validation of the ViT model ture extracted by the ViT model and compared with the
original data, as shown in Figure 10. It can be observed
During training, all model weights were assigned to the that the original data is significantly scattered and cannot
ViT model, and these weights were updated during test- be separated, especially in the case of D1 and D2. However,
ing. However, during validation, all model weights the features extracted from the ViT model form good clus-
are fixed, and the saved model is used to predict unseen ters and show clear separation not only between healthy
validation data. Moreover, instead of using only accuracy and damaged cases but also in the identification of D1 and
as the evaluation matrix, a comprehensive evaluation is D2. These results therefore confirm that the proposed ViT
performed through additional matrices such as confusion model provides better damage detection performance for
matrix, precision, recall, F1-score, receiver operating polymer composite structures.
characteristic (ROC) curve, and area under the curve
(AUC). The confusion matrix with precision (P), recall
(R), and f1-scores (F1) for each class predicted by the ViT 4.3 | Explainable vision transformer
model is shown in Figure 9. The results demonstrate that (X-ViT)
all classes are well predicted by the ViT model. The pre-
diction accuracy of D1, D2, and H is 98.82%, 98.48%, and In this study, a novel patch attention technique was
99.16%, respectively. Similarly, other evaluation metrics proposed to facilitate the explanation of the ViT model.
such as P, R, and F1 were also found to be higher than Generally, attention maps are used to interpret deep
98%. Moreover, it should be noted that the ViT model learning models and provide a non-linear explanation for
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
AZAD and KIM 1547
F I G U R E 9 Evaluation of ViT model predictions based on unseen validation data: (A) confusion matrix, precision, recall, and f1-score,
and (B) ROC curve for D1, (C) ROC curve for D2, and (D) ROC curve for H.
T A B L E 2 The performance
Model Accuracy Precision Recall F1-score Explainability
comparison of the proposed approach
42
with existing state-of-the-art models for SVM 85.00 84.87 84.46 84.50
42
damage detection of polymer XGBoost 84.67 84.50 84.42 84.34
composites. 42
CNN 95.44 95.57 95.49 95.40
50
VGG-16 93.66 93.67 93.67 94.00
50
VGG-19 91.33 91.33 91.33 90.33
50
Xception 84.66 85.00 84.67 84.33
50
ResNet 96.67 96.67 96.00 96.33
Proposed X-ViT 98.82 98.82 98.82 98.82
F I G U R E 1 0 Visual comparison of model features using t-SNE for (A) the original validation dataset, and (B) features of validation data
from the last layer of the ViT model.
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1548 AZAD and KIM
F I G U R E 1 1 End-user interpretation of the X-ViT model (A) normalized patch attentions, and (B) patch attentions overlap each health
state of the composites.
predictions.31,51 This implies that attention maps provide states, which means that most of the composite possesses
localized attention that may not effectively capture global healthy status. None of the other patches contributed suffi-
dependencies, focusing more on local regions in a non- ciently to the healthy state of H. However, in the case of D1
linear manner. Moreover, while attention maps provide and D2, several other patches are receiving high attention.
insight into the model's decision-making process, inter- In the case of D1 patches 4,5, and 7 also contribute 0.07,
preting non-linear ratings can be challenging for the end 0.06, and 0.06, respectively, to the D1 delamination classifi-
user, especially for complex models with multiple layers. cation. While patches 5–8 and 12 contribute to the D2 clas-
Therefore, this study aims to provide a linear and easy- sification. Moreover, the combined eight patches in the
to-understand interpretation based on patch attention middle contribute significantly to the damage cases (0.42
information. Since patching is an essential part of the for D1 and 0.52 for D2), while only 0.01 for H, indicating
ViT model, patch attention identifies patches or regions that the middle region of the scalogram images is impor-
that contain significant characteristics of a specific health tant for delamination detection. Similarly, the bottom fig-
state. In our study, patch attention is calculated using the ures highlight the importance of these patches in a way
weights of each layer of the ViT model. Generally, that the end user can understand while making decisions.
the attention mappings in each layer of a multi-layer ViT The increased transparency shows the area of interest for
model are learned individually, adding a significant num- each class. High transparency can be observed in the first
ber of parameters and limiting the generalization of the 4 patches of the healthy state, indicating that only the top
model. However, in the proposed X-ViT model, patch region is sufficient to predict H. However, the transparency
attentions from various layers are connected, and the sub- of the 8 patches in the middle increases in the scalograms
sequent layer directly uses the information from earlier with delamination indicating that the middle patches con-
layers to create a more beneficial dependency structure. tain the knowledge of delaminations. The bottom 4 patches
Next, this knowledge is aggregated across the entire dataset are highly opaque for all classes, which indicates that they
and significant patches for each health state are identified do not contribute to the prediction of any of the health
establishing the X-ViT model. The normalized patch atten- states D1, D2, or H. Thus, these visual interpretations are
tion and its overlap with each health state of the compos- beneficial in increasing the transparency and reliability of
ites are shown in Figure 11. This indicates that the patch the deep learning models, making it easier for the end user
attention is high for the first four patches in all health to make a decision.
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
AZAD and KIM 1549
5 | C ON C L U S I ON ORCID
Muhammad Muzammil Azad https://orcid.org/0000-
In this study, significant progress has been made in the 0002-1216-979X
SHM of laminated composites by integrating explainable
AI. Unlike existing black box AI models, this study intro- RE FER EN CES
duces an X-ViT model that addresses the critical need for 1. Sharma A, Mukhopadhyay T, Rangappa SM, Siengchin S,
reliability, interpretability, and transparency of AI models Kushvaha V. Advances in computational intelligence of poly-
deployed for SHM of composite structures. To achieve this mer composite materials: machine learning assisted modeling,
objective, first, the ViT model was developed using a trans- analysis and design. Arch Comput Methods Eng. 2022;29(5):
3341-3385. doi:10.1007/s11831-021-09700-9
former architecture utilizing attention mechanism and
2. Azad MM, Shah AU, Prabhakar MN, Kim HS. Deep learning-
positional encoding. The ViT model is then optimized by based fracture mode determination in composite laminates.
choosing hyperparameters that provide optimal damage Journal of the Computational Structural Engineering Institute of
detection results for laminated composites and validated Korea. 2024;37(4):225-232. doi:10.7734/COSEIK.2024.37.4.225
on experimental data comprising various health states, 3. Gayathri P, Umesh K, Ganguli R. Effect of matrix cracking and
including delamination damage cases. Moreover, the per- material uncertainty on composite plates. Reliab Eng Syst Saf.
formance and generalization ability of the model were 2010;95(7):716-728. doi:10.1016/J.RESS.2010.02.004
improved by transforming the raw vibration signals into 4. Khan A, Azad MM, Sohail M, Kim HS. A review of physics-based
models in prognostics and health management of laminated com-
images using CWT and adding Gaussian noise to augment
posite structures. Int J Precis Eng Manuf-Green Technol. 2023;
the data. The results indicated that the proposed ViT model 10(6):1615-1635. doi:10.1007/s40684-023-00509-4
performs well compared to existing state of the art methods 5. Khan A, Kim HS. A brief overview of delamination localization
in terms of numerous evaluation metrics including accu- in laminated composites. Multiscale Sci Eng. 2022;4(3):102-110.
racy, precision, recall, and f1-score. Overall, the proposed doi:10.1007/s42493-022-00085-w
model showed an accuracy of 98.82%, 98.48%, and 99.16%, 6. Giannakeas IN, Mazaheri F, Bacarreza O, Khodaei ZS,
for D1, D2, and H, respectively, using the unseen validation Aliabadi FMH. Probabilistic residual strength assessment of
smart composite aircraft panels using guided waves. Reliab Eng
dataset. The developed ViT-based deep learning model can
Syst Saf. 2023;237:109338. doi:10.1016/j.ress.2023.109338
then be explained by analyzing the patch attention in each
7. Yang F, Wu D, Feng H. A new probe-based electromagnetic
layer of the model. By aggregating patch attention and non-destructive testing method for carbon fiber-reinforced
identifying key patches influencing predictions, the X-ViT polymers utilizing eddy current loss measurements. Polym
model offers valuable insight into the decision-making pro- Compos. 2023;44(10):6661-6675. doi:10.1002/pc.27587
cess, enhancing trust and reliability among end users. 8. Khan A, Ko DK, Lim SC, Kim HS. Structural vibration-based
Therefore, the interpretable X-ViT model signifies a major classification and prediction of delamination in smart compos-
contribution to the field by offering a robust and transpar- ite laminates using deep learning neural network. Compos Part
B Eng. 2019;161:586-594. doi:10.1016/j.compositesb.2018.12.118
ent solution for damage detection in composite laminates.
9. Khan A, Kim H. Active vibration control of a piezo-bonded
However, the current research focuses on a single dataset laminated composite in the presence of sensor partial debond-
covering three health states, future work will focus on ing and structural Delaminations. Sensors. 2019;19(3):540. doi:
developing another dataset with complex damage scenar- 10.3390/s19030540
ios, to not only detect damage but also perform explainable 10. Sohn JW, Choi SB, Kim HS. Vibration control of smart hull
damage localization for laminated composite structures. structure with optimally placed piezoelectric composite actua-
tors. Int J Mech Sci. 2011;53(8):647-659. doi:10.1016/j.ijmecsci.
A U T H O R C ON T R I B U T I O NS 2011.05.011
11. Yang JS, Liu ZD, Schmidt R, Schröder KU, Ma L, Wu LZ. Vibra-
Muhammad Muzammil Azad: Conceptualization, Data
tion-based damage diagnosis of composite sandwich panels with
curation, Methodology, Validation, Writing—original bi-directional corrugated lattice cores. Compos Part A Appl Sci
draft. Heung Soo Kim: Supervision, Writing—review & Manuf. 2020;131:105781. doi:10.1016/j.compositesa.2020.105781
editing, Funding acquisition. 12. Essedik LM, Rachid T, Madjid E, et al. Active vibration control
of piezoelectric multilayers FG-CNTRC and FG-GPLRC plates.
A C K N O WL E D G M E N T S Polym Compos. 2024;11:9573-9587. doi:10.1002/pc.28428
This work was supported by a National Research 13. Azad MM, Kim S, Bin CY, Kim HS. Intelligent structural health
monitoring of composite structures using machine learning,
Foundation of Korea (NRF) grant, funded by the Korea
deep learning, and transfer learning: a review. Adv Compos
government (MSIT) (No. 2020R1A2C1006613).
Mater. 2024;33(2):162-188. doi:10.1080/09243046.2023.2215474
14. Jakkamputi L, Devaraj S, Marikkannan S, et al. Experimental
DATA AVAILABILITY STATEMENT and computational vibration analysis for diagnosing the
The data that support the findings of this study are avail- defects in high performance composite structures using
able from the corresponding author upon reasonable machine learning approach. Appl Sci. 2022;12(23):12100. doi:
request. 10.3390/app122312100
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
1550 AZAD and KIM
15. Das S, Chattopadhyay A, Srivastava AN. Classifying induced 31. Noh YR, Khalid S, Kim HS, Choi SK. Intelligent fault
damage in composite plates using one-class support vector diagnosis of robotic strain wave gear reducer using area-
machines. AIAA J. 2010;48(4):705-718. doi:10.2514/1.37282 metric-based sampling. Mathematics. 2023;11(19):4081. doi:
16. Tang L, Li Y, Bao Q, et al. Quantitative identification of dam- 10.3390/math11194081
age in composite structures using sparse sensor arrays and 32. Yagin FH, Cicek IB,_ Alkhateeb A, et al. Explainable artificial
multi-domain-feature fusion of guided waves. Measurement. intelligence model for identifying COVID-19 gene biomarkers.
2023;208:112482. doi:10.1016/j.measurement.2023.112482 Comput Biol Med. 2023;154:106619. doi:10.1016/j.compbiomed.
17. Qiao S, Huang M, Liang Y, Zhang S, Zhou W. Damage mode 2023.106619
identification in carbon/epoxy composite via machine learning 33. Liu H, Wang Y, Fan W, et al. Trustworthy AI: a computational
and acoustic emission. Polym Compos. 2023;44(4):2427-2440. perspective. ACM Trans Intell Syst Technol. 2023;14(1):1-59.
doi:10.1002/pc.27254 doi:10.1145/3546872
18. Rautela M, Senthilnath J, Monaco E, Gopalakrishnan S. Delami- 34. Holzinger A. The Next Frontier: AI We Can Really Trust. 2021:
nation prediction in composite panels using unsupervised-feature 427–440. doi:10.1007/978-3-030-93736-2_33
learning methods with wavelet-enhanced guided wave representa- 35. Manzari ON, Ahmadabadi H, Kashiani H, Shokouhi SB,
tions. Compos Struct. 2022;291:115579. doi:10.1016/j.compstruct. Ayatollahi A. MedViT: a robust vision transformer for general-
2022.115579 ized medical image classification. Comput Biol Med. 2023;157:
19. Ullah S, Ijjeh AA, Kudela P. Deep learning approach for delami- 106791. doi:10.1016/j.compbiomed.2023.106791
nation identification using animation of lamb waves. Eng Appl 36. Liang P, Yu Z, Wang B, Xu X, Tian J. Fault transfer diagnosis
Artif Intell. 2023;117:105520. doi:10.1016/j.engappai.2022.105520 of rolling bearings across multiple working conditions via sub-
20. Wang MH, Lu SD, Hsieh CC, Hung CC. Fault detection of domain adaptation and improved vision transformer network.
wind turbine blades using Multi-Channel CNN. Sustainability. Adv Eng Inform. 2023;57:102075. doi:10.1016/j.aei.2023.102075
2022;14(3):1781. doi:10.3390/su14031781 37. Wu H, Triebe MJ, Sutherland JW. A transformer-based approach
21. Rautela M, Gopalakrishnan S. Ultrasonic guided wave based for novel fault detection and fault classification/diagnosis in
structural damage detection and localization using model manufacturing: a rotary system application. J Manuf Syst. 2023;
assisted convolutional and recurrent neural networks. Expert 67:439-452. doi:10.1016/j.jmsy.2023.02.018
Syst Appl. 2021;167:114189. doi:10.1016/j.eswa.2020.114189 38. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you
22. Yoon S, (Song-Kyoo) Kim A, Cantwell WJ, et al. Defect detec- need. Advances in Neural Information Processing Systems.
tion in composites by deep learning using solitary waves. Int J NeurIPS Proceedings; 2017.
Mech Sci. 2023;239:107882. doi:10.1016/j.ijmecsci.2022.107882 39. Dosovitskiy A, Beyer L, Kolesnikov A, et al. Jakob Uszkoreit
23. Modarres C, Astorga N, Droguett EL, Meruane V. Convolu- NH. An image is worth 16 16 words: transformers for image
tional neural networks for automated damage recognition and recognition at scale. International Conference on Learning Rep-
damage type identification. Struct Control Health Monit. 2018; resentations; 2021. 10.48550/arXiv.2010.11929
25(10):e2230. doi:10.1002/stc.2230 40. Ding Y, Jia M, Miao Q, Cao Y. A novel time–frequency trans-
24. Lee H, Lim HJ, Skinner T, Chattopadhyay A, Hall A. Auto- former based on self–attention mechanism and its application
mated fatigue damage detection and classification technique in fault diagnosis of rolling bearings. Mech Syst Signal Process.
for composite structures using lamb waves and deep autoenco- 2022;168:108616. doi:10.1016/j.ymssp.2021.108616
der. Mech Syst Signal Process. 2022;163:108148. doi:10.1016/j. 41. Alshahrani H, Sharma G, Anand V, et al. An intelligent
ymssp.2021.108148 attention-based transfer learning model for accurate differenti-
25. Park S, Song J, Kim HS, Ryu D. Non-contact detection of ation of bone marrow stains to diagnose hematological disor-
delamination in composite laminates coated with a mechano- der. Life. 2023;13(10):2091. doi:10.3390/life13102091
luminescent sensor using convolutional AutoEncoder. Mathe- 42. Azad MM, Kim HS. Hybrid deep convolutional networks for the
matics. 2022;10(22):4254. doi:10.3390/math10224254 autonomous damage diagnosis of laminated composite struc-
26. Rudin C, Radin J. Why are we using black box models in AI when tures. Compos Struct. 2024;329:117792. doi:10.1016/j.compstruct.
we Don't need to? A lesson from an explainable AI competition. 2023.117792
Harvard Data Sci Rev. 2019;1(2):1-9. doi:10.1162/99608f92.5a8a3a3d 43. Ali S, Abuhmed T, El-Sappagh S, et al. Explainable artificial
27. Naser MZ. An engineer's guide to eXplainable artificial intelli- intelligence (XAI): what we know and what is left to attain
gence and interpretable machine learning: navigating causality, trustworthy artificial intelligence. Inf Fusion. 2023;99:101805.
forced goodness, and the false perception of inference. Autom doi:10.1016/j.inffus.2023.101805
Constr. 2021;129:103821. doi:10.1016/j.autcon.2021.103821 44. Zhou J, Li Z, Chen J. Damage identification method based on
28. Bogue R. What are the prospects for robots in the construction continuous wavelet transform and mode shapes for composite
industry? Ind Robot an Int J. 2018;45(1):1-6. doi:10.1108/IR-11- laminates with cutouts. Compos Struct. 2018;191:12-23. doi:10.
2017-0194 1016/j.compstruct.2018.02.028
29. Guilleme M, Masson V, Roze L. Termier A. Agnostic Local 45. Sun S, Zhang T, Li Q, et al. Fault diagnosis of conventional cir-
Explanation for Time Series Classification. In: 2019 IEEE 31st cuit breaker contact system based on time–frequency analysis
International Conference on Tools with Artificial Intelligence and improved AlexNet. IEEE Trans Instrum Meas. 2021;70:1-
(ICTAI). IEEE. 2019:432–439. doi:10.1109/ICTAI.2019.00067 12. doi:10.1109/TIM.2020.3045798
30. Kumar P, Hati AS. Deep convolutional neural network based 46. Liu Y, Li Z, Zhang W. Crack detection of fibre reinforced compos-
on adaptive gradient optimizer for fault detection in SCIM. ISA ite beams based on continuous wavelet transform. Nondestruct
Trans. 2021;111:350-359. doi:10.1016/j.isatra.2020.10.052 Test Eval. 2010;25(1):25-44. doi:10.1080/10589750902744992
15480569, 2025, 2, Downloaded from https://4spepublications.onlinelibrary.wiley.com/doi/10.1002/pc.29055 by The Director National Institut, Wiley Online Library on [16/05/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
AZAD and KIM 1551
47. Zhu S, Yu C, Hu J. Regularizing deep neural networks for experimental data. J Mater Res Technol. 2024;29:3024-3035.
medical image analysis with augmented batch normaliza- doi:10.1016/j.jmrt.2024.02.067
tion. Appl Soft Comput. 2024;154:111337. doi:10.1016/j.asoc. 51. Innat M, Hossain MF, Mader K, Kouzani AZ. A convolutional
2024.111337 attention mapping deep neural network for classification and
48. Soltanieh S, Etemad A, Hashemi J. Analysis of Augmenta- localization of cardiomegaly on chest X-rays. Sci Rep. 2023;
tions for Contrastive ECG Representation Learning. In: 13(1):6247. doi:10.1038/s41598-023-32611-7
2022 International Joint Conference on Neural Networks
(IJCNN). IEEE. 2022:1–10. doi:10.1109/IJCNN55064.2022.
9892600
49. Sikdar S, Ostachowicz W, Kundu A. Deep learning for How to cite this article: Azad MM, Kim HS. An
automatic assessment of breathing-debonds in stiffened
explainable artificial intelligence-based approach
composite panels using non-linear guided wave signals.
Compos Struct. 2023;312:116876. doi:10.1016/j.compstruct.
for reliable damage detection in polymer composite
2023.116876 structures using deep learning. Polym Compos.
50. Azad MM, Kumar P, Kim HS. Delamination detection in 2025;46(2):1536‐1551. doi:10.1002/pc.29055
CFRP laminates using deep transfer learning with limited