Cao Thesis 2022
Cao Thesis 2022
Learning Algorithms
By
Lan Cao
Master of Science
in Mechanical Engineering
University of Houston
December 2022
Copyright 2022, Lan Cao
ACKNOWLEGEMENTS
I would like to express my great appreciation to Dr. Gangbing Song for all of his
supporting during my graduate study at the University of Houston (UH). When I decided
to apply for graduate school back in 2020, I planned to attend an online Master degree
persuaded me to apply to the one at the University of Houston and do the thesis option
instead of the full courses option. During the two and half years of my academic and
research study at UH, Dr. Song presented me his previous research on smart materials
and ongoing research on solving mechanical engineering problems with machine learning
methods, which I assumed had only belonged to the computer science applications, such
as YouTube video watching, Google translation, and video gaming. Dr. Song's guidance
really expanded my view on integration of machine learning into many fields, not limited
to mechanical engineering, but also to civil, chemical and medical industry as well. I
was astonished that the technology had made such a great progress since then.
Throughout my thesis work, Dr. Song gave me close guidance and many suggestions
Dr. Zheng Chen, Dr. Xuemin Chen and Dr. Matthew A. Franchek for taking their time to
iii
Last but not least, thanks to Mr. Ji’An Chen, a Ph.D. student in Dr. Song’s Smart
Material and Structure Lab. We did experiment together and he helped to train me on
iv
ABSTRACT
Elbows are widely used in many industries, especially in oil and gas industry. The
purpose of elbow is to change the flow direction in pipeline systems. In some severe
medium. With the increase of the service time, the wall thickness of the elbow will
become thinner due to erosion and wear, which may lead to piercing or bursting of the
high-pressure piping system and cause negative impacts on both the economy and the
environment.
A novel method of using percussion and machine learning to detect the rate of
elbow’s erosion was developed and discussed in this thesis. Three sets of elbow and pipe
assembly were used as test specimens. Then, six different erosion levels were simulated
by grinding off mass from the internal wall of the elbows. The elbow bottom location,
where the simulated erosion was, was tapped to generate the percussion sound, which
was recorded by a smart phone. The power spectral density (PSD) and mel-frequency
cepstral coefficient (MFCC) were employed to extract features from the percussion
sound.
The k-nearest neighbor (KNN), the decision tree (DT), and the support vector
machine (SVM) were implemented with PSD features to learn the training samples and
predict test samples. By using the above three basic machine learning methods, the
experiment achieved an average of 90% accuracy on training data and 80% on testing
data. Then, the recurrent neural network (RNN), a deep learning method, was
v
implemented with MFCC features to learn and train the data. This method achieved
100% accuracy on training data and 97% on testing data. Finally, the unsupervised
clustering algorithms, k-means and Gaussian mixture model (GMM), were implemented
with transformed MFCC features. The accuracy of k-means algorithm varied in a range
from 49% to 68%, while the GMM clustering method achieved an accuracy of 76%.
The results of this work have demonstrated the feasibility of the novel method of
percussion and machine learning to detect the level of erosion of elbow in pipeline.
Compared with the conventional method, the proposed method does not require
installation of sensors or extra signal acquisition instruments. The erosion detection using
percussion and machine learning brings great potential contribution to pipeline operating
safety assurance.
vi
Table Contents
ABSTRACT .........................................................................................................................v
1 Introduction ..................................................................................................................1
2.1 The basic understanding of elbow erosion and relevant prevailing detection
methods ........................................................................................................................... 4
vii
3.5 Mel-frequency cepstral coefficients ................................................................... 15
References ..........................................................................................................................45
viii
List of Tables
Table 1: Components to control the cell state and hidden state .........................................20
ix
List of Figures
Figure 2: Photographs illustrating the elbow section of the pipe where the leak was
found ....................................................................................................................................2
Figure 3: Schematic diagram of elbow erosion by liquid-solid two phase flow .................4
Figure 12: Percussion performed on elbow and three sets of elbow and pipe assembly ...26
Figure 14: The elbows after grinding on the last step ........................................................28
Figure 15: Plot of six erosion rates of power spectrum of signals in frequency domain ...29
x
Figure 16: 3D plot showing scattering data on PSD features (2nd, 5th and 7th)…………..30
Figure 17: 3D plot showing scattering data on PSD features (3rd, 5th, and 9th) .................31
Figure 18: MFCC features extracted from single frame sound signal ...............................33
Figure 24: 3D plot showing results of k-means clustering on selected transformed MFCC
features ...............................................................................................................................40
Figure 25: 3D plot showing results of clustering with GMM on selected transformed
xi
1 Introduction
Elbows are widely used in many industries, especially in oil and gas industry. The
purpose of elbow is to change the flow direction in pipeline systems. In some severe
medium, such as fracturing fluid [1]. Studies show that the erosion of the elbow is about
50 times more serious than that of straight pipe [2]. With the increase of the service time,
the wall thickness of the elbow will become thinner due to erosion and wear, which may
lead to piercing or bursting of the high-pressure piping system [1]. Figure 1 shows an
1
Figure 2: Photographs illustrating the elbow section of the pipe where the leak was found
As another example, Kusmono et al. investigated in his research paper the failure
of an elbow, which had a leakage after two months in service. The details of leakage are
shown in Figure 2 [4]. Figure 2a shows the leak area, which is an elbow of a pipeline.
Figure 2b shows the elbow being cut from pipeline and leakage area was determined.
Figure 2c shows the inner area of elbow and wall thinning area caused by erosion and
corrosion. Any leakage of pipeline would cause an unscheduled shutdown and result in
since elbow erosion monitoring is of great significance to the safety of pipeline systems
and maintenance personnel. Often, the elbow erosion monitoring is based on the
2
detection of the wall thickness and the detailed literature review on the detection method
This work proposes a new elbow erosion monitoring method, which is the
percussion method with the help of machine learning algorithms. To simulate erosion
degree of elbow, a grinder was used to remove mass from the elbow at the location of the
curved bottom where the erosion would be most severe. The mass reduction of the elbow
was measured by a high-precision electronic scale and recorded. By tapping the curved
bottom of elbow, the sound are collected by a smart phone and pre-processed by power
features. Machine learning algorithms, including the k-nearest neighbor and recurrent
neural network, are employed to classify the different mass loss. The results show that the
newly proposed method for elbow wall thickness erosion detection is effective.
3
2 An Overview of Related Work
learning algorithm and their applications. Basic knowledge of elbow erosion and
2.1 The basic understanding of elbow erosion and relevant prevailing detection
methods
When the fluid passes through the elbow, the flow direction changes sharply, and
solid particles impact intensively on local areas of the inner wall at a certain speed and
angle, causing local mass loss and wall thickness thinning [5], as shown in Figure 3.
There are various nondestructive testing methods for pipeline defect detection,
including optical testing method, radiography technology, magnetic flux leakage testing
4
technology and ultrasonic testing technology [6–8]. However, these methods have some
limitations for the real-time monitoring of erosion of the elbow. Optical principle
detection technology need to send the test probe into the pipeline and, through the
acquisition of images, directly display the defect status. This method is not effective in
the environment of high pressure, multiphase medium and pipeline vibration [1]. The X-
ray digital real-time imaging testing method and infrared nondestructive testing method
are both radiographic technologies [9], which can be used to detect local corrosion of
pipelines and measure wall thickness with the help of standard image characteristics
display instrument. However, the X-ray method involves the radiation hazard, and the
thickness of pipe wall cannot be read directly. The main ultrasonic methods used in
nondestructive testing of pipeline defects include ultrasonic pulse reflection method and
ultrasonic guided wave method [10]. Whereas, these methods are mainly used for multi-
point discontinuous measurement, which requires the surface of the tested part to be flat
and uncoated. The irregular structure of elbow causes the modal transformation of guided
waves, which affects the propagation guided waves and the defect detection accuracy.
The aforementioned methods are mostly non-destructive testing based and may not be
suitable for real time monitoring [1]. In order to realize real-time monitoring, structural
health monitoring (SHM) that utilizes permanently installed sensors [11,12] integrated
with communication and signal processing algorithms to report the structure’s health
status in real time, has achieved great progress [13,14]. In SHM, Lead Zirconate Titanate
(PZT), with a strong piezoelectric effect, is a commonly used transducer that can be
integrated with a structure for real time monitoring [15–17]. Li et al. proposed a new
elbow erosion monitoring method, which combines the PZT (Lead Zirconate Titanate)
5
enabled active sensing with the fractional Fourier transform (FrRT), and takes the
fractional-order energy peak of the stress wave signal as the damage index. The results
show that there is a one-to-one relationship between mass of erosion loss and damage
index. Compared with the traditional time domain signal energy method, this method has
the advantage of eliminating the saturation phenomenon. However the above mentioned
In recent years, the percussion method, that does not require the installation of
sensors on the structure, has gathered attentions in the application of detecting looseness
of various mechanical joints, such as spatial bolt-ball joint [18], bolted flange [19], and
cup-lock scaffolds [20]. The percussion method uses a structure’s sound response
subjected to a low-level impact or tapping for further analysis to detect the structure’s
health status [21–24]. The percussion method has a great potential for integration with
robotics technology to enable fully automated structural inspection. Even the percussion
method can be used to detect bolt looseness of a subsea flange using speech recognition
technology and least square support vector machine [25]. Also researches have been
conducted in detection of shear loading on bolts in bridge structures [26], sand deposition
in pipeline [27] and bolt head corrosion [28]. Next, selected research work using the
Kong et al. proposed a percussion method to detect or monitor the bolt looseness.
The bolted connection was set at different tightening levels with a torque wrench
6
followed by percussions, and sound signals were collected. With selected ranges in the
PSD as features, a basic machine learning algorithm, decision tree, can successfully
classify the different tightening levels of the bolted connection [19]. Wang et al.
developed a novel robotic-assisted percussion method for spatial bolt-ball joint looseness
plus CNN-based deep machine learning method were used. In this experiment, a robotic
arm instead of manual tapping was used, which explored the potential implementation in
real industry. The experimental results showed that the proposed method was effective in
monitoring the connection looseness [18]. Wang et al. conducted an exploratory study to
attempt to research one issue that can affect the stability of scaffolding systems, namely
the looseness of the cup-lock joint. In this paper, to detect looseness of cup-lock scaffold,
was used to craft characteristics from MFCC features, and a bi-directional long short-
term memory architecture (BLSTM) was used to improve classification accuracy [20].
Wang et al. developed a percussion method to detect shear loading of through bolts in
loading. The paper also compared the effectiveness of using two technologies, which
experimental results showed that the percussion method by using machine leaning
7
Percussion method is also used in applications other than structural looseness
detecting. Cheng et al. developed a new non-destructive approach using the percussion
method and voice recognition with support vector machine to detect the sand deposits in
deposit monitoring model can estimate the deposits in the pipeline with high accuracy
[27]. Wang et al. developed a novel entropy-enhanced acoustic emission (AE) method
with the help of machine learning to detect bolt head corrosion [28].
From literature review in Section 2.2, we can see that the percussion-based
detection methods are often used with machine learning (ML) algorithms. ML algorithms
at both basic learning and deep learning levels have seen great advances in the past few
decades [29–31]. ML techniques have also been used for voice and sound processing
[32–34] and normally consist of the following steps: data processing after collecting
sound signals, extraction of features that are sensitive to target, and the feature
classification via a certain classifier (e.g., KNN, DT, SVM or CNN/RNN) [26]. Even
though the classification accuracy through percussion and ML methods is relatively high,
the hand-crafted features require professional knowledge and extensive labor hours [26].
the other experiment on different structures. Researchers often need to test different
combination of features and ML algorithms to find the best fit for the particular scenario
being explored.
8
Among these machine learning algorithms, neural networks, especially the deep
learning methods, are often used in application of sound signal classification. MFCC
features combined with neural network algorithms has demonstrated its superiority in
speech recognition [35,36]. For instance, Liu et al. proposed a MFCC-CNN hybrid
method for short utterance speaker recognition [37], and Rejaibi et al. proposed to use a
to predict its severity level from speech [38]. Similar researches are made on detecting
bowel sound by LSTM neural network using MFCC features [39] proposed by Liu et al.
In this thesis, a deep learning method based on RNN with help from MFCC is
proposed to effectively detect erosion levels of an elbow after exploring basic learning
9
3 Research Methodology and Methods
The general research methodology of the thesis is to use percussion induced sound
on pipeline elbows with machine learning methods to estimate the level of erosions if the
data are labeled (supervised learning), i.e., we have the baseline data. In case of absence
of the baseline data (unsupervised learning), we use clustering methods to detect the
severe erosion.
This chapter introduces the feature selection methods and the machine learning
methods used in this thesis. The power spectrum density (PSD) of the recorded signal
with selected frequency segment was used as signal features, and these PSD features
were applied with basic learning algorithms, including KNN, DT, and SVM, to classify
(MFCCs) were used as features for the deep learning algorithm, recurrent neural network
(RNN). Finally, transformed MFCC features were used in clustering, including k-mean
and Gaussian mixture model (GMM), which are unsupervised learning algorithms. Figure
10
Figure 4: Flow chart of proposed methods in this thesis
The Power spectrum density, or simply power spectrum, of the input signal can be
estimated using the fast Fourier transform (FFT). It is a discrete Fourier transform
algorithm which reduces the number of computations needed for N points from 2N2 to
2N*log2N, where log2 is the base-2 logarithm. For each feature, the total energy of PSD
11
segment was computed by the summation of the PSD sampling values at the
The discrete Fourier transform (DFT) can be used to decompose any signal into a
sum of simple sine and cosine waves that we can easily measure the frequency, amplitude
/
2 2
cos sin (1)
where
N = number of samples
n = current sample
k = current frequency, where k∈[0,N−1]
xn = the sine value at sample n
Xk = The DFT which include information of both amplitude and phase
| |
amp
(2)
2 , ))
where Im(Xk) and Re(Xk) are the imagery and real part of the complex number, atan2 is
12
3.2 K-nearest neighbor algorithm
about the grouping of an individual data point. The goal of KNN algorithm is to identify
the nearest neighbors of a given query point, so that we can assign a class label to that
point [42].
Using the below formula, it measures a straight line between the query point and
, . (3)
The k value in the KNN algorithm defines how many neighbors will be checked to
for both classification and regression tasks [43]. It has a hierarchical, tree structure, which
consists of a root node, branches, internal nodes and leaf nodes. A final output is based
on a series of decisions made from multiple conditions. The flow chart is shown in Figure
5 [43].
13
Figure 5: Decision tree algorithm chart
classification and regression. The objective of the SVM algorithm is to find a hyperplane
that, to the best degree possible, separates data points of one class from those of another
class [44]. The “Best” is defined as the hyperplane with the largest margin between the
two classes, represented by plus versus minus in the Figure 6 below. The margin means
the maximal width of the slab parallel to the hyperplane that has no interior data points.
Only for linearly separable problems can the algorithm find such a hyperplane, and for
most practical problems the algorithm maximizes the soft margin allowing a small
number of misclassifications.
14
Figure 6: Maximum-margin hyperplane and margins for an SVM [45]
(ECOC) classifier for multiclass learning, where the classifier consists of multiple binary
learners such as SVMs [46]. This function was used in the experiment in this thesis.
automatic speech and speaker recognition. The method was introduced by Davis and
Mermelstein in the 1980's for processing of voice data [47]. Hereafter are implementation
steps [48].
15
Step 2 is to apply the discrete Fourier transform to obtain the power spectrum in the
(4)
where h(n) is an N sample long analysis window (e.g. hamming window), and k is the
length of the DFT. The periodogram-based power spectral estimate for the speech frame
is given by
| | . (5)
Step 3 is to convert the frequency (Hertz scale) to the mel scale through a filter bank. The
first filter is very narrow and gives an indication of how much energy exists near 0 Hertz.
As the frequencies get higher our filters get wider as we become less concerned about
variations. The mel scale tells us exactly how to space our filter banks and how wide to
make them. The formula for converting from frequency to mel scale is
1125 ln 1 . (6)
700
16
700 exp 1
1125
0, 1
1 (7)
, 1
1
1
, 1
1
0, 1
where m is the number of filters we want, and f( ) is the list of m+2 mel-spaced
frequencies.
Step 4 is to take the logarithm of the filter bank energies once we have them. The
technique.
Step 5, the final step, is to compute the discrete cosine transform (DCT) of the log filter
.
∑ cos
. (8)
∑ | |
There are two main reasons why this is performed. Since the filter banks are all
overlapping, the filter bank energies are quite correlated with each other. The DCT
decorrelates the energies, which means diagonal covariance matrices can be used to
model the features in an HMM (hidden Markov model) classifier. Only 12-14 of the 26
17
DCT coefficients are kept. This is because the higher DCT coefficients represent fast
changes in the filter bank energies and it turns out that these fast changes actually
i.e. transformed MFCC. The MFCC in this thesis has 14 lines of features and 24 columns
of time steps. The transformed MFCC is a vector which has size of 1x28. All data in each
row are calculated with a mean and a standard deviation, which formed the new
nodes form a directed graph along a temporal sequence. This allows it to exhibit temporal
dynamic behavior [49]. Derived from feedforward neural networks, RNNs can use their
internal state (memory) to process variable length sequences of inputs. This makes them
recognition [49]. RNNs are distinguished by their “memory” as they take information
from prior inputs to influence the current input and output. RNN is a type of deep
network that can learn long-term dependencies between time steps of sequence data [50].
This diagram illustrates the architecture of a simple LSTM network for classification.
18
The network, as shown in figure 7, starts with a sequence input layer followed by an
LSTM layer. To predict class labels, the network ends with a fully connected layer, a
This diagram in Figure 8 illustrates the flow of a time series X with C features
(channels) of length S through an LSTM layer. In the diagram, ht and ct denote the output
(also known as the hidden state) and the cell state at time step t, respectively [51].
19
The first LSTM block uses the initial state of the network and the first time step of
the sequence to compute the first output and the updated cell state. At time step t, the
block uses the current state of the network (ct−1, ht−1) and the next time step of the
sequence to compute the output and the updated cell state ct. The state of the layer
consists of the hidden state (also known as the output state) and the cell state. The hidden
state at time step t contains the output of the LSTM layer for this time step. The cell state
contains information learned from the previous time steps. At each time step, the layer
adds information to or removes information from the cell state. The layer controls these
updates using gates. The following components, as illustrated in Table 1, control the cell
Component Purpose
Output gate (o) Control level of cell state added to hidden state
This diagram in Figure 9 illustrates the flow of data at time step t. The diagram highlights
how the gates forget, update, and output the cell and hidden states [51].
20
Figure 9: Diagram of an individual cell
The learnable weights of an LSTM layer are the input weights W (input weights),
the recurrent weights R (recurrent weights), and the bias b (bias). The matrices W, R, and
b are concatenations of the input weights, the recurrent weights, and the bias of each
, , (9)
where i, f, g, and o denote the input gate, forget gate, cell candidate, and output gate,
(10)
⊙ 1 ⊙
(11)
⊙
21
where σc denotes the state activation function. The ‘lstmLayer’ function, by default, uses
the hyperbolic tangent function (tanh) to compute the state activation function. The
Component Formula
Input gate
Forget gate
Cell candidate
Output gate
−1
function, by default, uses the sigmoid function given by σ(x)=(1+e−x) to compute the
unsupervised machine learning method. The algorithm calculates the best category based
on the similarity of the distance between sample points. And the closer the two objects
are, the more similar they are. Its core idea is to divide samples into different categories
through an iterative process. This algorithm needs to specify the number of clusters, and
22
The MATLAB function ‘kmeans’ partitions data into k mutually exclusive
clusters and returns the index of the cluster to which it assigns each observation, and it
also treats each observation in the data as an object that has a location in space [52]. The
function finds a partition in which objects within each cluster are as close to each other as
possible, and as far from objects in other clusters as possible. Like many clustering
methods, k-means clustering requires to specify the number of clusters k before clustering
[52].
Gaussian mixture model (GMM), shown in Figure 11, is often used for data
clustering. To perform clustering, the GMM assigns query data points to the multivariate
normal components that maximize the component posterior probability. That is, given a
23
fitted GMM, the cluster assigns query data to the component yielding the highest
posterior probability.
GMM clustering can accommodate clusters that have different sizes and
can be more appropriate than methods such as k-means clustering [55]. GMM can also be
integrated with other methods, such as HMM (Hidden Markov Model), to achieve more
comprehensive tasks [56,57]. Like many clustering methods, GMM clustering requires
you to specify the number of clusters before fitting the model. The number of clusters
specifies the number of components in the GMM. For GMMs, there are below best
practices:
Consider the component covariance structure. You can specify diagonal or full
covariance matrices, and whether all components have the same covariance
matrix.
24
Specify initial conditions. The Expectation-Maximization (EM) algorithm fits the
GMM.
Implement regularization.
1 1 (12)
| ,Σ exp Σ
2
2 |Σ|
where x represents our data points, and D is the number of dimensions of each data point.
μ and Σ are the mean and covariance, respectively. If we have a dataset comprised of N =
vector, and Σ will be a 3 × 3 matrix. For later purposes, we will also find it useful to take
1 1 (13)
| ,Σ 2 Σ Σ .
2 2 2
25
4 Experiment Setup and Data Collection
to detect the different erosion rate of the elbow with the assistance of machine learning
algorithms. After impacts, signals were input into MATLAB and pre-processed, there
were three strategies doing data analysis. The first strategy was to process the input signal
by PSD and train and test the processed data with basic machine learning methods, which
were KNN, DT and SVM. The second phase was to process the input signal by MFCC
and use deep learning method, RNN, to train and test the data. The third phase was to use
transformed MFCC as features and unsupervised machine learning methods, i.e. K-mean
In the experiment, there were three identical sets of elbow and pipe assembly, as
Figure 12: Percussion performed on elbow and three sets of elbow and pipe assembly
26
An impact hammer with a metal tip was used to repeatedly tap on the bottom of
the elbow. Meanwhile, the hit-sound was recorded by a smart phone. The percussion-
induced sound signals were pre-processed and trained by different machine learning
algorithms. Also cross validation data was used to test the trained model.
collected as shown in Figure 13. Scenario one is an elbow only and it is hanged with a
string; Scenario two is an assembly of an elbow with one pipe being connected on one
end of the elbow; Scenario three is an assembly of an elbow with two pipes being
connected on both ends of the elbow; Scenario four was is the same assembly as Scenario
The erosion rate for each scenario has six classes. Class 1 uses original elbow
with no mass loss, and class 2 to class 6 involve 0.4g mass loss on each step. The details
27
of erosion rates corresponding to the class numbers are listed in below table 3. Also
Figure 14 shows the elbows after grinding on the last step and one example of zoomed-in
elbow’s picture. The tools included a hammer with metal tip and a smart phone which
was used to record signals. The tapping point is on the bottom of elbow. The smart phone
28
5 Data Analyses and Results Discussion
The signals were input and processed by PSD, from which the most sensitive nine
frequency segments were manually selected as features. They form a vector that is
edges
1000,1500,1600,2150,5750,6200,6250,7070,7200,7700, …
(14)
9000,9300,10000,11000,13000, 14000,17000,18000 .
The PSD plots for the 6 different erosion levels are printed in Figure 15.
Figure 15: Plot of six erosion rates of power spectrum of signals in frequency domain
29
All processed signals of six classes were labelled. Three random PSD features
from elbow A, the 2nd, 5th and 7th features, were selected to plot a 3D figure showing the
scattering of all labelled data as in below Figure 16. From the plot we can see that from
the scenario 1 to the scenario 4, the data were scattered more and more easily being
classified. Also three other features, i.e. the 3rd, 5th, and 9th features, were selected and
plotted to see if data were scattered better. Figure 17 shows that scenario 2 and 3 had
better results when using the other set of features. From the above observation the
Figure 16: 3D plot showing scattering data on PSD features (2nd, 5th and 7th)
30
Figure 17: 3D plot showing scattering data on PSD features (3rd, 5th, and 9th)
By doing so, we obtained the data matrix which had size of 48 x 10 with last
column as y-label. For the training data, we had data from elbow A, B, and C forming a
864 x 10 matrix, i.e. total 864 data. Then we partitioned the whole data into 80/20
Firstly, we use KNN classification to train the combined data set of elbow A, B
and C. By setting K value as 3, we obtain the highest accuracy. Then we use the trained
model to test the cross validated data, which is 20% of partitioned data. Secondly, we
31
use DT classification to perform the same training and testing as stated above. Thirdly,
we use SVM classification to train the model. Since the standard SVM classification
learner only deals with binary model, we used a MATLAB built-in function, fitcecoc, to
We observe that the accuracy on the training data are above 80% for KNN and
DT methods while SVM method has the best result. We also observe that the accuracy on
the cross validates testing data from these three classification methods are above 60%.
32
Also we can conclude that the scenario 4 has the best train and test results. From these
basic machine learning methods, SVM has the best overall accuracy.
implemented for data classification. Here we adopted MFCC + RNN method. After
collecting sound via percussion of elbows, we employed the MFCC to process the signals
and obtained a time sequence feature. An example of MFCC feature is displayed below
in Figure 18, which shows the data, the 20th signal, from all six classes of elbow A and
scenario 4.
Figure 18: MFCC features extracted from single frame sound signal
33
Data from each class is processed by MFCC, generating a 48 x 1 cell array, which
contains 48 MFCC feature matrixes. Each feature matrix has 14 lines of time steps and 24
columns of features. MFCC matrixes of all classes are labelled correspondently. To train
a deep neural network to classify each time step of sequence data, a sequence-to-
sequence LSTM network was used. The input size is 14 since there are 14 time steps for
each sample. The number of hidden units are set to 200. The number of classes are 6 as
there are six classes of each scenario. The data are also divided into 80% as training part
and 20% as testing part. The training accuracy reaches 95% on the first 12 iterations, and
quickly reached 100% in short time of computation. The accuracy on testing data reached
95% and higher on all four scenarios. The training and testing process for scenarios 1 to 4
34
Figure 20: The training process of RNN for scenario 2
35
Figure 22: The training process of RNN for scenario 4
Also the confusion matrix for all four scenarios are displaced in Tables 5 to 8.
confusion_matrix_test_set_scenario 1
True
Class Accuracy Error
C1 34 0 0 0 0 0 100.00
C2 2 34 0 0 0 0 94.44 5.56
C3 0 1 31 0 0 0 96.88 3.13
C4 0 0 3 16 0 0 84.21 15.79
C5 0 0 0 0 22 0 100.00
C6 0 0 0 0 0 29 100.00
36
Table 6: Confusion matrix of RNN method for scenario 2
confusion_matrix_test_set_scenario 2
True
Class Accuracy Error
C1 30 0 1 0 0 0 96.77 3.23
C2 0 27 0 0 0 0 100.00
C3 0 0 27 0 0 0 100.00
C4 0 0 0 31 0 0 100.00
C5 0 0 0 0 24 3 88.89 11.11
C6 0 0 0 0 0 29 100.00
confusion_matrix_test_set_scenario 3
True
Class Accuracy Error
C1 29 0 0 0 0 1 96.67 3.33
C2 0 25 1 0 0 0 96.15 3.85
C3 0 0 30 0 0 0 100.00
C4 0 0 0 22 2 1 88.00 12.00
C5 0 0 0 0 25 0 100.00
C6 0 1 0 0 0 35 97.22 2.78
37
Table 8: Confusion matrix of RNN method for scenario 4
confusion_matrix_test_set_scenario 4
True
Class Accuracy Error
C1 33 0 0 0 0 0 100.00
C2 1 28 0 0 0 0 96.55 3.45
C3 0 0 28 0 0 0 100.00
C4 0 0 0 24 0 0 100.00
C5 0 0 0 1 24 0 96.00 4.00
C6 0 0 0 1 0 32 96.97 3.03
Through experiments in this phase, better results are achieved by applying the
deep learning method. The detailed training and testing results are shown in Table 9.
38
5.3 Severe erosion detection using clustering method
The last phase of experiments is to find out what degree of accuracy we can get if
we use supervised learning methods (data are labeled). Here we use two clustering
methods. One is the k-means method, and the other is Gaussian mixture model. K-means
calculates the best category based on the similarity of the distance between sample points.
And the closer the two objects are, the more similar they are [40], while Gaussian
mixture models account for covariance. For example in two dimensions, covariance
In order to use these two clustering methods, we choose data from three classes,
i.e. the first class, the fourth class and the sixth class. Then we modified MFCC feature
by taking mean and standard deviation for all time steps, i.e. the data in 24 columns in the
39
MFCC matrix. All data in each row are calculated with a mean and a standard deviation.
Each MFCC line has 14 mean values and 14 standard deviation values, which form a
vector feature with dimension 1x28. With these transformed MFCC features, both
displayed in Figure 24. The data are combined from elbow A, B and C, and three random
features, the 3rd, 15th and 24th features, were selected. Also the results of clustering with
GMM is displayed in Figure 25. The selected three random features were the 3rd, 11th,
and 21st. Both figures successfully showed that three classes of data were clustered.
However, we only worked to classify data in this thesis, but we couldn’t tell which one is
the most erosive data. This work will be in included in future researches.
Figure 24: 3D plot showing results of K-means clustering on selected transformed MFCC
features
40
Figure 25: 3D plot showing results of clustering with GMM on selected transformed
MFCC features
From the experimental results we can see that both clustering methods
successfully classified the elbow at severe erosion rates, however, the GMM method has
a better accuracy than k-means method. K-means method reached an overall accuracy
from 49% to 68%, while the GMM method improved the accuracy to around 77% and a
very high accuracy with 90% on scenario 1. Clustering results are listed in Table 10.
41
Table 10: Classification results of unsupervised learning methods
K‐MEANS GMM‐CLUSTERING
single single
class overall class overall
Class1 Class2 Class3 accuracy accuracy Class1 Class2 Class3 accuracy accuracy
Scenario 1 Scenario 1
99 44 1 68.75 144 0 0 100.00
30 103 11 71.53 68.06 17 103 24 71.53 90.28
19 33 92 63.89 1 0 143 99.31
Scenario 2 Scenario 2
53 84 7 36.81 92 50 2 63.89
3 63 78 43.75 49.31 46 94 4 65.28 69.21
33 14 97 67.36 31 0 113 78.47
Scenario 3 Scenario 3
61 56 27 42.36 92 46 6 63.89
9 103 32 71.53 67.36 3 127 14 88.19 77.78
8 9 127 88.19 0 27 117 81.25
Scenario 4 Scenario 4
58 86 0 40.28 95 49 0 65.97
15 98 31 68.06 66.44 49 95 0 65.97 77.31
7 6 131 90.97 0 0 144 100.00
42
6 Conclusion and Future Work
6.1 Conclusion
grinding off mass on elbow’s internal wall and applied different machine learning
algorithms, which were KNN, DT, SVM, RNN and clustering. For KNN, DT and SVM,
we obtained the training and testing accuracies in the range of 70% to 80%. Then, by
using MFCC matrix with time sequence as the feature, we applied one deep neural
network, RNN, and achieved very high accuracy which is above 95%. Finally
unsupervised learning, clustering, was explored. Mean and standard deviation of all
features on each MFCC time step were calculated as input features, and k-means
clustering and clustering with Gaussian mixture model were used. For k-means
clustering, the final accuracy was not stable since it varied widely from 49% to 68%. To
improve the results, the GMM clustering, with the same features, was used and the better
performance was achieved with the accuracy ranging from 70% to 90%.
To conclude, among these machine learning algorithms, the MFCC plus deep
learning algorithm, RNN, had the best accuracy on both training and testing data.
During the experiment, it was observed that there was one limitation of percussion
method in detecting erosion rate of elbow’s wall thickness. The percussion method is
very sensitive to mass loss from the inner wall of elbow. Any tiny difference on mass or
43
grinding location would affect PSD of the collected sound signals. This limitation
decreases the accuracy, especially on independent testing data. Based on this limitation,
we would recommend that this percussion method could be used to detect critical erosion
Besides the limitation mentioned in the above paragraph, there are also other
factors, such as corrosion, line pressure and sediment in pipe, to affect the detection
results of elbow’s wall thickness. Among these factors, corrosion and erosion normally
happen together affecting elbow’s wall thickness. Static and dynamic line pressure also
can affect the energy of sound by percussion, and this factor needs to be considered when
evaluating the rate of erosion and corrosion combined situation. Sediment in pipeline is
thickness. All the above factors need to be examined and compensated when determining
the real wall thickness of elbows in service. Research along these areas can be performed
in the future.
also widely used in data classification. LDA has an analytical solution while SVM has a
numerical solution. Using LDA to analyze and train multi-class percussion data and
44
References
[1] Li, N., Wang, F., and Song, G. “A Feasibility Study on Elbow Erosion Monitoring
Intelligent Material Systems and Structures, Vol. 32, No. 5, 2021, pp. 584–596.
https://doi.org/10.1177/1045389X20963172.
[2] Peng, W., and Cao, X. “Numerical Prediction of Erosion Distributions and Solid
https://doi.org/10.1016/j.jngse.2016.02.008.
[3] Hassani, S. Solid Particle Erosion, Sand Monitoring and Transport in Oil and Gas
Production. https://www.linkedin.com/pulse/solid-particle-erosion-sand-
[5] Zhang, J., Kang, J., Fan, J., and Gao, J. “Study on Erosion Wear of Fracturing
Pipeline under the Action of Multiphase Flow in Oil & Gas Industry.”
Journal of Natural Gas Science and Engineering, Vol. 32, 2016, pp. 334–346.
https://doi.org/10.1016/j.jngse.2016.04.056.
45
[6] Wang, Z. D., Gu, Y., and Wang, Y. S. “A Review of Three Magnetic NDT
[7] Datta, S., and Sarkar, S. “A Review on Different Pipeline Fault Detection
Methods.” Journal of Loss Prevention in the Process Industries, Vol. 41, 2016, pp.
97–106. https://doi.org/10.1016/j.jlp.2016.03.010.
https://doi.org/10.1016/j.measurement.2017.07.058.
[9] Udod, V. A., Van, Ya., Osipov, S. P., Chakhlov, S. v., Usachev, E. Yu., Lebedev,
[10] Alobaidi, W. M., Alkuam, E. A., Al-Rizzo, H. M., and Sandgren, E. “Applications
Journal of Operations Research, Vol. 05, No. 04, 2015, pp. 274–287.
https://doi.org/10.4236/ajor.2015.54021.
[11] Tsangouri, E., Karaiskos, G., Aggelis, D. G., Deraemaeker, A., and van
46
Healing System Using Embedded Piezoelectric Transducers.” Structural Health
https://doi.org/10.1177/1475921715596219.
[12] Liu, T., Zou, D., Du, C., and Wang, Y. “Influence of Axial Loads on the Health
https://doi.org/10.1177/1475921716670573.
[13] Peairs, D. M., Park, G., and Inman, D. J. “Improving Accessibility of the
Material Systems and Structures, Vol. 15, No. 2, 2004, pp. 129–139.
https://doi.org/10.1177/1045389X04039914.
[14] Zhang, T., Biswal, S., and Wang, Y. “SHMnet: Condition Assessment of Bolted
https://doi.org/10.1177/1475921719881237.
[15] Farinholt, K. M., Miller, N., Sifuentes, W., MacDonald, J., Park, G., and Farrar, C.
Sensor Nodes.” Structural Health Monitoring, Vol. 9, No. 3, 2010, pp. 269–280.
https://doi.org/10.1177/1475921710366647.
47
[16] Talakokula, V., Bhalla, S., and Gupta, A. “Monitoring Early Hydration of
https://doi.org/10.1016/j.ymssp.2017.05.042.
[17] Li, W., Fan, S., Ho, S. C. M., Wu, J., and Song, G. “Interfacial Debonding
[18] Wang, F., Mobiny, A., van Nguyen, H., and Song, G. “If Structure Can Exclaim:
Looseness Detection.” Structural Health Monitoring, Vol. 20, No. 4, 2021, pp.
1597–1608. https://doi.org/10.1177/1475921720923147.
[19] Kong, Q., Zhu, J., Ho, S. C. M., and Song, G. “Tapping and Listening: A New
[20] Wang, F., and Song, G. “Looseness Detection in Cup-Lock Scaffolds Using
103266. https://doi.org/10.1016/j.autcon.2020.103266.
48
[21] Yuan, R., Lv, Y., Kong, Q., and Song, G. “Percussion-Based Bolt Looseness
Smart Materials and Structures, Vol. 28, No. 12, 2019, p. 125001.
https://doi.org/10.1088/1361-665X/ab3b39.
[22] Zhou, Y., Wang, S., Zhou, M., Chen, H., Yuan, C., and Kong, Q. “Percussion
Reconstruction.” Structural Control and Health Monitoring, Vol. 29, No. 2, 2022.
https://doi.org/10.1002/stc.2876.
[23] Chen, L., Xiong, H., Sang, X., Yuan, C., Li, X., and Kong, Q. “An Innovative
Columns Using Percussion Sound.” Structural Health Monitoring, Vol. 21, No. 3,
[24] Kong, Q., Ji, K., Gu, J., Chen, L., and Yuan, C. “A CNN-Integrated Percussion
https://doi.org/10.1177/14759217221082007.
[25] Wang, F., Chen, X., and Song, G. Percussion-Based Detection of Bolt Looseness
Using Speech Recognition Technology and Least Square Support Vector Machine.
2020.
49
[26] Wang, F., Song, G., and Mo, Y. “Shear Loading Detection of through Bolts in
https://doi.org/10.1111/mice.12602.
[27] Cheng, H., Wang, F., Huo, L., and Song, G. “Detection of Sand Deposition in
https://doi.org/10.1177/1475921720918890.
[28] Wang, F., and Zhu, R. “Detection of Bolt Head Corrosion under External
https://doi.org/10.1007/s11071-022-07390-x.
[29] Yuan, F.-G., Zargar, S. A., Chen, Q., and Wang, S. Machine Learning for
[30] Malekloo, A., Ozer, E., AlHamaydeh, M., and Girolami, M. “Machine Learning
and Structural Health Monitoring Overview with Emerging Technology and High-
Dimensional Data Source Highlights.” Structural Health Monitoring, Vol. 21, No.
50
[31] Bao, Y., and Li, H. “Machine Learning Paradigm for Structural Health
Monitoring.” Structural Health Monitoring, Vol. 20, No. 4, 2021, pp. 1353–1372.
https://doi.org/10.1177/1475921720972416.
[32] Purwins, H., Li, B., Virtanen, T., Schlüter, J., Chang, S., and Sainath, T. “Deep
https://doi.org/10.1109/JSTSP.2019.2908700.
[33] Latif, S., Cuayáhuitl, H., Pervez, F., Shamshad, F., Ali, H. S., and Cambria, E. “A
[36] Hossan, Md. A., Memon, S., and Gregory, M. A. A Novel Approach for MFCC
[37] Liu, Z., Wu, Z., Li, T., Li, J., and Shen, C. “GMM and CNN Hybrid Method for
https://doi.org/10.1109/TII.2018.2799928.
51
[38] Rejaibi, E., Komaty, A., Meriaudeau, F., Agrebi, S., and Othmani, A. “MFCC-
and Assessment from Speech.” Biomedical Signal Processing and Control, Vol.
[39] Liu, J., Yin, Y., Jiang, H., Kan, H., Zhang, Z., Chen, P., Zhu, B., and Wang, Z.
Bowel Sound Detection Based on MFCC Feature and LSTM Neural Network.
2018.
2022.
20, 2022.
https://towardsai.net/p/programming/decision-trees-explained-with-a-practical-
52
[45] Support Vector Machine. https://en.wikipedia.org/wiki/Support_vector_machine.
[46] Multiclass Model for Support Vector Machines (SVMs) and Other Classifiers -
MATLAB.
https://www.mathworks.com/help/stats/classificationecoc.html;jsessionid=c634e18
[47] Ayvaz, U., Gürüler, H., Khan, F., Ahmed, N., Whangbo, T., and Akmalbek
https://doi.org/10.32604/cmc.2022.023278.
http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-
53
[51] Long Short-Term Memory Networks - MATLAB & Simulink.
https://www.mathworks.com/help/deeplearning/ug/long-short-term-memory-
20, 2022.
https://www.mathworks.com/help/stats/clustering-using-gaussian-mixture-
[56] Zhang, M., Chen, X., and Li, W. “A Hybrid Hidden Markov Model for Pipeline
https://doi.org/10.3390/app11073138.
[57] Zhang, M., Chen, X., and Li, W. “Hidden Markov Models for Pipeline Damage
021-00481-0.
54
55