Fault Detection in AHU Sensors Using ML
Fault Detection in AHU Sensors Using ML
2 learning techniques
3
4 Behrad Bezyan, Radu Zmeureanu
5 Centre for Net-Zero Energy Buildings Studies, Department of Building, Civil and Environmental
6 Engineering, Concordia University, Montreal, Quebec, Canada
7
8
9
10
Abstract by using the historical databases of building performance
(Katipamula and Brambley, 2005): 1) process history-
This paper presents the development of machine learning based, and 2) qualitative model-based. History-based
models for multiple fault detection and diagnosis of air models are the data-driven models classified into supervised
temperature sensors of an air handling unit (AHU). For this and unsupervised learning-based models that use historical
purpose, air temperatures in critical points of an AHU such measured data of the system operation. Qualitative models
as mixed air temperature and temperature after the heating are the knowledge-based models of fundamental physics-
coil, are predicted. A compound artificial neural network based operation of the system (Zhao, Li, Zhang, Zhang,
(ANN) model is proposed for prediction of air temperature 2019).
sensors, then the fault is detected if the differences between
Machine learning, which is a data-driven based method in
measured and expected values exceed the defined threshold.
the artificial intelligence field, gives the computer the ability
For fault diagnosis aspect, the Recurrent Neural network
to learn without being explicitly programmed (Samuel,
(RNN) as a deep learning model, and shallow Feedforward
1959). An artificial neural network (ANN) is a machine
Neural network are developed for prediction of the air
learning technique that is used for prediction and forecasting
temperature value at the current time step (t) using the
purposes.
previous measurements of that target sensor; the faults are
diagnosed if the residuals exceed the threshold. This paper Many researchers used ANN models for the prediction of
uses the synthetic hourly data from the simulation of an variables or indices of performance in building HVAC
institutional building with eQuest program as a proxy for systems. Such models can be used to predict the reference
real measurements. Models developed in this paper will be or typical operating performance that is useful for fault
used in future work, and will be tested with real detection. If the difference between the measurement in an
measurements for the multiple fault detection and diagnosis HVAC system and ANN prediction exceeds the defined
(MFDD) purposes. threshold, a fault is detected.
Introduction Some researchers have focused on the development of
different ANN algorithms, such as single, auxiliary, and
In the United States, almost 40% of the total annual energy backpropagation ANN using MATLAB and Python. They
use belongs to the building sector (Beiter, Elchinger, Tian, considered the input time lags and the number of hidden
2017). Therefore, it is so important to design and operate neurons for prediction or forecasting sensor values of the
energy-efficient buildings. The building performance HVAC system or building energy use, for instance,
should be monitored in order to detect any potential fault in forecasting the electrical use in commercial buildings
the system and diagnose the sources of malfunction in the (Chae, Horesh, Hwang, Lee, 2016). Another example is the
heating, ventilation, and air conditioning (HVAC) systems. prediction of the sensor values such as air temperature in the
About 15 to 30% of the energy in the commercial buildings AHU with a variable air volume valve (VAV) system for
would be wasted if HVAC systems are not maintained the fault detection and diagnosis purposes, and comparison
regularly or they are inappropriately controlled, and if the with measurements (Lee, House, Park, Kelly, 1996). They
system degradation has not been detected at early stages also combined the model with the subtractive clustering
(Katipamula and Brambley, 2005). In order to monitor the method and concluded that the fault detection performance
building energy performance and record HVAC system is acceptable (Du, Fan, Jin, Chi, 2014).
operation in real-time, the building energy management A sensitivity analysis was conducted in order to select the
system (BEMS) should be used, and databases for the variables as the inputs which have the most impact on the
operation of each equipment should be maintained and used. accuracy of predictions (Fan, Du, Jin, Yang, Guo, 2010).
The malfunction in the system can be detected with the Adaptive ANN is another model that was used for the
application of two fault detection and diagnosis approaches estimation the solar hot water system with application of the
1
generated data set from a TRNSYS simulation (He, for fault diagnosis. The RNN is developed for predictions of
Menicucci, Caudell, Mammoli, 2011), (He, Caudell, each regressor (e.g., the outdoor air temperature, the return
Menicucci, Mammoli, 2012). air temperature) in order to compare the predicted values
Levenberg Marquart algorithm for ANN was applied to with the measurements, and if the difference exceeds a pre-
predict the values in a vapor compression refrigeration defined threshold, then the corresponding sensor is defined
system (Koçyigit, 2015). The combination of ANN and as the faulty sensor.
fuzzy inference system found eight faults in the system. A shallow feedforward neural network (FFNN) is also used
Two hyperparameters, such as the number of hidden layers for the fault diagnosis, and compared with the RNN model.
and number of hidden layer neurons, were considered in a Other machine learning models are also used such as
deep neural network model for fault detection and diagnosis support vector regression (SVR), decision tree, and random
(FDD) by (Heo and Lee, 2018). They concluded that forest. Their prediction performance for air temperature is
increasing the network size does not improve the fault compared with ANN model. The proposed machine
detection accuracy at more than about 97.26%. learning models are developed using Scikit learn
(Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel,
Different ANN algorithms, such as feedforward and
Blondel, Prettenhofer, Weiss, Dubourg, Vanderplas, Passos,
cascade, were used for the prediction of outdoor air
Cournapeau, Brucher, Perrot, 2011), Keras (Chollet, 2015)
temperature using the measured data set from four European
and Tensorflow (Abadi, Barham, Chen, Chen, Davis, Dean,
cities. With the comparison of predicted and measured
Devin, Ghemawat, Irving, Isard, 2015) which are the open
outdoor air temperature, they concluded that the developed
source libraries in Spyder Python 3.8 (Python, 3.8.1).
models have accurate prediction performance over the
prediction time horizon of 4 to 24 hours, with RMSE less Case study
than 2°C and R2 higher than 90% (Papantoniou and In this study, synthetic hourly data from eQuest simulation
Kolokotsa, 2016). program (Mihai, 2014) of an institutional building located
Auto-Associative ANN (AANN) was developed by in Concordia University, Montreal, over the heating season
(Elnour, Meskin, Al-Naemi, 2020) for sensor prediction and from January 2nd to March 31st are extracted and used for
then compared with PCA-based algorithm for fault the models` development and validation. The dataset is
detection and diagnosis. They concluded that AANN would available online on Github repository (Bezyan and
improve multiple and simultaneous fault detection Zmeureanu, 2020). Figure 1 shows the schematic of the
accuracy. AHU, and Table 1 lists the available variables for this study.
Geyer et al. (2018) proposed a component-based approach The use of synthetic data replaces the measurements as the
using ANN for prediction of energy performance in heating first step in the development and testing of prediction
and cooling modes. They used the physical properties of models as they are free of errors. The model prediction
building components as the inputs in each ANN of performance should normally be higher when using the
components in order to predict the conduction heat transfer synthetic data compared with the measurements in existing
rate. Furthermore, they used the summation of the outputs buildings. Therefore, these results should be considered as
along with other input parameters as the input into the next the upper threshold of performance.
ANN for the prediction of heating and cooling loads. The target values used for the fault detection are the mixed
Recurrent neural networks (RNN) were developed for air temperature (T_ma), the air temperature after heating
simultaneous fault detection and diagnosis of sensors in the coil (T_ahc) and the supply air temperature (T_sa) from the
HVAC system using simulation and real measured AHU.
databases (Shahnazari, Mhaskar, House, Salsbury, 2019). Table 1: List of AHU variables.
ANN models work very well for the prediction of air No. Variable Description Unit
temperature and FDD in the AHU (Hou, Lian, Yao, Yuan,
2006). A shallow feedforward ANN is proposed for fault 1 T_oa_db Outdoor air dry-bub temperature C
detection, as a robust machine-learning algorithm, for 2 V_oa Outdoor air volumetric flow rate m3/s
prediction of the sensor values in the AHU, as compared 3 T_ra Return air temperature C
with other models represented in (Chae, Horesh, Hwang, 4 V_ra Return air volumetric flow rate m3/s
Lee, 2016). 5 T_ma Mixed air temperature C
6 V_ma Mixed air volumetric flow rate m3/s
In this paper, first, a feedforward compound ANN model is 7 T_ahc Air temperature after heating coil C
used for fault detection in an air handling unit. A fault is 8 Q_HC Electric input for heating coil (𝑄̇ ) kW
detected if the difference between measurements of target Air temperature rise over supply
values (e.g., the mixing air temperature) and the ANN 9 ΔT_s,fan C
fan
predictions exceeds the pre-defined threshold. Second, a 10 T_sa Supply air temperature C
Recurrent Neural Network (RNN) model, using long short-
term memory (LSTM) as a deep learning approach, is used
2
2018) based on previous measurements of the same
AHU.
The number of hidden layer neurons (N) is selected
based on (Heaton, 2008).
𝑁𝐼𝑛𝑝𝑢𝑡 +𝑁𝑂𝑢𝑡𝑝𝑢𝑡
𝑁= (2)
2
Where, 𝑁𝐼𝑛𝑝𝑢𝑡 is the number of inputs, and 𝑁𝑜𝑢𝑡𝑝𝑢𝑡 is
Figure 1: AHU schematic. the number of outputs.
Method 2. The optimum ANN#1 architecture for the prediction of
T_ma over the time horizon of three days uses 72 hours
Fault detection of training data, and has three hidden layer neurons
A compound ANN architecture of the air handling unit (Figure 3).
(AHU) is proposed (Figure 2) that is composed of two 3. The mixed air temperature T_ma is predicted at the time
ANNs. The ANN#1 predicts the mixed air temperature (t) by using the regressors that are measured at the same
(T_ma) at time (t) using the related input variables at time time (t). If the difference between the measurement of
(t). The output of ANN#1 is input in the ANN#2 that T_ma and the prediction of T_ma obtained with
predicts the air temperature after heating coil (T_ahc) at the proposed ANN#1 does not exceed the threshold value,
same time (t). Then the supply air temperature (T_sa) at that there is no fault with the T_ma sensor. Then, the
time (t) is predicted. measured T_ma is used as an input to ANN#2. While,
𝑡 if the difference exceeds the threshold value, there is a
The measurements of regressors at current time (t) (𝑥1,𝑚 ,
𝑡 𝑡
𝑥2,𝑚 , …) are the inputs to ANN, and the target value (𝑌𝑝 ) at fault with the T_ma sensor, and the predicted T_ma is
the same time (t) is the output (Equation 1). used as an input for ANN#2.
4. The same approach is used for the ANN#2, which is
𝑡
𝑌𝑝𝑡 = 𝑓(𝑥1,𝑚 𝑡
, 𝑥2,𝑚 𝑡
, … , 𝑥𝑛,𝑚 ) (1) trained by using the measurements over the past 72
hours. The optimum ANN#2 for the prediction of
Where, 𝑥𝑚 is the measured values of the regressors at time T_ahc at the current time (t) requires 72 hours for
(t), and 𝑌𝑝 is the predicted target variable at time (t). training and two hidden layer neurons (Figure 3).
Input time step: Input time step:
5. The air temperature T_ahc after the heating coil is
(t) (t) predicted at the time (t) by using regressors V_ma,
t Air temperature
T_oa_db
t rise over supply
Q_HC that are measured at the same time (t), and T_ma
t V_ma predicted or measured (as discussed at item 2 above). If
T_ra fan
t t t
V_ma t T_ma T_ahc
ΔT_s,fan: T_sa the difference between the measured T_ahc and the
t
V_ra
t Q_HC + 1.8 C ANN#2 prediction of T_ahc does not exceed the
V_oa
t threshold value, there is no fault of T_ahc sensor. If the
difference exceeds the threshold value, the predicted
Figure 2: Compound ANN for prediction of T_ma and T_ahc is used as an input to predict T_sa.
6. The supply air temperature (T_sa) is predicted at the
T_ahc in heating mode of AHU.
same time (t) by adding the air temperature rise in the
Feedforward ANN is used for both ANN#1 and ANN#2, supply fan (ΔT_s,fan) to the T_ahc value. In this study,
using Python 3.8. ΔT_s,fan=1.8°C, based on previous measurements of
The following steps are taken: the same AHU (Zibin, 2014).
Several ANN architectures are compared with the number Input time Input time
of training days from 1 to 30, and the number of hidden layer Prediction Prediction
step: step: Air
at same at same
neurons from 1 to 10; in all cases, there is only one hidden t (t) time step
(t)
time step temperature
T_oa_db
72 hours (t) t 72 hours (t) rise over
layer. The Sigmoid activation function is used for the hidden t
T_ra training V_ma
training supply Fan
t t t
layer neurons, and the output node. V_ma
t T_ma T_ahc ΔT_s,fan: T_sa
1. Each architecture is evaluated ten times; the average t 3 Hidden t 2 Hidden + 1.8 C
V_ra neurons Q_HC neurons
RMSE (between the predictions and measurements) of V_oa t
3
Fault diagnosis layer. The Sigmoid activation function is used on the hidden
Two different approaches are developed and compared for layer neurons and the output layer node.
the multi-fault diagnosis: 1) Recurrent Neural Network The RNN model is trained using a sliding window technique
(RNN) with use of long short-term memory (LSTM), and 2) over 62 hours, to predict T_ra at time (t). The RNN
Feedforward Neural Networks (FFNN). Both methods were prediction model architecture of return air temperature T_ra
developed in Python (3.8.1). is presented as an example in Figure 4.
The application of fault diagnosis method is explained by The results of RNN model are compared with those from a
assuming a case when an abnormal measurement was shallow feedforward neural network (FFNN), which is
detected from the T_ma sensor at 2 P.M. One can suppose presented in the following section.
that the T_ma sensor is faulty, or one or more readings of Input time step: Target time step:
the regressors (i.e., T_oa, T_ra, V_ma, V_ra, V_oa) (Figure (t-1, t-2 t-10) (t)
3) are faulty. The predicted values of each regressor at 2 t-1
P.M, which are obtained under normal operation conditions T_ra
62 hours training
based on past monitored values, are compared with the t-2 (Sliding window) t
T_ra T_ra
measured values. If the differences between predicted and
expected values for one or several sensors values exceed the 4 Hidden layers
defined threshold, then the faulty regressor sensor(s) is
diagnosed. If none of the regressors are faulty, then it can be t-10 50 Hidden
T_ra neurons in each
assumed that the T_ma sensor is faulty. layer
For fault diagnosis, a single network for prediction of each
variable at time (t) is developed for each of FFNN and RNN Figure 4: Recurrent Neural Network model for
models (Figure 4 and 5). The past monitored values of the prediction of T_ra at time (t) in heating mode of AHU.
target variable at (t-1, t-2, …, t-n) are used as the input
values. For example, the FFNN model uses the recorded
Feedforward Neural Network (FFNN)
values of T_ra over the past five hours (t-1, t-2, …, t-5) to
predict T_ra at time (t) (Figure 5). The FFNN predicts 𝑋𝑝𝑡 at time (t) by using the inputs of
previous 5 hours measurements (i.e., t-1, t-2…, t-5)
(Equation 5). The schematic of the FFNN for T_ra is
Recurrent Neural Network (RNN) illustrated in Figure 5.
The Recurrent Neural Network (RNN) that is a deep
learning model with long-short-term memory (LSTM) 𝑋𝑝𝑡 = 𝑓(𝑋𝑚
𝑡−1 𝑡−2
, 𝑋𝑚 𝑡−5
, … , 𝑋𝑚 ) (5)
architecture is applied to the normalized data set (Equation
3). The RNN model is developed and used for the prediction Where, 𝑋𝑚 is the measured value of the regressor at
of 𝑋𝑝𝑡 at time (t) by using the inputs of previous 10 hours previous time steps (t-n).
measurements (i.e., t-1, t-2…, t-10) (Equation 4). Input time step: Target time step:
RNN uses the previous sequential information to learn and (t-1, t-2 t-5) (t)
predict the present values based on the trained model. t-1
T_ra
LSTM architecture has the chain like structure of the neural 67 hours training
t-2 t
networks and able to learn the long-term dependencies. The T_ra (Sliding window) T_ra
LSTM is capable of adding, storing and removing the
information which are helpful for the prediction (Hochreiter 1 Hidden layer
and Schmidhuber, 1997).
t-5 3 Hidden neurons
𝑋−𝑚𝑖𝑛(𝑋) T_ra in layer
𝑋𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 = (3)
𝑚𝑎𝑥(𝑋)−𝑚𝑖𝑛 (𝑋)
Where 𝑋 is the value at time (t); 𝑚𝑖𝑛(𝑋) and 𝑚𝑎𝑥(𝑋) are Figure 5: Feedforward Neural Network for prediction
the minimum and maximum values of the selected data set. of T_ra at time (t) in heating mode of AHU.
𝑋𝑝𝑡 = 𝑓(𝑋𝑚
𝑡−1 𝑡−2
, 𝑋𝑚 𝑡−10
, … , 𝑋𝑚 ) (4)
The FFNN architecture, which was obtained by trial and
where, 𝑋𝑚 is the measured value at the previous time steps error, consists of one input layer with five inputs, one hidden
(t-n). layer with 4 to 6 hidden layer neurons (depending on the
The RNN model architecture, which was developed by trial regressor), and one output neuron.
and error, consists of one input layer, four hidden layers
with 50 hidden neurons on each hidden layer, and one output
4
Prediction performance Results and discussions
The performances of the proposed prediction models are The selected compound ANN architecture (Figure 3) for the
evaluated with statistical indices (Equations 6 to 10): prediction of T_ma and T_ahc is applied over 24 hours,
Coefficient of determination (R2), Root-Mean-Squared- using sliding window technique, and the statistical indices
Error (𝑅𝑀𝑆𝐸), Mean Absolute Percentage Error (𝑀𝐴𝑃𝐸), of prediction performance are presented in Table 3.
Mean Bias Error (𝑀𝐵𝐸), and Maximum Absolute Error Table 3: Predictions of the compound ANN model.
(𝑀𝐸𝑚𝑎𝑥 ). ANN Architecture with
Prediction over validation data set for 24
one hidden layer and 72
∑𝑛 ̂ 𝑖 )2
𝑖=1(𝑦𝑖 −𝑦 hours
Target
Mode
2
R = [1 − ∑𝑛 ̅ 𝑖 )2
] . 100 (6) hours of training
𝑖=1(𝑦𝑖 −𝑦 Input
Hidden
R2 RMSE MAPE MBE 𝑀𝐸𝑚𝑎𝑥
layer
Variables (%) (C) (%) (C) (C)
∑𝑛 ̂ 𝑖 −𝑦𝑖 )2 neurons
𝑖=1(𝑦
𝑅𝑀𝑆𝐸 = √ (7) T_oa_db, T_ra,
𝑛 T_ma V_ma, V_ra, 3 99.06 0.053 0.28 0.04 0.10
Heating
𝑛 V_oa
1 𝑦̂𝑖 − 𝑦𝑖 T_ahc
Q_HC, T_ma,
2 97.52 0.130 0.43 0.11 0.24
𝑀𝐴𝑃𝐸 = ∑| | . 100 (8) V_ma
𝑛 𝑦𝑖
𝑖=1
Input
3
TN is number of true negatives, e.g., the correctly labelled m /s 100 0.000527 0.0195 0.0005 0.0005
readings as no-fault. as represented in a confusion matrix t m3/s 100 0.001476 0.2194 0.0015 0.0015
(Table 2). 3
m /s 100 0.000908 0.0447 0.0009 0.0009
Table 2: Confusion matrix for fault detection kW 84.13 1.700485 3.8161 1.2275 4.8029
Actual faults C 83.76 0.307907 0.8015 0.2178 0.7697
Negative (0) Positive (1)
Negative
TN FP The performance indices of FFNN and RNN models are
Predicted (0)
faults Positive presented in tables 5 and 6, respectively. In the FFNN and
FN TP RNN models, the target prediction variables are the air
(1)
temperature sensors, volumetric air flow rate, and electric
5
input for heating at time (t) and the inputs for each model reading in T_oa at the same time steps. Values for T_ma are
are the past monitored values of that target variable (t-1, t- generated using grey-box model 1 (Equation 14).
2, …, t-n). Table 7. Statistical indices of the developed models.
Both FFNN and RNN models give acceptable prediction Test data set
Training data set
performance in order to be deployed for the fault diagnosis (24 hours in Jan.4)
(24 hours in
approach. For example, for prediction of T_ra the RMSE No. Model Jan.5)
Para R2 RMSE R2 RMSE
and MAPE are approximately 0.0202°C, 0.074% using meter
Value Unit (%) (C) (%) (C)
FFNN model, and 0.0207°C, 0.069% using the RNN model. 𝑇_𝑚𝑎 a 0.232 -
1 = 𝑎. 𝑇_𝑜𝑎 99.69 0.031 99.68 0.075
b 0.747 -
+ 𝑏. 𝑇_𝑟𝑎
Table 6: Predictions of the Recurrent Neural Network. 𝑇_𝑎ℎ𝑐 c 0.038 C/kW
𝑐. 𝑄 d 0.345 s/m3
RNN Prediction performance over validation data set 2 = 99.87 0.091 98.65 0.124
Target 𝑑. 𝑉_𝑚𝑎
model for 24 hours e 1.037 -
+ 𝑒. 𝑇_𝑚𝑎
𝑉𝑚𝑎 𝑇𝑟𝑎 𝑇𝑜𝑎 Variabl
Input
3
m /s 100.00 0.00025 0.0091 0.0002 0.0002 and expected (predicted) values exceeds the threshold a
t fault is detected.
𝑇𝑠𝑎 𝑄𝐻𝐶 𝑉𝑜𝑎 𝑉𝑟𝑎
3
m /s 100.00 0.00129 0.1913 0.0013 0.0013
3
m /s 100.00 0.00190 0.0935 0.0019 0.0019
6
are calculated, and where the residual is higher than the
defined threshold, the faulty performance is diagnosed
(Figure 6). Hence, it is concluded that the T_ra sensor is
diagnosed as a faulty sensor which results in the abnormal
readings of T_ma sensor.
Conclusions
In this paper, the development and application of multiple
fault detection and diagnosis models for the air temperature
sensors of an AHU were presented. For the fault detection,
the compound ANN architecture was proposed for the
prediction of mixed air temperature (T_ma) and air
temperature after heating coils (T_ahc) at current time (t).
Figure 7: Difference between prediction and
The differences between prediction and real measured
measurement of faulty T_ma, and T_ahc over validation
values were calculated, and once the differences exceed the
data set.
defined threshold (sensor uncertainty), the fault was
detected.
Table 8. Accuracy, precision and sensitivity of the ANN The results reveal a good compound prediction model that
models for MFDD. can be used as a tool for fault detection of air temperature
ANN Architecture with
MFDD over validation data set for
sensors of AHU in heating mode. The performance of the
one hidden layer and 72 ANN was compared with three machine learning models;
24 hours
Target
Mode
hours of training
Hidden
SVR, decision tree, and random forest. The results revealed
Input Accuracy Precision Sensitivity ANN has better prediction performance and lower RMSE
layer
Variables (%) (%) (%)
neurons compared with other models.
T_oa_db, T_ra,
For the fault diagnosis, the recurrent neural network (RNN)
Heating