0% found this document useful (0 votes)

30 views15 pages

Artificial Intelligence Algorithm For Optimal Time

Uploaded by

qobilov.sirojiddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views15 pages

Artificial Intelligence Algorithm For Optimal Time

Uploaded by

qobilov.sirojiddin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2981488, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/[Link] Number

Artificial intelligence algorithm for optimal time

series data model
Kang Wang

School of Humanities and Media, Pingxiang University, Pingxiang Jiangxi, 337055, China.

Corresponding Author: Kang Wang, E-mail: kaobowang@[Link]

ABSTRACT In order to solve the limitation of a large number of literatures on the study of modeling,
simulation and prediction of time series data, there is no model selection, and a certain model is directly
used for analysis. For three types of artificial intelligence models often applied to time series analysis:
hidden Markov Carrier model, artificial neural network model and autoregressive moving average model
are used to study model selection based on simulation comparison method. The study of nonlinear
integration methods, using intelligent system methods to learn the weighting mode, has made the model's
generalization ability and the degree is of fit to the sample data have been significantly improved. At the
same time, numerical simulations are performed on various models, and the characteristics of the time
series generated by various models are investigated. Based on the characteristics, the theory and algorithm
of model selection are proposed. The model selection theory and algorithm in this paper is used for
empirical analysis. For the artificial intelligence models commonly used in time series analysis such as
autoregressive moving average model, artificial neural network model, hidden Markov model, etc., when
selecting the research model, the method of simulation comparison can be used. The experimental results
show that the time series data generated by various models have different mathematical and physical
characteristics, which provide a basis for model selection. At the same time, the selection theory is practical.
The model selected by the theory has a good fit and prediction effect. The generation of different models
has different mathematical characteristics of time series data, which also provides a basis for selecting
models.

INDEX TERMS Time series; data model; artificial intelligence algorithm; weight pattern; generalization
ability

I. INTRODUCTION and natural language processing, etc. [[1]. Many intelligent

Time series prediction is an interesting and challenging system technologies such as artificial neural network (ANN),
research topic. The motivation of time series prediction is to support vector machine (SVM), genetic algorithm (GA), and
discover the rules and patterns hidden in the sequence, and to genetic programming have also been widely used in the
make scientific estimates of the future development trend of financial field [2, 3]. Especially the ANN method, in the past
the research object based on the learned knowledge. many years, ANN has been successfully applied to the
Artificial intelligence has universal approximation ability, research topics of time series analysis in the financial field
and is an ideal rule and pattern learning machine that can be [4-7]. A large number of studies have shown that
used to develop more advanced forms of predictors. This feedforward neural networks (BPNN) with a hidden layer
paper is devoted to improving the accuracy of prediction, and can usually approximate any non-linear function with
studying the scalability and advantages of artificial arbitrary accuracy [8-10]. This feature also makes BPNN
intelligence and its combination models on time series widely used to predict complex non-linear systems. In
prediction problems. addition, compared with traditional models, the superior
Artificial intelligence technology has been widely used in learning ability of neural networks for dynamic systems also
robot systems, graphic recognition, knowledge engineering, makes it a more powerful tool for studying financial time

VOLUME XX, 2017 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.2981488, IEEE Access

series. The prediction accuracy of multilayer perceptron and improve the robustness of the iterative estimation model,
(1VB, P), radial basis function network (RBF) and the multiple support vector regression (MSVR) model is used
conditional heteroscedasticity model is empirically compared. in the research of iterative time series analysis [43-46], and it
The results show that in the prediction of exchange rate time is used in three benchmark data sets. The performance of the
series, neural network model and conditional model is compared with SVR model and SVR direct model,
heteroscedasticity model are each of them can give effective which proves the validity of the MSVR model. A multi-
predictions [11-14], but it is clear that the overall output support vector regression based on the multi-input
performance of the neural network is better than the output strategy is proposed creatively [47-49], namely M-
conditional heteroscedastic model. However, neural networks SVR based on the 1VBM0 strategy, and the effectiveness of
tend to fall into a local minimum in the time series prediction the method is simulated with the help of simulated data sets
problem. In response to this defect, BPNN and Adaptive and real data sets. In addition, from the perspective of
Differential Evolution Algorithm (ADE) are combined to prediction accuracy and calculation cost, the performance of
present a hybrid model [15-19], namely ADE- BPNN to three SVR models based on different strategies is compared
improve the goodness of fit of sample data for time series and analyzed. The analysis results show that among the three
analysis. And using two real data sets, the operability and models compared, the M-SVR based on the 1VBM0 strategy
good performance of the proposed hybrid method are is acceptable. At the cost of computing, the best model
confirmed, and the proposed ADE-BPNN method can accuracy can be achieved in multi-step timing analysis
significantly increase the good fitting performance compared problems. Many application literatures directly apply one or
with separate models such as BPNN or ARIMA. Time series more of the models to directly analyze time series data
forecasting has always been a research problem of interest in without establishing a more comprehensive model selection
many application fields, such as stock price forecasting, theory. Before establishing a time series model, analyze and
temperature forecasting, hydrological time series forecasting, compare which data is suitable for use. Class model. In fact,
power load forecasting, network traffic forecasting, and so on. time series data are very different in terms of their own
Time series prediction is to predict the future data or trends characteristics. For example, in terms of autocorrelation,
from historical and current data by analyzing the rules or there are short-term, medium-term, and long-term differences.
trends of time series over time. Time series prediction A model that can only describe short-term correlation is
methods include classical time series analysis [20-23], neural obviously used to analyze time series with long correlation.
networks [24-26], and expert systems [27-30]. Time series The data is inappropriate.
prediction can be performed by mining time series, When model-based data mining methods are used to
discovering sequence rules, and using rule knowledge to process streaming data, the challenging problems are:
predict. An algorithm for discovering sequence rules was automatic selection of models, easy updating of models, and
proposed. The idea of Apriori algorithm in association training samples that appear in streaming form. In the process
analysis was used to mine sequence rules. Three algorithms, of streaming data, data-driven models have a lot of
AprioriAll, AprioriSome, and DynamicSome, were proposed limitations, because the parameters of these models only
[31-33]. An algorithm for finding numerical association rules appear as fitted parameters, without considering the internal
from multiple synchronized data streams is proposed [34]. mechanism that constitutes the data. Time series data are
The clustering method is used to symbolize the time series very different in terms of their own characteristics. For
first, a sequence pattern mining algorithm is used to find the example, in terms of autocorrelation, there are short-term,
rules in the symbol [35, 36]. An evolutionary rule based on medium-term, and long-term differences. A model that can
expert system was proposed, which combined fuzzy logic only describe short-term correlation is obviously used to
and rule inference for the analysis of stock market activities analyze time-series data with long correlation suitable. The
[37, 38]. Using the methods of fuzzy logic, ANN, and purpose of this article is to answer what kind of time series
evolutionary computation, the trend of the Nasdaq-100 index the above three models can describe, so as to select the time
value and the Nasdaq-100 index of six other companies were series data analysis model based on this, and establish a
predicted [39]. A time series association rule discovery computer intelligent time series model selection method. The
algorithm based on the cross-correlation succession tree selection theory of this model is helpful to guide researchers
model was proposed. The method of using sequence rules to to better perform data pre-analysis and pre-processing, and
make predictions is limited by the knowledge of domain improve the efficiency of establishing time series analysis
experts and has certain limitations [40-42]. In general models. Based on computer intelligence, the analysis of time
iterative methods in timing analysis, multi-step estimation is series data is better promoted, making the modeling more
an iteration based on one-step estimation. However, even if targeted and the results more accurate.
the one-step prediction model is very accurate, repeating the
iterative process of one-step prediction will accumulate II. CHARACTERISTIC ANALYSIS OF TIME SERIES
prediction errors, resulting in poor prediction performance. In DATA GENERATED BY VARIOUS MODELS
order to reduce the cumulative error in the iterative process

VOLUME XX, 2017 9

A. Three types of models generating observations. A complete HMM model consists of

(1) Hidden Markov Model 5 parts: a hidden state set (denoted as S), a probability
Hidden Markov models have been successfully applied to transition matrix between hidden states (denoted as A), a
speech recognition, DNA sequence analysis, text information phenotype set of the model output (denoted as V), and the
extraction, network path analysis, time series analysis and corresponding state to generate these performance Type
other fields. It is essentially a double random process, probability matrix (denoted as s) and the probability
consisting of a Markov chain random process (transition distribution of the initial hidden state (denoted as p). The
process) with hidden states and a random process of topology is shown in Figure 1.

FIGURE 1 Topological structure of hidden Markov model

memory of past patterns and the memory of past noise. The

(2) Artificial neural network model basic model is given by:
Artificial neural networks have developed different yt  c  1 yt 1  ... p yt  p  ...q  t q (1)
network organization forms such as mapping networks, self- Where  t (t  1, 2...) -white noise, yt (t  1, 2...) a data set
organizing networks, recursive networks, and time for regression analysis (which can be in a multidimensional
feedforward networks. Its training algorithm starts from the form), i (i  1, 2.. p), j ( j  0,1...q) represents the
back propagation algorithm ((BP algorithm), and its basic
coefficient to be sought. For the calculation of coefficients in
structure is shown in Figure 2.
this model, the most widely used algorithms are the
maximum likelihood estimation method and the moment
method.

B. Characteristic analysis of time series data

The method of simulation analysis is used to confirm the
following conjectures about the characteristics of the time
FIGURE 2 Structure of neural network (MLP type) series data generated by the three models: Hidden Markov
Based on the network model, it has been applied to the models are short-term related and are not stable; artificial
research of numerical calculation problems, engineering neural networks will use historical information due to the
control problems, and signal processing problems, including trained algorithm, which is reflected in Among the weights,
time series data modeling and prediction problems, and some the results should show long-term correlation, and it is
practical results have been obtained. difficult to determine whether it is stable or not.
(3) Autoregressive moving average model Autoregressive moving average models have been
Auto-regression and moving average (ARMA) is a model theoretically proven that under certain parameters, it
specifically designed for trend fitting and prediction of time generates stable short-term correlation time series. Since the
series data. It was proposed by Box and Jenkins for the autoregressive moving average model has theories and a lot
modeling of stationary time series. Its model is a combination of research to reveal its data characteristics, the following is
of an autoregressive model (AR) and a moving average to design the model and verify the conjecture of the data
model (MA), which respectively describes the system's characteristics of the first two models by simulation. Since
the autoregressive moving average model has theories and a

VOLUME XX, 2017 9

lot of research to reveal its data characteristics, the following

is to design the model and verify the conjecture of the data
characteristics of the first two models by simulation.
(1) Features of time series data generated by hidden
Markov models
Given a hidden Markov model with 3 hidden states, the
transition probability is given by the following matrix:
 0.5, 0.3, 0.2 
 0.1, 0.7, 0.2  (2)
 
 0.2, 0.4, 0.4 
Let the initial probability distribution be ((0.3,0.3,0.4)T,
and the probability distribution of the data generated by
hidden state 1 satisfy the probability distribution with a mean
of 1 and a standard deviation of 0.5, that is, N (-1,0.5). Also
define The probability distribution of data generated by FIGURE 4 Artificial Neural Network 1 Generates Time Series Data
Connections
hidden state 2 satisfies N (0, 0.1), and the probability
Similar to the above analysis process, the autocorrelation,
distribution of data generated by hidden state 3 satisfies N
partial correlation analysis and stationary analysis are
(1,0.25). A data generated by the above-mentioned hidden
performed on the time series data, and the following
Markov model is shown in Figure 3.
conclusions are obtained: the data generated by the neural
network model is long memory, showing long-range
autocorrelation, and no attenuation trend. The same KPSS
test was applied, which passed the stationary test at a
confidence level of 0.1. However, if the above neural
network weight matrix is changed to the following form.
 0.2, +0.45, 0.1 
w23    , w12  2, 4  (4)
 +0.3, 0.2, 0.6 
The initial value is also taken as (0.3 0.2 0.9)T. The
connection diagram of the time series is shown in Figure 5.

FIGURE 3 Hidden Markov Model Generates Sequence Data

(2) Characteristics of time series data by artificial neural
network model
Construct a two-layer neural network with 3 input nodes, 2
hidden nodes, and 1 output node. The activation function of
the hidden node is a hyperbolic tangent function, and the
output node is a proportional linear function. White noise is
with distribution N (0, 0.25). The two weight matrices take
the following form:
 0.2, 0.45, 0.1  FIGURE 5 Artificial Neural Network 2 connection for generating time
w23    , w12  2, 4  (3) series data
 0.3, 0.2, 0.6 Its autocorrelation graph and partial correlation graph are
similar to the above. It pays attention to the stationarity test.
The initial value is taken as (0.3 0.2 0.9)T to obtain the It is found that the KPSS test does not meet the original
time series data of Figure. 4. hypothesis of stationarity at a confidence level of 0.1. It can
be seen that the stationarity of the sequence generated by the
neural network depends on the network structure.

C. Model comparison and model selection theory

VOLUME XX, 2017 9

Through discussion, we can find that the data generated by (AR model), moving average model (MA model), and hybrid
these three types of models have the following rules. The model (ARMA model). In essence, the ARMA method is a
data generated by Hidden Markov is related to the first period. linear model with limited parameters. The principle
The KPSS stationarity test is applied, which is not stable. The framework of stationary time series analysis that meets this
model generated by the neural network is long-term related. condition has been perfected, and it is also widely used in
The stability of the generated data depends on the structure of many fields. The ARMA model uses a linear model with
the model; the data generated by the autoregressive moving limited parameters to characterize the autocorrelation of time
average model is short-term correlated and meets the stability series. Not only is it conducive to sequence analysis and
(under certain parameter conditions). Based on the above structure processing, but the finite parameter one-time model
description, the model selection theory here is based on the can also describe very common random phenomena, and the
following premise: the data suitable for a model should be accuracy of the actual fitting can meet the needs of reality. In
consistent with the characteristics of the data generated by addition, the linear prediction theory can be extended from
the model. And this is a necessary condition, because if a the structure of the primary model with limited parameters.
time series data is long-term related, it cannot be generated Therefore, the study of timing analysis problems based on
by a hidden Markov model. If a hidden Markov model is the ARMA model has a theoretically important position in
established for such data, there will not be a good modeling the fields of signal processing, economic prediction, state
effect and prediction. Therefore, the following model estimation, control, and pattern recognition.
selection algorithm is proposed, as shown in Table 1: M

TABLE 1 TIME SERIES DATA MODEL SELECTION PROGRAM PSEUDO CODE ri  0   i ri 1  i  i 1   i
i 1, R i 1
(5)
(1) Input data;
(2) Perform correlation test and stationarity test on the data;  i  ui hi (6)
If the data is a period related then p q

If the data is not stable then hi  k   Ai i21   Gi hi 1 (7)

i 1 i 1
Select hidden Markov model;
The above model is referred to as the
Else if the data is smooth then
ARMA( R , M )  GARCH ( p , q ) model. Where ri is the
Choose an autoregressive moving average model;
End if return on assets,  i is the sequence of random error terms,
Else if the data is short-term related then ii is the coefficient term of the conditional mean equation,
If the data is stationary then ht is the conditional variance, ut is the sequence of iid
Choose an autoregressive moving average model;
random variables, and the mean is 0, the variance is 1, and it
Else if data is not stable, then do data transformations, such as
is independent of ht , k is the intercept term, Ai is the
difference, into stable data, and apply autoregressive moving average model;
End if coefficient of the ARCH term, and Gi is the coefficient of
Else if data is a long-term relevant then directly choose neural network the GARCH term hi21 .
model Based on the ordering theory given by Box and Jenkins,
End if the specific operation of ordering with the help of
(3) End the model selection process autocorrelation coefficients and partial autocorrelation
coefficients can be summarized as follows: When the ACF is
tailed, and the PACF is truncated at the p-th order, consider
III. Data model for optimal time series using the lag order p Autoregressive model, namely AR (p)
A time series is a set of observations. These observations are model. When PACF is tailing and ACF is truncating at q-th
the realization of the same variable at different time points in order, consider using the MA model and q-th order moving
a certain period, such as X :  x1 , x2 ...x n  . They are arranged average model. For mixed-order definite order, EACF can be
in chronological order, and the intervals between time points used to judge the lag order p and q of ARMA method.
are equal. The basic idea of time series analysis can be
summarized as follows: the ultimate goal is to estimate and B. S-BPNN model
infer future time series values, and this goal is to capture the The principle of feedforward neural network can be
laws behind the historical data of time series and use this law summarized as follows: At the input layer of the neural
to predict the future [50, 51]. network, a large amount of non-linear input information is
received by many input network nodes. These input
information are called input vectors. The input signal is
A. Autoregressive moving average model transmitted, analyzed, and weighed in the neuron connection,
The ARMA model, the autoregressive moving average and finally the output signal is generated. The hidden layer is
model, is the most widely used stationary time series analysis all layers composed of all network nodes and connections
model. It has three basic forms, namely autoregressive model between the input layer and the output layer. A neural

VOLUME XX, 2017 9

network can contain multiple hidden layers, and the number Suppose there are m p-dimensional vectors ( x1 , x2 ...xm ) .
of nodes or neurons in the hidden layer is more or less, but The set of all possible linear combinations of these vectors,
the larger the number, the more obvious the nonlinearity of namely k1 x1  ...k m xm , forms a linear space, which is called
the neural network, and the more robust the network model.
the space of x sheets. Similarly, a vector composed of
For a neural network with a network structure of n-m-l, different lag orders of the original time series can also be
that is, it contains n input nodes, m hidden nodes, and l expanded into a new vector space, which is called a set of
output nodes, in the forward propagation process, the input basis that constitutes the prediction space. Proper selection of
signal first flows into the input layer, and then passes along the basis of the prediction space can help solve many
each hidden layer in turn. Finally, it is transmitted to the problems encountered in the application of neural network
output layer, and an output signal is formed at the output end. methods. Specifically, an appropriate set of basis can not
If the output signal meets a given output requirement, the only capture the potential characteristics of the input
calculation ends; otherwise, it turns to the second stage of the variables, but also avoid the computational difficulties caused
training process, which is the back propagation error link. by the non-uniqueness or multicollinearity of the parameters.
The calculation of the forward propagation process is as Therefore, as a dimensionality reduction technique with the
follows. For the hidden layer, there are: goal of finding several principal components that can explain
n most of the sample variance, principal component analysis
net j   vij xi   j , j  1, 2..m (PCA) becomes a natural choice in this case. In PCA, the
i 1 (8)
original variables are transformed into new variables that are
y j  f (net j ), j  1, 2...m orthogonal to each other, which is very advantageous for
For the output layer, there is simplifying the calculation process, especially in those cases
m where the original variables are highly correlated. In addition,
netk   w jk y j  k , k  1, 2..l after the data is reduced in dimension, some noise may be
j 1 (9)
reduced, and more information containing fundamental
ok  f (netk ), k  1, 2...l characteristics is identified, which is beneficial to the next
The excitation functions corresponding to the hidden layer data analysis.
and the output layer are both unipolar Sigmoid functions, that About the first part of the BPNN model, that is,
1 determining the connection weight between the input layer
is f ( x)  . The Sigmoid function f (x) is continuously
1  e x and the hidden layer, the PCA-based solution involves two
differentiable and f ' ( x )  f ( x )[1  f ( x )] . When the actual aspects. First, the initial matrix of weights or loads contains
the correlation of all variables and factors. These factor loads
output of the network model is not consistent with the
represent the degree of agreement between the variables and
expected output, Will produce an output error E, which has
the principal components. Second, by associating a subset of
the following expression:
the original variables with a principal component, the
1 l
E   (d k  ok ) 2 (10) resulting variables will reflect the characteristics of the data
2 k 1 formation process.
Expanding from the error in the above expression to the In addition, mapping raw data into a low-dimensional
hidden layer, there are: space can greatly improve the performance of pattern
1 l recognition or prediction in a high-dimensional space.
E   [d k  f (net k )] 2 (11)
2 K 1 Although this mapping may result in loss of information, it is
The network error is a function of the weight w jk , vij of the ultimate goal to build a new set of bases that minimize
the number of variables while expanding into the original
each layer, so the magnitude of the error E will change as the
data space. It is more appropriate to use PCA to achieve this
weight changes. When correcting the weights, the error
goal, because PCA, as a dimension reduction method, can
should decrease at the fastest speed, so the weights should be
achieve the purpose of dimension reduction while extracting
corrected along the negative gradient direction of the weight,
that is, the correction amount of the weight is proportional to the main features of the prediction space. In addition, based
the negative gradient direction of the error, that is, on the variation of the prediction space explained by each
E principal component, a natural ordering of the input variables
w jk   , j  1, 2...m; k  1, 2..l is given, which allows the non-linear function of the original
w jk
(12) variables to be considered without losing overall degrees of
E freedom in the parameter estimation process.
vij   , i  1, 2...n; j  1, 2...m
vij The BPNN model also aims to modify the model
Where  is the learning rate, which is a preset constant, parameters with the goal of minimizing the sum of squared
errors. For testing purposes, an extra Gaussian error structure
usually 0    1 is taken. is assumed during the parameter estimation process:
yt  g (rt ,  )   t (13)

VOLUME XX, 2017 9

In the stage of neural network model structure selection, it In the theoretical framework of the generalized world, a
is necessary to evaluate the contribution of each basic new strategy is proposed in the field of statistical learning,
variable from the perspective of explanatory ability. This can which is different from the empirical risk minimization
be achieved by choosing between two specific model settings, criterion. Specifically, a function subset sequence is first
that is, the use of hypothesis testing to determine the optimal constructed from a function set, and then the subset sequence
number of hidden nodes in the neural network: is arranged according to the size of the VC dimension. Then
H 0 : yt  g (rt ,  )   t based on the empirical risk, find the minimum value in each
(14) subset sequence, and finally select the subset with the
H1 : yt  g (rt ,  )   h(rt ,  )   t
smallest sum of the minimum empirical risk and confidence
range among all the subset sequences. SVM transforms the
C. EMD-LSSVM model problem of finding an optimal hyperplane between two
different classes into a maximum classification interval
When implementing EMD decomposition, first of all, the problem. The maximum interval problem is actually a
following two prerequisites must be met: time series data quadratic programming problem with inequality constraints.
must meet the following: first, the number of local extreme
values of the time series data is the same as the number of D. Nonlinear integrated prediction model
zero points or the difference between the two is only 1; The
Assuming p training sample set ( x u , y u )(u  1, 2..m) , our
local mean of the sequence is zero, that is, the time series
signal is locally symmetric about the time axis, that is, the goal is to find the most appropriate functional relationship f =
upper envelope generated by the local maximum and the f (x) for prediction. There are n separate prediction
lower envelope generated by the local minimum, respectively, techniques that can be used to predict. For any x, the output
and the mean of the two is zero. The data sequence x (t) (t = of the i-th prediction technique is f i ( x ) . Below we will
1, 2… n) can be EMD decomposed according to the combine n separate prediction techniques to perform
following process: integrated prediction, the general form of which can be
(1) Find all local extreme values in x (t). Use cubic spline expressed as follows:
n
interpolation to connect all local maxima to generate the f ( x )   wi f i ( x ) (18)
upper envelope xu (t ) . Similarly, connect all local minima to i 1
produce the lower envelope xl (t ) . Where f (x) is the integrated prediction result, and the
(2) Calculate the average envelope value based on the weight of each individual prediction technology in the
upper and lower envelopes obtained in (1): integrated prediction is wi (i  1, 2..n) .
x (t )  xl (t ) The nonlinear integrated prediction model does not simply
m1 (t )  u (15)
2 assign weights to the prediction results of each integrated
(3) The original data sequence x (t) minus the average prediction member, but rather learns the weight pattern by
value of the upper and lower envelopes m1 (t ) will generate means of artificial intelligence to maximize the information
the first component d1 (t ) : contained in the data after fitting a single prediction model.
d1 (t )  x (t )  m1 (t ) (16) Reflected in the weights learned, this is better than the linear
integration model. To sum up, the nonlinear integrated
(4) Check whether d1 (t ) meets the requirements of the
prediction model has certain advantages over the other
eigenmode function. models mentioned above in terms of fitting accuracy and
n
[d k 1 (t )  d k (t )] 2 generalization ability. Following the above symbolic
t 1 d k21 (t )
(17)
assumptions, the general steps of nonlinear integrated
When the number of training samples is small, the modeling can be summarized as follows:
confidence range will increase with the increase of the VC ① Take the data set: Obtain the expected output d (x) and
dimension of the learning machine, that is, based on the the prediction results of the individual models constituting
actual risk, the deviation from the empirical risk will the integrated prediction model, that is, f i ( x ), i  1, 2...n ,
gradually increase. Therefore, choosing a learning machine constitute the data set of the nonlinear integrated modeling,
that is too complex, that is, a VC with too high a neural and divide the data set into two parts, the training set And test
network, often fails to get good results. This "over-learning" set;
phenomenon mainly occurs because in a small sample ② Training weight mode: Use the training set obtained in
situation, once the network structure or algorithm is not step ①, and use non-linear technology, such as the artificial
designed properly, it will lead to a large confidence range. intelligence method used in this article, including BPNN and
Even though the risk of experience may be small, the SVM, to train the weight mode of the nonlinear integrated
increased confidence will greatly reduce the ability to prediction model to determine each individual prediction The
promote. weight of technology, that is, wi ( x ), i  1, 2...n , and finally
determine the optimal model structure;

VOLUME XX, 2017 9

③ Test model performance: Use the test set obtained in class, merging data objects from bottom to top until a certain
step ① and the optimal structure of the model obtained in end condition is met or all objects have been merged into one
step ② to test the performance of the nonlinear prediction class. The data sequence space is mapped to the model space,
model to quantify the prediction effect or performance of the and various existing clustering algorithms are applied in the
nonlinear integrated model. model space. A partitioning and hierarchical combination
Generally, we can think of a nonlinear integrated clustering algorithm is proposed. Table 2 and Table 3
prediction model as a nonlinear information processing describe clustering algorithms based on HMM's time series
system. Assuming that the prediction results of n individual hierarchical clustering, partitioning, and hierarchical
prediction technologies is yi , i  1, 2...n , the nonlinear combination.
integrated prediction model in this paper can be described by
the following formula: TABLE 2 HIERARCHICAL CLUSTERING ALGORITHM BASED ON HMM
y  g ( y1 , y 2 ... y n ) (19) Input: O  O1 , O2 ...On 
Output: results of clustering
Where g is a non-linear function and y  g ( y1 , y 2 ... y n ) is
Method:
the input vector of the model. In the BPNN non-linear 1) Train each sequence Oi as an HMM i ;
integrated prediction model, the weight of the integrated 2) Construct the distance matrix P (Oi | i ) by the likelihood
model is determined by the BPNN to realize this non-linear
D  D (Oi , O j ) or the distance between the models;
mapping. At this time, the input of the neural network is the 3) Use agglomerative hierarchical clustering algorithm to cluster by distance
prediction result yi of each individual prediction technology, matrix D;
and the model output is the result of the BPNN nonlinear TABLE 3 CLUSTERING ALGORITHM BASED ON HMM-BASED PARTITIONING
AND LAYERING
integrated prediction. The expected output is the
corresponding sample true value. Input: O  O1 , O2 ...On 
Output: results of clustering
Method:
IV. HMM-based time series artificial intelligence 1) Class division: set time series. Divided into k clusters;
algorithm 2) Train each cluster as a HMM i
Time series are different from static data, whose data change 3) Construct the distance matrix D  D(i ,  j ) ;
over time. Time series exist in a wide range of fields, from 4) Use agglomerative hierarchical clustering algorithm to cluster by distance
scientific computing, engineering, business, finance, matrix D;
economics, health care to government departments. Cluster Suppose G and C are data sets with k classes. The
analysis of time series has also been extensively studied. similarity measure of clustering is defined as:
These studies include clustering analysis based on original 1 k
time series, feature-based time series clustering, and model- Sim(G , C )   maxSim(G I , C j ) (20)
k i 1 1 j  k
based time series clustering.
HBHCTS (HMM-Based Hierarchical Clustering Time-
Series) algorithm is mainly divided into three parts: the
A. Time series clustering formation of initial partitions, hierarchical aggregation and
Common time series clustering algorithms mainly include automatic selection of clustering results. The initial partition
partitioning (dividing) method and layering method. is formed by scanning the time series set in a single pass,
Partition-based clustering randomly selects k objects as the comparing the currently accessed time series with the
initial class center, calculates the distance from each object to existing model (partition). If there is a suitable model, add it,
the class center, and assigns it to the nearest class, then otherwise create a new model. Judging whether the model is
recalculates the new class center, and so on until the object is suitable is determined by the distance min value. We can
not Change again. Hierarchical clustering method organizes obtain relevant prior knowledge by counting the distribution
data objects into a tree. According to whether the hierarchical of this threshold, which is easier to determine than specifying
structure is top-down or bottom-up, the method can be the initial number of partitions in advance. After the initial
divided into two types: splitting and agglomeration. The split partitions are formed, the hierarchical clustering is used to
method treats all objects as belonging to the same class, and merge the partitions. The evaluation of the clustering results
gradually splits down into more and smaller classes until is similar to the Dunn Index method, and the largest one is
each object becomes its own class or meets an end condition. the optimal clustering result. The algorithm flowchart is
The aggregation rule treats each object as an independent shown in Figure 6:

VOLUME XX, 2017 9

FIGURE 6 HBHCTS algorithm flowchart

O14 6. 52E-169 4. 67E-160

O15 5. 68E-168 7.38E-160

B. Analysis of Time Series HBHCTS Algorithm
O16 9. 42E-167 2. 63E-160
The advantages of the HBHCTS algorithm: (1) no need
to specify the initial number of clusters and corresponding O17 3. 72E-167 3. 23E-160
initialization; (2) combined with the CBIC method can O18 1. 85E-165 1. 35E-160
automatically determine the number of HMM hidden states 1. 83E-163 6. 55E-159
O19
and corresponding initialization; (3) can clearly give the
O20 1. 48E-167 8. 26E-160
class's (4) is not sensitive to the length of the sequence; (5)
when a new sequence is added, it just compares the new O21 1. 48E-221 1. 65E-252
sequence with the existing set of models (models), and does O22 1. 84E-214 5. 08E-216
not need to perform partition clustering for all sequences 1. 91E-212 9. 75E-243
O23
again. It is easy to implement incremental clustering, and
O24 4. 66E-230 1. 22E-262
initial partition clustering is suitable for time series stream
data processing. O25 9.58E-230 1. 35E-263
The sequences are O1  O10 in the experiment are from O26 3. 22E-237 1. 84E-271
HMM1, O11  O20 is from HMM2, and O21  O30 is from O27 1. 31E-213 3.84E-255
model 3. Table 4 shows the probabilities of these sequences O28 3. 90E-242 6. 76E-264
in HMM 1 and HMM2, respectively: 7. 70E-233 1. 81E-254
TABLE 4 EXAMPLES OF PROBABILITIES OF 30 SEQUENCES IN HMM1 AND O29
HMM2 O30 3. 52E-212 1. 17E-255
Sequence HMM1 HMM2
O 1. 68E-160 2. 23E-168 In the experiments on HBHCTS, the HMM hidden state
1
number is determined using two strategies. One uses the
O2 1. 66E-159 8. 88E-167
specified hidden state number, which ranges from 2 to 8, and
O3 1. 23E-160 1. 44E-167 is expressed by HBHCTS (2-8). The rate is expressed by
O4 1. 25E-160 8. 98E-167 HBHCTS (CBIC). For comparison, the experiment was
1. 53E-159 9. 76E-166
repeated 20 times. The average correct rate of clustering is
O5 shown in Figure 7. From Figure 7, we can see that the
O6 4. 05E-160 3. 91E-168 HBHCTS (CBIC) method can reach the clustering results of
O7 1. 19E-160 9. 41E-167 the Hier-moHMMs method. In HBHCTS, the HMM hidden
1. 13E-160 1. 73E-165 state number and initial partition number do not need to be
O8
specified in advance, and the Hier-moHMMs method needs
O9 7. 39E-160 1. 06E-165
to specify the hidden state number and initial partition
O10 2. 05E-159 2. 04E-165 number in advance, where the HMM hidden state number is
2. 27E-165 9. 82E-159 specified as 5, and the initial partition number is specified Is
O11
6, the accuracy of clustering is greatly affected by these two
O12 2. 96E-165 5. 06E-161
parameters.
O13 9. 63E-169 2. 76E-160

VOLUME XX, 2017 9

FIGURE 7 Corrected-rate of different clustering methods on the FIGURE 8 Histogram of random variable x
synthesized data
Regarding the distribution of the random variable X, we
In the above experiment, the distance threshold sfit of have also experimented with other models. Three models are
HBHCTS was set to 0.08. This threshold can be used to selected. The first model is a HMM model with five hidden
decide whether to build a new model for the new sequence. states, where the mean of the hidden state output distribution
This is achieved by calculating the distance between the is randomly set between 0-5 and the variance is randomly set
sequence and the existing HMM model. If the threshold is set between 0-1. The second model is similar to the first model,
too large, sequences of different classes will be merged by except that the variance is randomly set between 0-10. The
mistake. The smaller the threshold is, the more the number of third model selects the previously mentioned HMM 1 model.
classes generated in the initial partition is, which increases Generate a random sequence for these models to count the
the computational complexity of subsequent hierarchical distribution of the random variable X, calculate the kurtosis
clustering. value of the random variable X, and repeat 10 random
Experiments show that the distribution has the experiments. The experimental results are shown in Table 5.
characteristics of a normal distribution. As shown in Figure 8, Their kurtosis values are all close to 3.0. It can be seen that,
the statistical characteristics have a mean value of 0.007, a for the sequences of the three models, the random variable x
standard deviation of 0.025, a minimum and maximum value approximately follows a normal distribution.
TABLE 5 KURTOSIS VALUES OF THE RANDOM VARIABLE X OBTAINED
of -3.2 and 3.48, and a kurtosis value of 19.97. FROM DIFFERENT SEQUENCES
According to the 3 rule of normal distribution, the Experiment Model 1 Model 2 Model 3
points that fall into the (u  3 , u  3 ) interval account for 1 3. 0466 2. 9967 3. 0439
99% of the entire distribution. Therefore, it can be considered 2 3. 0712 3. 1822 3. 1086
that these points belong to a normal distribution with a 99% 3 2. 9072 3. 0495 3. 0819
confidence level. Therefore, we set the threshold sfit to 3 4 3. 1131 3. 1148 3. 0132
can merge the sequences as much as possible while ensuring 5 3. 0039 3. 0066 3. 0261
the correctness of the initial partitioning of the clustering 6 3. 0049 3. 1095 3. 0485
algorithm. 7 3. 0661 3. 0849 2. 9713
8 3. 0333 3. 0272 2. 9836
9 3. 0983 3. 0592 2. 9796
10 2. 9786 2. 9831 3. 0326
If in the initial partition, the sequences of different
classes are divided into the same region, this introduces
wrong partitioning. Because the hierarchical division
clustering does not consider class splitting, such as HBHCTS
and Hier-moHMMs methods, the initial partition It is
important to ensure that sequences belonging to the same
class are partitioned in the same region. In order to test the
effectiveness of HBHCTS, we have performed experiments
on the error rate of the partition sequence of the initial
partition of HBHCTS. As shown in Figure 9, as the distance
threshold increases, , The initial partition error rate also

VOLUME XX, 2017 9

increased, and after the distance threshold is greater than 0.08,

the confidence level of the sequence merging to the same
class decreases, resulting in a rapid increase in the initial
partition error rate from about 10% to about 30%, which is
mainly Because the sequences of HMM 1 and HMM2 are
easily divided into the same region. It can be seen from
Figure 9 that the number of partitions at this time is about 11.
In addition, we can also see from Figure 9 that when the
distance threshold is less than 0.08, the initial number of
partitions exceeds 10, especially when the distance threshold
is 0.03. The number of initial partitions actually exceeds 25.
This is because there are many models of Model 3 in which
each sequence is divided into a region, and it also shows that
the model 3 sequence is the main factor affecting the
accuracy of clustering.
FIGURE 10 Autocorrelation diagram of time series

FIGURE 9 Number of initial partitions at different distance thresholds FIGURE 11 Partial autocorrelation diagram of time series
In the S-BPNN model, in order to avoid the subjectivity
of artificially selecting the number of input nodes, the
IV. Experimental verification number of input nodes is set to 6 according to the
In the empirical research in this paper, the models used, autocorrelation analysis and partial autocorrelation analysis
ARMA-GARCH model, BPNN model, EMD-SVR model in the ARMA model modeling process, that is, 6 input
and nonlinear integrated prediction model are all realized by variables are selected. PCA was applied to these 6 variables
using matlab software developed by Mathworks simultaneously, and 6 principal components were obtained,
Experimental Company. as shown below. The hypothesis test with a confidence level
According to the analysis results of autocorrelation of   99% described in the previous section is used to
analysis, partial autocorrelation analysis, and determination determine whether a principal component remains in the
by the AIC criterion, in the ARMA model, the autoregressive model. In the test statistic L of the hypothesis test, n = 2264,
n
order p is set to 4 and the moving average order q is set to 6.
s = 1, k = 3, SSE   (d i  oi ) 2 . As a result, the two
The ARCH order and GARCH order of the GARCH model i 1
are selected as 3 and 2, respectively. The autocorrelation principal components remain in the network model of
diagram is shown in Figure 10 below, and Figure 11 is a conditional mean prediction, in other words, the optimal
partial autocorrelation diagram. network structure of the obtained neural network is 6-2-1.
By introducing PCA and hypothesis testing in the
process of neural network model selection, the resulting
BPNN model has several satisfactory features. First, the
BPNN model does not need to make any assumptions about
the functional relationship between lagging returns and future
returns. Secondly, by orthogonalizing the input space, the
possible multicollinearity is eliminated, and the uniqueness
of hidden nodes is guaranteed. Third, the step-by-step

VOLUME XX, 2017 9

selection process selects a most streamlined model to ensure As can be seen from Table 6, three separate prediction
that the training data does not overfit. Finally, it reduces the models, namely the ARMA (4,6) -GARCH (3,2) model, the
computational cost required to find the best model structure. S-BPNN model, the EMD-LS S VM model, and three
The RBF kernel function has two hyperparameters, C integrated prediction models, namely the simple average The
and  , namely the penalty factor and the inverse of the integrated model, BPNN integrated prediction model, and
Gaussian kernel bandwidth. For a specific problem, naturally, SVM integrated prediction model. The prediction
the optimal values of C and  cannot be determined in performance of the SVM integrated prediction model is the
advance, so it is absolutely necessary to perform model best among the six models. Not only is the NMSE the
selection, that is, the process of parameter (C ,  ) search. smallest, but the Dstat is the largest.
Choosing the best-performing parameter pair from the many In the sample, the difference between the fitted data and
optional parameter pairs is the ultimate goal of model the real data is shown in Figure 12, and it can be found that
selection, and the best-performing parameter pair refers to the effect is better, the maximum difference is 0.8, and the
the parameter that enables the support vector classifier to average fluctuates within the range of ± 0.8.
make the most accurate prediction of the test data Correct. In
the method to achieve this goal, the cross-validation method
can avoid the over-fitting phenomenon, and control the
variance of the model performance to ensure the stability of
the model performance. Cross-validation methods are
commonly used to determine tuning parameters and compare
model performance.
In the nonlinear integrated prediction model, the first
205 prediction sample values are used as the training set, and
the last 50 prediction sample values are used as the test set.
Based on simple average integration, the S-BPNN and
LSSVM are used to allocate the integrated time series
analysis model. Baseline integrated prediction model, the
arithmetic average of the results of simple average integrated
model. The BPNN and SVM nonlinear integrated prediction
FIGURE 12 Differences between model data and real data
models use neural network method and support vector
machine technology to determine the weights of three The following is a comparison group. Based on the data
separate prediction models in the integrated model. In used as the test set for the fitting set above, the model
addition, considering that the neural network method has the mechanism for establishing an autoregressive moving
disadvantage of easily falling into a local optimum, when the average is as follows. First, the model order is determined.
S-BPNN nonlinear integrated modeling is performed, the Generally, the smaller AIC and BIC are selected as the model
average value of the results of running the program 100 times according to the AIC and Schwarz criteria the lag order is
is taken as the final S-BPNN nonlinear integrated prediction shown in Table 7 in several cases.
TABLE 7 SELECTION OF LAG ORDER OF ARMA MODEL
model. The performance of the ARMA-GARCH model, the AR
Numbering MA AIC Schwarz
S-BPNN model, the EMD-LS S VM model, and the three 6 -2.813 -2.661
1 3
integrated prediction models, the simple average integration -2.782 -2.656
2 5 3
model, the BPNN integrated time series analysis model, and -2.772 -2.651
3 4 2
the SVM integrated time series analysis model are shown in -2.749 -2.690
4 3 2
Table 6. -2.712 -2.660
5 2 1
TABLE 6 MODEL PERFORMANCE OF EACH TIME SERIES PREDICTION MODEL The following examines the effect of model
Evaluation index NMSE Rank D Rank extrapolation, and the application model examines the
stat
ARMA(4,6)- 3.2243 6 48% 4 remaining 19 data differences (using real data minus the data
GARCH(3,2} generated by the model) as shown in Figure 13.
S-BPNN 1.0086 5 46% 5
EMD-SSVM 1.5324 4 54% 2
Simple average 1.2026 3 51% 3
integrated prediction
(benchmark)
Nonlinear Integrated 1.1129 2 52% 6
Prediction (BPNN)
Nonlinear Integrated 0.7384 1 63% 1
Prediction (SVM)

VOLUME XX, 2017 9

REFERENCES

[1] Antonio M. Durán-Rosal, Pedro A. Gutiérrez, Salcedo-

Sanz S , et al. Dynamical memetization in coral reef
optimization algorithms for optimal time series
approximation[J]. Progress in Artificial Intelligence,
vol.8, pp.253-262, March 2019.
[2] Kewen Li, Lu Liu, Jiannan Zhai. The improved grey
model based on particle swarm optimization algorithm
for time series prediction:[J]. Engineering Applications
of Artificial Intelligence, vol.55, pp.285-291, July 2016.
[3] Aihua Y , Yine C . Data Mining Technology Based
Financial Time Series Forecasting Algorithm[J]. Journal
of Computational & Theoretical Nanoscience, vol.12,
no.12, pp.6296-6303, Dec 2015.
[4] Talaverallames R L , PérezChacón, Rubén,
FIGURE 13 Differences between model data and real data
MartínezBallesteros, Maria, et al. A Nearest
In general, the neural network model has a good Neighbours-Based Algorithm for Big Time Series Data
prediction effect, especially the first 4 predictions, with a Forecasting.[J]. Lecture Notes in Computer Science,
small deviation, the magnitude is 10 2 , and the maximum vol.14, pp.174-185, April 2016.
[5] Zuhaimy Ismail, Riswan Efendi, Mustafa Mat Deris.
deviation is 0.06. The maximum difference between the real Application of Fuzzy Time Series Approach in Electric
data and the predicted data of the ARMA model is 0.08, and Load Forecasting[J]. New Mathematics & Natural
there are 3 All of them exceed 0.06, and the accuracy of the Computation, vol.11, no.3, pp.229-248, May 2015.
model fitting and prediction is not as good as the neural [6] Yongquan Zhou, Rui Wang. An Improved Flower
Pollination Algorithm for Optimal Unmanned Undersea
network model. By comparing these two models, we can find Vehicle Path Planning Problem[J]. International Journal
that the neural network extrapolation prediction has a better of Pattern Recognition & Artificial Intelligence, vol.30,
effect. From this example, we can prove that the model no.4, pp.135-148, Feb 2016.
selection theory in this paper has its rationality. [7] Weihang S , Nan L . A Joint Classification Learning
Algorithm for Feature Sequences of Time-series Data[J].
Computer Engineering, vol.42, no.6, pp.196-200, June
V. Conclusion 2016.
This paper points out that the three models are suitable for [8] Wang W , Pedrycz W , Liu X . Time series long-term
forecasting model based on information granules and
modeling different types of time series data. As a result, the fuzzy clustering[J]. Engineering Applications of
question arises of how to select a suitable model for time Artificial Intelligence, vol.41, pp.1016-1028, May 2015.
series data analysis and prediction. Based on the full analysis [9] Nesrine Baklouti, Ajith Abraham, Adel M. Alimi. A
Beta basis function Interval Type-2 Fuzzy Neural
of various models, through the pre-analysis and pre- Network for time series applications[J]. Engineering
processing of the data, the intelligent technology of the Applications of Artificial Intelligence, vol.7, pp.259-274,
computer is applied to automatically complete the selection May 2018.
process of the corresponding model, forming an artificial [10] Tank, Alex, Foti, Nicholas, Fox, Emily. Bayesian
Structure Learning for Stationary Time Series[J].
intelligence method of model selection. Through empirical Statistics, vol.24, no.5, pp.70-78, Aug 2015.
analysis of the method, it is found that the method is practical [11] Stopar L , Skraba P , Grobelnik M , et al. StreamStory:
and helps to select a suitable model in a targeted manner to Exploring Multivariate Time Series on Multiple
Scales[J]. IEEE Transactions on Visualization &
achieve high prediction accuracy. Empirical results show that: Computer Graphics, vol.3, pp.78-98, April 2018.
from the perspective of normalized mean square error or [12] Stuart Edward Lacy, Stephen Leslie Smith, Michael
direction change statistics, the prediction performance of the Adam Lones. Using Echo State Networks for
Classification : A Case Study in Parkinson's Disease
EMD-LS S VM model is the best among three separate yield Diagnosis[J]. Artificial Intelligence in Medicine, vol.86,
time series prediction models; integrated time series The pp.53-59, March 2018.
overall performance of the analysis model is better than that [13] Yang Zhena, Zhang Wanpeng, Liu Hongfu. Real-time
Strategy Game Tactical Recommendation Based on
of the single model. Among the three integrated time series
Bayesian Network[J]. Journal of Physics Conference
analysis models, the SVM nonlinear integrated prediction Series, vol. 1168, pp.32018-32029, June 2019.
model performs most prominently. In the next step, based on [14] A. Patra, S. Das, S. N. Mishra. An adaptive local linear
the full analysis of various models, through the pre-analysis optimized radial basis functional neural network model
for financial time series prediction[J]. Neural Computing
and pre-processing of the data, the intelligent technology of & Applications, vol.28, no.1, pp.101-110, Aug 2015.
the computer is used to automatically complete the selection [15] Jalalkamali, A, Moradi, M, Moradi, N. Application of
process of the corresponding model to form an artificial several artificial intelligence models and ARIMAX
model for forecasting drought using the Standardized
intelligence method of model selection. It helps to select a Precipitation Index[J]. International Journal of
suitable model in order to achieve high prediction accuracy. Environmental Science & Technology vol.12,
no.4 ,pp.1201-1210, Nov 2014.

VOLUME XX, 2017 9

[16] Ling Teng, Junwu Zhu, Bin Li. A Voting Aggregation relationship.[J]. Lecture Notes in Computer Science,
Algorithm for Optimal Social Satisfaction[J]. Mobile vol.6948, pp.250-264, Sep 2017.
Networks & Applications, vol.23, pp.344-351, Sep 2017. [32] Felix G , Gonzalo Nápoles, Falcon R , et al. A Review
[17] R. Arunkumar, V. Jothiprakash, Kirty Sharma. Artificial on Methods and Software for Fuzzy Cognitive Maps[J].
Intelligence Techniques for Predicting and Mapping Artificial Intelligence Review, vol.52, pp.1707-1737,
Daily Pan Evaporation[J]. Journal of the Institution of Aug 2017.
Engineers, vol.98, no.3, pp.219-231, Aug 2017. [33] Miquel L. Alomar, Vincent Canals, Nicolas Perez-Mora.
[18] Mahmud M S , Meesad P . An innovative recurrent FPGA-Based Stochastic Echo State Networks for Time-
error-based neuro-fuzzy system with momentum for Series Forecasting[J]. Computational Intelligence &
stock price prediction[J]. Soft Computing, vol.20, no.10, Neuroscience, 2016, vol.2016, pp.892-901, Aug 2016.
pp.4173-4191, June 2015. [34] Nikolaos Kariotoglou, Maryam Kamgarpour, Tyler H.
[19] Andrzej Janusz, Marek Grzegorowski, Marcin Michalak. Summers. The Linear Programming Approach to Reach-
Predicting seismic events in coal mines based on Avoid Problems for Markov Decision Processes[J].
underground sensor measurements[J]. Engineering Mathematics, 2017, vol.60, pp.263-285, Oct 2016.
Applications of Artificial Intelligence, vol.64, pp.83-94, [35] Haiyang Yu, Zhihai Wu, Dongwei Chen. Probabilistic
Sep 2017. Prediction of Bus Headway Using Relevance Vector
[20] Mabel González, Christoph Bergmeir, Isaac Triguero. Machine Regression[J]. IEEE Transactions on
Self-labeling techniques for semi-supervised time series Intelligent Transportation Systems, 2017, vol.18, no.7,
classification: an empirical study[J]. Knowledge & pp.1772-1781, Jan 2016.
Information Systems, vol.55, pp.493-518, Aug 2017. [36] Nema M K , Khare D , Chandniha S K . Application of
[21] Ozoegwu C G . The solar energy assessment methods artificial intelligence to estimate the reference
for Nigeria: The current status, the future directions and evapotranspiration in sub-humid Doon valley[J].
a neural time series method[J]. Renewable & Applied Water Science, 2017,vol.7, no.5, pp.3903-3910,
Sustainable Energy Reviews, vol.92, pp.146-159, Sep Mar 2017.
2018. [37] Philipp Grohs, Fabian Hornung, Arnulf Jentzen. A proof
[22] Selmo Eduardo Rodrigues Júnior, Ginalber Luiz de that artificial neural networks overcome the curse of
Oliveira Serra. A novel intelligent approach for state dimensionality in the numerical approximation of Black-
space evolving forecasting of seasonal time series[J]. Scholes partial differential equations[J]. Papers, vol.2,
Engineering Applications of Artificial Intelligence, no.2, pp.1314-1325, Sep 2018.
vol.64, pp.272-285, Sep 2017. [38] Tawfek Mahmoud, Zhao Yang Dong, Jin Ma. Advanced
[23] Diana M. Sánchez-Silva, Héctor G. Acosta-Mesa, Tania method for short-term wind power prediction with
Romo-González. Semi-Automatic Analysis for multiple observation points using extreme learning
Unidimensional Immunoblot Images to Discriminate machines[J]. Journal of Engineering, vol.1, no.1, pp.29-
Breast Cancer Cases Using Time Series Data Mining[J]. 38, Oct 2017.
International Journal of Pattern Recognition and [39] Priya Nayar, Bhim Singh, Sukumar Mishra. Neural
Artificial Intelligence, vol.32, no.1, pp.18604-18621, Network based Control of SG based Standalone
March 2018. Generating System with Energy Storage for Power
[24] Pritpal Singh. Neuro-Fuzzy Hybridized Model for Quality Enhancement[J]. Journal of the Institution of
Seasonal Rainfall Forecasting: A Case Study in Stock Engineers, 2016, vol.98, no.4, pp.405-413, Sep 2016.
Index Forecasting[J]. Studies in Computational [40] Albrecht S V , Beck J C , Buckeridge D L , et al.
Intelligence, vol.611, pp.361-385, Aug 2016. Reports on the 2015 AAAI Workshop Series[J]. Ai
[25] Xuemin Xing, Debao Wen, Hsing-Chung Chang. Magazine, vol.36, no.2, pp.90-101, June 2015.
Highway Deformation Monitoring Based on an [41] Shikha Gupta, Nikita Basant, Premanjali Rai. Modeling
Integrated CRInSAR Algorithm — Simulation and Real the binding affinity of structurally diverse industrial
Data Validation[J]. International Journal of Pattern chemicals to carbon using the artificial intelligence
Recognition & Artificial Intelligence, vol.32, no.8, approaches[J]. Environmental Science & Pollution
pp.185036-185051, May 2018. Research International, vol.22, no.22, pp.17810-17821,
[26] Xiao-Xia Yin, Sillas Hadjiloucas, Yanchun Zhang. July 2015.
Pattern identification of biomedical images with time [42] Wuhui Chen, Incheon Paik, Zhenni Li. Tology-Aware
series: Contrasting THz pulse imaging with DCE- Optimal Data Placement Algorithm for Network Traffic
MRIs[J]. Artificial Intelligence in Medicine, 2016, Optimization[J]. IEEE Transactions on Computers,
vol.67, no.3, pp.1-23, Feb 2016. vol.65, no.8, pp.2603-2617, May 2016.
[27] Wanjawa B W . Evaluating the Performance of ANN [43] Raul Cristian Scarlat, Georg Heygster, Leif Toudal
Prediction System at Shanghai Stock Market in the Pedersen. Experiences With an Optimal Estimation
Period [J]. vol.5, no.3, pp.124-145, Dec 2016. Algorithm for Surface and Atmospheric Parameter
[28] Hirata T , Kuremoto T , Obayashi M , et al. Deep Belief Retrieval From Passive Microwave Data in the Arctic[J].
Network Using Reinforcement Learning and Its IEEE Journal of Selected Topics in Applied Earth
Applications to Time Series Forecasting[J]. Lecture Observations & Remote Sensing, vol.3, pp.1-14, Sep
Notes in Computer Science, 2016, vol.6, no.8, pp.30-37, 2017.
Sep 2016. [44] Mohammad Altaher, Omaima Nomir. Euler–Lagrange
[29] Atencia M, Sandoval F, Prieto A. Advances in as Pseudo-metric of the RRT algorithm for optimal-time
computational intelligence: Selected and improved trajectory of flight simulation model in high-density
papers of the 12th International Work-Conference on obstacle environment[J]. Robotica, vol.5, pp.1-19, Dec
Artificial Neural Networks (IWANN 2013)[J]. vol.164, 2015.
no.21, pp.1-4, Sep 2015. [45] Ankita Sinha, Prasanta K. Jana. MRF: MapReduce
[30] Patty Kostkova, Jane Mani-Saada, Gemma Madle. based Forecasting Algorithm for Time Series Data[J].
Agent-Based Up-to-date Data Management in National Procedia Computer Science, vol.132, pp.92-102, Dec
electronic Library for Communicable Disease[J]. 2018.
Concurrency Practice and Experience, vol.34, pp.105- [46] Goran Klepac, Robert Kopal, Leo Mršić. REFII Model
124, May 2018. as a Base for Data Mining Techniques Hybridization
[31] Wen J , Kow Y M , Chen Y . Online games and family with Purpose of Time Series Pattern Recognition[J].
ties: Influences of social networking game on family

VOLUME XX, 2017 9

Studies in Computational Intelligence, vol.611, pp.237-

270, Dec 2016.
[47] Wang T , Xia Y , Muppala J , et al. Achieving Energy
Efficiency in Data Centers Using an Artificial
Intelligence Abstraction Model[J]. IEEE Transactions
on Cloud Computing, vol.5, pp.35-48, Dec 2015.
[48] Adit B. Sanghvi, Erastus Z. Allen, Keith M. Callenberg.
Performance of an artificial intelligence algorithm for
reporting urine cytopathology[J]. Cancer Cytopathology,
vol.127, no.10, pp.241-154, Aug 2019.
[49] Chen, Xin, Xing, Pei, Luo, Yong. Surface temperature
dataset for North America obtained by application of
optimal interpolation algorithm merging tree-ring
chronologies and climate model output[J]. Theoretical &
Applied Climatology, vol.127, pp.533-549, Oct 2015.
[50] Zhang Qi, Hu Yupeng, Ji Cun. Edge Computing
Application:Real-Time Anomaly Detection Algorithm
for Sensing Data[J]. Journal of Computer Research &
Development, vol.3, pp.1009-1021, Jan 2018.
[51] H. Zheng, H. Li, J. Xiao. Research on model and
algorithm for optimal power flow of large-scale high
voltage direct current transmission system[J]. Zhongguo
Dianji Gongcheng Xuebao/proceedings of the Chinese
Society of Electrical Engineering, vol.35, no.9, pp.2162-
2169, Sep 2015.
[52] P. Sengottuvelan, N. Prasath. BAFSA: Breeding
Artificial Fish Swarm Algorithm for Optimal Cluster
Head Selection in Wireless Sensor Networks[J].
Wireless Personal Communications, vol.94, no.4,
pp.1979-1991, April 2016.

Kang Wang, Graduated from Wuhan

University in 2016. Currently works at
the School of Humanities and Media,
Pingxiang University. His research
interests include Cognitive Science,
Philosophy of Mind and Artificial
Intelligence.

VOLUME XX, 2017 9

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]

Time Series Data Mining A Case Study With Big
No ratings yet
Time Series Data Mining A Case Study With Big
7 pages
بحث حسن
No ratings yet
بحث حسن
28 pages
A Hybrid Time Series Forecasting Method Based On Neutrosophic Log
No ratings yet
A Hybrid Time Series Forecasting Method Based On Neutrosophic Log
23 pages
Load Forecasting Using Time Series Techn
No ratings yet
Load Forecasting Using Time Series Techn
15 pages
A Study On Neural Networks Approach To Time-Series Analysis
No ratings yet
A Study On Neural Networks Approach To Time-Series Analysis
4 pages
A Hybrid Time Series Forecasting Method Based On Neutrosophic Logic With Applications in Financial Issues
No ratings yet
A Hybrid Time Series Forecasting Method Based On Neutrosophic Logic With Applications in Financial Issues
21 pages
Prediction of Time Series Data Using GA-BPNN Based Hybrid ANN Model
No ratings yet
Prediction of Time Series Data Using GA-BPNN Based Hybrid ANN Model
6 pages
Fulltext PDF
No ratings yet
Fulltext PDF
28 pages
Fulltext PDF
No ratings yet
Fulltext PDF
28 pages
Wang3 1 PDF
No ratings yet
Wang3 1 PDF
34 pages
NN5 Time Series Forecasting Model
No ratings yet
NN5 Time Series Forecasting Model
29 pages
Entropy: A Labeling Method For Financial Time Series Prediction Based On Trends
No ratings yet
Entropy: A Labeling Method For Financial Time Series Prediction Based On Trends
27 pages
A Survey On Forecasting of Time Series Data
No ratings yet
A Survey On Forecasting of Time Series Data
8 pages
1 s2.0 S2314728817300715 Main
No ratings yet
1 s2.0 S2314728817300715 Main
7 pages
Notes
No ratings yet
Notes
37 pages
Hybrid ARIMA-ANN Model for Time Series Forecasting
No ratings yet
Hybrid ARIMA-ANN Model for Time Series Forecasting
14 pages
Financial Time Series Forecasting Applying Deep Learning Algorithms
No ratings yet
Financial Time Series Forecasting Applying Deep Learning Algorithms
16 pages
Time Series Forecasting Using Back Propagation Neural Network With ADE Algorithm
No ratings yet
Time Series Forecasting Using Back Propagation Neural Network With ADE Algorithm
5 pages
A Novel Hybridization of Artificial Neural Networks and ARIMA Models For Time Series Forecasting
No ratings yet
A Novel Hybridization of Artificial Neural Networks and ARIMA Models For Time Series Forecasting
12 pages
Time Series 1
No ratings yet
Time Series 1
134 pages
Expert Systems With Applications: Erdal Kayacan, Baris Ulutas, Okyay Kaynak
No ratings yet
Expert Systems With Applications: Erdal Kayacan, Baris Ulutas, Okyay Kaynak
6 pages
Enhancing Time Series Forecasting Accuracy With Deep Learning Models: A Comparative Study
No ratings yet
Enhancing Time Series Forecasting Accuracy With Deep Learning Models: A Comparative Study
10 pages
Automated ML Model Selection for Time-Series
No ratings yet
Automated ML Model Selection for Time-Series
16 pages
Artificial Neural Networks in Time Series Forecasting: A Comparative Analysis
No ratings yet
Artificial Neural Networks in Time Series Forecasting: A Comparative Analysis
21 pages
To Study and Analyze To Foresee Market Using Data Mining Technique
No ratings yet
To Study and Analyze To Foresee Market Using Data Mining Technique
4 pages
Review of Time Series Forecasting Pipelines
No ratings yet
Review of Time Series Forecasting Pipelines
19 pages
A Comparative Study and Analysis of Time
No ratings yet
A Comparative Study and Analysis of Time
7 pages
Training SubsetSelection To Improve Prediction Accuracy Ininvestment Ranking
No ratings yet
Training SubsetSelection To Improve Prediction Accuracy Ininvestment Ranking
5 pages
Forward Forecast of Stock Price Using Sliding-Window Metaheuristic-Optimized Machine-Learning Regression
No ratings yet
Forward Forecast of Stock Price Using Sliding-Window Metaheuristic-Optimized Machine-Learning Regression
11 pages
Ijirt155434 Paper
No ratings yet
Ijirt155434 Paper
5 pages
MBA Analytics For Finance 11
No ratings yet
MBA Analytics For Finance 11
12 pages
Applsci 09 05574 v2
No ratings yet
Applsci 09 05574 v2
20 pages
Neural Network for Financial Predictions
No ratings yet
Neural Network for Financial Predictions
7 pages
Development of Trading Strategies Using Time Series Based On
No ratings yet
Development of Trading Strategies Using Time Series Based On
14 pages
Ncaa D 25 02195
No ratings yet
Ncaa D 25 02195
13 pages
Financial Time Series Prediction Using Deep Learning: Ariel Navon, Yosi Keller
No ratings yet
Financial Time Series Prediction Using Deep Learning: Ariel Navon, Yosi Keller
21 pages
1330-Article Text-124126336-1-10-20211222
No ratings yet
1330-Article Text-124126336-1-10-20211222
8 pages
An Artificial Neural Network P D Q Model For Times
No ratings yet
An Artificial Neural Network P D Q Model For Times
12 pages
Advanced Fuzzy Model for Time Series
No ratings yet
Advanced Fuzzy Model for Time Series
19 pages
Jmis 8 3 183
No ratings yet
Jmis 8 3 183
8 pages
Forecast Methods For Time Series Data A Survey
No ratings yet
Forecast Methods For Time Series Data A Survey
18 pages
Parallel Multivariate Deep Learning Models For Time-Series Prediction: A Comparative Analysis in Asian Stock Markets
No ratings yet
Parallel Multivariate Deep Learning Models For Time-Series Prediction: A Comparative Analysis in Asian Stock Markets
12 pages
Retail Profit Forecasting Model
No ratings yet
Retail Profit Forecasting Model
10 pages
Time Series Forecasting Models
No ratings yet
Time Series Forecasting Models
14 pages
1 s2.0 S0957417422009447 Main
No ratings yet
1 s2.0 S0957417422009447 Main
15 pages
7 Tsa Ri
No ratings yet
7 Tsa Ri
18 pages
Scopus q4 Anggota - cm-3367（机构已修改）Bkd Kuz Nugroho
No ratings yet
Scopus q4 Anggota - cm-3367（机构已修改）Bkd Kuz Nugroho
13 pages
Causal and ANN Models for Forecasting
No ratings yet
Causal and ANN Models for Forecasting
6 pages
Deep Learning and Optimisation For Quality of Service Modelling
No ratings yet
Deep Learning and Optimisation For Quality of Service Modelling
10 pages
An Adaptive Hybrid Algorithm For Time Series Prediction in Healthcare
No ratings yet
An Adaptive Hybrid Algorithm For Time Series Prediction in Healthcare
6 pages
Gratis: Generating Time Series With Diverse and Controllable Characteristics
No ratings yet
Gratis: Generating Time Series With Diverse and Controllable Characteristics
23 pages
Meta-Learning How To Forecast Time Series
No ratings yet
Meta-Learning How To Forecast Time Series
38 pages
Applsci 13 10782
No ratings yet
Applsci 13 10782
12 pages
2
No ratings yet
2
4 pages
اسلوب مقترح لمسالة اختيار افضل نموذج تكهن في السلاسل الزمنية حالة
No ratings yet
اسلوب مقترح لمسالة اختيار افضل نموذج تكهن في السلاسل الزمنية حالة
20 pages
Visvesvaraya Technological University Belagavi-590018: "Machine Learning Algorithm For Time Series Data"
No ratings yet
Visvesvaraya Technological University Belagavi-590018: "Machine Learning Algorithm For Time Series Data"
10 pages
FE Roblox Egor (Can Customize)
No ratings yet
FE Roblox Egor (Can Customize)
4 pages
Saudi General Aviation Airshow (Sand & Fun) : Single Day Pass L Promo 09:30 Tue 19 Nov 2024
No ratings yet
Saudi General Aviation Airshow (Sand & Fun) : Single Day Pass L Promo 09:30 Tue 19 Nov 2024
5 pages
Comptia Network n10 008 Exam Cram 1
No ratings yet
Comptia Network n10 008 Exam Cram 1
50 pages
Lab Da-4
No ratings yet
Lab Da-4
25 pages
SEO Content Strategy Guide
No ratings yet
SEO Content Strategy Guide
9 pages
Vulcan 2022.4 Highlights
No ratings yet
Vulcan 2022.4 Highlights
5 pages
Best Hackathon Idea For Amaravati Quantum Valley H
No ratings yet
Best Hackathon Idea For Amaravati Quantum Valley H
6 pages
Pioneer Deh-1100 1150 1190mp MPG SM
No ratings yet
Pioneer Deh-1100 1150 1190mp MPG SM
67 pages
Bbs 1
No ratings yet
Bbs 1
1 page
Azzurro PH Hyd zp1 en
No ratings yet
Azzurro PH Hyd zp1 en
1 page
LEAP PPT Nicole Cusap - 102232
No ratings yet
LEAP PPT Nicole Cusap - 102232
10 pages
Script NPC
No ratings yet
Script NPC
18 pages
Advanced Drones for Agriculture
No ratings yet
Advanced Drones for Agriculture
2 pages
Overview of Forging Processes
No ratings yet
Overview of Forging Processes
31 pages
WavLM: Advancements in Speech Processing
No ratings yet
WavLM: Advancements in Speech Processing
14 pages
2019-20 ICSE Class 10 Computer SQP
No ratings yet
2019-20 ICSE Class 10 Computer SQP
6 pages
Paper 17 Modern Regularization Methods For Inverse Problems
No ratings yet
Paper 17 Modern Regularization Methods For Inverse Problems
97 pages
3D Cube Model Specifications and Details
No ratings yet
3D Cube Model Specifications and Details
7 pages
DW Components
No ratings yet
DW Components
30 pages
Media and Information Literacy (Mil)
No ratings yet
Media and Information Literacy (Mil)
52 pages
Praveen DBT and Snowflake Training 9703180969 - Jan 31
No ratings yet
Praveen DBT and Snowflake Training 9703180969 - Jan 31
10 pages
TCS NQT First Level Shortlisted - Sairam
No ratings yet
TCS NQT First Level Shortlisted - Sairam
12 pages
HP Compaq
No ratings yet
HP Compaq
34 pages
Track Consignment
No ratings yet
Track Consignment
1 page
a&s Security 50 - Top 50 2024
No ratings yet
a&s Security 50 - Top 50 2024
1 page
LRM-C 230911 135749
No ratings yet
LRM-C 230911 135749
5 pages
FANUC AC Servo Motor β Series Overview
No ratings yet
FANUC AC Servo Motor β Series Overview
26 pages
Lab Programs 12,13,14,15,16
No ratings yet
Lab Programs 12,13,14,15,16
3 pages
Chapter1 - Introduction To Modern Facilities Planning and Design
No ratings yet
Chapter1 - Introduction To Modern Facilities Planning and Design
62 pages
Protection Layers Sis
No ratings yet
Protection Layers Sis
18 pages

Artificial Intelligence Algorithm For Optimal Time

Uploaded by

Artificial Intelligence Algorithm For Optimal Time

Uploaded by

This article has been accepted for publication in a future issue of this journal, but has not been

Artificial intelligence algorithm for optimal time

Corresponding Author: Kang Wang, E-mail: kaobowang@[Link]

I. INTRODUCTION and natural language processing, etc. [[1]. Many intelligent

VOLUME XX, 2017 1

VOLUME XX, 2017 9

A. Three types of models generating observations. A complete HMM model consists of

FIGURE 1 Topological structure of hidden Markov model

memory of past patterns and the memory of past noise. The

B. Characteristic analysis of time series data

VOLUME XX, 2017 9

lot of research to reveal its data characteristics, the following

FIGURE 3 Hidden Markov Model Generates Sequence Data

C. Model comparison and model selection theory

VOLUME XX, 2017 9

If the data is not stable then hi  k   Ai i21   Gi hi 1 (7)

VOLUME XX, 2017 9

VOLUME XX, 2017 9

VOLUME XX, 2017 9

VOLUME XX, 2017 9

FIGURE 6 HBHCTS algorithm flowchart

O14 6. 52E-169 4. 67E-160

O15 5. 68E-168 7.38E-160

VOLUME XX, 2017 9

VOLUME XX, 2017 9

increased, and after the distance threshold is greater than 0.08,

VOLUME XX, 2017 9

VOLUME XX, 2017 9

[1] Antonio M. Durán-Rosal, Pedro A. Gutiérrez, Salcedo-

VOLUME XX, 2017 9

VOLUME XX, 2017 9

Studies in Computational Intelligence, vol.611, pp.237-

Kang Wang, Graduated from Wuhan

VOLUME XX, 2017 9

You might also like