0% found this document useful (0 votes)
95 views7 pages

Hidden Markov Models for Stock Prediction

Markov

Uploaded by

jeinny.peralta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views7 pages

Hidden Markov Models for Stock Prediction

Markov

Uploaded by

jeinny.peralta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II - IDENTIFICATION AND OPTIMAL CONTROL 1

Hidden Markov Models for Stock Market Prediction


Luigi Catello, Ludovica Ruggiero, Lucia Schiavone, and Mario Valentino

Abstract— The stock market presents a challenging environ- complex mathematical models and statistical techniques. It is
ment for accurately predicting future stock prices due to its estimated that 50 percent of stock trading volume in the U.S.
intricate and ever-changing nature. However, the utilization of is currently being driven by algorithmic trading [3].
advanced methodologies can significantly enhance the precision
of stock price predictions. One such method is Hidden Markov While algorithmic trading offers significant advantages for
Models (HMMs). HMMs are statistical models that can be used institutional investors, retail investors should exercise caution
to model the behavior of a partially observable system, making when relying solely on algorithmic strategies. The success of
them suitable for modeling stock prices based on historical data.
arXiv:2310.03775v1 [eess.SY] 5 Oct 2023

algorithmic trading is often attributed to extensive research,


Accurate stock price predictions can help traders make better robust infrastructure, and substantial resources that may not
investment decisions, leading to increased profits.
In this article, we trained and tested a Hidden Markov Model be readily available to individual retail investors. Moreover,
for the purpose of predicting a stock closing price based on market conditions can change rapidly, rendering algorithms
its opening price and the preceding day’s prices. The model’s ineffective or even exacerbating losses. Retail investors, lack-
performance has been evaluated using two indicators: Mean ing the same level of resources and risk management capabil-
Average Prediction Error (MAPE), which specifies the average ities, face greater risks when solely depending on algorithmic
accuracy of our model, and Directional Prediction Accuracy
(DPA), a newly introduced indicator that accounts for the number trading.
of fractional change predictions that are correct in sign. In recent years, many strategies have been developed for
Index Terms—Hidden Markov Models, Stock market, forecast-
algorithmic trading. Since Hidden Markov Models (HMMs)
ing. have emerged as powerful tools for the prediction of time
series data, we expect them to give promising results in the
context of the stock market prediction.
I. I NTRODUCTION This paper is organized as follows: in Section II, we briefly

P REDICTION of the stock market, with its inherent


volatility and potential for substantial financial gains,
has long captivated the attention of institutional investors,
review various stock market prediction approaches; Section III
outlines our theoretical approach with relevant definitions and
mathematical equations; in Section IV describes our technical
hedge funds, and proprietary trading firms. These sophisticated implementation and solutions to encountered challenges; in
market participants are driven by the desire to make accurate Section V we present our results, compare them with other
predictions about future price movements and trends in order models, and introduce a novel evaluation metric; finally, in
to gain a competitive edge and maximize their investment Section VI we summarize key findings and suggest directions
returns. for further developments.
Institutional investors, such as pension funds and insurance
companies, manage large pools of capital on behalf of their
II. P REVIOUS W ORKS
clients or beneficiaries. The primary objective of these in-
vestors is to generate consistent returns over the long term Various approaches have been explored in the quest for
to fulfill their financial obligations. Accurate predictions in reliable prediction models as accurate forecasting of stock
the stock market allow them to identify opportunities and prices and identification of trends are crucial for investors and
mitigate risks, enhancing portfolio performance and meeting financial institutions seeking informed decisions.
their fiduciary responsibilities [1]. Before the emergence of artificial intelligence, probabilistic
Hedge funds, on the other hand, are private investment techniques formed the foundation of stock market forecasting.
partnerships that pool capital from accredited investors. Hedge These theories encompassed random walk models [4], [5],
fund managers seek to generate significant absolute returns, correlation-based methods [6], [7], scaling properties [8], [9],
often irrespective of market conditions, by employing diverse stock market volatility and investor sentiment [10], [11],
investment strategies. Accurate predictions empower hedge probability distributions of stock price returns [12], [13], and
funds to identify mispriced securities, exploit market ineffi- other relevant approaches.
ciencies, and construct profitable trading strategies, ultimately With the advancement of machine learning and the increas-
attracting investors and earning substantial profits [2]. ing computational power of computers, several techniques
These sophisticated market participants employ a wide utilizing big data and artificial intelligence algorithms gained
range of quantitative and qualitative methods to make pre- popularity. Support Vector Machines (SVM) were quickly
dictions in the stock market. recognized as promising for time series prediction [14], and
Algorithmic strategies rely on computer algorithms that were first employed for financial forecasting in 2001 [15].
systematically analyze vast amounts of market data, identify Hybrid models combining ARIMA (Autoregressive Integrated
patterns, and generate trade signals. These algorithms are Moving Average) and SVM were introduced by Pai et al. in
programmed to execute trades automatically, often leveraging 2005 [16]. Neural networks, including specific types coupled
UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II - IDENTIFICATION AND OPTIMAL CONTROL 2

with Genetic Algorithms, have been utilized to detect temporal Let xk ∈ Rn be the vector containing the probabilities of
patterns [17], [18]. Bollen et al. scraped data from Twitter to being in each state at time k, the system evolves according to:
forecast the stock market based on user mood in 2011 [19].
In 2005, Hassan et al. employed Hidden Markov Models xTk+1 = xTk A (2)
for time series prediction [20], building upon the concepts We refer to the initial distribution of probabilities as
introduced in Rabiner’s influential tutorial published in 1989 π = {π1 , π2 . . . πn }.
[21]. Their work demonstrated the effectiveness of HMMs,
leading to their widespread adoption in subsequent studies
[22], [23], [24]. B. Hidden Markov Models
In this project, we aimed to replicate and extend the The model presented above implicitly assumes that each
findings reported by Gupta et al. in their 2012 study [25]. We state corresponds to an observable (physical) event. Markov
used Mathworks Statistics and Machine Learning Toolbox™. chains have proven to be valuable tools for modeling sequen-
Moreover, we decided to make our project open-sourced [26], tial data in various domains. However, many real-world scenar-
hoping to foster collaboration and encourage the research ios involve underlying states that influence observations but are
community to build upon our work, replicate our experiments, not directly observable. This limitation led to the development
and explore further enhancements to advance the field. of Hidden Markov Models, which extend the basic Markov
chain model by introducing hidden or unobservable states that
III. A PPROACH affect the observed data.
A. Markov Chains The hidden state process of a HMM is a Markov chain,
where each state generates an observation having a certain
A Markov chain is a stochastic model that represents a probability distribution that only depends on the state itself.
sequence of events or states. In our analysis, we will specifi- Let O = {Ok |Ok ∈ O, k = 1, . . . , T } be the observed
cally focus on first-order Markov chains, which adhere to the sequence, where O is the set of possible observations. The
Markov property. Let S = {S1 , S2 , . . . , Sn } be the set of all hidden states are assumed to evolve according to Equation 2.
possible states, and X = {Xk |Xk ∈ S, k = 1, . . . , T } be the
The probability of symbol o ∈ O being emitted by state
states time series. The Markov property states that for any
Si is described by the emission probability function bi : o 7→
k ≥ 0 and states X0 , . . . , Xk :
bi (o) ∈ [0, 1], ∀i.
P (Xk+1 = Sj |Xk = Si , Xk−1 , . . . , X0 (1) Both in the current and following sections, we refer to the
= P (Xk+1 = Sj|Xk = Si ) emission probability matrix B = {bij = bi (oj )}.

In other words, the probability of transitioning to a certain


state Sj at time step k + 1 only depends on the current state
Si at time step k and not on any previous states. This property
allows us to compute the probability distribution of the Markov
chain at any future time step based exclusively on its current
state.
Formally, a first order Markov chain is defined by the set of
states S and a transition probability matrix A = [aij ], where
aij represents the probability of transitioning from state Si to
state Sj in one step.

Fig. 2. Hidden Markov Chain Example with two states and two possible
outputs.

C. Applications of HMMs: Common Problems and Solutions


The most common problems analyzed using HMMs include
the evaluation problem, decoding problem, and parameter
estimation problem. These problems and their corresponding
solutions have been extensively studied and documented in
the literature [21]. In the following paragraphs, we provide an
Fig. 1. Example of a Markov Chain with three states. overview of these problems and their solutions.
1) Evaluation Problem: The evaluation problem in HMMs
The elements of the transition matrix must satisfy the involves computing the probability of a known observation
standard stochastic constraints: sequence, given the model. Specifically, given an HMM with
• 0 ≤ aij ≤ 1 ∀i, j. the parameters π, A, and B, and an observed sequence O, we
Pn
• j=1 aij = 1 ∀i. want to compute P (O|π, A, B).
UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II - IDENTIFICATION AND OPTIMAL CONTROL 3

The procedure used to solve this problem involves comput- state at each time step. It complements the forward procedure
ing the forward variable, denoted as αk (i), which represents by computing the backward variable, denoted as βk (i), which
the probability of being in state Si at time k and observing represents the probability of being in state Si at time k and
the sequence up to time k. It is computed recursively for each observing the remaining sequence from time k + 1 to the end.
time step k and state Si . The backward variable is computed recursively for each time
At the initialization step, the forward variable for the ini- step k and state Si . At the initialization step, the backward
tial time step k = 1 is calculated as: α1 (i) = πi · bi (O1 ), ∀i, variable for the final time step is set as: βT (i) = 1, ∀i.
where πi is the initial probability of being in state Si and For each time step k < T , the backward variable βk (i) is
bi (O1 ) is the emission probability for the first observation O1 . computed as the sum of the probabilities of all possible paths
For each subsequent time step k > 1, the forward variable that can be generated from state Si at time k to the end of the
αk (i) is computed as the sum of the probabilities of all sequence. This is expressed by the following formula:
possible paths that could have generated the observations up to n
time k and reached state Si . This is expressed by the following
X
βk (i) = βk+1 (j) · aij · bj (Ok+1 ) (5)
formula: j=1
 
n
αk (i) = 
X
αk−1 (j) · aji  · bi (Ok ), ∀i (3) After computing the forward and backward variables, they
j=1
are used to refine the parameters of the HMM based on the
observed sequence. In fact, let ξk (i, j) be the probability of
By recursively calculating the forward variables for each being in state Si at time k and state Sj at time k + 1, given
time step, the algorithm evaluates the probability of observing the model and the observation sequence. It can be computed
the entire sequence, given the HMM parameters. as:
Finally, to obtain the overall probability of the observed
sequence, the forward procedure computes the sum of the P (Xk = i, Xk+1 = j, O|π, A, B)
ξk (i, j) = (6)
forward variables at the final time step T : P (O|π, A, B)
αk (i) · aij · bj (Ok+1 ) · βk+1 (j)
n = Pn Pn
h=1 αk (r) · arh · bh (Ok+1 ) · βk+1 (h)
X
P (O|π, A, B) = αT (i) (4) r=1
i=1 Let’s also define γk (i) as the probability of being in state Si
2) Decoding Problem: The decoding problem involves de- at time k, given the observation sequence and the model:
termining the most likely sequence of hidden states given an
observation sequence and the model. Given an HMM with γk (i) = P (Xk = i|O, π, A, B)
the parameters π, A, and B, and an observed sequence O, αk (i) · βk (i)
= Pn (7)
we want to find the hidden state sequence X that maximizes j=1 αk (j) · βk (j)
P (X|O, π, A, B).
The solution to the decoding problem is commonly ad- We can relate those quantities:
dressed using the Viterbi algorithm. n
X
The algorithm works by iteratively calculating the most γk (i) = ξk (i, j) (8)
likely path to each state at each time step, considering both j=1
the current observation and the previous states probabilities. PT −1
At each time step, it computes the probability of being in each Moreover, k=1 γk (i) represents the expected number of
PT −1
state and tracks the most probable path leading to that state. transitions from state Si , and k=1 ξk (i, j) represents the
3) Parameter Estimation Problem: The parameter estima- expected number of transitions from state Si to state Sj .
tion problem in HMMs involves adjusting the model param- The update step involves estimating the new values for the
eters to maximize the probability of an observed sequence. initial state probabilities, transition probabilities, and emission
Given an observation sequence O, we want to estimate the probabilities. For the initial state probabilities, the updated
optimal values for the parameters π, A, and B that maximize values are calculated as the normalized forward-backward
P (O|π, A, B). probabilities at the initial time step:
The solution to the this problem is typically addressed using
the forward-backward algorithm, also known as the Baum- πi = γ1 (i) (9)
Welch algorithm. The transition probabilities are updated based on the ex-
The Baum-Welch algorithm is a specific implementation pected number of transitions between states, normalized over
of the Expectation–Maximization (EM) algorithm, tailored the expected number of transitions from each state
for HMMs. It iteratively performs three main steps: forward,
PT −1
backward, and update step. In the forward step, the algorithm ξk (i, j)
computes the probability of being in a specific state at each aij = Pk=1 T −1
(10)
k=1 γk (i)
time step, given the observed sequence up to that point. This
is the evaluation problem explained before. The estimate for the emission probability bj (o) — the prob-
In the backward step, the algorithm computes the probability ability of observing symbol o from state Sj — is calculated by
of observing the remaining part of the sequence from a given evaluating the expected number of times in which o is emitted
UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II - IDENTIFICATION AND OPTIMAL CONTROL 4

from state j. This value is then normalized by the expected utilised for the training and testing of the HMM: hmmtrain,
number of transitions from state Sj : hmmdecode and fitgmdist.
PT   The model was trained and tested on the historical daily
k=1 γ k (j) · δ(Ok , o) prices of Apple, IBM and Dell stocks that are publicly avail-
bj (o) = PT (11) able on the web. Each observation in our dataset comprises
k=1 γk (j)
three distinct values representing the daily fractional change,
where δ(Ok , o) is the Kronecker delta function that evaluates fractional high, and fractional low prices.
to 1 when Ok is equal to the observed symbol o, and 0  
Close − Open High − Open Open − Low
otherwise. Ok = , ,
Open Open Open
This process iteratively refines the model parameters until
convergence, where they stabilize, meaning that the likelihood 
of the sequence is maximized. Ok := f racChange, f racHigh, f racLow (13)
Choosing the initial conditions wisely is crucial as it can The array Ok is a three-dimensional array consisting of
significantly impact the estimation of the HMM parameters. real values. Since the probability of guessing any real value
By selecting appropriate initial conditions, the Baum-Welch is mathematically zero, it was necessary to discretize the
algorithm is more likely to converge to some parameters that observations. The number of points used for the discretization
provide accurate modeling of the underlying dynamics. is specified in Table I.

D. Gaussian Mixture Models TABLE I


N UMBER OF DISCRETE POINTS FOR EACH DIMENSION
Gaussian Mixture Models (GMMs) are powerful probabilis-
Variable Number of points
tic models used to represent complex probability distributions. fracChange nfC = 50
A GMM is a weighted sum of multiple Gaussian distributions, fracHigh nfH = 10
where each component represents a subpopulation within the fracLow nfL = 10
data.
When a single observation is composed by multiple data We used dynamic edges for the discretization: at each train,
(multivariate data), GMMs are employed as multivariate mix- the maximum and minimum values of fracChange, fracHigh,
ture models. Each Gaussian component in a GMM represents a and fracLow are computed. We then generated three linearly
multivariate distribution with its own mean vector and covari- spaced vectors for the edges, using as the lowest values the
ance matrix. The probability density function of a multivariate min, and as the highest values the max values. Finally, we
GMM is given by: assigned to each discrete value its corresponding index in the
edges array.
K
X Furthermore, compatibility with our monovariate Hidden
p(y) = ci N (y|µi , Σi ) (12)
Markov Model required mapping the three-dimensional space
i=1
to a one-dimensional space. The mapping process was accom-
where K is the number of Gaussian components, ci rep- plished by enumerating the points within the discrete three-
resents the weight associated with the i-th component, and dimensional set, as described in Equation 14. The inverse
N (y|µi , Σi ) denotes the multivariate Gaussian distribution mapping was achieved using Equations 15:
with mean vector µi and covariance matrix
PK Σi .
The weights ci satisfy the constraint i=1 ci = 1, ensuring n = (z − 1) · (xmax · ymax ) + (y − 1) · xmax + x (14)
that the probabilities sum up to one. These weights control
the contribution of each Gaussian component to the overall  
n−1
distribution. z= +1
x ·y
GMMs are particularly useful when modeling complex mul-  max max 
tivariate data that exhibit multiple modes or clusters because n − 1 − (z − 1) · xmax · ymax
y= +1 (15)
they can capture the inherent structure and variability present xmax
in the data. x = (n − 1) mod (xmax ) + 1
In the context of Hidden Markov Models, GMMs are
employed to initialize the emission probabilities of the hidden A. Training
states. The GMM is fitted on the training dataset: the training
algorithm estimates the parameters of each Gaussian compo- During the initial trains we assumed four underlying hidden
nent, i.e. the mean vectors and covariance matrices. These states, where each state generates outputs represented by a
parameters are used to initialize the emission probabilities of GMM with four components. We then re-trained our model
the corresponding hidden states. varying the values. The specific parameters used for each
train can be found on our GitHub repository [26]. The pa-
rameters of these GMMs were estimated using the MATLAB
IV. T ECHNICAL IMPLEMENTATION function fitgmdist, which optimizes the model using the
In our study, three specific functions provided by the Expectation-Maximization (EM) algorithm. The initial param-
Statistics and Machine Learning Toolbox™ in MATLAB were eters of the GMM are obtained through k-means clustering.
UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II - IDENTIFICATION AND OPTIMAL CONTROL 5

The resulting probability density function served as the initial On day i, pi is the predicted stock closing price, ci is the actual
estimate for the emission matrix. Moreover, we assigned a stock closing price, and np is the total number of predictions.
uniform distribution of probabilities as the initial estimate for Table II shows our results on two different stocks, compared
the transition matrix. to the results from other papers [25], [24].
The training data for the HMM is constructed by using a
rolling window approach. In this approach, each observation TABLE II
sequence spans a fixed duration of 10 days. We refer to this C OMPARISON OF M EAN A BSOLUTE P ERCENTAGE E RROR
duration as latency. The window is shifted incrementally along Stock Our HMM- HMM + Fuzzy ARIMA /
the training period: the first sequence captures observations Name HMM MAP [25] Model [24] ANN
from the initial time point, while each following sequence Apple 1.50 1.51 1.77 1.80
incorporates new observations by sliding the window by one Inc.
IBM 0.68 0.61 0.78 0.97
day. The dataset is then provided as input to the MATLAB Corp.
function hmmtrain. This function utilizes the Baum-Welch
algorithm to estimate the transition and emissions matrices. Predicting the movement of stock prices for a single day is
These estimations are initialized with the initial guesses dis- a challenging task in itself. Achieving accuracy in predicting
cussed previously. multiple consecutive days becomes even more difficult and
often approaches the realm of impossibility. Recognizing
B. Prediction the potential utility of predicting whether the stock value
Following the training phase, we proceeded to test our will increase or decrease during the day, we introduced a
model by predicting the stock daily close prices for different novel evaluation metric called Directional Prediction Accuracy
time frames. For each day d within the target period, the (DPA). DPA measures the percentage of correct directional
prediction process involved the following steps: predictions, providing valuable information about the accuracy
1) We considered the last latency − 1 = 9 observations of our model.
np
available. These observations represent the preceding 9 1 X 
days. DPA = δ sgn(pi − si ), sgn(ci − si ) · 100% (17)
np i=1
2) Next, we appended each possible output for the current
day d, creating a 10-day sequence. This sequence now In Equation 17, np is the total number of predictions, δ is
encompasses the 9 historical observations and one po- the Kronecker delta function, pi is the predicted stock closing
tential observation for the next day. There are nfc ·nfH ·nfL price, ci the actual closing price, and si is the stock opening
possibilities for the current day. price. Table III demonstrates that MAPE and DPA are not
3) We computed the probability for each sequence to be equivalent measures as they are in general not correlated.
generated from our trained model. Finally, we selected
the observation with the highest emission probability as TABLE III
D IRECTIONAL P REDICTION ACCURACY
the observation for the next day.
In certain cases, there may arise situations where the prob- Stock Name MAPE DPA Training Testing
2003-02-10 2004-09-13
ability of emitting the historical observations, along with any Apple Inc. 1.50% 52.11%
2004-09-10 2005-01-21
hypothetical observation, becomes 0 or very close to 0. This 1 2003-02-10 2004-10-13
1.73% 63.27%
can occur due to numerical errors or limitations inherent the 2004-09-10 2005-01-21
2021-01-04 2023-01-03
model. However, we have found that incorporating a dynamic 1.05% 53.06%
2022-01-03 2023-06-30
window can enhance the performance of our model. 2017-01-03 2023-01-03
1.01% 40.57%
To address the issue, we have modified the prediction 2019-01-03 2023-06-30
2003-02-10 2004-10-13
algorithm as follows: if the highest probability obtained is 0, IBM Corp. 0.77% 54.55%
2004-09-10 2005-01-21
we repeat the prediction algorithm while gradually reducing 2003-02-10 2004-10-13
0.82% 57.58%
the latency by one day. By reducing the latency, we aim to find 2004-09-10 2005-01-21
2003-02-10 2004-09-13
a viable solution where the emitted probabilities are non-zero. 0.68% 60.23%
2004-09-10 2005-01-21
This process of reducing latency is repeated iteratively until a 2 2021-01-04 2023-01-04
0.88% 52.73%
solution is found, while ensuring that the historical sequence 2023-01-03 2023-07-11
2021-01-04 2023-01-03
remains sufficiently long. In our case, we have set a minimum Dell Inc. 1.45% 60.32%
2022-01-03 2023-07-11
requirement of four days for the historical sequence.
Figure 4 remarks the difference between the two indicators:
V. R ESULTS the candlestick chart shows the actual prices for AAPL during
We used multiple metrics to evaluate the performance of 1 Figure 3 shows predicted and actual close values for this train. Green lines
our models. Mean Absolute Percentage Error (MAPE) is indicate predictions that guessed right the sign of the fractional change with
calculated between the actual and predicted stock closing respect to the opening price. Therefore, 63.27% of the predictions are correct
prices. in sign. The high MAPE reflects the accuracy of the predictions that often
np deviate from the actual close value, although being correct in sign.
1 X |pi − ci | 2 The results of this train are shown in Figure 5: since the MAPE is low,
MAPE = · 100% (16)
np i=1 |ci | predicted close values are, on average, more accurate.
UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II - IDENTIFICATION AND OPTIMAL CONTROL 6

Fig. 3. Actual vs Forecasted close prices for AAPL.

Our model was trained on a time period of one to two


years and used to make predictions on a different time span,
demonstrating its flexibility and re-usability. The complete
source code, along with pre-trained models and a summary
of their characteristics, is available on GitHub [26].
Our implementation underwent rigorous testing on various
stocks, and we compared the results with those obtained from
HMM-MAP, HMM-Fuzzy, ARIMA, and ANN models. We
conducted thorough exploration of different hyperparameters,
measuring their impact in search of optimal values. The
outcomes demonstrated a significantly lower Mean Absolute
Percentage Error compared to HMM-Fuzzy, ARIMA, and
ANN models when trained on the same years as those in
Fig. 4. Candlestick chart along with predicted close values for AAPL [25] and [24]. Furthermore, to gain additional insights into
the efficiency of our model, we developed a novel evaluation
metric named Directional Prediction Accuracy (DPA). The
the testing period 2004-10-13 to 2005-01-21. The dots show DPA allowed us to assess the accuracy of our predictions in
the predicted close values for the corresponding day. When the capturing stock price movements, providing valuable informa-
dots are green, they contribute to increasing the DPA. They tion for model performance assessment.
only improve the MAPE when they get closer to the actual For further improvement, we propose fine-tuning the Emis-
close value. sion and Transmission matrices on the latest time window each
day before making predictions. This approach involves training
the main model on historical data, then updating the matrices
VI. C ONCLUSIONS AND F URTHER D EVELOPMENTS
with data from a more recent period just before making
In this study, we have undertaken a comprehensive im- each prediction. For example, we could train the main model
plementation of the Stock Prediction Hidden Markov Model from 2021-01-01 to 2023-01-01 and fine-tune the matrices
originally proposed by Gupta et al. [25]. Building upon their using data from 2022-06-01 to 2023-06-01 before making
foundational work, we aimed to assess the model’s perfor- a prediction for 2023-06-02. This process could potentially
mance on diverse datasets and explore its effectiveness in capture more recent market trends and improve the accuracy
predicting stock market movements. By carefully replicating of predictions.
and extending their approach, we have conducted a rigorous In conclusion, our implementation of the Stock Prediction
evaluation, comparing the results with benchmark models to HMM exhibited strong predictive capabilities, outperforming
gain valuable insights into the model capabilities and limita- other benchmark models. The proposed fine-tuning approach
tions. holds promise for enhancing future predictions and warrants
UNIVERSITÀ DEGLI STUDI DI NAPOLI FEDERICO II - IDENTIFICATION AND OPTIMAL CONTROL 7

Fig. 5. Actual vs Forecasted close prices for IBM

further investigation to capitalize on recent market dynamics. [14] S. Mukherjee, E. Osuna, and F. Girosi, “Nonlinear prediction of chaotic
Our work contributes to the growing field of stock market time series using support vector machines,” in Neural Networks for
Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing
prediction and provides valuable insights for traders and Society Workshop, 1997, pp. 511–520.
financial analysts. [15] F. E. Tay and L. Cao, “Application of support vector machines in
July 2023 financial time series forecasting,” Omega, vol. 29, no. 4, pp. 309–317,
2001.
[16] P.-F. Pai and C.-S. Lin, “A hybrid ARIMA and support vector machines
R EFERENCES model in stock price forecasting,” Omega, vol. 33, no. 6, pp. 497–505,
2005.
[1] “Institutional investors definition — nasdaq.” [Online]. Available:
[17] H. jung Kim and K. shik Shin, “A hybrid approach based on neural
https://www.nasdaq.com/glossary/i/institutional-investors
networks and genetic algorithms for detecting temporal patterns in stock
[2] “Investor bulletin: Hedge funds.” [Online]. Avail-
markets,” Applied Soft Computing, vol. 7, no. 2, pp. 569–576, 2007.
able: https://www.investor.gov/introduction-investing/investing-basics/
[18] K. jae Kim and I. Han, “Genetic algorithms approach to feature
investment-products/private-investment-funds/hedge-funds
discretization in artificial neural networks for the prediction of stock
[3] “Algo or algorithmic trading definition — nasdaq.” [Online]. Available:
price index,” Expert Systems with Applications, vol. 19, no. 2, pp. 125–
https://www.nasdaq.com/glossary/a/algo-trading
132, 2000.
[4] E. F. Fama, “Random walks in stock market prices,” Financial Analysts
[19] J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock
Journal, vol. 51, no. 1, pp. 75–80, 1995, publisher: Routledge.
market,” Journal of Computational Science, vol. 2, no. 1, pp. 1–8, 2011.
[5] A. W. Lo and A. C. MacKinlay, “Stock market prices do not follow
[20] M. R. Hassan and B. Nath, “Stock market forecasting using hidden
random walks: Evidence from a simple specification test,” The Review
markov model: a new approach,” in 5th International Conference on
of Financial Studies, vol. 1, no. 1, pp. 41–66, 2015-04.
Intelligent Systems Design and Applications (ISDA’05), 2005, pp. 192–
[6] F. Longin and B. Solnik, “Is the correlation in international equity returns
196.
constant: 1960–1990?” Journal of International Money and Finance,
[21] L. Rabiner, “A tutorial on hidden markov models and selected applica-
vol. 14, no. 1, pp. 3–26, 1995.
tions in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2,
[7] Z. Ding, C. W. J. Granger, and R. F. Engle, “A long memory property of
pp. 257–286, 1989.
stock market returns and a new model,” Journal of Empirical Finance,
[22] M. Zhang, X. Jiang, Z. Fang, Y. Zeng, and K. Xu, “High-order hidden
vol. 1, no. 1, pp. 83–106, 1993.
markov model for trend prediction in financial time series,” Physica A:
[8] T. D. Matteo, T. Aste, and M. M. Dacorogna, “Scaling behaviors in
Statistical Mechanics and its Applications, vol. 517, pp. 1–12, 2019.
differently developed markets,” Physica A: Statistical Mechanics and
[23] Y. Zhou and J. Zheng, “Research on stock market forecasting model
its Applications, vol. 324, no. 1, pp. 183–188, 2003.
based on data mining and markov chain,” in 2022 IEEE 2nd Interna-
[9] R. N. Mantegna and H. E. Stanley, “Scaling behaviour in the dynamics
tional Conference on Mobile Networks and Wireless Communications
of an economic index,” Nature, vol. 376, no. 6535, pp. 46–49, 1995.
(ICMNWC), 2022, pp. 1–5.
[10] K. R. French, G. W. Schwert, and R. F. Stambaugh, “Expected stock
[24] M. R. Hassan, “A combination of hidden markov model and fuzzy
returns and volatility,” Journal of Financial Economics, vol. 19, no. 1,
model for stock market forecasting,” Neurocomputing, vol. 72, no. 16,
pp. 3–29, 1987.
pp. 3439–3446, 2009.
[11] W. Y. Lee, C. X. Jiang, and D. C. Indro, “Stock market volatility,
[25] A. Gupta and B. Dhingra, “Stock market prediction using hidden markov
excess returns, and the role of investor sentiment,” Journal of Banking
models,” in 2012 Students Conference on Engineering and Systems,
& Finance, vol. 26, no. 12, pp. 2277–2299, 2002.
2012, pp. 1–4.
[12] X. Gabaix, P. Gopikrishnan, V. Plerou, and H. E. Stanley, “A theory
[26] (2023) valentinomario/HMM-stock-market-prediction. [Online]. Avail-
of power-law distributions in financial market fluctuations,” Nature, vol.
able: https://github.com/valentinomario/HMM-Stock-Market-Prediction
423, no. 6937, pp. 267–270, 2003, publisher: Nature Publishing Group
UK London.
[13] R. N. Mantegna, Z. Palágyi, and H. E. Stanley, “Applications of
statistical mechanics to finance,” Physica A: Statistical Mechanics and
its Applications, vol. 274, no. 1, pp. 216–221, 1999.

You might also like