0% found this document useful (0 votes)

44 views23 pages

Time Series

Uploaded by

Alferino Filho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views23 pages

Time Series

Uploaded by

Alferino Filho

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Steps implementation of Time Series Data EDA

1. Data Ingesition
2. EDA of the Data
3. processing of Data
4. Model Building
5. Model Evalution

Data Ingestion Steps:-

1. import the required libraries such as numpy,pandas,matplotlib,seaborn,etc
2. Load the data
3. Load the time series data into a pandas dataframe
4. Set the datetime columns as the index of dataframe
5. Check datatype of the index and convert it into the dataframe if necessary

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
import sys
import warnings
warnings.filterwarnings('ignore')

df=pd.read_csv('TSLA.csv')
df

Date Open High Low Close Volume Dividends Stock Splits

0 2023-01-01 102.264052 102.844516 102.016732 102.375100 190884 0.0 0.0

1 2023-01-02 103.164210 103.568883 103.072105 103.268399 144529 0.0 0.0

2 2023-01-03 104.642948 104.945523 104.396706 104.661726 114590 0.0 0.0

3 2023-01-04 107.383841 107.749974 107.409781 107.514532 144406 0.0 0.0

4 2023-01-05 109.751399 109.687393 108.002799 109.147197 152652 0.0 0.0

... ... ... ... ... ... ... ... ...

360 2023-12-27 274.683259 274.739668 274.622839 274.681922 198906 0.0 0.0

361 2023-12-28 275.187029 275.220635 274.802580 275.070082 171058 0.0 0.0

362 2023-12-29 276.618878 277.740538 276.938281 277.099232 108824 0.0 0.0

363 2023-12-30 277.458843 278.365180 277.325499 277.716507 119610 0.0 0.0

364 2023-12-31 277.943161 278.736790 276.368373 277.682775 106382 0.0 0.0

365 rows × 8 columns

df.isnull().sum()

Date 0

Open 0

High 0

Low 0

Close 0

Volume 0

Dividends 0

Stock Splits 0

dtype: int64

Now we perform univariate analysis

df = df[['Date','Close']]
df
Date Close

0 2023-01-01 102.375100

1 2023-01-02 103.268399

2 2023-01-03 104.661726

3 2023-01-04 107.514532

4 2023-01-05 109.147197

... ... ...

360 2023-12-27 274.681922

361 2023-12-28 275.070082

362 2023-12-29 277.099232

363 2023-12-30 277.716507

364 2023-12-31 277.682775

365 rows × 2 columns

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 365 entries, 0 to 364
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 365 non-null object
1 Close 365 non-null float64
dtypes: float64(1), object(1)
memory usage: 5.8+ KB

df["Date"]=pd.to_datetime(df.Date)

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 365 entries, 0 to 364
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 365 non-null datetime64[ns]
1 Close 365 non-null float64
dtypes: datetime64[ns](1), float64(1)
memory usage: 5.8 KB

stock_df=df.set_index("Date")
stock_df

Date

2023-01-01 102.375100

2023-01-02 103.268399

2023-01-03 104.661726

2023-01-04 107.514532

2023-01-05 109.147197

... ...

2023-12-27 274.681922

2023-12-28 275.070082

2023-12-29 277.099232

2023-12-30 277.716507

2023-12-31 277.682775

365 rows × 1 columns

Why we convert this columns into index

1. Retrving of the data will be easy
2. Visualization will be easy
3. Whatever library we are using for the time series data like stats model scipy this library of data which is having index as a columns.

EDA of the Data

EDA of the Data
1. Summary Statistics such as mean,median,mode etc
2. Visualization the time series data
3. Stationarity check by using augmented dickey fuller test.
4. Check for Autocorrelation by using autocorrelation function (acf)
5. checking the Outlier
6. Check Partial autocorrelation function using ARIMA model.

Preprocessing of the data

1. fill the missing value (hear not required)
2. convert data into stationary time series
3. if necessary the normalized the data(hear(not required))
4. split the data into train and test.
5. clean the data by removing the outliers(hear not required)

stock_df.describe()

count 365.000000

mean 199.661626

std 51.101389

min 102.375100

25% 147.327615

50% 205.663111

75% 238.942848

max 277.716507

stock_df.head()

Date

2023-01-01 102.375100

2023-01-02 103.268399

2023-01-03 104.661726

2023-01-04 107.514532

2023-01-05 109.147197

plt.plot(stock_df)
plt.show()

plt.hist(stock_df)
plt.hist(stock_df)
plt.show()

sns.distplot(stock_df)
plt.show()

# plotting close price

plt.style.use('ggplot')
plt.figure(figsize=(18,8))

plt.grid(True)

plt.xlabel('Dates', fontsize = 20)

plt.xticks(fontsize = 15)

plt.ylabel('Close Prices', fontsize = 20)

plt.yticks(fontsize = 15)

plt.plot(stock_df['Close'], linewidth = 3, color = 'blue')

plt.title('Tesla Stock Closing Price', fontsize = 30)

plt.show()
# plotting close price

plt.style.use('ggplot')
plt.figure(figsize=(18,8))

plt.grid(True)

plt.xlabel('Dates', fontsize = 20)

plt.xticks(fontsize = 15)

plt.ylabel('Close Prices', fontsize = 20)

plt.yticks(fontsize = 15)

plt.hist(stock_df['Close'], linewidth = 3, color = 'blue')

plt.title('Tesla Stock Closing Price', fontsize = 30)

plt.show()

# Style and figure size

plt.style.use('ggplot')
plt.figure(figsize=(18, 8))

# Labeling
plt.xlabel('Dates', fontsize=20)
plt.xticks(fontsize=15)
plt.ylabel('Close Prices', fontsize=20)
plt.yticks(fontsize=15)

# Plotting the distribution (Kernel Density Estimate plot)

sns.kdeplot(stock_df['Close'], color='blue', linewidth=3)

# Title
plt.title('Tesla Stock Closing Price Distribution', fontsize=30)

plt.grid(True)
plt.show()

stock_df["Close"]

Date

2023-01-01 102.375100

2023-01-02 103.268399

2023-01-03 104.661726

2023-01-04 107.514532

2023-01-05 109.147197

... ...

2023-12-27 274.681922

2023-12-28 275.070082

2023-12-29 277.099232

2023-12-30 277.716507

2023-12-31 277.682775

365 rows × 1 columns

dtype: float64

#Rolling mean which is windows size

rolemean=stock_df["Close"].rolling(120).mean()
rolemean
Close

Date

2023-01-01 NaN

2023-01-02 NaN

2023-01-03 NaN

2023-01-04 NaN

2023-01-05 NaN

... ...

2023-12-27 254.024091

2023-12-28 254.389435

2023-12-29 254.769750

2023-12-30 255.152673

2023-12-31 255.535278

365 rows × 1 columns

dtype: float64

#Rolling mean which is windows size

rolestd=stock_df["Close"].rolling(120).std()
rolestd

Date

2023-01-01 NaN

2023-01-02 NaN

2023-01-03 NaN

2023-01-04 NaN

2023-01-05 NaN

... ...

2023-12-27 14.628715

2023-12-28 14.602063

2023-12-29 14.594201

2023-12-30 14.588376

2023-12-31 14.572035

365 rows × 1 columns

dtype: float64

plt.plot(stock_df.Close)
plt.plot(rolemean)
plt.plot(rolestd)

[<matplotlib.lines.Line2D at 0x7da8e8eca110>]
from statsmodels.tsa.stattools import adfuller
adft=adfuller(stock_df['Close'])

pd.Series(adft[0:4],index=["test stats","p value","lag","data points"])

test stats -1.893196

p value 0.335269

lag 0.000000

data points 364.000000

dtype: float64

# null hypotheseis=data is not stationary

# alternate hypothesis=data is stationary
# p value=0.335269
# p<0.05
# reject null hypothesis

# p>0.05
# accept null hypothesis

def test_stationarity(timeseries):
# Determining rolling statistics
rolmean = timeseries.rolling(48).mean() # rolling mean
rolstd = timeseries.rolling(48).std() # rolling standard deviation

# Plotting rolling statistics

plt.figure(figsize=(18, 8))
plt.grid('both')
plt.plot(timeseries, color='blue', label='Original', linewidth=3)
plt.plot(rolmean, color='red', label='Rolling Mean', linewidth=3)
plt.plot(rolstd, color='black', label='Rolling Std', linewidth=4)
plt.legend(loc='best', fontsize=20, shadow=True, facecolor='lightgray',edgecolor='k')
plt.title('Rolling Mean and Standard Deviation', fontsize=25)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
plt.show(block=False)

# Perform Dickey-Fuller test

print("Results of Dickey-Fuller Test:")
adft = adfuller(timeseries, autolag='AIC')

# Displaying the output of the Dickey-Fuller test

output = pd.Series(adft[0:4], index=['Test Statistic', 'p-value', '#Lags Used', 'Number of Observations Used'
for key, value in adft[4].items():
output[f'Critical Value ({key})'] = value
print(output)

test_stationarity(stock_df.Close)

Results of Dickey-Fuller Test:

Test Statistic -1.893196
p-value 0.335269
#Lags Used 0.000000
Number of Observations Used 364.000000
Critical Value (1%) -3.448443
Critical Value (5%) -2.869513
Critical Value (10%) -2.571018
dtype: float64

#check the outliers

sns.boxplot(stock_df.Close)

<Axes: ylabel='Close'>

#Time series decomposition

from statsmodels.tsa.seasonal import seasonal_decompose
result = seasonal_decompose(stock_df[["Close"]],period=12)
result.plot()
plt.show()

result.seasonal

seasonal

Date

2023-01-01 -0.049962

2023-01-02 0.098094

2023-01-03 -0.012132

2023-01-04 0.071651

2023-01-05 0.282969

... ...

2023-12-27 -0.049962

2023-12-28 0.098094

2023-12-29 -0.012132

2023-12-30 0.071651

2023-12-31 0.282969

365 rows × 1 columns

dtype: float64

from statsmodels.graphics.tsaplots import plot_acf,plot_pacf

plot_acf(stock_df.Close)#this function provide the correlation value on differenct different lags
plot_pacf(stock_df.Close)
df_close=stock_df["Close"]
df_close
Close

Date

2023-01-01 102.375100

2023-01-02 103.268399

2023-01-03 104.661726

2023-01-04 107.514532

2023-01-05 109.147197

... ...

2023-12-27 274.681922

2023-12-28 275.070082

2023-12-29 277.099232

2023-12-30 277.716507

2023-12-31 277.682775

365 rows × 1 columns

dtype: float64

df_close

Date

2023-01-01 102.375100

2023-01-02 103.268399

2023-01-03 104.661726

2023-01-04 107.514532

2023-01-05 109.147197

... ...

2023-12-27 274.681922

2023-12-28 275.070082

2023-12-29 277.099232

2023-12-30 277.716507

2023-12-31 277.682775

365 rows × 1 columns

dtype: float64

df_close=df_close.diff()
df_close=df_close.dropna()

test_stationarity(df_close)
Results of Dickey-Fuller Test:
Test Statistic -5.281090
p-value 0.000006
#Lags Used 7.000000
Number of Observations Used 356.000000
Critical Value (1%) -3.448853
Critical Value (5%) -2.869693
Critical Value (10%) -2.571114
dtype: float64

perform train test split of time series model

df_close[0:-60]#training data

Date

2023-01-02 0.893299

2023-01-03 1.393326

2023-01-04 2.852806

2023-01-05 1.632665

2023-01-06 0.547885

... ...

2023-10-28 -0.800922

2023-10-29 2.113884

2023-10-30 0.623045

2023-10-31 -0.615555

2023-11-01 1.324551

304 rows × 1 columns

dtype: float64

df_close[-60:]#testing data

Date

2023-11-02 -0.166372

2023-11-03 -1.044071

2023-11-04 -0.820504

2023-11-05 1.426642

2023-11-06 0.892228

2023-11-07 -0.178966

2023-11-08 1.583613

2023-11-09 -0.351853

2023-11-10 -0.247485
2023-11-10 -0.247485

2023-11-11 -0.109094

2023-11-12 0.598929

2023-11-13 0.740941

2023-11-14 0.324071

2023-11-15 0.351657

2023-11-16 0.286591

2023-11-17 -0.604990

2023-11-18 0.541170

2023-11-19 0.030313

2023-11-20 -0.329250

2023-11-21 -0.608250

2023-11-22 0.342926

2023-11-23 0.701763

2023-11-24 1.912550

2023-11-25 0.010383

2023-11-26 1.710726

2023-11-27 1.372687

2023-11-28 -0.686699

2023-11-29 0.954674

2023-11-30 -0.466640

2023-12-01 -2.042970

2023-12-02 0.638001

2023-12-03 -1.164504

2023-12-04 1.029772

2023-12-05 0.085211

2023-12-06 1.994400

2023-12-07 2.125879

2023-12-08 -0.086252

2023-12-09 -0.577820

2023-12-10 -0.268866

2023-12-11 -0.369897

2023-12-12 -0.074961

2023-12-13 0.421395

2023-12-14 0.703201

2023-12-15 1.114858

2023-12-16 0.921257

2023-12-17 -0.380721

2023-12-18 -1.068875

2023-12-19 1.768739

2023-12-20 0.322271

2023-12-21 -0.512357

2023-12-22 0.072897

2023-12-23 -1.323132

2023-12-24 0.106720

2023-12-25 -0.330315

2023-12-26 1.021908

2023-12-27 1.472730

2023-12-28 0.388160

2023-12-29 2.029151

2023-12-30 0.617275

2023-12-31 -0.033732

dtype: float64

#split data into train and training set

#split data into train and training set
train_data=df_close[0:-60]
test_data=df_close[-60:]
plt.figure(figsize=(18,8))
plt.grid(True)
plt.xlabel('Dates', fontsize = 20)
plt.ylabel('Closing Prices', fontsize = 20)
plt.xticks(fontsize = 15)
plt.xticks(fontsize = 15)
plt.plot(train_data, 'green', label='Train data', linewidth = 5)
plt.plot(test_data, 'blue', label='Test data', linewidth = 5)
plt.legend(fontsize = 20, shadow=True,facecolor='lightpink', edgecolor = 'k')

<matplotlib.legend.Legend at 0x7da92d05a890>

Model building in Time Series

#in this time we use arima model
stock_df["Close"]

Date

2023-01-01 102.375100

2023-01-02 103.268399

2023-01-03 104.661726

2023-01-04 107.514532

2023-01-05 109.147197

... ...

2023-12-27 274.681922

2023-12-28 275.070082

2023-12-29 277.099232

2023-12-30 277.716507

2023-12-31 277.682775

365 rows × 1 columns

dtype: float64

365-60

305

import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error

history = [x for x in train_data]

history

[0.8932991355088831,
1.3933263239742928,
2.852806278873757,
1.6326649420308286,
0.5478846329143892,
0.7696058342437482,
0.9625287430472156,
0.018231373084361735,
1.1077005040734065,
0.7497050402641605,
1.972947678168694,
0.40951767041809717,
1.1636240679130196,
1.2304959594809617,
0.18371201236193713,
2.4325497305482457,
-0.26391312202788697,
1.442625668491928,
-0.5318829334557194,
-2.413729246967847,
1.6712015109161484,
0.9317498913066089,
0.0698322548046093,
2.807609973898053,
-1.0265658609344541,
0.43923686869243284,
0.5207027332374281,
1.6977267356932941,
2.4579251484588553,
0.3907230052264481,
1.0285667069413762,
-0.4168248436978672,
-1.31013559865616,
-0.06614521282753572,
0.7182963135563796,
1.4772153494880627,
1.995977100207341,
-0.2607738551311627,
0.18751835521305793,
-0.1470745941979601,
-0.9302395758612363,
-1.175458451390483,
2.2248009695441056,
-0.02044980691758269,
0.19315671831344616,
-0.6671134180505476,
1.1357477139460173,
-0.8821431823303101,
0.35407616995422586,
-0.8295849273184785,
0.9959651825627134,
-0.06403624432709876,
-0.1254060218069526,
0.20201766368549556,
0.8513926357980495,
0.44520452621046047,
0.3740700567433919,
0.532466961761088,
-0.12477596304930216,
-0.01570540903489359,
0.1639978537589286,
-0.7857140190270115,
-1.2328669657189266,
1.1792026695864877,
-0.017910958370322305,
-1.4507965353131453,
1.5357081015208678,
-0.61471958905355,
0.41149953226522484,
1.0698880810996627,
0.8176723681895055,
1.6438973978009699,
-0.5466558718802901,
0.5599350920078052,
-0.17443936153424033,
-0.5136371255328243,
0.187439400478155,
0.5121577171133822,
0.4071279719513825,
-0.760302556354219,
1.2999894672940684,
0.667526643148932,
-0.7276142970002581,
1.912569456755989,
2.027267021302805,
2.3918575744206407,
-0.13105016392063362,
-0.17900711779094536,
1.4184955199386025,
0.38286385364656894,
1.3036863601885216,
0.5610416187842588,
1.835680127192603,
0.8274719712972285,
0.962973437639846,
0.43310279806178187,
2.0598256806272843,
0.8728259721148106,
1.1014355498728605,
2.7299102840982528,
-1.5219601321712162,
-0.5815011359226219,
1.236853495737762,
-0.2699754161442627,
1.9296890641726065,
0.6184725435801113,
-0.29059930396928735,
2.217915352321569,
2.415112577533364,
2.2554566313530415,
1.5617323934664569,
-0.9518744601043636,
2.5516783572660984,
0.41447323494159605,
1.4482892620561643,
1.3321046160507137,
0.23776838206009643,
1.2299132591679154,
1.5062757615648366,
0.7259905772023387,
-0.14271730621246093,
0.5038740865232683,
1.8941510595801674,
-0.20365688303775187,
0.04264250908897793,
0.3792145940770979,
2.384819429402455,
1.2267840907422567,
1.0470496207943256,
-0.9197373312707668,
1.105424410204506,
-0.12503909465468155,
0.4347971674349935,
-0.00047127816949910084,
1.4605530632271666,
0.8567678108843779,
0.4095450043537312,
0.9541648088898853,
-0.9846694681493204,
-1.0661810820396056,
1.0496226968534188,
0.6124782628608898,
1.3059418277300097,
3.0336473041813576,
1.6702186490335293,
-0.47266247272744977,
1.2718941508587136,
-1.0736219471799018,
0.56404071487745,
0.050955787937795094,
2.648999286346992,
-0.3239580264257995,
-0.4083176648713902,
0.736210905074671,
-0.47007975917725275,
1.2075272192262787,
-0.4642814851631556,
-0.648229305801209,
0.42947268677431794,
-0.21222472933988,
2.547204410309206,
1.4523247896722182,
0.14151962258091544,
0.021271593676090106,
0.2571075913099321,
0.41571940289978215,
-0.7933694966874612,
0.8053587181372848,
1.0925945456034185,
1.7128518843369989,
0.7429764591254013,
1.1948999450876556,
-0.12492871305823883,
-0.3450670390670041,
0.7897461900880103,
-0.1497014015942284,
0.08206034496333814,
-0.01971566829175231,
0.138933152095575,
0.4939003011037073,
-0.5239700280359045,
-1.155146324585104,
-0.5870156275400689,
0.686234065534677,
-1.335096902104965,
-0.445184759171525,
0.7279536732392273,
-0.5440324664817808,
2.058874594525122,
-0.27255959791790474,
-0.2724127433110368,
0.8499787304334347,
-0.7095231836120774,
1.1477017576514754,
0.40817688196634094,
1.3651581417331329,
1.3606390307013214,
2.229787485292121,
1.8274741977712665,
0.26484370559887793,
0.5955447688372146,
1.5488637598278103,
1.334956406980524,
1.048843017829114,
-1.0411073790201328,
0.12284371446423847,
-0.3749746318142684,
1.3140512763593222,
0.3287323868915166,
1.1016466869869816,
0.5882608524204045,
1.3703320827427206,
-0.1815827758780415,
0.2385799079773676,
-0.47976109312293147,
0.6954842758049438,
0.6059251624713227,
3.2347373527486525,
0.33844309895349056,
-0.303456523336763,
-0.002355769474434055,
-0.04602240039588423,
1.1438055364243667,
-1.6396435925654487,
1.2086371235202478,
0.5362330707300487,
0.9004626050362674,
-0.7666321295496061,
1.1056824150820432,
-1.0717533413471472,
-0.35213855788359183,
0.10624865407280026,
0.7082703765054248,
-0.3099064479051208,
0.9823418343268884,
2.307936411124075,
-2.0564375154723677,
1.2698091022463132,
0.8851898436832926,
0.24098021966440797,
-0.08847856271148657,
0.2326345171115065,
0.30425645941497237,
0.004451985076087794,
-1.0164009236605978,
1.7421066978022566,
1.9739693751788252,
-0.38503585552368236,
-1.0979274424854566,
0.9869216436693193,
-0.021929141850222322,
0.6957551437645861,
0.10236768419537157,
1.0597859114010078,
1.3301609680348747,
-0.2621673797302151,
-1.0454573238503997,
-1.2060857036777577,
1.3796231147639162,
-0.9547226304205481,
0.27181288446479357,
0.3146558113507183,
-0.013781623356692307,
-1.435210642150338,
0.5538345451957127,
1.220740058501633,
0.6369303829316095,
0.25667955803731957,
0.4518904213170458,
0.8895449833110263,
-2.5488895093606345,
2.8701273263094436,
0.8968927087922793,
-0.08302734093248887,
-0.12824623694370985,
1.1801893787638278,
0.7215067046920751,
-2.1639138156295985,
2.479478937589903,
0.7481824800907759,
1.2967698907965826,
-0.12351585784816166,
1.714674382931122,
1.537481772986979,
0.9174169024166474,
-1.1684285505278353,
2.4679278706622654,
0.9080225036720435,
1.7723986293189284,
-0.24381132693795848,
-0.014754413257094257,
2.781816436072006,
-0.2887306275033268,
-0.20006864541986147,
2.056440550381069,
0.5849053174142966,
1.3754351704423016,
-0.32557474963380173,
1.0517465564922759,
-0.8009215689510256,
2.113884101388976,
0.6230446831372092,
-0.6155548195604865,
1.3245506334627066]

#train arima model and we pass data as a history

model=ARIMA(history,order=(1,1,1))
model=model.fit()
model.summary()

SARIMAX Results
Dep. Variable: y No. Observations: 304

Model: ARIMA(1, 1, 1) Log Likelihood -444.626

Date: Sat, 02 Nov 2024 AIC 895.251

Time: 00:05:55 BIC 906.392

Sample: 0 HQIC 899.708

- 304

Covariance Type: opg

coef std err z P>|z| [0.025 0.975]

ar.L1 -0.1192 0.054 -2.195 0.028 -0.226 -0.013

ma.L1 -0.9370 0.023 -41.528 0.000 -0.981 -0.893

sigma2 1.0933 0.091 11.989 0.000 0.915 1.272

Ljung-Box (L1) (Q): 0.01 Jarque-Bera (JB): 0.09

Prob(Q): 0.91 Prob(JB): 0.95

Heteroskedasticity (H): 1.07 Skew: 0.00

Prob(H) (two-sided): 0.75 Kurtosis: 2.91

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

len(history)

304

model.forecast()

array([0.56016095])

mean_squared_error([test_data[0]],model.forecast())

0.527849948848813
np.sqrt(mean_squared_error([test_data[0]],model.forecast()))

0.7265328270964864

def train_arima_model(x, y, arima_order):

# prepare training dataset
# make predictions list
history = [x for x in x]
predictions = list()
for t in range(len(y)):
model = ARIMA(history, order=arima_order)
model_fit = model.fit()
yhat = model_fit.forecast()[0]
predictions.append(yhat)
history.append(y[t])
# calculate out of sample error
rmse = np.sqrt(mean_squared_error(y, predictions))
return rmse

# evaluate different combinations of p, d and q values for an ARIMA model to get the best order for ARIMA Model
def evaluate_models(dataset, test, p_values, d_values, q_values):
dataset=dataset.astype('float32')
best_score, best_cfg = float("inf"), None
for p in p_values:
for d in d_values:
for q in q_values:
order = (p,d,q)
try:
rmse=train_arima_model(dataset, test, order)
if rmse<best_score:
best_score, best_cfg = rmse, order
print('ARIMA%s RMSE=%.3f' % (order,rmse))
except:
continue
print('Best ARIMA%s RMSE=%.3f'%(best_cfg,best_score))

p_values=range(0,3)
d_values=range(0,3)
q_values=range(0,3)
evaluate_models(train_data,test_data,p_values,d_values,q_values)

ARIMA(0, 0, 0) RMSE=0.932
ARIMA(0, 0, 1) RMSE=0.940
ARIMA(0, 0, 2) RMSE=0.940
ARIMA(0, 1, 0) RMSE=1.237
ARIMA(0, 1, 1) RMSE=0.933
ARIMA(0, 1, 2) RMSE=0.958
ARIMA(0, 2, 0) RMSE=2.140
ARIMA(0, 2, 1) RMSE=1.239
ARIMA(0, 2, 2) RMSE=0.938
ARIMA(1, 0, 0) RMSE=0.941
ARIMA(1, 0, 1) RMSE=0.941
ARIMA(1, 0, 2) RMSE=0.953
ARIMA(1, 1, 0) RMSE=1.097
ARIMA(1, 1, 1) RMSE=0.955
ARIMA(1, 1, 2) RMSE=0.968
ARIMA(1, 2, 0) RMSE=1.604
ARIMA(1, 2, 1) RMSE=1.098
ARIMA(1, 2, 2) RMSE=0.959
ARIMA(2, 0, 0) RMSE=0.940
ARIMA(2, 0, 1) RMSE=0.953
ARIMA(2, 0, 2) RMSE=0.913
ARIMA(2, 1, 0) RMSE=1.045
ARIMA(2, 1, 1) RMSE=0.960
ARIMA(2, 1, 2) RMSE=0.957
ARIMA(2, 2, 0) RMSE=1.303
ARIMA(2, 2, 1) RMSE=1.047
ARIMA(2, 2, 2) RMSE=0.965
Best ARIMA(2, 0, 2) RMSE=0.913

history=[x for x in train_data]

predictions=list()
for i in range(len(test_data)):
model=ARIMA(history,order=(2,0,0))
model=model.fit()
fc=model.forecast(alpha=0.05)#alpha=0.05 we set out confidence interval 95%
predictions.append(fc)
history.append(test_data[i])
print(f"my RMSE {np.sqrt(mean_squared_error(test_data,predictions))}")

my RMSE 0.9404476463697088

plt.figure(figsize=(18,8))
plt.grid(True)
plt.plot(range(len(test_data)), test_data,label='True Test Close Value',linewidth = 5)
plt.plot(range(len(predictions)), predictions, label = 'Predictions on test data', linewidth = 5)
plt.xticks(fontsize = 15)
plt.xticks(fontsize = 15)
plt.legend(fontsize = 20, shadow=True, facecolor='lightpink', edgecolor = 'k')
plt.show()

fc_series=pd.Series(predictions,index=test_data.index)

#plot
plt.figure(figsize=(12,5), dpi=100)
plt.plot(train_data, label='Training', color = 'blue')
plt.plot(test_data, label='Test', color = 'green', linewidth = 3)
plt.plot(fc_series, label='Forecast', color = 'red')
plt.title('Forecast vs Actuals on test data')
plt.legend(loc='upper left', fontsize=8)
plt.show

matplotlib.pyplot.show
def show(*args, **kwargs)

Display all open figures.

Parameters
----------
block : bool, optional
Whether to wait for all figures to be closed before returning.

If `True` block and run the GUI main loop until all figure windows
are closed.

If `False` ensure that all figure windows are displayed and return
immediately. In this case, you are responsible for ensuring
that the event loop is running to have responsive figures.

Defaults to True in non-interactive mode and to False in interactive

mode (see `.pyplot.isinteractive`).

See Also
--------
ion : Enable interactive mode, which shows / updates the figure after
every plotting command, so that calling ``show()`` is not necessary.
ioff : Disable interactive mode.
savefig : Save the figure to an image file instead of showing it on screen.

Notes
-----
**Saving figures to file and showing a window at the same time**

If you want an image file as well as a user interface window, use

`.pyplot.savefig` before `.pyplot.show`. At the end of (a blocking)
from statsmodels.graphics.tsaplots
``show()`` the figure is closedimport plot_predict
and thus unregistered from pyplot. Calling
fig=plt.figure(figsize=(18,8))
`.pyplot.savefig` afterwards would save a new and thus empty figure. This
ax1=fig.add_subplot(111)#these are a forcasted next 60 days data
limitation of command order does not apply if the show is non-blocking or
plot_predict(result=model,start=1,end=len(df_close)+60,ax=ax1)
if you keep a reference to the figure and use `.Figure.savefig`.
plt.grid("both")
plt.legend(['Forecast', 'Close', '95% confidence interval'], fontsize = 20, shadow=True, facecolor='lightblue',
**Auto-show in jupyter notebooks**
plt.show()

The jupyter backends (activated via ``%matplotlib inline``,

``%matplotlib notebook``, or ``%matplotlib widget``), call ``show()`` at
the end of every cell by default. Thus, you usually don't have to call it
history= [x for x in train_data]
predictions = list()
conf_list = list()
for t in range(len(test_data)):
model=sm.tsa.statespace.SARIMAX(history, order = (0,1,0), seasonal_order = (1,1,1,3))
model_fit = model.fit()
fc=model_fit.forecast()
predictions.append(fc)
history.append(test_data[t])
print('RMSE OF SARIMA Model:', np.sqrt(mean_squared_error(test_data, predictions)))#my RMSE 0.9404476463697088

RMSE OF SARIMA Model: 1.2743650895214962

TIME - ChatGPT Manual 001
No ratings yet
TIME - ChatGPT Manual 001
7 pages
HDFC Bank Time Series Analysis
No ratings yet
HDFC Bank Time Series Analysis
10 pages
Time Series Analysis of Stock AA
No ratings yet
Time Series Analysis of Stock AA
5 pages
Forecasting Economic Indicators Using Time Series Analysis
No ratings yet
Forecasting Economic Indicators Using Time Series Analysis
4 pages
AAPL Stock Data Analysis and Visualization
No ratings yet
AAPL Stock Data Analysis and Visualization
23 pages
Dev Lab Record
No ratings yet
Dev Lab Record
31 pages
Week 1 Time Series PDF
No ratings yet
Week 1 Time Series PDF
47 pages
Data Science with Pierian Data Inc.
No ratings yet
Data Science with Pierian Data Inc.
13 pages
Ibd Manual
No ratings yet
Ibd Manual
12 pages
TSA - Mini - Project - Ipynb - Colaboratory
No ratings yet
TSA - Mini - Project - Ipynb - Colaboratory
28 pages
Python Libraries for Time Series Analysis
No ratings yet
Python Libraries for Time Series Analysis
13 pages
Assignment 3 Teleco Telecom Revenue - Copy1
No ratings yet
Assignment 3 Teleco Telecom Revenue - Copy1
33 pages
Asm2024 1
No ratings yet
Asm2024 1
33 pages
06 Time Series Analysis
No ratings yet
06 Time Series Analysis
9 pages
NSE Tata Global Beverages Dataset
No ratings yet
NSE Tata Global Beverages Dataset
10 pages
Netflix Stock Price Prediction
No ratings yet
Netflix Stock Price Prediction
20 pages
Lab Record Dev
No ratings yet
Lab Record Dev
20 pages
Tesla Forecasting Project 1694122380
No ratings yet
Tesla Forecasting Project 1694122380
43 pages
Case Study Crude Oil Production Forecasting
No ratings yet
Case Study Crude Oil Production Forecasting
27 pages
Dev Record Final
No ratings yet
Dev Record Final
34 pages
Working With Time
No ratings yet
Working With Time
3 pages
Dev Record Aids
No ratings yet
Dev Record Aids
24 pages
Time Series Using Python
No ratings yet
Time Series Using Python
18 pages
Statistics Project SEM1 Notes
No ratings yet
Statistics Project SEM1 Notes
5 pages
Financial Analytics With Python
100% (1)
Financial Analytics With Python
40 pages
Machine Learning Stock Time Series 1700932258
No ratings yet
Machine Learning Stock Time Series 1700932258
21 pages
FUll Code
No ratings yet
FUll Code
43 pages
Time - Series - Forecasting Using Teleco Telecom Revenue
No ratings yet
Time - Series - Forecasting Using Teleco Telecom Revenue
27 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Análisis Exploratorio de Datos (EDA) - NVIDIA 2021-2023
No ratings yet
Análisis Exploratorio de Datos (EDA) - NVIDIA 2021-2023
9 pages
Stock Market Analysis ? Pro2 My
100% (1)
Stock Market Analysis ? Pro2 My
32 pages
Unit 5 - Real Time Data Analysis
No ratings yet
Unit 5 - Real Time Data Analysis
16 pages
ML Report Miniproject
No ratings yet
ML Report Miniproject
11 pages
Completed Time Series Analysis! ?
No ratings yet
Completed Time Series Analysis! ?
24 pages
M1 - L2 (Visualizing Times Series Plots)
No ratings yet
M1 - L2 (Visualizing Times Series Plots)
28 pages
Python & Pandas for Beginners
No ratings yet
Python & Pandas for Beginners
29 pages
Markets
No ratings yet
Markets
5 pages
Time Series
No ratings yet
Time Series
27 pages
Python Finance & Trading Guide
No ratings yet
Python Finance & Trading Guide
11 pages
Project Time Series Analysis
100% (2)
Project Time Series Analysis
26 pages
Exp 8
No ratings yet
Exp 8
4 pages
Time Series
No ratings yet
Time Series
38 pages
Unit 6 2
No ratings yet
Unit 6 2
6 pages
Stock Market Analysis Project Overview: Part 1: Getting The Data
No ratings yet
Stock Market Analysis Project Overview: Part 1: Getting The Data
1 page
Stock - Class - Py - Jupyter Notebook
No ratings yet
Stock - Class - Py - Jupyter Notebook
5 pages
Time Series Analysis - CheatSheet
No ratings yet
Time Series Analysis - CheatSheet
10 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Dav 4
No ratings yet
Dav 4
6 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
Advanced Stock Price Prediction Using Machine Learning and Time Series Analysis - by Nickolas Discolll - Dec, 2023 - Medium
100% (1)
Advanced Stock Price Prediction Using Machine Learning and Time Series Analysis - by Nickolas Discolll - Dec, 2023 - Medium
38 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
Components of Time Series and Exploratory Analysis - Transcript
No ratings yet
Components of Time Series and Exploratory Analysis - Transcript
2 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
S&P 500 Historical Stock Data Analysis
No ratings yet
S&P 500 Historical Stock Data Analysis
1 page
Cheat Sheet
No ratings yet
Cheat Sheet
12 pages
WMA12 - 01 - Que - 2025 (January) P2
100% (1)
WMA12 - 01 - Que - 2025 (January) P2
32 pages
R Integration User Guide
No ratings yet
R Integration User Guide
46 pages
Cpac Puritan Bennet 420E PDF
No ratings yet
Cpac Puritan Bennet 420E PDF
56 pages
Norma's TNG Wallet Transactions
No ratings yet
Norma's TNG Wallet Transactions
7 pages
Getting Started With Scratch
No ratings yet
Getting Started With Scratch
48 pages
172 16smbecs2 2-16smbeca2 1-16smbeit2 3 2020051912275542
No ratings yet
172 16smbecs2 2-16smbeca2 1-16smbeit2 3 2020051912275542
4 pages
Java Banking System Code
No ratings yet
Java Banking System Code
8 pages
FortiWeb 5 2 Administration Guide Revision1
No ratings yet
FortiWeb 5 2 Administration Guide Revision1
743 pages
IWA Winch User Manual
No ratings yet
IWA Winch User Manual
32 pages
Technical Specs for Mini Wedge Connector
No ratings yet
Technical Specs for Mini Wedge Connector
1 page
RAG For Educational Application
No ratings yet
RAG For Educational Application
14 pages
5th Generation
No ratings yet
5th Generation
11 pages
Thrashing: Operating System
100% (1)
Thrashing: Operating System
13 pages
Kéo Nhẹ Tâm
100% (1)
Kéo Nhẹ Tâm
3 pages
Ricoh MP 6503SP MP 7503SP MP 9003SP
No ratings yet
Ricoh MP 6503SP MP 7503SP MP 9003SP
6 pages
Class 9 Digital Presenation
No ratings yet
Class 9 Digital Presenation
3 pages
13 - Designing A Graphical User Interface (GUI) Final
No ratings yet
13 - Designing A Graphical User Interface (GUI) Final
72 pages
Free Photobook Campaign Report
No ratings yet
Free Photobook Campaign Report
6 pages
WWW Tradingview Co...
No ratings yet
WWW Tradingview Co...
26 pages
ICT G10 Resource Book - Prithi - 001503
No ratings yet
ICT G10 Resource Book - Prithi - 001503
171 pages
Csat 2025 Quant Based Reasoning
No ratings yet
Csat 2025 Quant Based Reasoning
12 pages
OpenSAP Sac1 Week 2 Transcript
100% (1)
OpenSAP Sac1 Week 2 Transcript
31 pages
DDA Unit 1
No ratings yet
DDA Unit 1
35 pages
Essential ICT Terms for Education
No ratings yet
Essential ICT Terms for Education
2 pages
Hydrosplit M3 User Manual Guide
No ratings yet
Hydrosplit M3 User Manual Guide
28 pages
Unit 4 Sequential Circuits
No ratings yet
Unit 4 Sequential Circuits
109 pages
Practice Test - Ai Modeling
No ratings yet
Practice Test - Ai Modeling
22 pages
MIS 107 Mid Nafisa Assignment
No ratings yet
MIS 107 Mid Nafisa Assignment
3 pages
HID Aero-X300-Controller-Ds-En
No ratings yet
HID Aero-X300-Controller-Ds-En
2 pages
Email Usage Consent Form
No ratings yet
Email Usage Consent Form
1 page