Jay Kapoor
STAT 4534
Dr. Marco A. R. Ferreira
17th Nov, 2022
Data Analysis Activity 3
Task 1
We can observe that the time series is not stationary as it changes along with time.
We can use log-log transformation for this data set.
There is a upward trend in the time series and it does have trends which need to de-trended
using log-log transformation.
Task 2
Plotting the sample ACF and sample PACF of the differenced time series
The possible preliminary orders p and q of an AR model is 1 as PACF cuts of at lag 1 and it seems
the time series is an AR(1) and MA(1)
Task 3
The BIC gives the best model to be (1,1,0) as suspected by us in previous task. This makes us
confident to use ARIMA(1,1,0).
AIC favors the model ARIMA(1,1,1) and also is very close to favoring model ARIMA(1,1,0).
BIC favors the model ARIMA(1,1,0).
The results for AIC and BIC do not accurately coincide but it does give information about
choosing a model.
Based on these results, I choose ARIMA(1,1,0) to have the bets fit to our model.
Task 4 - performing model diagnostics
The ACF of the residuals do not have much auto correlation, considering that the threshold is
only crossed one time in the ACF plot.
The assumption of normality of the residuals reasonable as most values follow the predicted
line, however, some values do not fit the line, causing a S-shape curve which can be reasoned
out as nature of the problem (industrial production)
We observe that the p-values on the Ljung-Box are above the line or just on the line, we don’t
see any point going way below the line and we can say that the residuals are barely
uncorrelated.
Task 5 - Forecasting the model
Plotting the forecast for industrial production time series 12 quarters ahead
Prediction for the next quarters
The 95% prediction interval for the fourth quarter of 2022.11.17
Appendix
#Jay Kapoor
#Stat 4534 Time Series Analysis
#DAA 3
library(readr)
library(astsa)
library(polynom)
#Task 1
le <- read_csv("D:/STAT 4534 time series/Analysis Activity 3/[Link]")
[Link] <- ts(le[,2],start=1980,frequency=4)
tsplot([Link])
[Link] <- log([Link])
tsplot([Link])
#Removing the trend
[Link] <- diff([Link])
tsplot([Link])
#Task2
acf2([Link],[Link]=40)
# Task 3
n = length([Link])
max.p = 5
max.d = 1
max.q = 5
max.P = 0
max.D = 0
max.Q = 0
[Link] =array(NA,dim=c(max.p+1,max.d+1,max.q+1,max.P+1,max.D+1,max.Q+1))
[Link] =array(NA,dim=c(max.p+1,max.d+1,max.q+1,max.P+1,max.D+1,max.Q+1))
[Link] <- 1e8
[Link] = [Link]
for (p in 0:max.p) for(d in 0:max.d) for(q in 0:max.q)
for (P in 0:max.P) for(D in 0:max.D) for(Q in 0:max.Q)
{
# This is a modification of a function originally from the book:
# Cowpertwait, P.S.P., Metcalfe, A.V. (2009), Introductory Time
# Series with R, Springer.
# Modified by M.A.R. Ferreira (2016, 2020).
cat("p=",p,", d=",d,", q=",q,", P=",P,", D=",D,", Q=",Q,"\n")
fit <- tryCatch(
{ arima([Link], order = c(p,d,q),
seas = list(order = c(P,D,Q),
frequency([Link])),method="CSS-ML")
},
error = function(cond){
message("Original error message:")
message(cond)
# Choose a return value in case of error
return(NA)
}
)
if(){
[Link] <- length(fit$coef) + 1
[Link][p+1,d+1,q+1,P+1,D+1,Q+1] = -2*fit$loglik + log(n)*[Link]
[Link][p+1,d+1,q+1,P+1,D+1,Q+1] = -2*fit$loglik + 2*[Link]
if ([Link][p+1,d+1,q+1,P+1,D+1,Q+1] < [Link])
{
[Link] <- [Link][p+1,d+1,q+1,P+1,D+1,Q+1]
[Link] <- fit
[Link] <- c(p,d,q,P,D,Q)
}
}
}
[Link]
[Link]
[Link]
#Task4
arima211 <- arima([Link], order = c(2,1,1), method="CSS-ML")
BIC.arima211 = -2*arima211$loglik + log(n)*(length(arima211$coef) + 1)
BIC.arima211
arima111 <- arima([Link], order = c(1,1,1), method="CSS-ML")
BIC.arima111 = -2*arima111$loglik + log(n)*(length(arima111$coef) + 1)
BIC.arima111
arima110 <- arima([Link], order = c(1,1,0), method="CSS-ML")
BIC.arima110 = -2*arima110$loglik + log(n)*(length(arima110$coef) + 1)
BIC.arima110
AIC(arima211)
AIC(arima111)
AIC(arima110)
# So, the best model according to BIC is the ARIMA(1,1,0)
fit.110 <- sarima([Link],1,1,0)
#Task 5
par(mfrow=c(1,1))
[Link] <- [Link]([Link],12,1,1,0)
[Link]$pred
[Link]$se
exp([Link]$pred-1.96* [Link]$se)
exp([Link]$pred+1.96* [Link]$se)