ECON1267 Quantitative
Analysis
Week 8: ARIMA Models (Part
2)
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 1 / 37
Topic 6 & 8 – Learning Objectives (ARIMA Models)
What is ARIMA Model?
LO1: Stationary and differencing (Week 6)
LO2: Non-seasonal ARIMA models (Week 6)
LO3: Estimation and order selection
LO4: ARIMA modelling in R
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 2 / 37
Table of Contents
1 F2f Tutorial
LO3: Estimation and order selection
Workshop Activity 1
Workshop Activity 2
2 Digital Session
LO4: ARIMA modelling in R
Digital Activity
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 3 / 37
Section 1
F2f Tutorial
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 4 / 37
Subsection 1
LO3: Estimation and order selection
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 5 / 37
Partial autocorrelations
It is often useful to examine the ACF and PACF in order to determine
what values of p (AR) and q (MA) are required.
Partial autocorrelations (PACF) measure relationship between yt and
yt−k , when the effects of other time lags 1, 2, 3, . . . , k − 1 are removed.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 6 / 37
Partial autocorrelations
Partial autocorrelations (PACF) measure relationship between yt and
yt−k , when the effects of other time lags 1, 2, 3, . . . , k − 1 are removed.
αk = k th partial autocorrelation coefficient
= equal to the estimate of φk in the regression
yt = c + φ1 yt−1 + φ2 yt−2 + · · · + φk yt−k + t
Varying number of terms on RHS gives αk for different values of k.
α1 = ρ1 √
Same critical values of ±1.96 T as for ACF.
Last significant αk indicates the order of an AR model.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 7 / 37
Egyptian exports
egypt <- global_economy |> egypt |> PACF(Exports) |>
filter(Code == "EGY") autoplot()
egypt |> ACF(Exports) |>
autoplot()
0.6
0.5 0.3
pacf
acf
0.0
0.0
-0.3
4 8 12 16
lag [1Y]
4 8 12 16
lag [1Y]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 8 / 37
Egyptian exports
global_economy |> filter(Code == "EGY") |>
gg_tsdisplay(Exports, plot_type='partial')
30
25
Exports
20
15
10
1960 1980 2000
Year
0.5 0.5
pacf
acf
0.0 0.0
4 8 12 16 4 8 12 16
lag [1Y] lag [1Y]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 9 / 37
ACF and PACF interpretation
If the data are from an ARIMA(p,d,0) or ARIMA(0,d,q) model, then the
ACF and PACF plots can be helpful in determining the value of p or q.
So, we have an AR(p) model when:
the ACF is exponentially decaying or sinusoidal
there is a significant spike at lag p in PACF, but none beyond p
So, we have an MA(q) model when:
the PACF is exponentially decaying or sinusoidal
there is a significant spike at lag q in ACF, but none beyond q
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 10 / 37
AR(1) model ACF and PACF plots
yt = 18 − 0.8yt−1 + t , t ∼ N (0, 1)
20
18
value
16
0 25 50 75 100
idx
0.4 0.4
0.0 0.0
pacf
acf
-0.4 -0.4
-0.8 -0.8
5 10 15 20 5 10 15 20
lag [1] lag [1]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 11 / 37
AR(2) model ACF and PACF plots
yt = 8 + 1.3yt−1 − 0.7yt−2 + t , t ∼ N (0, 1)
12.5
10.0
value
7.5
5.0
2.5
0 25 50 75 100
idx
0.4 0.4
pacf
acf
0.0 0.0
-0.4 -0.4
5 10 15 20 5 10 15 20
lag [1] lag [1]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 12 / 37
MA(1)’s ACF and PACF plots
yt = 20 + t + 0.8t−1 , t ∼ N (0, 1)
23
22
21
value
20
19
18
17
0 25 50 75 100
idx
0.50 0.50
0.25 0.25
pacf
acf
0.00 0.00
-0.25 -0.25
5 10 15 20 5 10 15 20
lag [1] lag [1]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 13 / 37
MA(2)’s ACF and PACF plots
yt = t − t−1 + 0.8t−2 , t ∼ N (0, 1)
0
value
-2
-4
0 25 50 75 100
idx
0.25 0.25
0.00 0.00
pacf
acf
-0.25 -0.25
-0.50 -0.50
-0.75 -0.75
5 10 15 20 5 10 15 20
lag [1] lag [1]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 14 / 37
ACF and PACF interpretation
If the data are from an ARIMA(p,d,0) or ARIMA(0,d,q) model,
then the ACF and PACF plots can be helpful in determining the
value of p or q.
When p and q are both larger than 0 (for example ARIMA(2,1,1)),
then the ACF and PACF plots do not help in finding suitable
values of p and q.
The real-world data is extremely messy! We ARE NOT doing the
exact science with the ACF and PACF. They just provide hints to
pick p and q orders.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 15 / 37
Maximum Likelihood Estimation
Once we have identified the model order (p, q), we need to estimate the
parameters c, φ1 , φ2 , . . . , φp , θ1 , . . . , θq
MLE and Least Squares
Under normal errors, maximizing the likelihood is mathematically
equivalent to minimizing the sum of squared residuals in pure linear
regression or a pure AR(p) model.
However, for models with an MA component (ARMA, ARIMA), we
cannot simply solve for parameters by OLS. The unobserved lagged
errors complicate estimation and require more advanced methods.
Nonlinear Optimization
Because of the MA part, software typically uses iterative, nonlinear
optimization methods (e.g., Kalman filter–based algorithms or
backcasting) to find the MLE.
These methods handle the unobserved errors and often differ in
initialization or convergence criteria.
Software Differences
Different statistical packages may use slightly different estimation
routines or default assumptions, leading to slightly different
parameter estimates.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 16 / 37
Criteria to choose ARIMA models
Why Not R2 ?
R2 was developed for standard regression settings and doesn’t
reliably assess time-series forecast accuracy or penalize added
complexity in ARIMA models.
ARIMA models often involve unobserved components (lagged errors),
so R2 is not a meaningful measure of model quality.
Information Criteria
Akaike’s Information Criterion (AIC): Balances goodness of fit
(through likelihood) with model complexity (number of parameters).
Corrected AIC (AICc): A version of AIC that includes a
small-sample correction. Often preferred for ARIMA when sample
sizes are limited.
Bayesian Information Criterion (BIC): Similar to AIC but applies a
stronger penalty for extra parameters, often favoring more
parsimonious models.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 17 / 37
Criteria to choose ARIMA models
Model Fit vs. Complexity
AIC, AICc, and BIC each strike a balance between “How well does
the model fit?” (log-likelihood) and “How many parameters are we
using?” (penalty term).
Lower AIC/AICc/BIC indicates a better trade-off between fit and
complexity.
Which to Choose?
AICc is typically recommended for ARIMA, especially with smaller
datasets.
BIC can be useful if you want a more conservative approach (heavier
penalty on additional parameters).
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 18 / 37
Subsection 2
Workshop Activity 1
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 19 / 37
Workshop Activity 1
Consider aus_arrivals, the quarterly number of international visitors
to Australia from several countries for the period Q1 1981– Q3 2012.
a. Describe the time plot for Japan.
b. What can you learn from the ACF graph?
c. What can you learn from the PACF graph?
d. Can a “normal” ARIMA model be used in here
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 20 / 37
Subsection 3
Workshop Activity 2
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 21 / 37
Workshop Activity 2
For the United States GDP series (from the global_economy data set):
a. Plot the data and if necessary, find a suitable Box-Cox
transformation for the data; Does the data need to be differenced in
order to make stationary?
b. Find a suitable ARIMA model to the transformed data using the
ACF and PACF
c. Try some other models by experimenting with different orders for
AR and MA and select the best model based on the lowest AIC
measure;
d. Observe the residuals of the best model and determine whether they
look like white noise.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 22 / 37
Section 2
Digital Session
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 23 / 37
Subsection 1
LO4: ARIMA modelling in R
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 24 / 37
Modelling procedure with ARIMA()
Plot the data. Identify any unusual observations.
If necessary, transform the data (using a Box-Cox transformation)
to stabilize the variance.
If the data are non-stationary: take first/seasonal differences of the
data until the data are stationary.
Examine the ACF/PACF: Is an AR (p) or MA (q) model
appropriate?
Try your chosen model(s) and use the AICc criteria to search for a
better model. AICc helps choosing between competing models (the
lower the better).
Check the residuals from your chosen model by plotting the ACF of
the residuals, and doing a portmanteau test of the residuals (formal
test to check whether residuals are white noise). If they do not look
like white noise, try a modified model.
Once the residuals look like white noise, calculate forecasts.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 25 / 37
Automatic modelling procedure with ARIMA()
Plot the data. Identify any unusual observations.
If necessary, transform the data (using a Box-Cox transformation)
to stabilize the variance.
Use ARIMA() to automatically select a model.
Check the residuals from your chosen model by plotting the ACF of
the residuals and doing a Portmanteau test of the residuals (formal
test to check whether residuals are white noise). If they do not look
like white noise, try a modified model.
Once the residuals look like white noise, calculate forecasts.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 26 / 37
Central African Republic exports
global_economy |>
filter(Code == "CAF") |> autoplot(Exports) +
labs(title="Central African Republic exports", y="% of GDP")
Central African Republic exports
35
30
25
% of GDP
20
15
10
1960 1980 2000
Year [1Y]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 27 / 37
Central African Republic exports
global_economy |>
filter(Code == "CAF") |>
gg_tsdisplay(difference(Exports), plot_type='partial')
4
difference(Exports)
-4
1960 1980 2000
Year
0.2 0.2
0.0 0.0
pacf
acf
-0.2 -0.2
-0.4 -0.4
4 8 12 16 4 8 12 16
lag [1Y] lag [1Y]
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 28 / 37
Central African Republic exports
caf_fit <- global_economy |>
filter(Code == "CAF") |>
model(arima210 = ARIMA(Exports ~ pdq(2,1,0)), #AR(2) model with first differencing
arima013 = ARIMA(Exports ~ pdq(0,1,3)), #MA(3) model with first differencing
stepwise = ARIMA(Exports), #Finds best ARIMA model using stepwise search
search = ARIMA(Exports, stepwise=FALSE))
#Finds best ARIMA model using exhaustive search
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 29 / 37
Central African Republic exports
This code reshapes the caf_fit object from a wide format (where each model is in a separate
column) to a long format (where models are listed under one column).
caf_fit |> pivot_longer(!Country,
names_to = "Model name", values_to = "Orders")
## # A mable: 4 x 3
## # Key: Country, Model name [4]
## Country `Model name` Orders
## <fct> <chr> <model>
## 1 Central African Republic arima210 <ARIMA(2,1,0)>
## 2 Central African Republic arima013 <ARIMA(0,1,3)>
## 3 Central African Republic stepwise <ARIMA(2,1,2)>
## 4 Central African Republic search <ARIMA(3,1,0)>
Why Use pivot_longer()?
Makes it easier to compare models in a tidy format.
Useful for visualizing results in ggplot2.
Helps in summarizing and reporting model performance.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 30 / 37
Central African Republic exports
This code extracts, sorts, and selects key model evaluation metrics for the ARIMA models
stored in caf_fit. The best model (lowest AICc) appears first.
glance(caf_fit) |> arrange(AICc) |> select(.model:BIC)
## # A tibble: 4 x 6
## .model sigma2 log_lik AIC AICc BIC
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 search 6.52 -133. 274. 275. 282.
## 2 arima210 6.71 -134. 275. 275. 281.
## 3 arima013 6.54 -133. 274. 275. 282.
## 4 stepwise 6.42 -132. 274. 275. 284.
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 31 / 37
Central African Republic exports
caf_fit |>
select(search) |>
gg_tsresiduals()
5
Innovation residuals
-5
1960 1980 2000
Year
20
0.2
15
0.1
count
acf
0.0 10
-0.1
5
-0.2
4 8 12 16 -5 0 5
lag [1Y] .resid
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 32 / 37
Portmanteau test
Portmanteau test - check the residuals are white noise (null hypothesis)
augment(caf_fit) |> #Extracts the residuals
features(.innov, ljung_box,
lag = 10, dof = 3) # Applies the Ljung-Box test on the residuals
## # A tibble: 4 x 4
## Country .model lb_stat lb_pvalue
## <fct> <chr> <dbl> <dbl>
## 1 Central African Republic arima013 5.64 0.582
## 2 Central African Republic arima210 10.7 0.152
## 3 Central African Republic search 5.75 0.569
## 4 Central African Republic stepwise 4.12 0.766
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 33 / 37
Portmanteau test
A test for autocorrelations is called a portmanteau test.
No autocorrelations for the residuals → very likely to be the white noise.
features(.innov, ljung_box, lag = ???, dof = ???)
How to choose lag and dof (the degree of freedoms)? It is suggested
that:
lag = 10 is commonly used in short to medium-length time series
(50–200 observations).
lag = 2m for seasonal data (m as the lengths of the seasonal cycle,
e.g. 4 for quarterly seasonality).
dof is the number of parameters in the estimated model (p + q)
For example: dof = 0 for a white noise, dof = 3 for a ARIMA(3,0,0)
or ARIMA(2,0,1)
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 34 / 37
Central African Republic exports
caf_fit |>
forecast(h=5) |> filter(.model=='search') |>
autoplot(global_economy)
30
level
Exports
20
80%
95%
10
1960 1980 2000 2020
Year
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 35 / 37
Subsection 2
Digital Activity
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 36 / 37
Digital Activity
Using Exports for Luxemburg (LUX) (from the global_economy data
set):
Plot the data and if necessary, find a suitable Box-Cox
transformation for the data; Does the data need to be differenced in
order to make stationary?
Fit a suitable ARIMA model to the transformed data using the
ARIMA() function; Does the suggested model indicate that the
data needs to be differenced?
Produce forecasts of your fitted model. Do the forecasts look
reasonable?
Produce forecasts for a model without transformation and compare
to the one with transformation. Which model do you think is
better?
RMIT Vietnam ECON1267 Quantitative Analysis 25 February, 2025 37 / 37