R packages re-doing of our nested model time series example.
# http://fable.tidyverts.org
library(fable)
library(tsibble)
library(tsibbledata)
library(lubridate)
library(dplyr)
library(ggplot2)
library(forecast)
library(jsonlite)We want to fit a time series with both durable and transient external regressors.
First let’s show the generating parameters we are attempting to recover.
model_specification <- fromJSON("generating_params.json")
model_specification## $beta_auto_intercept
## [1] 2.127637
##
## $effect_shift
## [1] -10.4
##
## $beta_auto
## [1] 1.972723 -1.000000
##
## $beta_durable
## [1] 9.2
##
## $beta_transient
## [1] 19.1
##
## $generating_lags
## [1] 1 2
##
## $error_scale
## [1] 3.2
##
## $n_step
## [1] 1000
What we hope is to find b_x_dur ~ b_z and b_x_imp ~ b_x.
The generating and modeling lags specify translate into ARIMA terms as
p = 2, i = 0. We take our own advice from A Time Series
Apologia
and pick q = p. So in ARIMAX terms we try fitting a pdq(2, 0, 2)
system.
We don’t have any way to specify the nature of external regressors, with
act as transient effects in the “regression with ARIMA residuals”
formulation favored by the fable and forecast packages. This will lead
to a degradation in fit quality and an inability to properly estimate
b_z (as we can’t specify the for of the effect we believe it has in
the data).
General Fable formulation:
(1 - φ1 B - … - φp Bp)(1 -
B)d yt = c + (1 - θ1 B - … -
θq Bq) εt
where B is the shift operator such that B
ut = ut-1, μ is the mean of
(1 - B)d yt and c = μ (1 -
φ1 - … - φp). We solve for
εt being small. We very much prefer calling the
εt “the external shocks”, and not calling them
“errors.”
In our pdq(2, 0, 2) case this specializes to:
(1 - φ1 B - φ2 B2)
yt = c + (1 - θ1 B - θ2 B2)
εt
And knowing this is “regression with ARIMA residuals”, when we add the external regressors this should be:
(1 - φ1 B - φ2 B2)
(yt - β0 - βx xt -
βz zt) = c + (1 - θ1 B - θ2
B2) εt
Removing the B shift notation and assuming
mean(yt - β0 - βx xt -
βz zt) is zero gives us.
(yt - β0 - βx xt - βz zt)
- φ1 (yt-1 - β0 - βx xt-1 - βz zt-1)
- φ2 (yt-2 - β0 - βx xt-2 - βz zt-2)
= εt - θ1 εt-1 - θ2 εt-2
This shows us the grace of the B shift notation. This
equation block is repeated for all t (with out of range
indices denoting zero values). For a T time step system we
get a system of T equalities in T + 7
equations (minimizing norm ε). This also shows us that
while the system is linear in the data, it is non-linear in the fit
coefficients.
All of the above is a long way of saying: “given the pdq(2, 0, 2)
specification and y, x, z, the modeling system then fits
for φ, β, θ, ε (with ε small norm).” I.e., we
don’t have to worry over the above transformations- that is the job of
the solver.
Note: I agree with the Prophet authors that the user has to be involved
in specifying pdq(2, 0, 2). Many auto-ARIMA systems seem to silently
fail in presence of external regressors. Here is an example of “hey
let’s use auto-arima” failing for tides and even its own package help
example:
https://github.com/WinVector/Examples/blob/main/Tides/TideR_ARIMA.md.
All mean mean to say is: we can give ourselves permission to use
non-ARIMA methods (more on this in “A Time Series
Apologia”)
in addition to trying ARIMA methods. The additional ARIMA driven
concerns and checks (such as looking for unit roots, worrying about
co-linear regressors) are more to do with the solution formulation
(recursive equations) than the nature of forecast problems.
Normally, we don’t have to care so much about model structure- as we are
protected from that by calling predict() or forecast(). However, in
this case we are very concerned if model structure will or will not
allow us to express what we think the external regressors actually do
(i.e. model structure).
This is what we meant about the chosen package specifying the modeling recurrence equations (i.e. taking that choice out of our hands). We can specify modeling the total, but not unobserved sub-populations of the system.
Frankly ARIMAX/SARIMAX can be a false path for business modelers. Time series research didn’t actually stop at or actually consolidate on this terminology. Instead, transfer function methods and other more further developed systems are studied. Roughly: the scientific community is well served by ARIMAX. The research community moved on from ARIMAX. And, the business community wishes ARIMAX was in fact the dominant method, as it is the dominant software offered.
An odd point in this direction is: ARIMA prediction of tides. In fact tides are formed by external regressors: the gravitational attraction of the sun and moon. However just giving the ARIMA the exact periodicities of these regressors is enough for it to model the entire system without those inputs.
# https://otexts.com/fpp3/regarima.html
d_train <- read.csv('d_train.csv', stringsAsFactors = FALSE)
d_test <- read.csv('d_test.csv', stringsAsFactors = FALSE)fable_model <- (
d_train %>%
tsibble(index=time_tick) %>%
model(
ARIMA(y ~
1 # turn off must specify constant warning
+ x_durable_0 # external regressor (can also use xreg(x_durable_0))
+ x_transient_0 # external regressor (can also use xreg(x_transient_0))
+ pdq(2, 0, 2) # AR=MA=2, I=0
+ PDQ(0, 0, 0) # turn off seasonality (help(ARIMA))
)
)
)
coef(fable_model)## # A tibble: 7 × 6
## .model term estimate std.error statistic p.value
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 "ARIMA(y ~ 1 + x_durable_0 + x_t… ar1 1.93 0.00985 196. 0
## 2 "ARIMA(y ~ 1 + x_durable_0 + x_t… ar2 -0.965 0.00972 -99.2 0
## 3 "ARIMA(y ~ 1 + x_durable_0 + x_t… ma1 -0.928 0.0323 -28.7 3.09e-132
## 4 "ARIMA(y ~ 1 + x_durable_0 + x_t… ma2 0.271 0.0310 8.72 1.15e- 17
## 5 "ARIMA(y ~ 1 + x_durable_0 + x_t… x_du… -0.487 0.842 -0.579 5.63e- 1
## 6 "ARIMA(y ~ 1 + x_durable_0 + x_t… x_tr… 17.0 0.346 49.2 3.70e-267
## 7 "ARIMA(y ~ 1 + x_durable_0 + x_t… inte… 78.3 2.72 28.8 7.12e-133
Notice we recovered good estimates of the autoregressive terms b_auto
(ar1, ar2), transient external effect coefficient b_x
(x_transient_0). We did not get a good estimate of the durable
external effect coefficient x_durable_0 (b_z), so we did not infer
how changes in this variable affect results.
We would be able to forecast, as the auto-regressive terms dominate. We
would not be able to plan, as we don’t have a good estimate of
x_durable_0 (b_z).
model_specification## $beta_auto_intercept
## [1] 2.127637
##
## $effect_shift
## [1] -10.4
##
## $beta_auto
## [1] 1.972723 -1.000000
##
## $beta_durable
## [1] 9.2
##
## $beta_transient
## [1] 19.1
##
## $generating_lags
## [1] 1 2
##
## $error_scale
## [1] 3.2
##
## $n_step
## [1] 1000
preds <- (
fable_model %>%
forecast(new_data=tsibble(d_test, index=time_tick))
)
d_test['fable ARIMAX prediction'] = preds['.mean']
# Rsquared
sse <- sum((d_test[['y']] - d_test[['fable ARIMAX prediction']])**2)
sey <- sum((d_test[['y']] - mean(d_test[['y']]))**2)
rsq <- 1 - sse/sey
# RMSE
rmse <- sqrt(mean((d_test[['y']] - d_test[['fable ARIMAX prediction']])**2))
(
ggplot(
data=d_test,
mapping=aes(x=time_tick)
)
+ geom_step(mapping=aes(y=`fable ARIMAX prediction`), direction='mid', color='blue')
+ geom_point(mapping=aes(y=y, shape=ext_regressors, color=ext_regressors), size=2)
+ ggtitle(paste0("fable package on held out data, rmse: ",
sprintf('%.2f', rmse), ', rsq: ', sprintf('%.2f', rsq)))
) Forecast formulation:
(1 - φ1 B - … - φp Bp)
(y′t - μ) = c + (1 - θ1 B - … -
θq Bq) εt
where y′t = (1 - B)d
yt, μ = mean(y′t).
# https://otexts.com/fpp3/regarima.html
d_train <- read.csv('d_train.csv', stringsAsFactors = FALSE)
d_test <- read.csv('d_test.csv', stringsAsFactors = FALSE)
forecast_model <- Arima(
ts(d_train[['y']], start=d_train[['time_tick']][1]),
order=c(2, 0, 2),
xreg=ts(d_train[, c('x_durable_0', 'x_transient_0')], start=d_train[['time_tick']][1])
)
forecast_model## Series: ts(d_train[["y"]], start = d_train[["time_tick"]][1])
## Regression with ARIMA(2,0,2) errors
##
## Coefficients:
## ar1 ar2 ma1 ma2 intercept x_durable_0 x_transient_0
## 1.9335 -0.9648 -0.9282 0.2705 78.3360 -0.4873 17.0466
## s.e. 0.0099 0.0097 0.0323 0.0310 2.7168 0.8420 0.3463
##
## sigma^2 = 60.54: log likelihood = -3400.48
## AIC=6816.97 AICc=6817.11 BIC=6856.07
Notice the recovered durable effect coefficient is way too low.
preds <- forecast(
forecast_model,
xreg=ts(d_test[, c('x_durable_0', 'x_transient_0')],
start=d_test[['time_tick']][1]))
d_test['forecast ARIMAX'] <- as.numeric(preds$mean)
# Rsquared
sse <- sum((d_test[['y']] - d_test[['forecast ARIMAX']])**2)
sey <- sum((d_test[['y']] - mean(d_test[['y']]))**2)
rsq <- 1 - sse/sey
# RMSE
rmse <- sqrt(mean((d_test[['y']] - d_test[['forecast ARIMAX']])**2))
(
ggplot(
data=d_test,
mapping=aes(x=time_tick)
)
+ geom_step(mapping=aes(y=`forecast ARIMAX`), direction='mid', color='blue')
+ geom_point(mapping=aes(y=y, shape=ext_regressors, color=ext_regressors), size=2)
+ ggtitle(paste0(
"forecast package on held out data, rmse: ",
sprintf('%.2f', rmse), ', rsq: ', sprintf('%.2f', rsq)))
) 
