Wooldridge 6e Ch18 IM
Wooldridge 6e Ch18 IM
CHAPTER 18
Advanced Time Series Topics
Table of Contents
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
283
TEACHING NOTES
Several of the topics in this chapter, including testing for unit roots and cointegration, are now
staples of applied time series analysis. Instructors who like their course to be more time series
oriented might cover this chapter after Chapter 12, if time permits. Or, the chapter can be used
as a reference for ambitious students who wish to be versed in recent time series developments.
The discussion of infinite distributed lag models, and in particular geometric DL and rational DL
models, gives one particular interpretation of dynamic regression models. But one must
emphasize that only under fairly restrictive assumptions on the serial correlation in the error of
the infinite DL model does the dynamic regression consistently estimate the parameters in the lag
distribution. Computer Exercise C18.1 provides a good illustration of how the GDL model, and
a simple RDL model, can be too restrictive.
Example 18.5 tests for cointegration between the general fertility rate and the value of the
personal exemption. There is not much evidence of cointegration, which sheds further doubt on
the regressions in levels that were used in Chapter 10. The error correction model for holding
yields in Example 18.7 is likely to be of interest to students in finance. As a class project, or a
term project for a student, it would be interesting to update the data to see if the error correction
model is stable over time.
The forecasting section is heavily oriented towards regression methods and, in particular,
autoregressive models. These can be estimated using any econometrics package, and forecasts
and mean absolute errors or root mean squared errors are easy to obtain. The interest rate data
sets (for example, INTQRT) can be updated to do much more recent out-of-sample forecasting
exercises.
Several new time series data sets include OKUN (a small annual data set that can be used to
illustrate Okun’s Law), MINWAGE (U.S. monthly data, by sector, on wages, employment, and
the federal minimum wage), and FEDFUND (quarterly data on the federal funds rate and the real
GDP gap in the United States).
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
284
SOLUTIONS TO PROBLEMS
18.1 With z t1 and z t2 now in the model, we should use one lag each as instrumental variables,
z t-1,1 and z t-1,2 . This gives one overidentifying restriction that can be tested.
18.2 (i) When we lag equation (18.68) once, multiply it by (1 – λ), and subtract it from (18.68),
we obtain
when we plug this into the first equation we obtain the desired result.
(iii) Because {v t } follows an MA(1) process, it is correlated with the lagged dependent
variable, y t-1 . Therefore, the OLS estimators of the β j will be inconsistent (and biased, of
course). Nevertheless, we can use x t-2 as an IV for y t-1 because x t-2 is uncorrelated with v t
(because u t and u t-1 are both uncorrelated with x t-2 and x t-2 ) and x t-2 is partially correlated with
y t-1 .
18.4 Following the hint, we show that y t-2 – βx t-2 can be written as a linear function of y t-1 –
βx t-1 , ∆y t-1 , and ∆x t-1 . That is,
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
285
(yt-1 – βxt-1) – ∆yt-1 + β∆xt-1 = yt-1 – βxt-1 – (yt-1 – yt-2) + β(xt-1 – xt-2) = yt-2 – βxt-2,
or
18.6 (i) This is given by the estimated intercept, 1.54. Remember, this is the percentage growth
at an annualized rate. It is statistically different from zero since t = 1.54/.56 = 2.75.
(ii) 1.54 + .031(10) = 1.85. As an aside, you could obtain the standard error of this estimate
by running the regression.
(iii) Growth in the S&P 500 index has a statistically significant effect on industrial
production growth – in the Granger causality sense – because the t statistic on pcspt-1 is about
2.38. The economic effect is reasonably large.
18.7 If unemt follows a stable AR(1) process, then this is the null model used to test for Granger
causality: under the null that gMt does not Granger cause unemt, we can write
unemt = β0 + β1unemt-1 + ut
E(ut|unemt-1, gMt-1, unemt-2, gMt-2, ) = 0
and |β1| < 1. Now, it is up to us to choose how many lags of gM to add to this equation. The
simplest approach is to add gMt-1 and to do a t test. But we could add a second or third lag (and
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
286
probably not beyond this with annual data), and compute an F test for joint significance of all
lags of gMt.
By assumption, E(et|It-1) = 0, and since yt-1, zt-1, and zt-2 are all in It-1, we have
We obtain the desired answer by adding one to the time index everywhere.
(ii) The forecasting equation for yn+1 is obtained by using part (i) with t = n, and then
plugging in the estimates:
(iii) From part (i), it follows that the model with one lag of z and AR(1) serial correlation in
the errors can be obtained from
with α0 = (1 − ρ)α, γ1 = δ1, and γ2 = −ρδ1 = −ργ1. The key is that γ2 is entirely determined (in a
nonlinear way) by ρ and γ1. So the model with a lag of z and AR(1) serial correlation is a special
case of a more general model. (Note that the general model depends on four parameters, while
the model from part (i) depends on only three.)
(iv) For forecasting, the AR(1) serial correlation model may be too restrictive. It may
impose restrictions on the parameters that are not met. On the other hand, if the AR(1) serial
correlation model holds, it captures the conditional mean E(yt|It-1) with one fewer parameter than
the general model; in other words, the AR(1) serial correlation model is more parsimonious.
[See Harvey (1990) for ways to test the restriction γ2 = −ργ1, which is called a common factor
restriction.]
18.9 Let eˆn +1 be the forecast error for forecasting yn+1, and let aˆn +1 be the forecast error for
forecasting ∆yn+1. By definition, eˆn +1 = yn+1 − fˆn = yn+1 – ( gˆ n + yn) = (yn+1 – yn) − gˆ n = ∆yn+1 −
gˆ n = aˆn +1 , where the last equality follows by definition of the forecasting error for ∆yn+1.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
287
gprice = .0013 + .081 gwage + .640 gprice-1
(.0003) (.031) (.045)
n = 284, R2 = .454.
The estimated impact propensity is .081 while the estimated LRP is .081/(1 – .640) = .225. The
estimated lag distribution is graphed below.
coefficient .1
.08
.06
.04
.02
0
0 1 2 3 4 5 6 7 8 9 10 11 12
lag
(ii) The IP for the FDL model estimated in Problem 11.5 was .119, which is substantially
above the estimated IP for the GDL model. Further, the estimated LRP from GDL model is
much lower than that for the FDL model, which we estimated as 1.172. Clearly we cannot think
of the GDL model as a good approximation to the FDL model. One reason these are so different
can be seen by comparing the estimated lag distributions (see below for the GDL model). With
the FDL, the largest lag coefficient is at the ninth lag, which is impossible with the GDL model
(where the largest impact is always at lag zero). It could also be that {ut} in equation (18.8) does
not follow an AR(1) process with parameter ρ, which would cause the dynamic regression to
produce inconsistent estimators of the lag coefficients.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
288
The coefficient on gwage-1 is not especially significant but we include it in obtaining the
estimated LRP. The estimated IP is .09 while the LRP is (.090 + .055)/(1 – .619) ≈ .381. These
are both slightly higher than what we obtained for the GDL, but the LRP is still well below what
we obtained for the FDL in Problem 11.5. While this RDL model is more flexible than the GDL
model, it imposes a maximum lag coefficient (in absolute value) at lag zero or one. For the
estimates given above, the maximum effect is at the first lag. (See the estimated lag distribution
below.) This is not consistent with the FDL estimates in Problem 11.5.
coefficient .12
.1
.08
.06
.04
.02
0
0 1 2 3 4 5 6 7 8 9 10 11 12
lag
ginvpc = –.786 – .956 log(invpct-1) + .0068 t
t
where ginvpct = log(invpct) – log(invpct-1). The t statistic for the augmented Dickey-Fuller unit
root test is –.956/.198 ≈ –4.82, which is well below –3.41, the 5% critical value obtained from
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
289
Table 18.3. Therefore, we strongly reject a unit root in log(invpct). (Incidentally, remember that
the t statistics on the intercept and time trend in this estimated equation to not have approximate t
distributions, although those on ginvpct-1 and ginvpct-2 do under the usual null hypothesis that the
parameter is zero.)
gpricet = –.040 – .222 log(pricet-1) + .00097 t
(.019) (.092) (.00049)
+ .328 gpricet-1 + .130 gpricet-2
(.155) (.149)
n = 39, R2 = .200,
now the Dickey-Fuller t statistic is about –2.41, which is above –3.12, the 10% critical value
from Table 18.3. [The estimated root is 1 – .222 = .778, which is much larger than for
log(invpct).] We cannot reject the unit root null at a sufficiently small significance level.
(iii) Given the very strong evidence that log(invpct) does not contain a unit root, while
log(pricet) may very well, it makes no sense to discuss cointegration between the two. If we take
any nontrivial linear combination of an I(0) process (which may have a trend) and an I(1)
process, the result will be an I(1) process (possibly with drift).
pcip t = 1.80 + .349 pcipt-1 + .071 pcipt-2 + .067 pcipt-3
(0.55) (.043) (.045) (.043)
n = 554, R2 = .166, σˆ = 12.15.
When pcipt-4 is added, its coefficient is .0043 with a t statistic of about .10.
the null hypothesis is that pcsp does not Granger cause pcip. This is stated as H0: γ1 = γ2 = γ3 =
0. The F statistic for joint significance of the three lags of pcspt, with 3 and 547 df, is F = 5.37
and p-value = .0012. Therefore, we strongly reject H0 and conclude that pcsp does Granger
cause pcip.
(iii) When we add ∆i3t-1, ∆i3t-2, and ∆i3t-3 to the regression from part (ii), and now test the
joint significance of pcspt-1, pcspt-2, and pcspt-3, the F statistic is 5.08. With 3 and 544 df in the F
distribution, this gives p-value = .0018, and so pcsp Granger causes pcip even conditional on
past ∆i3.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
290
[Instructor’s Note: The F test for joint significance of ∆i3t-1, ∆i3t-2, and ∆i3t-3 yields p-value =
.228, and so ∆i3 does not Granger cause pcip conditional on past pcsp.]
C18.4 We first run the regression gfrt on pet, t, and t2, and obtain the residuals, uˆt . We then
apply the augmented Dickey-Fuller test, with one lag of ∆ uˆt , by regressing ∆ uˆt on uˆt −1 and ∆
uˆt −1 . There are 70 observations available for this last regression, and it yields −.165 as the
coefficient on uˆt −1 with t statistic = −2.76. This is well above –4.15, the 5% critical value
[obtained from Davidson and MacKinnon (1993, Table 20.2)]. Therefore, we cannot reject the
null hypothesis of no cointegration, so we conclude gfrt and pet are not cointegrated even if we
allow them to have different quadratic trends.
The t statistic for H0: β = 1 is (1.027 – 1)/.016 ≈ 1.69. We do not reject H0: β = 1 at the 5% level
against a two-sided alternative, although we would reject at the 10% level.
[Instructor’s Note: The standard errors on all slope coefficients can be used to construct t
statistics with approximate t distributions, provided there is no serial correlation in {et}.]
6 =
hy .070 + 1.259 ∆hy3t-1 − .816 (hy6t-1 – hy3t-2)
t
Neither of the added terms is individually significant. The F test for their joint significance gives
F = 1.35, p-value = .264. Therefore, we would omit these terms and stick with the error
correction model estimated in (18.39).
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
291
(0.572) (.096)
n = 49, R2 = .554, σˆ = 1.041
and
= 1.286 + .648 unemt-1 + .185 inft-1
unemt
The parameter estimates do not change by much. This is not very surprising, as we have added
only one year of data.
(ii) The forecast for unem1998 from the first equation is 1.549 + .734(4.9) ≈ 5.15; from the
second equation the forecast is 1.286 + .648(4.9) + .185(2.3) ≈ 4.89. The actual civilian
unemployment rate for 1998 was 4.5. Once again the model that includes lagged inflation
produces a better forecast.
(iii) There is no practical improvement in reestimating the parameters using data through
1997: 4.89 versus 4.90, which differs in a digit that is not even reported in the published
unemployment series: our predicted unemployment rate would be 4.9% in both cases.
(iv) To obtain the two-step-ahead forecast we need the 1996 unemployment rate, which is
5.4. From equation (18.55), the forecast of unem1998 made after we know unem1996 is (1 +
.732)(1.572) + (.7322)(5.4) ≈ 5.62. The one-step ahead forecast is 1.572 + .732(4.9) ≈ 5.16, and
so it is better to use the one-step-ahead forecast, as it is much closer to 4.5.
C18.7 (i) The estimated linear trend equation using the first 119 observations and excluding the
last 12 months is
= 248.58 + 5.15 t
chnimpt
(53.20) (0.77)
n = 119, R2 = .277, σˆ = 288.33.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
292
(54.71) (.084)
n = 118, R2 = .174, σˆ = 308.17.
Because σˆ is lower for the linear trend model, it provides the better in-sample fit.
(iii) Using the last 12 observations for one-step-ahead out-of-sample forecasting gives an
RMSE and MAE for the linear trend equation of about 315.5 and 201.9, respectively. For the
AR(1) model, the RMSE and MAE are about 388.6 and 246.1, respectively. In this case, the
linear trend is the better forecasting model.
[Instructor’s Note: In a model with a linear time trend and autoregressive term, both are
statistically significant with σˆ = 285.03 – a slightly better in-sample fit than the linear trend
model. But, using the last 12 months for one-step-ahead forecasting, RMSE = 316.15 and MAE
= 202.73. Therefore, one actually does a bit worse in using the more general model compared
with the simple linear trend.]
(iv) Using again the first 119 observations, the F statistic for joint significance of febt, mart,
…, dect when added to the linear trend model is about 1.15 with p-value ≈ .328. (The df are 11
and 106.) So, there is no evidence that seasonality needs to be accounted for in forecasting
chnimp.
C18.8 (i) As can be seen from the following graph, gfr does not have a clear upward or
downward trend. Starting from 1913, there is a sharp downward trend in fertility until the mid-
1930s, when the fertility rate bottoms out. Fertility increased markedly until the end of the baby
boom in the early 1960s, after which it fell sharply and then leveled off.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
293
gfr
125
100
85
65
1913 1941 1963 1984
year
(ii) The regression of gfrt on a cubic in t, using the data up through 1979, gives
If we use the usual t critical values, all terms are very statistically significant, and the R-squared
indicates that this curve-fitting exercise tracks gfrt pretty well, at least up through 1979.
(iv) The regression ∆gfrt on just an intercept, using data up through 1979, gives
= –.871
∆ gfrt
(.543)
n = 66, σˆ = 4.41.
(The R-squared is identically zero since there are no explanatory variables. But σˆ , which
estimates the standard deviation of the error, is comparable to that in part (ii), and we see that it
is much smaller here.) The t statistic for the intercept is about –1.60, which is not significant at
the 10% level against a two-sided alternative. Therefore, it is legitimate to treat gfrt as having no
drift, if it is indeed a random walk. (That is, if gfrt = α0 + gfrt-1 + et, where {et} is zero-mean,
serially uncorrelated process, then we cannot reject H0: α0 = 0.)
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
294
(v) The prediction of gfrn+1 is simply gfrn, so the predication error is simply ∆gfrn+1 = gfrn+1 –
gfrn. Obtaining the MAE for the five prediction errors for 1980 through 1984 gives MAE ≈
.840, which is much lower than the 34.17 obtained with the cubic trend model. The random
walk is clearly preferred for forecasting.
The second lag is significant. (Recall that its t statistic is valid even though gfrt apparently
contains a unit root: the coefficients on the two lags sum to .961.) The standard error of the
regression is slightly below that of the random walk model.
(vii) The out-of-sample forecasting performance of the AR(2) model is worse than the
random walk without drift: the MAE for 1980 through 1984 is about .991 for the AR(2) model.
[Instructor’s Note: As a third possibility, you might have the students estimate an AR(1) model
for ∆gfrt − that is, impose the unit root in the AR(2) model. The resulting MAE is about .879, so
it is better to impose the unit root than to estimate the unrestricted AR(2). But it still does less
well than the simple random walk without drift.]
(Notice how high the R-squared is. However, it is meaningless as a goodness-of-fit measure
because {yt} has a trend, and possibly a unit root.)
(ii) The forecast for 1990 (t = 32) is 3,186.04 + 116.24(32) + .630(17,804.09) ≈ 18,122.30,
because y is $17,804.09 in 1989. The actual value for real per capita disposable income was
$17,944.64, and so the forecast error is –$177.66.
(iii) The MAE for the 1990s, using the model estimated in part (i), is about 371.76.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
295
The MAE for the forecasts in the 1990s is about 718.26. This is much higher than for the model
with yt-1, so we should use the AR(1) model with a linear time trend.
C18.10 (i) The AR(1) model for ∆r6, estimated using all but the last 16 observations, is
The RMSE for forecasting one-step-ahead over the last 16 quarters is about .704.
The RMSE is about .788, which is higher than the RMSE without the error correction term.
Therefore, while the EC term improves the in-sample fit (and is statistically significant), it
actually hampers out-of-sample forecasting.
(iii) To make the forecasting exercises comparable, we exclude the last 16 observations to
estimate the cointegrating parameters. The CI coefficient is about 1.028. The estimated error
correction model is
which shows that this fits worse than the EC model when the cointegrating parameter is assumed
to be one. The RMSE for the last 16 quarters is .782, so this works slightly better. But both
versions of the EC model are dominated by the AR(1) model for ∆r6t.
[Instructor’s Note: Because ∆r6t-1 is only marginally significant in the AR(1) model, its
coefficient is small, and the intercept is also very small and insignificant, you might have the
students use zero to predict ∆r6 for each of the last 16 quarters. In other words, the “model” for
r6 is simply ∆r 6t = ut , where ut is an unpredictable sequence with zero mean. The resulting
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
296
RMSE is about .657, which means this works best of all. The lesson is that econometric methods
are not always called for, or even desirable.]
(iv) The conclusions would be identical because, as shown in Problem 18.9, the one-step-
ahead errors for forecasting r6n+1 are identical to those for forecasting ∆r6n+1.
C18.11 (i) For lsp500, the ADF statistic without a trend is t = −.79; with a trend, the t statistic is
−2.20. These are both well above their respective 10% critical values. In addition, the estimated
roots are quite close to one. For lip, the ADF statistic without a trend is −1.37 without a trend
and −2.52 with a trend. Again, these are not close to rejecting even at the 10% levels, and the
estimated roots are very close to one.
lsp 500 = −2.402 + 1.694 lip
(.095) (.024)
n = 558, R2 = .903.
The t statistic for lip is over 70, and the R-squared is over .90. These are hallmarks of spurious
regressions.
(iii) Using the residuals uˆt obtained in part (ii), the ADF statistic (with two lagged changes)
is −1.57, and the estimated root is over .99. There is no evidence of cointegration. (The 10%
critical value is −3.04.)
(iv) After adding a linear time trend to the regression from part (ii), the ADF statistic applied
to the residuals is −1.88, and the estimated root is again about .99. Even with a time trend there
is no evidence of cointegration.
(v) It appears that lsp500 and lip do not move together in the sense of cointegration, even if
we allow them to have unrestricted linear time trends. The analysis does not point to a long-run
equilibrium relationship.
C18.12 (i) The F statistic for the second and third lags, with 2 and 550 degrees of freedom, gives
F = 3.76 and p-value = .024.
(ii) When pcspt-1 is added to the AR(3) model in part (i), its coefficient is about .031 and its
t statistic is about 2.40. Therefore, we conclude that pcsp does Granger cause pcip.
(iii) The heteroskedasticity-robust t statistic is 2.47, so the conclusion from part (ii) does not
change.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
297
C18.13 (i) The DF statistic is about −3.31, which is to the left of the 2.5% critical value (−3.12),
and so, using this test, we can reject a unit root at the 2.5% level. (The estimated root is about
.81.)
(ii) When two lagged changes are added to the regression in part (i), the t statistic becomes
−1.50, and the root is larger (about .915). Now, there is little evidence against a unit root.
(iii) If we add a time trend to the regression in part (ii), the ADF statistic becomes −3.67, and
the estimated root is about .57. The 2.5% critical value is −3.66, and so we are back to fairly
convincingly rejecting a unit root.
(iv) The best characterization seems to be an I(0) process about a linear trend. In fact, a
stable AR(3) about a linear trend is suggested by the regression in part (iii).
(v) For prcfatt, the ADF statistic without a trend is −4.74 (estimated root = .62) and with a
time trend the statistic is −5.29 (estimated root = .54). Here, the evidence is strongly in favor of
an I(0) process whether or not we include a trend.
C18.14 (i) Using the ADF regression with one lag and a time trend, the coefficient on
lwage232t −1 is only −.0056 with t = −1.39. The estimated root is so close to one (.9944) that the
test outcome is almost superfluous. Anyway, the 10% critical value is −3.12 (from Table 18.3),
and the t statistic is well above that.
For the employment series, the coefficient on lemp 232t −1 in the ADF regression is actually
slightly positive, .0008. For practical purposes, the root is estimated to be one.
With so little evidence against a unit root – including estimated roots that are virtually one –
the only sensible approach to analyzing these series is to treat them as having unit roots.
(ii) Without a time trend, the regression ∆uˆt on uˆt −1 , ∆uˆt−1 , ∆uˆt−2 (where the uˆt are the
residuals from the cointegrating regression) gives a coefficient on uˆt −1 of .00008, and so the
estimated root is one. When time is added to the initial regression, the coefficient on uˆt −1
becomes .0022; again, there is no evidence for cointegration. (This is one of those somewhat
unusual cases where, even after obtaining the residuals from an OLS regression, the estimated
root is actually greater than one.)
(iii) If we use the real wage, as defined in the problem, and include a time trend, the
coefficient on uˆt −1 is −.044 with t = −3.09. Now, at least the estimated root is less than one
(about .956), but not much less. The 10% critical value for the E-G test with a time trend is
−3.50, and so we cannot reject the null that lemp 232t and lrwage232t are not cointegrated.
There is more evidence of cointegration when we use the real wage, but such a large root means
that, if the two series are returning to an equilibrium (around a trend), it happens very slowly.
(iv) Generally, productivity is omitted from the cointegrating relationship. Economic theory
tells us the demand for labor depends on worker productivity in addition to real wages.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.
298
Productivity is often modeled as following a random walk (with positive drift), and just
including a linear time trend may not sufficiently capture changes in productivity. Factors such
as other sources of income – that is, nonwage income – can affect the supply of labor, and other
income (or its log) likely has a unit root.
C18.15 (i) The usual DF test, obtained by regressing curate on a lag of urate, gives a very
small coefficient, −.0063, and a t statistic, −.79, that is not close to being significant. Adding two
lags of curate changes little: The coefficient on urate_1 becomes −.0086 with t = −1.22. There is
very little evidence against the unit root hypothesis for urate. The coefficients on both lags of
curate are positive and statistically very significant, but the outcome of the augmented DF test is
essentially the same as the usual DF test.
(ii) For vrate the outcome is less clear cut. From the simple DF regression, the estimated 𝜌𝜌 is
about .925. The DF t = −2.68, is below the 10% critical value, −2.57, but above the 5% critical
value, −2.86. When we use the ADF statistic the evidence against a unit root is even weaker with
t = −2.19. Based on the ADF we cannot reject the null hypothesis of a unit root at even the 10%
level and the two lags of cvrate are both very statistically significant, which justifies relying on
the ADF statistic. Because the estimated value of 𝜌𝜌 is pretty high, we operate under the
assumption that vrate also has a unit root. (If we do not, the Beveridge Curve would make no
sense, as it would be relating in I(1) variable, urate, to an I(0) variable, vrate.)
(iii) To implement the Engle-Granger test for cointegration, we regress urate on vrate and get
the residuals, say, û . We then run û through a standard DF test. The coefficient on uˆ _1 is −.148
and the Engle-Granger t statistic is −3.23. This is below the 10% critical value, −3.04, but above
the 5% critical value, −3.34. Thus, we reject the null of no cointegration at the 10% level but not
the 5% level.
−4.56 to −3.40. If we use the usual OLS standard error, the 95% CI is narrower because the
standard error is smaller. The CI runs from −4.38 to −3.58, but we should not rely on this.
(v) When two lags are added to the EG regression, the t statistic falls dramatically in
magnitude: t = −1.05. Now there is no evidence of cointegration, which is particularly troubling
because the two lags have very statistically significant coefficients.
Given the evidence from the augmented EG statistic, it is clear that the claim that urate and
vrate are cointegrated is not supported by the data. It seems that, from a modern time series
perspective – at least during a recent period, and with monthly data – the Beveridge Curve may
not be well defined.
© 2016 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website,
in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise
on a password-protected website or school-approved learning management system for classroom use.