6.
Regression with panel data
Key feature of this section:
Up to now, analysis of data on n distinct entities at a given
point of time
(cross sectional data)
Example:
Student-performance data set
Observations on different schooling characteristics in n =
420 districts (entities)
Now, data structure in which each entity is observed at two
or more points of time
Panel data
150
6.1. Structure of panel data sets
Definition 6.1: (Panel data)
Panel data consist of observations on the same n entities at two
or more time periods T . If the data set contains observations
on the independent variables X1, X2, . . . , Xk and the dependent
variable Y , then we denote the data by
(X1,it, X2,it, . . . , Xk,it, Yit),
i = 1, . . . , n and t = 1, . . . , T,
where the first subscript, i, refers to the entity being observed
and the second subscript, t, refers to the date at which it is
observed.
151
Selected observations on cigarette sales, prices, and taxes, by state and year
for U.S. states, 19851995
152
Terminology:
A balanced panel is a panel that has all its observations
(focus of this lecture)
An unbalanced panel is a panel that has some missing data
for at least one time period or for at least one entity
Description of example data set:
Traffic deaths and alcohol taxes
(State Traffic Fatality (STF) data set)
How effective are various government policies designed to
discourage drunk driving in reducing traffic deaths?
153
Description of example data set: [continued]
Annual data between 19821988 for 48 U.S. states
(excluding Alaska and Hawai)
Important variables:
FATALITYRATE is the number of annual traffic deaths per
10000 people in the population in the state
BEERTAX is the real tax on a case of beer put in 1988
U.S. dollars by adjusting for inflation
Various dummy variables indicating state-specific characteristics such as legal drinking age and punishment
154
Preliminary analysis:
In a first step we focus on the two years 1982 and 1988 and,
for each year, perform an OLS regression of FATALITYRATE on
BEERTAX
The estimated regression equations (neglecting subscripts)
for the 1982 and 1988 data along with the standard errors
(in brackets) are given by
\
FATALITYRATE
=
2.01 + 0.15 BEERTAX
(6.1)
1.86 + 0.44 BEERTAX
(6.2)
(0.15) (0.13)
\
=
FATALITYRATE
(0.11) (0.13)
155
The traffic fatality rate and the tax on beer
156
Preliminary analysis: [continued]
The OLS estimate 1 for the 1982 data is not significant at
the 10% level
(the t-statistic is 1.15 < 1.64)
The OLS estimate 1 for the 1988 data is significant at the
1% level
(the t-statistic is 3.43 > 2.58)
Both OLS estimates are positive what, taken literally, implies
that higher real beer taxes are associated with more (not
fewer) traffic fatalities
Indication of substantial omitted variable bias
157
Preliminary analysis: [continued]
Some potentially neglected state-specific factors:
Quality of automobiles driven in the state
Quality of state highways
Rural versus urban driving
Density of cars on the road
Cultural acceptance of drinking and driving
158
Problem:
Some of these variables (such as the cultural acceptance of
drinking and driving) might be hard or even impossible to
measure
Possible resort:
If these factors remain constant over time in a given state,
then we make use of the panel data structure to effectively
hold these factors constant even though we cannot measure
them
OLS regression with fixed effects
159
6.2. Panel data with two time periods: beforeand-after comparisons
Aim of this section:
Provision of intuition on how we can exploit the panel data
structure to mitigate the omitted-variable-bias problem
Approach:
We consider a panel with T = 2 time periods
We focus on changes in the dependent variable
This before-and-after comparison holds constant the unobserved factors that differ from one state to the next but do
not change over time within the state
160
More explicitly:
Consider the variable Zi with the following properties:
Zi determines the fatality rate in the ith state
Zi does not change over time
(no time-subscript t)
For example, Zi could represent the local cultural attitude
towards drinking and driving which changes slowly
(we consider it to be constant between 1982 and 1988)
Regression equation:
FATALITYRATEit = 0 + 1 BEERTAXit + 2 Zi + uit
(6.3)
with i = 1, . . . , n and t = 1, 2
161
Now:
Zi does not change over time
Zi does not produce any change in FATALITYRATE between
1982 and 1988
We eliminate the impact of Zi by analyzing the change in
FATALITYRATE between the two periods
Derivation of the change:
Regression equations for each time period:
FATALITYRATEi1982 = 0 + 1 BEERTAXi1982 + 2 Zi + ui1982
FATALITYRATEi1988 = 0 + 1 BEERTAXi1988 + 2 Zi + ui1988
162
Derivation of the change: [continued]
Subtraction of both regression equations:
FATALITYRATEi1988 FATALITYRATEi1982
= 1 (BEERTAXi1988 BEERTAXi1982) + ui1988 ui1982 (6.4)
Interpretation of Eq. (6.4):
Zi does not change between 1982 and 1988
Any changes in traffic fatalities over time must have arisen
from other sources
These changes are
changes in the tax on beer
changes in the error terms
(capturing changes in other factors on traffic deaths)
163
More precisely:
Specifying the regression changes in Eq. (6.4) eliminates the
effect of the unobserved variables Zi that are constant over
time
Analyzing changes in Y and X has the effect of controlling
for variables that are constant over time thereby eliminating
this source of omitted variable bias
Consider the change in the fatality rate between 1982 and
1988 against the change in the real beer tax between 1982
and 1988 for the 48 U.S. states
164
Changes in fatality rates and beer taxes, 19821988
165
Empirical results:
OLS estimation results:
\
FATALITYRATEi1988
FATALITYRATEi1982
= 0.072 1.04 (BEERTAXi1988 BEERTAXi1982) (6.5)
(0.065) (0.36)
Intercept in Eq. (6.5) allows for the possibility that the mean
change in the fatality rate, in the absence of a change in the
real beer tax, is nonzero
The negative intercept (0.072) could reflect improvements
in auto safety from 1982 to 1988 that reduced the average
fatality rate
166
Empirical results: [continued]
Estimated effect of a change in the real beer tax is negative
(as predicted by economic theory)
OLS slope coefficient of 1.04 is significant at the 1% level
(the absolute value of the t statistic is 2.89 > 2.58)
Increase in the real beer tax by 1$ per case reduces the
traffic fatality rate by 1.04 deaths per 10000 people
(substantial effect)
Remarks:
The regression Eq. (6.5) controls for fixed factors such as
cultural attitudes towards drinking and driving
There are other factors influencing traffic safety
167
Remarks: [continued]
If these factors change over time and are correlated with
the real beer tax, then their omission will produce omitted
variable bias
More careful analysis in Section 6.5
Transference of the ideas valid for T = 2 to more than 2
time periods (T > 2)
Method of fixed effects regression
168
6.3. Fixed effects regression
Now:
Method for controlling for omitted variables in panel data
when the omitted variables vary across entities but do not
change over time
The fixed effects regression model has n different intercepts,
one for each entity
These intercepts can be represented by a set of binary variables
These binary variables absorp the influences of all omitted
variables that differ from one entity to the next but are constant over time
169
More explicitly:
Consider the regression model (6.3) from Slide 161:
Yit = 0 + 1 Xit + 2 Zi + uit,
(6.6)
where Zi is an unobserved variable that varies from one state
to the next but does not change over time
(for example, Zi represents cultural attitudes toward drinking
and driving)
We aim at estimating 1, the effect on Y of X holding constant the unobserved state characteristic Z
We can interpret Eq. (6.6) as having n intercepts, one for
each entity
170
More explicitly: [continued]
Specifically, define i 0 +2 Zi, so that Eq. (6.6) becomes
Yit = 1 Xit + i + uit
(6.7)
1, . . . , n are treated as state-specific intercepts to be estimated
Population regression line for the ith state: Yit = i + 1 Xit
The slope coefficient 1 is the same for all states, but the
intercept varies from one state to the next
The intercept i can be thought of as the effect of being
in entity i
171
More explicitly: [continued]
The terms 1, . . . , n are known as entity fixed effects
The variation in the entity fixed effects comes from omitted
variables (like Zi in Eq. (6.6)) that vary across entities but
not over time
Eq. (6.7) is known as the fixed effects regression model
Representation with dummy variables:
Consider the n 1 dummy variables
D2,i =
1 when i = 2
, . . . , Dn,i =
0 otherwise
1 when i = n
0 otherwise
172
Representation with dummy variables: [continued]
Then, the fixed effects regression model (6.7) can be equivalently expressed as
Yit = 0 + 1 Xit + 2 D2,i + . . . + n Dn,i + uit,
(6.8)
where 0, 1, 2, . . . , n are coefficients to be estimated
Relationships between parameters in Eqs. (6.7) and (6.8):
1 = 0, 2 = 0 + 2, . . . , n = 0 + n
The entity-specific intercepts in Eq. (6.7) and the binary
regressors in Eq. (6.8) have the same source, namely the
unobserved variable Zi that varies across entities but not
over time
173
Now:
Extension to multiple X-regressors
Definition 6.2: (Fixed effects regression model)
The fixed effects regression model is
Yit = 1 X1,it + . . . + k Xk,it + i + uit,
(6.9)
where i = 1, . . . , n and t = 1, . . . , T and 1, . . . , n are the entityspecific intercepts. Equivalently, the fixed effects regression
model can be written in terms of a common intercept, the Xregressors and the n 1 dummy variables defined on Slide 172:
Yit = 0 + 1 X1,it + . . . + k Xk,it
+ 2 D2,i + . . . + n Dn,i + uit.
(6.10)
174
Estimation and inference:
In principle, the binary variable specification (6.10) can be
estimated via OLS
However, specificaton (6.10) requires estimation of k + n parameters what becomes problematic if the number of entities
n is large
Use of special routines for OLS estimation of fixed effects
regressions
(Two-step) entity-demeaned OLS algorithm
Subtract the entity-specific averages from each variable
Perform OLS regression using the entity-demeaned variables
175
Estimation and inference: [continued]
Example:
Consider the (single-regressor) fixed effects model (6.7)
Taking (time) averages on both sides of (6.7) yields
i + i + u
i = 1 X
i
Y
PT
i and u
with Yi = (1/T ) t=1 Yit and X
i similarly defined
It follows from Eq. (6.7) that
i i u
= 1 Xit + i + uit 1 X
Y
i
Y
| it {z }i
it
Y
i) + (uit u
= 1 (X
X
)
it
|
{z
}
{z i }
|
it
X
it + u
it
= 1 X
uit
(6.11)
176
Estimation and inference: [continued]
Example: [continued]
Estimation of 1 in Eq. (6.11) via OLS
Under certain assumptions stated on Slide 187 (the so-called
fixed effects regression assumptions)
the sampling distribution of the OLS estimator is normal
in large samples
the variance and the standard error of the sampling distribution can be estimated from the data
Hypothesis testing (based on t- and F -statistics) and construction of confidence intervals in exactly the same way
as in multiple regressions with cross-sectional data
177
Application to traffic deaths:
OLS estimate of the fixed effects regression based on all
T = 7 years of data (observations) is
\
FATALITYRATE
= 0.66 BEERTAX + StateFixedEffects
(0.29)
The sign of 1 is negative and the coefficient is significant
at the 5% level
Including state fixed effects avoids omitted variable bias arising from omitted factors that vary across states but are constant over time
What about the effects of omitted factors that evolve over
time but are the same for all states?
(for example, overall automobile safety improvements)
Regression with time fixed effects
178
6.4. Regression with time fixed effects
Now:
We aim at controlling for variables that are constant across
entities but evolve over time
(such as overall safety improvements in new cars)
To this end, we augment our regression Eq. (6.6) from Slide
170 to take the form
Yit = 0 + 1 Xit + 2 Zi + 3 St + uit,
(6.12)
where St is an unobserved variable (representing automobile
safety) that changes over time but is constant across states
Note that omitting St from the regression may lead to omitted variable bias
179
Time effects only:
Let us consider for the moment that the variables Zi are not
present, so that Eq. (6.12) becomes
Yit = 0 + 1 Xit + 3 St + uit
(6.13)
Similar to the entity fixed effects model, it is possible to
eliminate St from Eq. (6.13)
Specifically, we set t = 0 + 3 St to obtain
Yit = 1 Xit + t + uit
(6.14)
This model has a different intercept, t, for each time period
which can be thought of as the effect on Y of time period t
1, . . . , T are known as time fixed effects whose variation
stems from omitted variables (like St) that vary over time
but not across entities
180
Time effects only: [continued]
Considering the T 1 binary variables
B2,t =
1 when t = 2
, . . . , BT,t =
0 otherwise
1 when t = T
0 otherwise
we can equivalently express model (6.14) as
Yit = 0 + 1 Xit + 2 B2,t + . . . + T BT,t + uit,
(6.15)
where 0, 1, 2, . . . , T are coefficients to be estimated
Relationships between parameters in Eqs. (6.14) and (6.15):
1 = 0, 2 = 0 + 2, . . . , T = 0 + T
(see Eqs. (6.7) and (6.8) on Slides 171, 173)
181
Now:
Combination of entity and time fixed effects
Definition 6.3: (Entity and time fixed effects regression model)
The fixed effects regression model is
(6.16)
Yit = 1 X1,it + . . . + k Xk,it + i + t + uit,
where 1, . . . , n are the entity fixed and 1, . . . , T time fixed
effects. Equivalently, the entity and time fixed effects regression
model can be written in terms of a common intercept, the Xregressors and the n 1 and T 1 dummy variables defined on
Slides 172, 181:
Yit = 0 + 1 X1,it + . . . + k Xk,it
+ 2 D2,i + . . . + n Dn,i
+ 2 B2,t + . . . + T BT,t + uit.
(6.17)
182
Remark:
The combined entity and time fixed effects regression model
eliminates omitted variables bias arising both from unobserved variables that are constant over time and from variables that are constant across states
Parameter estimation:
The full model (6.17) can in principle be estimated by OLS
Most software packages implement a two-step algorithm using entity and time-period demeaned Y and X-variables
183
Application to traffic deaths:
OLS estimate of the entity and time fixed effects regression:
\
FATALITYRATE
= 0.64 BEERTAX + SF Effects + TF Effects
(0.36)
This specification includes
47 state binary variables (state fixed effects, not reported)
6 single-year binary variables (time fixed effects, not reported)
the variable BEERTAX
the intercept (not reported)
184
Application to traffic deaths: [continued]
Time fixed effects have little impact on beer tax coefficient
(cf. regression estimation on Slide 178)
Coefficient is significant at the 10% level
(but not at the 5% level; t-statistic is -0.64/0.36 = -1.78)
Estimation is immune to omitted variable bias from variables
that are constant either over time or across states
However, other relevant but omitted variables may vary both
across states and over time
Specification might still be subject to omitted variable bias
More careful analysis of the dataset
see class
185
6.5. The fixed effects regression assumptions and
standard errors for fixed effects regression
Aim of this section:
Formulation of OLS assumptions of the fixed effects regression model so that Theorem 2.4 on Slide 19 holds for the
involved OLS estimators
(especially the asymptotic normal distribution when n is large)
Some comments on the standard errors for fixed effects regressions
186
Definition 6.4: (Fixed effects regression assumptions)
We consider the fixed effects regression model
Yit = 1 Xit + i + uit,
i = 1, . . . , n, t = 1, . . . , T.
The following are called the fixed effects regression assumptions:
1. uit has conditional mean zero:
E(uit|Xi1, Xi2, . . . , XiT , i) = 0.
2. (Xi1, Xi2, . . . , XiT , ui1, ui2, . . . , uiT ), i = 1, . . . , n, are i.i.d. draws
from their joint distribution.
3. Large outliers are unlikely: Xit and uit have nonzero finite
fourth moments.
4. There is no perfect multicollinearity.
For multiple regressors, Xit should be replaced by the full list
X1,it, X2,it, . . . , Xk,it.
187
Remarks:
Definition 6.4 focuses on entity fixed effects regressions neglecting time effects
An extension for including time fixed effects is straightforward
Standard errors for fixed effects regression:
Autocorrelated errors are a pervasive phenomenon in data
with a time component
(see Section 3.1.2. on Slides 48, 49)
In the case of autocorrelated errors standard errors should
be computed using the HAC estimator of the variance
One type of HAC errors are clustered errors used in the
traffic-fatality dataset
188