0% found this document useful (0 votes)
40 views3 pages

Lecture 4 - Panel Data Analysis

Lecture 4 covers panel data analysis, emphasizing its advantages over cross-sectional data, particularly in addressing omitted variable bias. It discusses key features of panel models, including individual-specific effects, strict exogeneity, and estimation techniques like the first difference estimator and pooled OLS. The lecture also highlights the importance of within variation and provides practical Stata commands for analysis.

Uploaded by

keyi.zhang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views3 pages

Lecture 4 - Panel Data Analysis

Lecture 4 covers panel data analysis, emphasizing its advantages over cross-sectional data, particularly in addressing omitted variable bias. It discusses key features of panel models, including individual-specific effects, strict exogeneity, and estimation techniques like the first difference estimator and pooled OLS. The lecture also highlights the importance of within variation and provides practical Stata commands for analysis.

Uploaded by

keyi.zhang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Lecture 4:Panel data analysis

Contents: Motivation; Advantage of Panel Data; Main features of panel models; The individual specific
effect; Strict exogeneity; Between and within variation; Classification; The first difference estimator.
Advantage of (pseudo) pane; over cross section data
to motivate the use of panel-data models
Panel data: the same individual etc, are followed over time.
Estimation of dynamic models (or transition models )is impossible in the case of a time series of
cross sections(panel data)
the primary reason for using panel data is to solve the statistical problem of omitted variables.
Main features of panel models
to relate panel data to cross sections and time series. To introduce the assumption of no
correlation between individual-specific effects and explanatory variables.
panel data=cross sections+time series
The individual-specific effect
Aim: to formalize the use of the unobserved effect ai
Yit = αi + β1X1it + ...+βkXkit +uit (i=1,2,3....N ; t=1,2...T ) unobserved effects model/fixed effects
model 是个式子
αi is the individual-specific effect (a random variable) , fixed effect/ unobserved effect
uit is the idiosyncratic(i.i.d.: identically and independently distributed) error term with
ecpected value zero and constant variance
N: cross-sectional dimension; T: time-dimension
There are K different explanatory variables.
αi captures all individual-specific variables that are not observed by the researcher;
Vit = αi + uit (=individual specific effect + idiosyncratic error term)
Econometric issue: what is strict exogeneity?
Aim: to formalize a linear panel-data model
Assumption TS.3 (strict exogeneity)
for each t, the expected value of ut, given ALL of the k explanatory variables FOR ALL T time
periods, is equal to zero:E(ut| X)=0
Assumption TS.3' contemporaneous exogeneity
For each t, the expected value of ut, given ALL of the K explanatory variables in period t, is
equal to zero: E(ut| Xt1,..Xtk)=0=E(ut| Xt=0), Corr(ut, Xjt)=0,j=1..k
Violation of the strict exogeneity assumpyion while assumption contemporaneous exogeneity is
satisfied
Aim: to reconsider the issue of strict exogeneity for panel data models. Models with a lagged
dependent variable (yt-1) or with a feedback mechanism are NOT strictly exogenous
Between and within variation
Between variation (between individuals): the cross-sectional variation (across individuals)
Within variation (within individuals): the time-series variation (for a given individual). so the
variation within individuals
Economists are usually interested in the within variation more than in the between variation
To measure the within variation of X on Y, we need to control for individual effects. Consequently,
it allows for correlation between the individual effect and the explanatory variable.
Classification
Aim: to classify the different estimation techniques for static panel data models.
课件上有Table A:严格外生变量 on the correlation between the individual effect and RHS-
variables;
Table B estimate the effect of time-invariant variables Zi
The first difference estimator
An estimator for a non-zero correlation between αi and the explanatory variables
Aim: to introduce an eatimation technique for models with a correlation between αi and
explanatory variables.
Yit = αi + β1X1it + ... + βkXkit + uit
the explanatory variables variables Xit are allowed to be correlated.
虚拟便变量要不取0要不取1
There are three methods to estimate parameter β consistently
The first-differences estimator 这次讲
The least Squares Dummy Variable estimator 下次讲
The within estimator 下次讲
first - differences estomoter
the individual effect αi is removed
Some useful Stata commands
tsset i t 或者 xtset i t (i: inidividual; t:time)
xtsum y x (summary statistics that take into account of both dimensions: individuals and
time )
First differences
reg d.y d.x
reg d.y d.x, robust
reg d.y d.x, cluster(number)
步骤:
we started with the first - difference estimator
Next, we checked for autocorrelation, using the Breusch Godfery test
The parameter on the lagged residual was statistically different from zero
we re-estimate the model with the first - difference estimator, using clustered standard errors.
补充 :
稳健标准误(robust standard error),残差-- 基础模型;
聚类标准误(cluster robust standard error),残差--自相关
允许组别间存在异方差,每个组内部聚类减少方差的方法。 但是注意,这一方法既然允
许了组别异方差,有一个重要假设,那就是cluster数量趋近无穷,才会达成consistent。
Pooled OLS
An estimator for a zero correlation between αi and the explanatory variables
Aim: to introduce the estimator that assumes there is no correlation between αi and the
explanatory variables
我们现在假设αi 和自变量之间没有关系
methods:
(method 4 from tableA;下次讲)
Random effects
Pooled OLS(method 5 from Table A)
结论:in pooled OLS there is always autocorrelation.

The estimation procedure for pooled OLS: OLS on Yit=β1X1it +...+βkXkit + vit vit=αi

+ uit , with Newey-West robust standard errors, which is also referred to as
clustered standard errors. It corrects for both heteroskedasticity and autocorrelation.
程序语言 reg y x, cluster(i)
步骤:
we started with the pooled OLS estimator
Next, we checked for autocorrelation, using the Breusch Godfrey test
the parameter on the lagged residual was statistically different from zero
we re-ertimate the model with the pooled OLS estimator, using clustered standard errors.

You might also like