0% found this document useful (0 votes)
6 views29 pages

Count2019 DAY1b Crosssection Inference

The document discusses count data models, specifically focusing on Poisson quasi-MLE and various standard error estimation methods, including heteroskedastic-robust and cluster-robust standard errors. It outlines the process for bootstrap estimation of standard errors and provides an application of Poisson regression using doctor visit data. The presentation is part of a workshop held from May 13-16, 2019, by A. Colin Cameron at Queens University, Canada.

Uploaded by

davidwinsor2023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views29 pages

Count2019 DAY1b Crosssection Inference

The document discusses count data models, specifically focusing on Poisson quasi-MLE and various standard error estimation methods, including heteroskedastic-robust and cluster-robust standard errors. It outlines the process for bootstrap estimation of standard errors and provides an application of Poisson regression using doctor visit data. The presentation is part of a workshop held from May 13-16, 2019, by A. Colin Cameron at Queens University, Canada.

Uploaded by

davidwinsor2023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Monday Part 2

Counts: Cross-section Inference

A. Colin Cameron
Univ. of Calif. - Davis
. .
Nonlinear Cross-section and Panel Regression
Models for Count Data
. .
Queens University, Canada
Department of Economics

May 13-16, 2019

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens1 Univers
/ 29
1. Introduction

1. Introduction

Count data models are for dependent variable y = 0, 1, 2, ...

These slides focus on inference for Poisson quasi-MLE


I heteroskedastic-robust standard errors
I cluster-robust standard errors
I bootstrap.

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens2 Univers
/ 29
1. Introduction

Outline

1 Introduction
2 Standard Errors for OLS
3 Standard Errors for Poisson
4 Bootstrap
5 Bootstrap with Asymptotic Re…nement

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens3 Univers
/ 29
2. Standard Errors for OLS

2. Standard Errors for OLS

b = (∑i xi x0 ) 1
First consider OLS in linear model: β i ∑i xi yi .
Substitute in yi = xi0 β + ui and simplify gives
1
b
β β= ∑i xi xi0 ∑i xi ui .
For simplicity assume that the xi0 s are …xed
I b ] = β.
if E[ui ] = 0 then E[ β
I it follows that then
b]
V[ β = b
E[( β β )2 ]
1 h i 1
= ∑i xi xi0 V ∑i xi ui ∑i xi xi0 .

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens4 Univers
/ 29
2. Standard Errors for OLS

OLS Review (continued)


If observations are independent over i then V[∑i xi ui ] = ∑i V[xi ui ] so
1 h i 1
b ] = ∑ xi xi0
V[ β ∑ E [ xi xi
0 2
ui ] ∑ xi xi
0
.
i i i

Original approach: Assume a model for E[ui2 jxi ] and …t this model
I b [xi x0 u 2 ] = xi x0 exp(x0 b
e.g. E[ui2 ] = exp(xi0 α) and use E i i i i α)
1980’s on: There is no need for such a model!
I simply estimate ∑ E[x x0 u 2 ] by ∑ x x0 u
i i i i
2
i i i bi where u
bi = yi b
xi0 β.
I even though ubi is not a good estimate of E[ui2 ]
2
I heteroskedastic-robust due to White (1982) and Huber (1967)
1 h i 1
V b ] = ∑ xi x0
b [β ∑i i i i
x x 0 2
b
u ∑i xi xi0
i i

Can extend to
I serially correlated errors (HAC robust)
I clustered errors (cluster-robust).
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens5 Univers
/ 29
3. Standard Errors for Poisson quasi-MLE

3. Standard Errors for Poisson quasi-MLE


Recall Poisson MLE solves ∑i (yi exp(xi0 βb ))xi = 0.
Take a …rst-order Taylor series expansion of left-hand side about β
I b ) ' g( β) + g0 ( β)( β
so g( β b β)

∑ i ( yi b ))xi '
exp(xi0 β ∑i (yi exp(xi0 β))xi ∑i exp(xi0 β)xi xi0 ( βb β)

Set r.h.s. to zero (note: we can ignore the remainder asymptotically)

∑i (yi µi )xi + ∑i b
µi xi xi0 ( β β) ' 0.

b
Solve for ( β β)
1
b
(β β) ' ∑i µi xi xi0 ∑i (yi µi )xi .

b 1
This is like earlier OLS ( β β) = (∑i xi xi0 ) ∑i xi ui
I so proceed in the same way.
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens6 Univers
/ 29
3. Standard Errors for Poisson quasi-MLE

Poisson heteroskedastic-robust standard errors


We have
1
b
(β β) ' ∑i µi xi xi0 ∑i (yi µi )xi .

Then
1 h i 1
b] '
V[ β ∑i µi xi xi0 V ∑i (yi µi )xi ∑i µi xi xi0 .

Given independence over i, V[∑i (yi µi )xi ] = ∑i V[(yi µi )xi ] so


1 h i 1
b ] ' ∑ µ xi x0
V[ β ∑ E [( yi µ ) 2
xi x 0
] ∑ µ xi x 0
.
i i i i i i i i i

And we use
1 h i 1
V b] =
b HET [ β ∑i µi xi xi0 ∑i (yi bi )2 xi xi0
µ ∑i µi xi xi0 .

Whereas MLE variance uses E[(yi µi )2 xi xi0 ] = µi xi xi0


and GLM variance uses E[(yi µi )2 xi xi0 ] = α µi xi xi0 .
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens7 Univers
/ 29
3. Standard Errors for Poisson quasi-MLE

Poisson cluster-robust standard errors


Sometimes errors are correlated within cluster or group and
independent across clusters
I e.g. correlated if in same village and independent if in di¤erent villages.
Let there be G such clusters, g = 1, ..., G . Then
h i h i
V ∑i (yi µi )xi = ∑g =1 V ∑i 2g (yi µi )xi as indep. over g
G

∑g =1 ∑i 2g ∑j 2g E[(yi
G
= µi )xi (yj µj )xj0 ] .

Then we use (provided G ! ∞)


1
b] =
b CLU [ β
V ∑i µi xi xi0
∑g =1 ∑i 2g ∑j 2g (yi
G
bi )(yj
µ bj )xi xj0
µ
1
∑i µi xi xi0 .

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens8 Univers
/ 29
3. Standard Errors for Poisson quasi-MLE

Here little di¤erence but in general cluster-robust standard errors can


be much larger than heteroskedastic-robust.

. * Poisson with cluster robust standard errors - illustration


. * Here cluster on age for illustration
. * In practice the grouping variable would be, for example, village
. poisson docvis $xlist, vce(cluster age) nolog // Poisson robust SEs

Poisson regression Number of obs = 3,677


Wald chi2(7) = 917.07
Prob > chi2 = 0.0000
Log pseudolikelihood = -15019.64 Pseudo R2 = 0.1297

(Std. Err. adjusted for 26 clusters in age)

Robust
docvis Coef. Std. Err. z P>|z| [95% Conf. Interval]

private .1422324 .0357205 3.98 0.000 .0722215 .2122434


medicaid .0970005 .0653316 1.48 0.138 -.0310471 .2250481
age .2936722 .0471694 6.23 0.000 .2012219 .3861226
age2 -.0019311 .0003162 -6.11 0.000 -.0025508 -.0013114
educyr .0295562 .0054728 5.40 0.000 .0188296 .0402828
actlim .1864213 .0374476 4.98 0.000 .1130254 .2598172
totchr .2483898 .011554 21.50 0.000 .2257444 .2710353
_cons -10.18221 1.749064 -5.82 0.000 -13.61031 -6.754105

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models for May
Count13-16,
Data 2019
. . Queens9 Univers
/ 29
4. Bootstrap Estimate of standard error

4. Bootstrap estimate of standard error


Basic idea is view f(y1 , x1 ), ..., (yN , xN )g as the population.
Then obtain B random samples from this population
I Get B estimates b
θ 1 , ..., b
θB .
I Then estimate Var[b θ ] as the usual standard deviation of B estimates
B B
b [b
V θ] = 1
B 1 ∑b =1 (bθ b b
θ )2 , where b
θ= 1
B ∑b =1 bθ b .
I Square root of this is called a bootstrap standard error.
Nonparametric bootstrap gets B di¤erent samples of size N we
resample with replacement from f(y1 , x1 ), ..., (yN , xN )g
I In each bootstrap sample some original data points appear more than
once while others not appear at all.
IMPORTANT: Stata 14 changed to a di¤erent random number
generator (mt64) than earlier versions (kiss32).
I These slides use the old Stata 13 generator.
I To get my slide results using Stata 14 or 15: set rng kiss32
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
10 Univers
/ 29
4. Bootstrap Estimate of standard error Poisson regression application

Poisson regression application

Data: Doctor visits (count) and chronic conditions. N = 50.

Contains data from musbootdata.dta


obs: 50
vars: 3 16 Apr 2010 10:32
size: 750 (99.9% of memory free)

storage display value


variable name type format label variable label

docvis int %8.0g number of doctor visits


age float %9.0g Age in years / 10
chronic byte %8.0g = 1 if a chronic condition

Sorted by:

. summarize

Variable Obs Mean Std. Dev. Min Max

docvis 50 4.12 7.82106 0 43


age 50 4.162 1.160382 2.6 6.2
chronic 50 .28 .4535574 0 1

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
11 Univers
/ 29
4. Bootstrap Estimate of standard error Standard error estimation

Bootstrap standard errors after Poisson regression


Use option vce(boot)
I Set the seed!
I Set the number of bootstrap repetitions!

. * Compute bootstrap standard errors using option vce(bootstrap) to


. poisson docvis chronic, vce(boot, reps(400) seed(10101) nodots)

Poisson regression Number of obs = 50


Replications = 400
Wald chi2(1) = 3.50
Prob > chi2 = 0.0612
Log likelihood = -238.75384 Pseudo R2 = 0.0917

Observed Bootstrap Normal-based


docvis Coef. Std. Err. z P>|z| [95% Conf. Interval]

chronic .9833014 .5253149 1.87 0.061 -.0462968 2.0129


_cons 1.031602 .3497212 2.95 0.003 .3461607 1.717042

Bootstrap se = 0.525 versus White heteroskedastic-robust se = 0.515.


Note that if B ! ∞ the bootstrap se is asymptotically equivalent to
White heteroskedastic-robust se!
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
12 Univers
/ 29
4. Bootstrap Estimate of standard error Standard error estimation

Results vary with seed and number of reps

. * Bootstrap standard errors for different reps and seeds


. quietly poisson docvis chronic, vce(boot, reps(50) seed(10101))

. estimates store boot50

. quietly poisson docvis chronic, vce(boot, reps(50) seed(20202))

. estimates store boot50diff

. quietly poisson docvis chronic, vce(boot, reps(2000) seed(10101))

. estimates store boot2000

. quietly poisson docvis chronic, vce(robust)

. estimates store robust

. estimates table boot50 boot50diff boot2000 robust, b(%8.5f) se(%8.5f)

Variable boot50 boot50~f boot2000 robust

chronic 0.98330 0.98330 0.98330 0.98330


0.47010 0.50673 0.53479 0.51549
_cons 1.03160 1.03160 1.03160 1.03160
0.39545 0.32575 0.34885 0.34467

legend: b/se

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
13 Univers
/ 29
4. Bootstrap Estimate of standard error Leading uses of bootstrap standard errors

Leading uses of bootstrap standard errors

Sequential two-step m-estimator


I First step gives b
α used to create a regressor z (b
α)
I Second step regresses y on x and z (b α)
I Do a paired bootstrap resampling (x, y , z )
I e.g. Heckman two-step estimator.
2SLS estimator with heteroskedastic errors (if no White option)
I Paired bootstrap gives heteroskedastic robust standard errors.
Functions of other estimates e.g. b
θ=b
α b
β
I replaces delta method
I Clustered data with many small clusters, such as short panels.
F Then resample the clusters.
F But be careful if model includes cluster-speci…c …xed e¤ects.

For these in Stata need to use pre…x command bootstrap:

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
14 Univers
/ 29
4. Bootstrap Estimate of standard error General algorithm

The bootstrap: general algorithm

A general bootstrap algorithm is as follows:


I 1. Given data w1 , ..., wN
F draw a bootstrap sample of size N (see below)
F denote this new sample w1 , ..., wN .
I 2. Calculate an appropriate statistic using the bootstrap sample.
Examples include:
F (a) estimate b
θ of θ;
F (b) standard error sbθ of estimate bθ
F (c) t statistic t = (b θ b
θ )/sbθ centered at b
θ.
I 3. Repeat steps 1-2 B independent times.
F Gives B bootstrap replications of b
θ 1 , ..., b
θ B or t1 , . . . , tB or .....
I 4. Use these B bootstrap replications to obtain a bootstrapped version
of the statistic (see below).

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
15 Univers
/ 29
4. Bootstrap Estimate of standard error Implementation

Implementation
Number of bootstraps: B high is best but increases computer time.
I CT use 400 for se’s and 999 for tests and con…dence intervals.
I Defaults are often too low. And set the seed!
Various resampling methods
I 1. Paired (or nonparametric or empirical dist. func.) is most common
F w1 , ..., wN obtained by sampling with replacement from w1 , ..., wN .
I 2. Parametric bootstrap for fully parametric models.
F Suppose y jx F (x, θ0 ) and generate yi by draws from F (xi , b
θ)
I 3. Residual bootstrap for regression with additive errors
F Resample …tted residuals u b1 , ..., u
bN to get (u
b1 , ..., u
bN ) and form new
(y1 , x1 ), ..., (yN , xN ).
Need to resample over i.i.d. observations
I resample over clusters if data are clustered
F But be careful if model includes cluster-speci…c …xed e¤ects.
I resample over moving blocks if data are serially correlated.
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
16 Univers
/ 29
4. Bootstrap Estimate of standard error Bootstraps can fail

Bootstrap failure

The following are cases where standard bootstraps fail


I so need to adjust standard bootstraps.
GMM (and empirical likelihood) in over-identi…ed models
I For overidenti…ed models need to recenter or use empirical likelihood.
Nonparametric Regression:
I Nonparametric density and regression estimators converge at rate less
than root-N and are asymptotically biased.
I This complicates inference such as con…dence intervals.
Non-Smooth Estimators: e.g. LAD.

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
17 Univers
/ 29
4. Bootstrap Estimate of standard error Jackknife

Jackknife

The jackknife uses a leave-one-out resampling scheme.


The jackknife estimate of the variance of an estimator b
θ is

∑i =1 (bθ (
N
b [b
V θ] = N 1
N i)
b
θ )2 , where b
θ=N 1
∑i bθ( i ).

I where b
θ( i) is b
θ obtained from the sample with observation i omitted.
The jackknife is a “rough and ready” method for bias reduction in
many situations, but not the ideal method in any.
I it can be viewed as a linear approximation of the bootstrap (Efron and
Tibsharani (1993, p.146)).
I it requires less computation than the bootstrap in small samples, as
then N < B is likely
I but is outperformed by the bootstrap as B ! ∞.

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
18 Univers
/ 29
4. Bootstrap Estimate of standard error Clustered Data

Clustered data

The bootstrap relies on independence over the quantity being


bootstrapped.
So for clustered data we resample over clusters rather than
observations
I Let the g th cluster be yg = (yg 1 , ..., yN g ) and similarly de…ne Xg .
I Then view the G clusters f(y1 , X1 ), ..., (yG , XG )g as the population
I And pick with replacement G clusters, etcetera.
I e.g. poisson y x, vce(boot, cluster(id_cluster)
Similarly for the jackknife use a delete-one-cluster jackknife

∑g =1 (bθ ( ∑g bθ(
G
b [b
V θ] = G 1
g)
b
θ )2 , where b
θ=G 1
g ).
G

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
19 Univers
/ 29
4. Bootstrap Estimate of standard error Clustered Data

Cluster bootstrap: cluster on age for illustrative purposes

. poisson docvis chronic, vce(boot, cluster(age) reps(400) seed(10101) nodots)

Poisson regression Number of obs = 50


Replications = 400
Wald chi2(1) = 4.12
Prob > chi2 = 0.0423
Log likelihood = -238.75384 Pseudo R2 = 0.0917

(Replications based on 26 clusters in age)

Observed Bootstrap Normal-based


docvis Coef. Std. Err. z P>|z| [95% Conf. Interval]

chronic .9833014 .484145 2.03 0.042 .0343947 1.932208


_cons 1.031602 .303356 3.40 0.001 .4370348 1.626168

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
20 Univers
/ 29
5. Bootstrap with asymptotic re…nement Asymptotic re…nement

5. Bootstrap with asymptotic re…nement


The simplest bootstraps are no better than usual asymptotic theory
I advantage is easy to implement, e.g. standard errors.
More complicated bootstraps provide asymptotic re…nement
I this may provide a better …nite-sample approximation.
Conventional asymptotic tests (such as Wald test).
I α = nominal size for a test, e.g. α = 0.05.
I Actual size= α + O (N 1 /2 ).

Tests with asymptotic re…nement


I Actual size= α + O (N 1 ).
I asymptotic bias of size O (N 1 ) < O (N 1 /2 ) is smaller asymptotically.
I But need simulation studies to con…rm …nite sample gains.
1)
p 1/2 ).
F e.g. if N = 100 then 100/N = O (N > 5/ N = O (N

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
21 Univers
/ 29
5. Bootstrap with asymptotic re…nement Asymptotically pivotal statistic

Asymptotically pivotal statistic

Asymptotic re…nement bootstraps an asymptotically pivotal statistic


I this means limit distribution does not depend on unknown parameters.
a
An estimator b
θ θ0 N [0, σb2θ ] is not asymptotically pivotal
I since σb2 is an unknown parameter.
θ
But the studentized t statistic is asymptotically pivotal
a
I since t = (b
θ θ 0 )/sbθ N [0, 1] has no unknown parameters.
So bootstrap Wald test statistic to get tests and con…dence intervals
with asymptotically re…nement.
For con…dence intervals can also use BC (bias-corrected) and BCa
methods.
Econometricians rarely use asymptotic re…nement.

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
22 Univers
/ 29
5. Bootstrap with asymptotic re…nement Con…dence intervals

. * Bootstrap confidence intervals: normal-based, percentile, BC, and BCa


. quietly poisson docvis chronic, vce(boot, reps(999) seed(10101) bca)

. estat bootstrap, all

Poisson regression Number of obs = 50


Replications = 999

Observed Bootstrap
docvis Coef. Bias Std. Err. [95% Conf. Interval]

chronic .98330144 -.0244473 .54040762 -.075878 2.042481 (N)


-.1316499 2.076792 (P)
-.0820317 2.100361 (BC)
-.0215526 2.181476 (BCa)
_cons 1.0316016 -.0503223 .35257252 .3405721 1.722631 (N)
.2177235 1.598568 (P)
.2578293 1.649789 (BC)
.3794897 1.781907 (BCa)

(N) normal confidence interval


(P) percentile confidence interval
(BC) bias-corrected confidence interval
(BCa) bias-corrected and accelerated confidence interval

(N) is observed coe¢ cient 1.96 bootstrap s.e.


(P) is 2.5 to 97.5 percentile of the bootstrap estimates b
β1 , ..., b
βB .
(BC) and (BCa) have asymptotic re…nement.
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
23 Univers
/ 29
5. Bootstrap with asymptotic re…nement Percentile-t bootstrap

Percentile-t bootstrap
.
. * Percentile-t for a single coefficient: Bootstrap the t statistic
. use bootdata.dta, clear

. quietly poisson docvis chronic, vce(robust)

. local theta = _b[chronic]

. local setheta = _se[chronic]

. bootstrap tstar=((_b[chronic]-`theta')/_se[chronic]), seed(10101) ///


> reps(999) nodots saving(percentilet, replace): poisson docvis chronic, ///
> vce(robust)

Bootstrap results Number of obs = 50


Replications = 999

command: poisson docvis chronic, vce(robust)


tstar: (_b[chronic]-.9833014421442415)/_se[chronic]

Observed Bootstrap Normal-based


Coef. Std. Err. z P>|z| [95% Conf. Interval]

tstar 0 1.3004 0.00 1.000 -2.548736 2.548736

a
Bootstrap t = (b
θ θ 0 )/sbθ N [0, 1]
The 999 tstar values are the bootstrap estimated density of the t-statistic
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
24 Univers
/ 29
5. Bootstrap with asymptotic re…nement Percentile-t bootstrap

Percentile-t bootstrap (continued)


Plot of the bootstrap estimated density of the t-statistic
tstar is tb = (b
θb bθ )/sbθ b
.4
.3
.2
.1
0

-5 0 5
tstar

Bootstrap density Standard normal

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
25 Univers
/ 29
5. Bootstrap with asymptotic re…nement Percentile-t bootstrap

Percentile-t bootstrap (continued)


Critical t-values are 2.5 and 97.5 percentiles

. use percentilet, clear


(bootstrap: poisson)

. summarize

Variable Obs Mean Std. Dev. Min Max

tstar 999 -.0874198 1.3004 -4.435354 4.611352

. centile tstar, c(2.5,50,97.5)

Binom. Interp.
Variable Obs Percentile Centile [95% Conf. Interval]

tstar 999 2.5 -2.756196 -3.030972 -2.567785


50 -.0569957 -.1464812 .0312477
97.5 2.568691 2.3092 2.917802

. kdensity tstar, generate(evalpoint densityest) xtitle("tstar from the bootstra

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
26 Univers
/ 29
5. Bootstrap with asymptotic re…nement Percentile-t bootstrap

Wild bootstrap

In practice bootstraps with asymptotic re…nement are not often used


in econometrics.
Datasets are usually large and if small estimation is imprecise.
But with clustered errors and few clusters it can be worthwhile to
cluster bootstrap with asymptotic re…nement
I A. Colin Cameron and Douglas L. Miller, "A Practitioner’s Guide to
Cluster-Robust Inference", Journal of Human Resources, Spring 2015,
Vol.50, No. 2, pp.317-373.
The best bootstrap appears to be a Wild cluster bootstrap.

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
27 Univers
/ 29
5. Bootstrap with asymptotic re…nement Wild bootstrap

Wild bootstrap for OLS


b and u
Let β bi be the original sample estimates.
With independent data the Wild bootstrap resamples as follows
I in b th resample the i th observation is (y , x ) where
i i
(b ) b+u (b )
F yi = xi0 β bi and
F
(b ) (b )
bi
u = ubi with probability 0.5 and ubi = ubi with probability 0.5
F b (b ) is OLS from regress y (b ) on xi .
then β i
In cluster case in b th resample for the g th cluster is (yi , Xi ) where
I
(b ) b+u (b )
yg = Xg β bg and
I
(b ) (b )
bg
u bg with probability 0.5 and u
=u bg = bg with probability 0.5.
u
b and u
A variation that works better is to let β bi (or u
bg ) be original
sample estimates that impose the null hypothesis restriction of
interest
I typically H : β = 0
0 j
I b is from estimation dropping the j th regressor.
then β
A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear
Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
28 Univers
/ 29
5. Bootstrap with asymptotic re…nement Wild bootstrap

Wild score bootstrap for Poisson


For Poisson there is no residual
I so instead resample the score (Kline and Santos)
I b = β +H
recall β b 1g b.
0
Then let
I b = ∑n x0 xi be the Hessian from original estimation
H i =1 i
I b = ∑ni=1 g
g bi be the score or gradient vector
I n
bi = ∑i =1 xi (yi exp( β
g b )) be the contribution to the score
(b )
I g (b ) = g
bi with probability 0.5 and gi = g bi with probability 0.5
F
(b )
more generally gi bi wi where i.i.d. wi have E [wi ] = 0 and
=g
E [wi2 ] = 1

I b (b ) = β
β b+H
b 1
∑ni=1 gi .
(b )

In cluster case repeat but at the cluster level.


This can be implemented by the Stata add-on boottest command.

A. Colin Cameron Univ. of Calif. - Davis . . Nonlinear


Counts:
Cross-section
Cross-section
andInference
Panel Regression Models forMay
Count
13-16,
Data2019
. . Queens
29 Univers
/ 29

You might also like