HSP 511: Economics Lab
Lecture 3
Indian Institute of Technology Delhi
Contents
The Bootstrap 1
Why Does the Bootstrap Work? 3
Let X1 , . . . , Xn ∼ F be random variables distributed according to F , and
θ = T (F )
be some
R
object of interest that we want to estimate using the data. For example, if
θ = xdF (x), then its estimator is
n
1X
θ̂ = Xi ,
n i=1
with Var(θ̂) = σ 2 /n, where σ 2 = Var(Xi ).
※ The Bootstrap
The plug-in estimator of θ is defined as
θ̂ = T (F̂ ) = g(X1 , . . . , Xn )
In the Bootstrap world, we simulate X1∗ , . . . , Xn∗ from F̂ and then compute the
estimator over these values, θ̂∗ = g(X1∗ , . . . , Xn∗ ).
Real world F =⇒ X1 , . . . , Xn =⇒ F̂ =⇒ θ̂ = g(X1 , . . . , Xn )
Bootstrap world F̂n =⇒ X1∗ , . . . , Xn∗ =⇒ F̂ ∗ =⇒ θ̂∗ = g(X1∗ , . . . , Xn∗ )
Bootstrap Variance Estimation
1. Draw X1∗ , . . . , Xn∗ ∼ F̂n .
2. Compute θ̂∗ = g(X1∗ , . . . , Xn∗ ).
3. Repeat steps 1 and 2, B times, to get θ̂1∗ , . . . , θ̂B
∗
.
4. Let v
u
B B
!2
1 X 1 X
u
θ̂b∗ − θ̂∗
u
v̂b = .
B r=1 r
t
B b=1
1
Lecture Notes The Bootstrap
The next theorem states that v̂b2 approximates Var θb . The are two sources
of error in this approximation. The first is due to the fact that n is finite and the
second is due to the fact that B is finite. However, we can make B as large as we
like. (something like B = 10, 000 suffice in practice.)
Theorem 1.1
v̂b2 P
Under appropriate regularity conditions, Var(θ̂)
→ 1 as n → ∞.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
y = np.random.rand(100)
tmean=np.mean(y)
resample=np.mean(np.random.choice(y,size=(100,10000),replace=True),
axis=0)
print('The variance for tmean is: {}'.format(np.var(resample)))
## The variance for tmean is: 0.0007704722692846294
Now we describe the confidence interval algorithm.
Bootstrap Confidence Interval
1. Draw a bootstrap sample X1∗ , . . . , Xn∗ ∼ F̂ . Compute θ̂∗ =
g(X1∗ , . . . , Xn∗ ).
2. Repeat the previous step, B times, yielding estimators θ̂1∗ , . . . , θ̂B
∗
.
3. Let
1 XB
θbj∗ − θb
Ĝ(t) = 1 ≤ t .
B j=1 v̂b
4. Let h i
Cn = θb − v̂b t1−α/2 , θb − v̂b tα/2 ,
b −1 (α/2) and t b −1
where tα/2 = G 1−α/2 = G (1 − α/2).
5. Output Cn .
The next states that our Bootstrap confidence has right coverage asymptotically.
Theorem 1.2
Under appropriate regularity conditions,
!
1
Pr(θ ∈ Cn ) = 1 − α + O √ .
n
2
Lecture Notes Why Does the Bootstrap Work?
x_axis = np.arange(-3, 3, 0.01)
plt.hist((resample-tmean)/np.std(resample), bins=50,
density= True)
plt.plot(x_axis, norm.pdf(x_axis, 0, 1))
plt.show()
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
4 2 0 2 4
※ Why Does the Bootstrap Work?
To explain why the bootstrap works, let us begin with a heuristic. Very often
√ d
n(T (F̂ ) − T (F )) →
− F,
for some distribution F . Main asymptotic justification of the bootstrap is that
conditional on X1 , X2 , . . .
√ d
n(T (F̂ ∗ ) − T (F̂ )) →
− F.
Example 2.1
R
Suppose that θ = xdF (x), then its estimator is
n
1X
θ̂ = Xi .
n i=1
3
Lecture Notes Why Does the Bootstrap Work?
We have asymptotic normality, under the condition that σ = Var(Xi ) < ∞
√ d
− N (0, σ 2 ).
n(θ̂ − θ) →
Lets also define bootstrap estimator as
n
1X
θ̂∗ = X ∗.
n i=1 i
Then for every sequence X1 , X2 , . . .
√ d
n(θ̂∗ − θ̂) →
− N (0, σ 2 ).
Along with asymptotic normality, we also have Edgeworth expansion, for each
t
√ !
n(θ̂ − θ) p 1 (t)ϕ(t) p j (t)ϕ(t) 1
Pr ≤ t − Φ(t) = √ + ... + + ... = O √ ,
σ̂ n nj/2 n
where ϕ, Φ is pdf, cdf of normal distribution and for κ3 skewness, κ4 kurtosis
1
p1 (t) = − κ3 (t2 − 1)
6
1 1
p2(x) = −t κ4 (t2 − 3) + κ23 (t4 − 10t2 + 15) .
24 72
Similar expansion can be found for bootstrap version as
θbj∗ − θb p̂∗1 (t)ϕ(t) p̂∗j (t)ϕ(t)
Pr ≤ t − Φ(t) =
√ + ... + + ....
v̂b n nj/2
√ √ ∗
Since in bootstrap we are approximating distribution of n(σ̂θ̂−θ) with n(σ̂θ̂∗ −θ̂) ,
we have
√
θbj∗ − θb n(θ̂ − θ) [p̂∗ (t) − p1 (t)]ϕ(t)
Pr ≤ t − Pr ≤ t = 1 √ + ....
v̂b σ̂ n
One can show that p̂∗1 (t) − p1 (t) = O √1
n
. With this we have
√
θbj∗ − θb n(θ̂ − θ) 1
Pr ≤ t − Pr
≤t =O
.
v̂b σ̂ n