Week 8 in-class problems
PM522b Introduction to the Theory of Statistics Part 2
Exercise 1: Asymptotic Normality of the MLE – Bernoulli Model
iid
Let X1 , . . . , Xn ∼ Bernoulli(θ), where θ ∈ (0, 1).
In this exercise, you will derive the asymptotic distribution of the maximum like-
lihood estimator (MLE) θ̂n by following the general steps of the asymptotic normality
proof. This is a chance to see how the theory works in a concrete, tractable example.
1. Find the log-likelihood and the score function.
(a) The likelihood is:
n
Y
L(θ) = θXi (1 − θ)1−Xi
i=1
Take the logarithm to find the log-likelihood:
n
X
ℓ(θ) = [Xi log θ + (1 − Xi ) log(1 − θ)]
i=1
(b) Differentiate to get the score function:
n
′
X Xi 1 − Xi
ℓ (θ) = −
θ 1−θ
i=1
(c) Solve ℓ′ (θ) = 0 to find the MLE:
n
1X
θ̂n = X̄n = Xi
n
i=1
2. Set up the Taylor expansion of the score function.
(a) Using a second-order Taylor expansion and ignoring the third term, which
under suitable regularity conditions can be shown to converge to zero in prob-
ability:
ℓ′ (θ̂n ) ≈ ℓ′ (θ0 ) + (θ̂n − θ0 )ℓ′′ (θ0 )
1
(b) Since ℓ′ (θ̂n ) = 0 (by the MLE condition), rearrange to solve for:
√
√ n · ℓ′ (θ0 )
n(θ̂n − θ0 ) ≈ − ′′
ℓ (θ0 )
3. Analyze the asymptotic behavior of the numerator.
(a) The score function at θ0 can be written as:
n
X Xi − θ0
ℓ′ (θ0 ) =
θ0 (1 − θ0 )
i=1
(b) The terms are i.i.d. with mean zero and finite variance, so the CLT gives:
1 ′ d 1
√ · ℓ (θ0 ) −
→ N 0,
n θ0 (1 − θ0 )
4. Analyze the behavior of the denominator.
(a) Compute the observed information (negative second derivative):
n
′′
X Xi 1 − Xi
ℓ (θ) = − 2
+
θ (1 − θ)2
i=1
(b) Use the law of large numbers to argue:
1 ′′ p X 1−X
ℓ (θ0 ) →
− −E 2 + = −I(θ0 )
n θ0 (1 − θ0 )2
5. Conclude the asymptotic distribution.
(a) Use Slutsky’s theorem to combine the asymptotic distribution of the numer-
ator and the convergence in probability of the denominator:
√
d 1
n(θ̂n − θ0 ) −
→ N 0,
I(θ0 )
(b) Compute the Fisher Information for the Bernoulli model:
" 2 #
∂ 1
I(θ0 ) = E log f (X; θ) =
∂θ θ0 (1 − θ0 )
(c) Therefore:
√ d
n(θ̂n − θ0 ) −
→ N (0, θ0 (1 − θ0 ))
which matches the known asymptotic distribution of the sample mean for
i.i.d. Bernoulli variables.
2
Exercise 2: Asymptotic Normality of the MLE – Exponential Model
iid
Let X1 , . . . , Xn ∼ Exponential(θ) with density
f (x; θ) = θe−θx , x ≥ 0, θ > 0.
In this exercise, you will derive the asymptotic distribution of the MLE θ̂n by fol-
lowing the general steps in the proof of asymptotic normality of MLEs.
1. Find the log-likelihood and score function.
(a) Write down the log-likelihood function ℓ(θ) for the sample.
(b) Compute the score function ℓ′ (θ) and solve ℓ′ (θ) = 0 to find the MLE θ̂n .
2. Set up the Taylor expansion of the score function.
(a) Expand ℓ′ (θ̂n ) around the true parameter θ0 using a first-order Taylor expan-
sion.
√
(b) Rearrange the expression to isolate n(θ̂n − θ0 ).
3. Analyze the asymptotic behavior of the numerator.
(a) Express ℓ′ (θ0 ) as a sum of i.i.d. random variables.
(b) Use the central limit theorem to derive the limiting distribution of √1
n
· ℓ′ (θ0 ).
4. Analyze the behavior of the denominator.
(a) Compute the second derivative ℓ′′ (θ).
p
(b) Show that ℓ′′ (θ0 )/n →
− −I(θ0 ), the negative Fisher information.
5. Conclude the asymptotic distribution.
(a) Combine the previous results to show that
√
d 1
n(θ̂n − θ0 ) −
→ N 0, .
I(θ0 )
(b) Compute I(θ0 ) explicitly for the Exponential model.
(c) Alternatively, verify this result by applying the delta method to θ̂n = 1/X̄n .
3
Exercise 3: Asymptotic Confidence Intervals for the Poisson Mean Using
Expected and Observed Information
Suppose that X1 , X2 , . . . , Xn is a random sample from a Poisson(λ) distribution with
probability mass function
e−λ λx
P (X = x) = , x = 0, 1, 2, . . . , λ > 0.
x!
(a) Derivation of the MLE.
(i) Write the likelihood function L(λ) and the log-likelihood function ℓ(λ) based on
the full sample.
(ii) Show that the maximum likelihood estimator (MLE) of λ is
n
1X
λ̂ = Xi .
n
i=1
(b) Asymptotic Normality of the MLE.
Show that under standard regularity conditions the MLE is asymptotically normal:
√
d
1
n λ̂ − λ −→ N 0, ,
I(λ)
where I(λ) is the Fisher information for one observation. (Hint: Derive the score function
and its second derivative.)
(c) Expected vs. Observed Fisher Information.
(i) Expected Information:
Show that for one observation the Fisher information is given by
h ∂ 2 i 1
I(λ) = E log f (X; λ) = ,
∂λ λ
so that for the full sample,
n
In (λ) = .
λ
(ii) Observed Information:
The observed information is defined as
∂2
J(λ) = − ℓ(λ).
∂λ2
4
Compute J(λ) (in terms of the data) and show that
Pn
Xi
J(λ) = i=12 .
λ
Explain why, when evaluated at the MLE λ̂, the observed information is
n
J(λ̂) = .
λ̂
Compare this with the expected information.
(d) Construction of Asymptotic Confidence Intervals.
Using the asymptotic normality in part (b), construct a 100(1 − α)% asymptotic confi-
dence interval for λ. Write the interval in two forms:
(i) Using the expected Fisher information.
(ii) Using the observed Fisher information (i.e. by replacing I(λ) with J(λ̂)/n).