1
M3/M4S3 STATISTICAL THEORY II
THE GLIVENKO-CANTELLI LEMMA
Definition : The Empirical Distribution Function
Let X1 , . . . , Xn be a collection of i.i.d. random variables with cdf FX . Then the empirical
distribution function will be denoted Fn (x), and defined for x R by
n
1X
Fn (x) = I[Xi ,) (x)
n
i=1
where IA () is the indicator function for set A.
If data x1 , . . . , xn are available, then the observed or estimated empirical distribution function is
denoted Fbn (x) and defined by
n
b 1X
Fn (x) = I[xi ,) (x).
n
i=1
Note that for any fixed x R, the Strong Law of Large Numbers ensures that
a.s.
Fn (x) FX (x) as n
as
E[I[Xi ,) (x)] = P [I[Xi ,) (x) = 1] = P [Xi x] = FX (x).
This result is strengthened by the following Theorem.
Theorem 1.9 The Glivenko-Cantelli Theorem
Let X1 , . . . , Xn be a collection of i.i.d. random variables with cdf FX , and let Fn (x) denote the
empirical distribution function. Then, as n ,
P sup |Fn (x) FX (x)| 0 = 1
xR
or equivalently
P lim sup |Fn (x) FX (x)| = 0 = 1.
n xR
that is, the convergence is uniform in x.
Proof. Let > 0. Then fix k > 1/, and then consider knot points 0 , . . . , k such that
= 0 < 1 2 . . . k1 < k =
that define a partition of R into k disjoint intervals such that
j
FX (
j ) FX (j ) j = 1, . . . , k 1
k
where, for each j,
FX (
j ) = P [Xj < j ] = FX (j ) P [X = j ].
Then, by construction, if j1 < j ,
j (j 1) 1
FX (
j ) FX (j1 ) = < .
k k k
2
Recall in the following that Fn (x) is a random quantity. Now, by the Strong Law, we have
pointwise convergence, so that, as n , for j = 1, . . . , k 1.
a.s. a.s.
Fn (j ) FX (j ) and Fn (
j ) FX (j ).
Then it immediately follows that, for each j,
a.s. a.s.
|Fn (
j ) FX (j )| 0 and |Fn (
j ) FX (j )| 0
as n , so looking at the maximum over all j,
n o
a.s.
4n = max |Fn (j ) FX (j )| , |Fn (
j ) FX (
j )| 0 as n .
j=1,...,k1
For any x, find the interval within which x lies, that is, identify j such that
j1 x < j .
Then we have
Fn (x) FX (x) Fn (
j ) FX (j1 ) Fn (j ) FX (j ) +
Fn (x) FX (x) Fn (j1 ) FX (
j ) Fn (j1 ) FX (j1 )
and thus for any x,
Fn (j1 ) FX (j1 ) Fn (x) FX (x) Fn (
j ) FX (j ) +
and thus
a.s.
|Fn (x) FX (x)| 4n + as n .
Hence, as this holds for arbitrary x, it follows that
a.s.
sup |Fn (x) FX (x)| as n .
xR
This holds for every > 0; that is, if A denotes the set of on which this convergence is
observed, then P (A ) = 1, and then by definition
\
A A lim A = P (A) = P lim A = lim P (A ) = 1
0 0 0
>0
and it follows that
P lim sup |Fn (x) FX (x)| = 0 = 1.
n xR