0% found this document useful (0 votes)
239 views2 pages

M3/M4S3 Statistical Theory Ii The Glivenko-Cantelli Lemma: N X N N, )

1) The empirical distribution function Fn(x) is defined as the proportion of observations less than or equal to x. 2) The Glivenko-Cantelli theorem states that the empirical distribution function Fn(x) converges uniformly to the true cumulative distribution function FX(x) as the sample size increases. 3) The proof involves dividing the real line into intervals and showing that Fn(x) converges pointwise to FX(x) at the interval endpoints, and is bounded within the intervals, ensuring uniform convergence.

Uploaded by

Pubava
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
239 views2 pages

M3/M4S3 Statistical Theory Ii The Glivenko-Cantelli Lemma: N X N N, )

1) The empirical distribution function Fn(x) is defined as the proportion of observations less than or equal to x. 2) The Glivenko-Cantelli theorem states that the empirical distribution function Fn(x) converges uniformly to the true cumulative distribution function FX(x) as the sample size increases. 3) The proof involves dividing the real line into intervals and showing that Fn(x) converges pointwise to FX(x) at the interval endpoints, and is bounded within the intervals, ensuring uniform convergence.

Uploaded by

Pubava
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1

M3/M4S3 STATISTICAL THEORY II


THE GLIVENKO-CANTELLI LEMMA

Definition : The Empirical Distribution Function


Let X1 , . . . , Xn be a collection of i.i.d. random variables with cdf FX . Then the empirical
distribution function will be denoted Fn (x), and defined for x R by
n
1X
Fn (x) = I[Xi ,) (x)
n
i=1

where IA () is the indicator function for set A.


If data x1 , . . . , xn are available, then the observed or estimated empirical distribution function is
denoted Fbn (x) and defined by
n
b 1X
Fn (x) = I[xi ,) (x).
n
i=1

Note that for any fixed x R, the Strong Law of Large Numbers ensures that
a.s.
Fn (x) FX (x) as n

as
E[I[Xi ,) (x)] = P [I[Xi ,) (x) = 1] = P [Xi x] = FX (x).
This result is strengthened by the following Theorem.

Theorem 1.9 The Glivenko-Cantelli Theorem


Let X1 , . . . , Xn be a collection of i.i.d. random variables with cdf FX , and let Fn (x) denote the
empirical distribution function. Then, as n ,

P sup |Fn (x) FX (x)| 0 = 1
xR

or equivalently
P lim sup |Fn (x) FX (x)| = 0 = 1.
n xR

that is, the convergence is uniform in x.

Proof. Let > 0. Then fix k > 1/, and then consider knot points 0 , . . . , k such that

= 0 < 1 2 . . . k1 < k =

that define a partition of R into k disjoint intervals such that


j
FX (
j ) FX (j ) j = 1, . . . , k 1
k
where, for each j,
FX (
j ) = P [Xj < j ] = FX (j ) P [X = j ].

Then, by construction, if j1 < j ,

j (j 1) 1
FX (
j ) FX (j1 ) = < .
k k k
2

Recall in the following that Fn (x) is a random quantity. Now, by the Strong Law, we have
pointwise convergence, so that, as n , for j = 1, . . . , k 1.
a.s. a.s.
Fn (j ) FX (j ) and Fn (
j ) FX (j ).

Then it immediately follows that, for each j,


a.s. a.s.
|Fn (
j ) FX (j )| 0 and |Fn (
j ) FX (j )| 0

as n , so looking at the maximum over all j,


n o
a.s.
4n = max |Fn (j ) FX (j )| , |Fn (
j ) FX (
j )| 0 as n .
j=1,...,k1

For any x, find the interval within which x lies, that is, identify j such that

j1 x < j .

Then we have

Fn (x) FX (x) Fn (
j ) FX (j1 ) Fn (j ) FX (j ) +

Fn (x) FX (x) Fn (j1 ) FX (


j ) Fn (j1 ) FX (j1 )

and thus for any x,

Fn (j1 ) FX (j1 ) Fn (x) FX (x) Fn (


j ) FX (j ) +

and thus
a.s.
|Fn (x) FX (x)| 4n + as n .
Hence, as this holds for arbitrary x, it follows that
a.s.
sup |Fn (x) FX (x)| as n .
xR

This holds for every > 0; that is, if A denotes the set of on which this convergence is
observed, then P (A ) = 1, and then by definition
\
A A lim A = P (A) = P lim A = lim P (A ) = 1
0 0 0
>0

and it follows that


P lim sup |Fn (x) FX (x)| = 0 = 1.
n xR

You might also like