0% found this document useful (0 votes)
14 views12 pages

LectureNote1 3

The document outlines the foundational concepts of probability measures, including axioms and properties that define probability spaces. It discusses discrete spaces, examples of probability distributions such as Bernoulli, binomial, waiting time, and Poisson distributions, and provides proofs for various properties of probability. Additionally, it explores the concept of infinite coin tossing and the probabilities associated with specific outcomes.

Uploaded by

鄭皓倫
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views12 pages

LectureNote1 3

The document outlines the foundational concepts of probability measures, including axioms and properties that define probability spaces. It discusses discrete spaces, examples of probability distributions such as Bernoulli, binomial, waiting time, and Poisson distributions, and provides proofs for various properties of probability. Additionally, it explores the concept of infinite coin tossing and the probabilities associated with specific outcomes.

Uploaded by

鄭皓倫
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lecture 1.

Probability Spaces (3)

This is taken from the Chapter 1, Venkatesh’s book,


Santosh S. Venkatesh (2013), The Theory of Probability,
Cambridge University Press

1.6. Probability Measures


Ω is the sample space, F is a σ-algebra on Ω. P , a probability defined on B, is a
function P : B → [0, 1] satisfying the following properties:
Axiom 1 Positivity P (A) ≥ 0, A ∈ F.

Axiom 2 Normalization P (Ω) = 1.

Axiom 3 Additivity Assume A, B are disjoint events. Then

P (A ∪ B) = P (A) + P (B).

Axiom 4 Continuity Let A1 , A2 , · · · be a sequence of events such that An+1 ⊂ An


for all n and ∩∞
n=1 An = ∅. Then

lim P (An ) = 0.
n→∞

Useful Properties

• We have the following consequences.


(P1) P (∅) = 0.
(P2) Let A be an event. Then P (Ac ) = 1 − P (A).
(P3) Let A, B, C be disjoint events. Then

P (A ∪ B ∪ C) = P (A) + P (B) + P (C).

(P4) Let A1 , A2 , · · · , An be pairwise disjoint events. Then


n
P (∪ni=1 Ai )
X
= P (Ai ).
i=1

Proof of (P1) and (P2)

By taking A = B = ∅ and Axiom 3, we have

P (∅) = P (∅) + P (∅) = 2P (∅).

0
This implies P (∅) = 0.

For (P2), applying Axiom 3 to B = Ac , we have

A ∪ B = A ∪ Ac = Ω,

a disjoint union. Then


1 = P (A) + P (Ac ).
(P2) follows.

Proof of (P3), (P4)

For (P3), we apply Axiom 3 to pair of independent events A and B ∪ C,

P (A ∪ B ∪ C) = P (A) + P (B ∪ C).

We apply Axiom 3 to pair of independent events B and C,

P (B ∪ C) = P (B) + P (C).

We conclude
P (A ∪ B ∪ C) = P (A) + P (B) + P (C).

Similar idea, by induction argument, we can prove (P4)

• (continue) We give other consequences.

(P5) For two events A, B, we have

P (A ∪ B) = P (A) + P (B) − P (A ∩ B).

(P6) For two events A, B, we have P (A ∪ B) ≤ P (A) + P (B).


(P7) For n events A1 , A2 , · · · , An , n is a positive integer, we have
n
P (∪ni=1 Ai ) ≤
X
P (Ai ).
i=1

Proof of (P5), (P6) and (P7)

For (P5), we consider

A1 = A ∩ B, A2 = A \ A1 , A3 = B \ A1 .

1
These three events are exclusive. We have the relations,

A = A1 ∪ A2 , B = A1 ∪ A3 ,

and use Axiom 3, we have

P (A) = P (A1 ) + P (A2 ), P (B) = P (A1 ) + P (A3 ).

Similarly, we have
A ∪ B = A1 ∪ A2 ∪ A3 ,
we can use (P3), we obtain

P (A ∪ B) = P (A1 ∪ A2 ∪ A3 ) = P (A1 ) + P (A2 ) + P (A3 ).

Together, we can prove

P (A ∪ B) = P (A) + P (B) − P (A ∩ B).

(P6) follows from (P5) easily.

(P7) follows from (P6) by induction on n.

• (continue) (Monotone Convergence)

(P8) Assume B1 , B2 , · · · is a sequence of events such that Bn+1 ⊂ Bn for all n. Then
Then
lim P (Bn ) = P (∩∞
n→∞ n=1 Bn ).

(P9) Assume A1 , A2 , · · · is a sequence of events such that An ⊂ An+1 for all n. Then

lim P (An ) = P (∪∞


n=1 An ).
n→∞

Proof of (P8) and (P9)

We first consider (P8).


We observe that ( Axiom 4) is the same as (P8) if ∩∞
n=1 Bn = ∅.

In general, we define
B = ∩∞
n=1 Bn .

Define
An = Bn \ B.

2
It is easy to see that A1 , A2 , · · ·, satisfy the condition in Axiom 4. Therefore,

lim P (An ) = 0.
n→∞

Since
P (An ) = P (Bn ) − P (B),
we can conclude
lim P (Bn ) = P (B).
n→∞

(P9) follows easily from (P8) by considering

Bn = Acn , n = 1, 2, · · · .

Theorem 1.6.1 Assume C1 , C2 , · · · is a sequence of disjoint events. Then



P (∪∞
X
i=1 Ci ) = P (Ci ).
i=1

Proof: Define
An = C1 ∪ C2 ∪ · · · ∪ Cn .
Then
A1 ⊂ A2 ⊂ · · ·
and
∪∞ ∞
n=1 An = ∪n=1 Cn .

Therefore, by (P9), we have

P (∪∞ ∞
n=1 Cn ) = P (∪n=1 An ) = n→∞
lim P (An ).

On the other hand, by (P4) we have


n
X
P (An ) = P (Ci ).
i=1

Therefore,
n ∞
P (∪∞
X X
n=1 Cn ) = lim
n→∞
P (Ci ) = P (Ci ).
i=1 i=1

The proof is complete.

1.7. Probabilities in Simple Cases

Discrete Spaces

3
• Omega is discrete if
Ω = {x1 , x2 , · · · , xn }
for some positive integer n or

Ω = {x1 , x2 , · · ·},

a set with countably infinite many elements.

In the following, we consider

Ω = {x1 , x2 , · · ·}.

The argument can be applied to other case that Ω has finitely many elements.

Let {p1 , p2 , · · ·} be a sequence of nonengative numbers, pk ≥ 0 for all k and



X
pk = 1.
k=1

We take F the power set of Ω. A ∈ F if and only if A ⊂ Ω. We define


X
P (A) = pk .
xk ∈A

We want to prove P satisfies (Axiom of Probability), which are (Axiom 1), (Axiom
2), (Axiom 3) and (Axiom 4).

The proof of (Axiom 1), (Axiom 2), (Axiom 3) are easy. In the following, we only
prove (Axiom 4).

Proof of (Axiom 4).

We take A1 , A2 , · · · elements of F such that An+1 ⊂ An for any n = 1, 2, · · · such


that ∩∞
n=1 An = ∅. We want to prove

lim P (An ) = 0.
n→∞

That is, for any  > 0, we want to find n0 such that P (An0 ) < .. This will implies

P (An ) < , n ≥ n0 .

To find n0 , we take k0 such that



X
pk < .
k=k0

4
Using ∩∞ n+1 An = ∅, it is not difficult to show the existence of n0 such that An0 ∩
{1, 2, · · · , k0 − 1} = ∅. This implies

An0 ⊂ {k0 , k0 + 1, · · ·}.

Then ∞
X
P (An0 ) ≤ P ({k0 , k0 + 1, · · ·}) = pk < .
k=k0

This implies
lim P (An ) = 0.
n→∞

Examples

• Take n > 0 a positive integer. Let Ω = {0, 1, 2, · · · , n − 1} and


1
p0 = p1 = · · · = pn−1 = .
n

Then for any A ⊂ Ω, we have


1
P (A) = card(A),
n
where card(A) is the cardinality of A.

• Bernoulli trials Take 0 ≤ p ≤ 1. Let Ω = Ω1 = {0, 1} and p0 = 1 − p, p1 = p.

1 in a trial means a success, 0 means a failure.p is the probability of a success.

We can also consider the coin tossing, with probabilty p of getting Head. We use 1
to represent Head, and 0 to represent Tail.

• (n times Bernoulli trials) Take 0 < p < 1, and n a positive integer.

Ω = Ωn = {0, 1}n . Each ω ∈ Ω is a sequence of 0, 1 , and denote k = k(ω) the


number of 1 in ω. Define
pω = pk (1 − p)n−k .
It is not difficult to see that we can also write

pω = pω1 pω2 · · · pωn ,

where pωi = pωi (1 − p)1−ωi . That is,

p1 = p, p0 = (1 − p).

5
Using this representation, we can easily show
P
ωΩ pω
Pn
··· pω1 pω2 · · · pωn
P P
= ω1 =0,1 ω2 =0,1 ωn =0,1
= 1.

Taking F = Fn = P(Ωn ), is the power set of Ωn , and


X
Pn (E) = pω .
ω∈E

We have the probability space (Ωn , Fn , Pn ) for the n times Bernoulli trials.

• Binomial distributions Take 0 ≤ p ≤ 1 and n > 0 a positive integer. Let Ω =


{0, 1, 2, · · · , n} and
n(n−1) 2
p0 = (1 − p)n , p1 = np(1 − p)n−1 , p2 = 2
p (1 − p)n−2 , · · · ,
n!
pk = k!(n−k)! pk (1 − p)n−k , · · · , pn = pn .

pk represents the probability that k Heads appear in a sequence of n coin tossings.

We recall the binomial theorem given as follow: for x, y ∈ R, we have


n
n!
(x + y)n = xk y n−k .
X

k=0 k!(n − k)!

We apply binomial theorem with x = p, y = 1 − p , we have


n
X
pk = 1.
k=0

(Ω, F, P ) is the probability space for Binomial distribution with parameter (n, p).

• Waiting time distribution Take 0 ≤ p ≤ 1. Let Ω = {1, 2, · · ·}. For each n ∈ Ω, we


define
pk = p(1 − p)k−1 , k = 1, 2, · · · .
Then ∞
X 1
pk = p = 1.
k=1 1 − (1 − p)

Ω is a discrete smaple space. p1 , p2 , · · · can be use to define a probability space


(Ω, F, P ).

6
F = P(Ω).
pk , for E ⊂ Ω
P
P (E) = k∈E

There is a connection between this example and the Bernoulli trial for infinitely
many times.

We can show that pk represents the probability of the first success of the trials at
time k in the sequence of trials. To explain this connection, we need to wait untill
we introduce the probability space for infinitely many Bernoulli trials.

(r) (r)
There is another interesting sequence, denoted by p1 , p2 , · · ·, where r ≥ 1 is a
given integer. Define

(r) (k + r − 1)!
pk = (1 − p)k pr , k = 0, 1, · · · .
k!(r − 1)!
(r)
pk represent the probability of k failures before r successes.
In this case, it suggests

X (r)
pk = 1.
k=0

However, a proof needs some nontrivial argument.

• Poisson distribution Take λ > 0. Define


λk −λ
pk = e , k = 0, 1, · · · .
k!
By the series expansion of eλ ,

λk
eλ =
X
.
k=0 k!
P∞
we have k=0 pk = 1.
pk represent the probability of k arrivals in a unit time interval given exponential
arrival distribution. Exponential distribution is a continuous probability measure to
be discussed later.

Coin tossing infinite times

• We consider infinite times coin tossing with probabilty 1/2 for Head and Tail .

We can take Ω = {0, 1}∞ . Each ω ∈ Ω is given by (a1 , a2 , · · ·), with ai = 0 or ai = 1


for each i.

7
We consider the event A4 that the first 4 tosses are all Head. Then the probability
of A4 is 1/16 = 2−4 . That is, P (A4 ) = 61 .

Similarly, the event A5 that the 5 tosses are all Head has probability 1/32 = 2−5 .

In general, for a positive integer n, the event An that the first n tosses are all Head
will have probability 2−n . That is, P (An ) = 2−n .

Denote A∞ the event that all tosses are Head. Then we observe the relation

A∞ = ∩∞
k=1 An

and An+1 ⊂ An for any n. Then by Axiom 4, we can derive

P (A∞ ) = n→∞
lim P (An ) = 0.

• Given a0k ∈ {0, 1} for k = 1, 2, · · ·. ω 0 = (a01 , a02 , · · ·) ∈ Ω. By a similar argument,


we shall have P ({ω 0 }) = 0. That is, P ({ω}) = 0 for any ω ∈ Ω. We conclude
1 X
P (A4 ) = 6= 0 = P ({ω}).
16 ω∈A4

In general, we have X
P (A) 6= P ({ω}).
ω∈A

Therefore, this is not a discrete space.

• (continue) We now show Ω has a one to one mapping to [0, 1] by



X ai
ω = (a1 , a2 , · · ·) → ∈ [0, 1].
i=1 2i

On the other hand for each t ∈ [0, 1], we have the binary expansion,
a1 a2 a3
t= + 2 + 3 + ···.
2 2 2
Here a1 , a2 , · · · ∈ {0, 1} are defined as follows: a1 is the integer part of 2t. Define
t1 = 2t − a1 ∈ [0, 1). We define a2 to be the integer part of 2t1 .We can continue this
process to define a3 , a4 , · · · and general an for n = 2, 3, · · ·. This define a mapping

T : [0, 1) → Ω,

by
T (t) = (a1 , a2 , · · ·).

8
The mapping of this event A4 is given by Ā4 ⊂ [0, 1],
1 1 1 1 1 1 1 1 1 15
Ā4 = {t; + 2 + 3 + 4 ≤ t < + 2 + 3 + 4 + 5 + · · ·} = [ , 1].
2 2 2 2 2 2 2 2 2 16
1
The length of Ā4 is 16
. We denote P̄ the probability measure on [0, 1]. We conclude

P (A4 ) = P̄ (Ā4 ).

We can also check P (An ) = P̄ (Ān ). For any event A ⊂ Ω, we define Ā ⊂ [0, 1).
P̄ (Ā) is the Borel measure of Ā.

• (continue) P̄ (B) can be defined for any Borel subset B in [0, 1]. We define

F = {A; T −1 (A) ∈ B([0, 1])}.

Here
T −1 (A) = {t ∈ [0, 1]; T (t) ∈ A}.
We can show F is a σ-algebra on Ω. Define

P (A) = P̄ (T −1 (A)).

We can show this defines a probability measure.

Continuous Spaces

• Examples of continuous sample spaces are the cases that Ω = R, Rn or subsets of


them. In discrete space it is interesting to consider probability of sample points. In
continuous space R, we consider probability of intervals and probability densities.
The sum (in discrete space) will be replaced by integrals ( in R).

• A probability density (in R) is a nonnegative integrable function satisfies


Z
f (x)dx = 1.
R

At this moment, we can assume f is continuous, since we can consider the integral
in Riemann sense.

For P (A), where A is an interval given by one of [a, b), [a, b), (a, b], (a, b), P (A) is
defined by Riemann integral
Z b
P (A) = f (x)dx.
a

9
For general Borel set A, Z
P (A) = f (x)dx,
A
is defined by Lebesgue integral. This will be discussed in our course later.
For a sequence of disjoint interval {In , n = 1, 2, · · ·}, from the relation
Z n Z
X
f (x)dx = f (x)dx
∪n
i=1 Ii i=1 Ii

to the relation Z ∞ Z
X
f (x)dx = f (x)dx
∪∞
i=1 Ii i=1 Ii
we need to prove the relation,
Z Z
f (x)dx → f (x)dx, n → ∞.
∪n
i=1 Ii ∪∞
i=1 Ii

This is a standard result in the theory of Lebegues integral. However, it is not a easy
result using Riemann integral. Such limiting theorems make the theory of Lebegues
integral very useful and becomes very popular to use. The Lebesgue integral can
also be defined on general probability space. This is the reason that we need to
discuss Lebesgue integral in probability theory.

Examples

• We mention several inportant densities.


1
• The uniform density For a < x < b, define f (x) = b−a
, and f (x) = 0 for other x.

This gives the uniform density on the interval [a, b].


• An exponential density is given by

f (x) = αe−αx , x ≥ 0,

α > 0 is the parameter of the exponential density.


• The normal density is given by
1 (x−µ)2
f (x) = √ e− 2 , x ∈ R.
2πσ 2
Here µ ∈ R and σ ∈ R, σ 6= 0 are parameters of the normal density.

We say standard normal distribution when the density is given by


1 x2
f (x) = √ e− 2 , x ∈ R.

10
Homework 1-3

The book Venkatesh is referred to the book:


Santosh S. Venkatesh (2013), The Theory of Probability: Explorations and Applicati-
ons Cambridge University Press

Due Date: In two weeks after we finish the discussion of this lecture note.

1. Reading assignment: P18-P24. Do the following:


(a) Take notes the important concepts and results while reading the materials.
(b) Mention the concepts or results you have question. Find the solution of your
questions from other sources (for example, books or internet).

2. Assume (Axiom 1), (Axiom 2), and the following property:

(Axiom 3)’ Let C1 , C2 , · · · be pairwise disjoint events. Then



P (∪∞
X
n=1 Cn ) = P (Cn ).
n=1

Prove (Axiom 3) and (Axion 4).

3. Let A, B, C be events. Prove

P (A∪B∪C) = (P (A)+P (B)+P (C))−(P (A∩B)+P (A∩C)+P (B∩C))+P (A∩B∩C).

4. Problem 32 in 10. Problems ) of Vekatesh’s Book.

11

You might also like