0% found this document useful (0 votes)

344 views36 pages

SMA 240 Probability and Statistics 1 Lecture Notes

The document contains lecture notes for a course on Probability and Statistics I, covering topics such as random variables, distribution functions, moments, and linear regression. It includes detailed explanations, properties, and examples related to each topic, along with chapter problems for practice. The notes are authored by Dr. Davis Bundi from the University of Nairobi and reference key textbooks in the field.

Uploaded by

gwan94484

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

344 views36 pages

SMA 240 Probability and Statistics 1 Lecture Notes

Uploaded by

gwan94484

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

SMA 240: Probability and Statistics I

Lecture Notes, February 2023

Dr. Davis Bundi - [email protected]
Department of Mathematics, University of Nairobi
Comments; In case of any typos, inform the author

Reference Books
1. An introduction to Probability and Statistics. Rohatgi, V.K. and Saleh, A.K. Second
Edition, Wiley Eastern Limited, 2011

2. Introduction to the theory of Statistics. Mood, A., Graybill, F., and Boes, D.C.
Third Edition, London

Contents
1 Review of Random Variables 2
1.1 Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Chapter Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Moments and Moments Generating Functions 5

2.1 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Moment Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Property of Moment Generating Function . . . . . . . . . . . . . . . . . . 9
2.4 Markov and Chebyshev’s Inequality . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Chapter Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Bivariate Probability Distribution 13

3.1 Joint distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Bivariate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Marginal distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Conditional Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Independent Random Variables . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Bivariate Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.7 Covariance and Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.8 Chapter Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1
4 Linear Regression and Correlation Analysis 24
4.1 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Chapter Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Distribution of Functions of Random Variables 28

5.1 Cumulative distribution function technique . . . . . . . . . . . . . . . . . . 28
5.2 Change of Variable Technique . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 Chapter Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6 Derived Distributions 35
6.1 Gamma Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 Gamma Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

1 Review of Random Variables

1.1 Random Variable
A RV X is said to be discrete if with probability one, it can take only a finite or countably
infinite number of possible values, that is

Σ∞
k=1 P (X = xk ) = 1

X is a continuous RV if there exists a function fx : ℜ → [0, ∞) such that

Z ∞
P (X = x) = fX (s)dx
−∞

The probability density function (Pdf) must satisfy

(a) fX (x) ≥ 0 for all x ∈ ℜ
R∞
(b) −∞ fX (x)dx = 1
Rb
(c) P (a ≤ X ≤ b) = a fX (x)dx for any a ≤ b

Exercise 1 1. Let X be a continuous RV with Pdf

−x
e : x>0
f (x) =
0 x≤0
Find P (x > a) for a > 0
2. X has a probability mass function (pmf ) given by
 1
, x=0
 41


2
, x=1
f (x) = 1
, x=2
 4


0 elsewhere
1
Find (a)Σx f (x) and (b)P (X = 1) = 2

2
1.2 Distribution Function
Let X be a RV defined on a sample space X. Consider the event E that X satisfies,
−∞ < X ≤ x, where x is any real number
P (x ∈ E) = P [−∞ < X ≤ x] = P [X ≤ x] = F (x)
The function F (x) is called the Distribution Function or the Cumulative Distribution
Function (cdf) of the RV X. For continuous RV X with Pdf f (x)
Z ∞
F (x) = f (x)dx
−∞
For discrete RV X with pmf f (x)
F (x) = Σx≤n f (x)

1.2.1 Properties of cumulative distribution function

(a) 0 ≤ F (x) ≤ 1 since 0 ≤ P (X ≤ x) ≤ 1
(b) If a and b are any real numbers such that a ≤ b, then P (a < X < b) = F (b) − F (a)
(c) F (x) is a non-decreasing function of x
(d) The limit Limx→−∞ F (x) = 0 and Limx→∞ F (x) = 1

1.3 Expectation
Let X be a discrete RV which takes the values x1 , x2 , x3 , . . . and whose pmf is defined by
f (x) = P (X = xi ), i = 1, 2, 3, . . .
The expected value of X, denoted by E(X) is defined by
E(x) = Σ∞
i xi f (xi )

If X is a continuous RV with Pdf, f (x), the expected of X, E(X) is defined as

Z ∞
E(X) = xf (x)dx
−∞
More generally, if g(x) is a function of x, then
Z ∞
E[g(x)] = g(x)f (x)dx
−∞

1.3.1 Properties of Expectation

1. If g(x) = ax + b where a, b ∈ ℜ, then
E[g(x)] = E[ax + b] = aE(x) + b

2. Let g(x) and h(x) be any real valued function of x. The expected value is
E[ag(x) + bh(x)] = aE[g(x)] + bE[h(x)], where a, b ∈ ℜ

3
1.4 Variance
The variance for a RV X is expressed in terms of the expected value

Var(X) = E(X 2 ) − [E(X)]2

where E(X 2 ) = Σ∞ 2
i=1 xi f (xi ) for discrete RV

Z ∞
2
and E(X ) = x2 f (x)dx for continuous RV
−∞

Example 1 Given the Pdf, find the expected value and the variance
x
2
: 0≤x≤2
f (x) =
0 elsewhere
R2 R2 x2
Solution 1 (a) E(X) = 0 x. x2 dx = 0 2
dx = 4
3
R2 R2 3
(b) E(X 2 ) = 0 x2 . x2 = 0 x2 dx = 2
16 2
(c) V ar(x) = E(x2 ) − [E(x)]2 = 2 − 9
= 9

1.5 Chapter Problems

1. Find the value of the constant c such that the function is a density function
2
cx , 0 ≤ x ≤ 3
f (x) =
0 otherwise

Then compute the following

(a) P (1 < X < 2)

(b) F (x)
(c) E(x) and V ar(x)

2. A continuous RV X has a Pdf, f (x) given by

k
4
, 2≤x≤4
f (x) =
0 elsewhere

Find

(a) The cdf of x, that is F (x)

(b) if y = x2 , find the cdf of y, that is F (y)
(c) P [X ≥ 1]

3. Suppose that X has a moment generating function

1 1
MX (t) = (1 − 2t)− 2 , for t <
2
4
Compute the standard deviation for the random variable X

4. The cdf of a RV X is 
 0, x < −1
x+1
F (x) = 2
, −1 ≤ x ≤ 1
1, x≥1


(a) Find: P [x ≥ 12 ] and P [− 12 < X ≤ 34 ]

(b) What is the value of a such that P (X ≤ a) = 0.65?

5. A RV X has a pdf
cx, 1 ≤ x ≤ 3
f (x) =
0 elsewhere
Find the constant c and P [0 ≤ x ≤ 1]

6. The random variable X has a pmf given by

k(x + 3), x = 0, 1, 2, 3
f (x) =
0 elsewhere

(a) Find the value of the constant k such that the distribution is a mass function
(b) P (x = 3, 4)
(c) Find the cdf, that is F (x)

2 Moments and Moments Generating Functions

The parameters µ and σ are very important parameters that describe the center and the
spread of a random variable X. They, however don’t provide the unique characterization
of the distribution of X.

2.1 Moments
Definition 1 The k th moment of a random variable X taken about the origin is defined
to be
E Xk

and is denoted by µ′k

From the above definition we can easily verify that the first moment about the origin
is
E(X) = µ′1 = µ,
the second moment about the origin is

E X 2 = µ′2

and so on. In addition to taking moments about the origin, a moment of a random
variable can also be taken about the mean µ.

5
Definition 2 The k th moment of a random variable X taken about its mean, or the k th
central moment of X, is defined to be
h i
E (X − µ)k

and is denoted by µk

The major use of moments is to approximate the probability distribution of a ran-

dom variable (usually and estimator or a decision maker). This further means that, the
moments µ′k where k = 0, 1, 2 · · · are primarily of theoretical value for k > 3.

2.2 Moment Generating Functions

A moment generating function is an interesting type of expectation of a random variable
which, figuratively speaking, packages all the moments of a random variable into one
simple expression.

Definition 3 The moment generating function MX (t) for a random variable X is defined
to be
MX (t) = E etX

We consider the moment generating functions of some probability distribution func-

tions:

2.2.1 Bernoulli Distribution

Consider a Bernoulli distributed random variable X, with probability mass function

f (x) = px (1 − p)1−x ; where x = 0, 1

The moment generating function is given by

1
X
tX
etx px (1 − p)1−x

MX (t) = E e =
x=0
= (1 − p)1 + pet = pet + 1 − p = pet + q, where q = 1 − p

2.2.2 Binomial Distribution

Consider a Binomial distributed random variable X, that is X ∼ Bin(n, p), with the
probability mass function

n x
f (x) = p (1 − p)n−x ; where x = 0, 1, 2, ..., n
x

From the definition of moment generating function

MX (t) = E etX ,

6
the moment generating function of the Binomial distributed random variable X is given
as
n
X
tX
etx f (x) where f (x) is the probability mass function of X

MX (t) = E e =
x=0
n n
X n x X n x
= e tx
p (1 − p)n−x
= pet (1 − p)n−x
x x
x=0t n t x=0
n
= pe + (1 − p) = pe + q , where q = 1 − p

2.2.3 Poisson Distribution

Consider a Poisson distributed random variable X with mean λ, that is, X ∼ Pois(λ) and
probability mass function

e−λ λx
f (x) = ; where x = 0, 1, 2, ..., ∞
x!
The moment generating function MX (t) of this random variable is given as
∞ ∞ x
X e−λ λx X (λet ) t t
MX (t) = E etx = etx = e−λ = e−λ eλe = eλ(e −1)
x=0
x! x=0
x!

2.2.4 Exponential Distribution

Consider an Exponential distributed random variable X with probability density

f (x) = λe−λx ; where x > 0

The moment generating function is given as

Z ∞ Z ∞
−λx
tX tx
e−(λ−t)x dx

MX (t) = E e = e λe dx = λ
0 0
∞
λ −(λ−t) λ λ
= e = [0 − 1] =
t−λ t−λ λ−t
0

2.2.5 Gamma Distribution

Consider a Gamma distributed random variable X, with probability density function
1 x
α−1 − β
f (x) = x e ; where 0 < x < ∞, α > 0, β > 0
Γ(α)β α

If λ = β1 , then

λα α−1 −xλ
f (x) = x e ; where 0 < x < ∞, α > 0, λ > 0
Γ(α)

Given the two listed properties of Gamma function. For any positive real number α:

7
R∞
Γ(α) = 0 xα−1 e−x dx
R∞
0 xα−1 e−λx dx = Γ(α)
λα
, for λ > 0
The moment generating function of the Gamma distribution can be expressed as:

∞
λα tx α−1 −xλ
Z
tX

MX (t) = E e = e x e dx
0 Γ(α)
Z ∞
λα
= xα−1 e−x(λ−t) dx
Γ(α) 0
dy y
Let y = (λ − t)x =⇒ dx = (λ−t)
=⇒ x = (λ−t)
. We substitute to get:
Z ∞
λα y α−1 −y dy
MX (t) = e
Γ(α) 0 (λ − t)α−1 (λ − t)α−1
α Z ∞
λ 1
= y α−1 e−y dy
Γ(α) (λ − t)α 0
Z ∞
λα Γ(α) h
α−1 −y
i
= Since y e dy = Γ(α)
Γ(α) (λ − t)α 0
λα
MX (t) =
(λ − t)α

2.2.6 Normal Distribution

Consider a Normal distributed random variable X, that is X ∼ N(µ, σ 2 ), with the prob-
ability density function
1 1
f (x) = √ exp − (X − µ)2 ; where −∞<x<∞
2πσ 2 2σ 2
The moment generating function is calculated as follows
Z ∞
1 1 2
tx
eµt e− 2σ2 (x−µ) dx

MX (t) = E e = √
2πσ
Z−∞
∞
1 1 2 2 1 2 2
= √ eµt+ 2 σ t e− 2σ2 [x−(µ+σ t)] dx
−∞ 2πσ
Z ∞
1 2 2 1 1 2 2
=e µt+ 2 σ t
√ e− 2σ2 [x−(µ+σ t)] dx
−∞ 2πσ
1 2 t2
MX (t) = eµt+ 2 σ
We note the following
R ∞ 1 − 1 [x−(µ+σ2 t)]2
−∞ √2πσ e 2σ2 dx = 1
−1 2 −1
tx − 2
(x + µ2 − 2xµ) = 2 (x2 + µ2 − 2xµ − 2txσ 2 )
2σ 2σ
−1 1
= 2 [x2 + µ2 + 2µtσ 2 + t2 σ 4 − 2xµ − 2xtσ 2 ] + [tµ + t2 σ 2 ]
2σ 2
1 2 2 1
= [tµ + t σ ] − 2 [x − (µ + tσ 2 )]2
2 2σ
8
2.3 Property of Moment Generating Function
One of the most important properties of moment generating function is that, if we can
find
E etX

we can find any of the moments of X.

Theorem 1 If MX (t) exists, then for any positive integer k,

dMX (t) (k)

= MX (0) = µ′k
dt
t=0

That is, if you find the k th derivative of MX (t) with respect to t and then set t = 0, the
result will be µ′k
In the section below, we show how to calculate the mean and variance using moment
generating functions.
Discrete Case: In the discrete case the moment generating function is in general
given by
X tx
MX (t) = E etX = e f (x)
x

Taking the first derivative we have

dMX (t) X tx
= xe f (x)
dt x

The derivative at t=0 is

dMX (t) X
= xf (x) = E(X)
dt x
t=0

Taking the second derivative we have

d2 MX (t) X 2 tx
= x e f (x)
dt2 x

The derivative at t=0 is

d2 MX (t) X
= x2 f (x) = E(X 2 )
dt2 x
t=0

Continuous Case: In the continuous case the moment generating function is in

general given by Z
tX
= etx f (x)dx

MX (t) = E e
x
Taking the first derivative we have
Z
dMX (t)
= xetx f (x)dx
dt x

9
The derivative at t=0 is
Z
dMX (t)
= xf (x)dx = E(X)
dt x
t=0

Taking the second derivative we have

d2 MX (t)
Z
= x2 etx f (x)dx
dt2 x

The derivative at t=0 is

d2 MX (t)
Z
= x2 f (x)dx = E(X 2 )
dt2 x
t=0

Hence
M ′ (0) = E(X) and M ′′ (0) = E(X 2 )
And
V ar(X) = E(X 2 ) − [E(X)]2 = V ar(X) = M ′′ (0) − [M ′ (0)]2
Therefore
E(X) = M ′ (0)
V ar(X) = M ′′ (0) − [M ′ (0)]2
The examples below show how to calculate the mean and variance for 3 distributions.
Note that calculations for the other distributions are left as exercises.

Example 2 Consider a Bernoulli distribution with mass function

f (x) = px (1 − p)1−x with x = 0, 1

Use the moment generating function to find the mean and variance.

Solution 2
MX (t) = (1 − p) + pet
dMX (t)
E(X) = = M ′ (0) = pet =p
dt
t=0 t=0

d2 MX (t)
E(X 2 ) = = M ′′ (0) = pet =p
dt2
t=0 t=0
2 2 2
V ar(X) = E(X ) − [E(X)] = p − p = p(1 − p)

Example 3 Consider a Geometric distribution with mass function

f (x) = q x p with x = 0, 1, 2, ..., ∞

Find the moment generating function and the mean and variance.

10
Solution 3 ∞ ∞
X X p
MX (t) = E(etx ) = etx q x p = p (qet )x =
x=0 x=0
1 − qet
t
dMX (t) −p(−qe ) pq pq q
E(X) = |t=0 = t 2
|t=0 = 2
= 2 =
dt (1 − qe ) (1 − q) p p
d2 MX (t) pqet (1 − qet )2 + 2pq 2 e2t (1 − qet ) pq + pq 2
E(X 2 ) = |t=0 = |t=0 =
dt2 (1 − qet )4 p3
pq + pq 2 q 2 q + q2 − q2 q
V ar(X) = M ′′ (0) − [M ′ (0)]2 = − = =
p3 p2 p2 p2

Example 4 Let X and Y be independent; Z = XY

Solution 4 Z Z Z
E(XY ) = xyf (x, y)dxdy = xyf (x)f (y)dxdy
y x y
Z Z Z Z
= x[ yf (y)dy]f (x)dx = f (x)dx yf (y)dy = E(X)E(Y )
x y x y

Example 5 Let X and Y be two independent random variables with mgf ’s MX (t) and
MY (t) respectively. Obtain the mgf of Z = X + Y , or rather, show that MZ (t) =
MX (t)MY (t).

Solution 5

MZ (t) = E(etZ ) = E(et(X+Y ) ) = E(etX+tY ) ) = E(etX etY )

Due to independence;
= E(etX ).E(etY ) = MX (t)MY (t)

2.4 Markov and Chebyshev’s Inequality

First, we consider the Markov inequality and use that inequality to derive the Chebyshev’s
inequality.

11
Markov Inequality
Let X be any positive and continuous random variable, we can write:
Z ∞
E(x) = xf (x)dx
−∞
Z ∞
= xf (x)dx (Since X is positive)
0
Z ∞
≥ xf (x)dx (for any a > 0)
Za ∞
≥ af (x)dx (Since x > a in the integrated region)
a
Z ∞
≥a f (x)dx
a
≥ aP (X ≥ a)

We write the Markov inequality as:

E(x)
P (X ≥ a) ≤ , for any a > 0
a

Chebyshev’s Inequality
Let X be any random variable. If we define Y = [X − E(X)]2 , then Y is a non-negative
random variable. We apply Markov’s inequality to Y , for any positive real number b.

E(Y )
P [Y ≥ b2 ] ≤
b2
But E(Y ) = E[X − E(X)]2 = V ar(X) and this implies

P [Y ≥ b2 ] = P ([X − E(X)]2 ≥ b2 ) = P (|X − E(X)| ≥ b)

And we can write the Chebyshev’s inequality as

V ar(X)
P (|X − E(X)| ≥ b) ≤ , for any b > 0
b2
Remark: If the variance is small, then X is unlikely to be too far from the mean.

2.5 Chapter Problems

1. The moment generating function for the Gaussian distribution is given by MX (t) =
exp(µt + 12 σ 2 t2 ). Find the expectation and variance of this distribution.

2. Each of 3 boxes contains 4 bolts, 3 square in shape and 1 circular in shape. A bolt
is chosen at random from each box. Find the probability that 3 circular bolts are
chosen.

12
3. If f (x) = λe−λx , if x ≥ 0 and zero elsewhere. Find the moment generating function
of f (x) and E(x)

4. If x = 1, 2, 3, . . . has the geometric distribution f (x) = pq x−1 , where q = 1 − p, show

that the moment generating function is

pet
M (t) =
1 − qet

Find E(x)

5. Find the moment generating function of f (x) = 1 where 0 < x < 1 and thereby
1
confirm that E(x) = 12 and V ar(x) = 12

6. Find the moment generating function of the point binomial

f (x) = px (1 − p)1−x

where x = 0, 1. What is the relationship between this and the moment generating
function of the binomial distribution?

7. Calculate the E(x) and V ar(x) for the Gamma distribution in section (2.2.5) and
the Normal distribution in section (2.2.6)

3 Bivariate Probability Distribution

I real life, we are often interested in two (or more) random variables at the same time.
For example, we might measure the height and weight of people, or the income and food
expenditure of a group of workers, or the frequency of exercise and the rate of heart
disease in adults, or the level of air pollution and rate of respiratory illness in cities. In
such situations the random variables have a joint distribution that allows us to compute
probabilities of events involving both variables and understand the relationship between
the variables.
In this chapter, we focus on bivariate analysis, where exactly two measurements are
made on each observation. The two measurements will be called X and Y . Since X and
Y are obtained for each observation, the data for one observation is the pair (X, Y ). Let
X and Y be two random variables defined on the same sample space. Then, the ordered
pair (X, Y ) is called a two dimensional random variable.

Definition 4 Suppose that X and Y are random variables. The joint distribution, or bi-
variate distribution of X and Y is the collection of all probabilities of the form P [(X, Y ) ∈
C] for all sets C ⊂ ℜ2 such that {(X, Y ) ∈ C} is an event.

3.1 Joint distributions

3.1.1 Continuous Case
If f (x, y) is continuous random variable, that is, it can take any value in a rectangle, then

13

(x, y) : a < x < b
.
c<y<d
In a plane, the Joint probability density function, f (x, y) is a non-negative real valued
function defined on ℜ2 such that
R∞ R∞
(i) P [(X, Y ) ∈ ℜ2 ] = −∞ −∞ f (x, y)dxdy
RdRb
(ii) P [a < x < b, c < y < d] = c a f (x, y)dxdy
RR
(iii) P [X ⊂ ℜ, ℜ ⊂ ℜ2 ] = ℜ
f (x, y)dxdy
Example 6 Consider the Joint probability density function defined by

k(6 − x − y) : 0 < x < 2, 2 < y < 4
f (x, y) =
0 elsewhere
Find
(i) the value of the constant k
(ii) P (x < 1, y < 3)
(iii) P (x + y < 3)
R4R2
Solution 6 (i) 2 0 k(6 − x − y)dxdy = 1
Z 4 Z 4
x2 2
k (6x − − xy)|0 dy = k (12 − 2 − 2y)dy
2 2 2
= k[10y − y 2 ]42 = k[40 − 16 − 20 + 4] = 1
1
k=
8
1
8
(6 − x − y) : 0 < x < 2, 2 < y < 4
f (x, y) =
0 elsewhere
(ii) P [x < 1, y < 3]
Z 3Z 1
1 3 x2
Z
1
(6 − x − y)dxdy = (6x − xy − )|dy
2 0 8 8 2 2
Z 3
1 11 1 11 y2
= ( − y)dy = [ y − ]32
8 2 2 8 2 2
1 33 9 1 3
= [ − − 11 + 2] = [33 − 9 − 22 + 4] =
8 2 2 16 8
(iii) P [x + y < 3]
1 3 3−y 1 3 x2
Z Z Z
= (6 − x − y)dxdy = 6x − xy − |03−y dy
8 2 0 8 2 2
Z 3 2
1 9 y
= (18 − 6y − + 3y − − 3y + y 2 )dy
8 2 2 2
1 3 27 y2 y3 3
Z
1 27 2 5
= ( − 6y + )dy = [ y − 3y + ]2 =
8 2 2 2 8 2 6 24

14
3.1.2 Discrete Case
The random variable (X, Y ) are discrete if each of it’s components x and y are discrete.
The Joint probability density function of x and y is defined as:

f (x, y) = P (X = x and Y = y), satisfying

(i) f (x, y) ≥ 0 ∀(x, y) ∈ ℜ2

P P
(ii) x y f (x, y) = 1

Suppose that X can assume any one of m values x1 , x2 , . . . , xm and Y can assume any
one of n values y1 , y2 , . . . yn . Then, the probability of the event that X = xj and Y = yk
is given by
P (X = xj , Y = yk ) = f (xj , yk )
A joint probability function for X and Y can be represented by a joint probability table
as in Table 3.1.2. The probability that X = xj is obtained by adding all entries in the
row corresponding to xi and is given by
n
X
P (X = xj ) = f1 (xj ) = f (xj , yk )
k=1

Y
y1 y2 ... yn Totals
X
x1 f (x1 , y1 ) f (x1 , y2 ) ... f (x1 , yn ) f1 (x1 )
x2 f (x2 , y1 ) f (x2 , y2 ) ... f (x2 , yn ) f1 (x2 )
.. .. .. .. ..
. . . . .
xm f (xm , y1 ) f (xm , y2 ) ... f (xm , yn ) f1 (xm )
Totals f2 (y1 ) f2 (y2 ) ... f2 (yn ) 1

For j = 1, 2, . . . , m, these are indicated by the entry totals in the extreme right-hand
column or margin of table3.1.2. Similarly, the probability that Y = yk is obtained by
adding all entries in the column corresponding to yk and is given by
m
X
P (X = xk ) = f2 (xk ) = f (xj , yk )
j=1

For k = 1, 2, . . . , n, these are indicated by the entry totals in the bottom row or margin
of Table 3.1.2. The probabilities f1 (xj ) and f2 (yk ) are the marginal probability functions
of X and Y , respectively.
It should also be noted that
m
X n
X
f1 (xj ) = 1, f2 (xk ) = 1
j=1 k=1

15
which can be written as m X
n
X
f (xj , yk ) = 1
j=1 k=1

The joint distribution function of X and Y is defined by

XX
F (x, y) = P (X ≤ x, Y ≤ y) = f (u, v)
u≤x v≤y

In Table 3.1.2, F (x, y) is the sum of all entries for which xj ≤ x and yk ≤ y.

Example 7 The table shows the promotional status of police officers during the past two
years.
Promoted Not Promoted Total
Men 288 672 960
Women 36 204 240
Total 324 876 1200

Let M be the event that an officer is a male

Let W be the event that an officer is a woman

Let A be the event that an officer is promoted

Let Ac be the event that an officer is not promoted

Required

(a) Probability that an officer is a man and is promoted

(b) Probability that an officer is a woman and is not promoted

(c) Probability that an officer is promoted

Solution 7 (a) P (M ∩ A) = 0.24

204
(b) P (W ∩ Ac ) = 1200
= 0.17

(c) P (A) = 0.27

16
3.2 Bivariate Functions
The joint distribution function (cdf), F (x, y) of two random variables X and Y defined
on the same sample space is given by
F (x, y) = P (X ≤ x and Y ≤ y), −∞ < x < ∞
If X and Y are continuous, then
Z ∞ Z ∞ Z ∞
F (x, y) = f (u, v)dvdu = f (x, y)dydx
−∞ −∞ −∞

If F (x, y) is differentiable, then the joint probability density function of X and Y is

∂ 2 F (x, y)
f (x, y) =
∂y∂x
Example 8 If F (x, y) = x + y + 2xy
Solution 8
∂F (x, y)
= 1 + 2y
∂x
and
∂ 2 F (x, y)
= 2 = f (x, y)
∂y∂x

3.3 Marginal distributions

Let X and Y be two continuous random variables with joint probability density function,
f (x, y), then
Z bZ ∞
P [a < X < b] = f (x, y)dydx
a −∞
The integral Z ∞
f (x, y)dy = f1 (x)
−∞
is the marginal probability density function of x, while the integral,
Z ∞
f (x, y)dx = f2 (y)
−∞

is the marginal probability density function of y.

If X and Y are discrete, then
X
f1 (x) = f (x, y)
y
.
The marginal density function of x is given by
Z x
F1 (x) = P (X ≤ x) = f1 (y)dy, if x is continuous
−∞

and X
F1 (x) = f1 (y), if x is discrete
X≤x

17
Example 9 Let the joint probability density function of X and Y be given by,

2 ,0 < x < y < 1
f (x, y) =
0 , elsewhere

Find the marginal probability density function of X and Y , that is f1 (x) and f2 (y).
R∞ R1
Solution 9 (a) f1 (x) = −∞
2dy = x
2dy = 2(1 − x)

2(1 − x) , 0 ≤ x ≤ 1
f1 (x) =
0 , elsewhere
R∞ Ry
(b) f2 (y) = −∞
2dy = 0
2dx = 2y

2y , 0 ≤ y ≤ 1
f1 (y) =
0 , elsewhere

3.4 Conditional Distributions

Definition 5 Let X and Y be discrete random variables with joint probability mass func-
tion. The probability that X will take a given value given that Y = y has been observed is
denoted by f (y/x) and is defined by

P [X = x and Y = y] f (x, y)
P [X = x/Y = y] = =
P (Y = y) f2 (y)

Where f2 (y) is the marginal probability density function of y. The function

(
f (x,y)
f2 (y)
f2 (y) > 0
f (x/y) =
0 f2 (y) = 0

is called the conditional probability function of X given Y .

Example 10 Suppose X and Y are two discrete random variables with joint probability
density function
1
54
(x + y) x = 1, 2, 3; y = 1, 2, 3, 4,
f (x/y) =
0 elsewhere

Calculate the following

(a) f (y/x), that is the conditional distribution of Y given X = x

(b) P (y = 1/x = 1)

(d) E[y|x]

18
Solution 10 (a) We express the conditional distribution of Y given X = x as
f (x, y)
f (y/x) =
f1 (x)
4
X 1 1 1
But f1 (x) = (x + y) = (4x + 10) = (2x + 5)
y=1
54 54 27
1
54
(x + y) (x + y)
f (y/x) = 1 = , x = 1, 2, 3, ; y = 1, 2, 3, 4,
54
(4x + 10) (4x + 10)
1+1 2 1
(b) P (y = 1/x = 1) = f (1/1) = 4+10
= 14
= 7

7
(c) P (y = 4/x = 3) = f (4/3) = 22
2
(d) E[y|x] = Σy xy+y
4x+10
= x+1+2x+4+3x+9+4x+16
4x+10
= 5x+15
2x+5

3.5 Independent Random Variables

Two random variables X and Y are said to be statistically independent if for any two
sets A and B of real numbers
P [x ∈ A and y ∈ B] = P [x ∈ A]P [y ∈ B]
it follows that if X and Y are independent random variables with joint probability density
function, then,
f (x, y) = f1 (x)f2 (y)
Example 11 Let X and Y have a joint probability density function
−(x+2y)
2e x ≥ 0, y ≥ 0
f (x, y) =
0 elsewhere
Show that X and Y are independent.
Solution 11 The two random variables are independent if
f (x, y) = f1 (x)f2 (y) = 2e−(x+2y)
The marginal density function of X is given by,

∞
e−(x+2y) ∞
Z
f1 (x) = 2 e−(x+2y) dy = 2{ | = −[e−∞ − e−x ] = e−x
0 −2 0
The marginal density function of Y is given by,

∞
e−(x+2y) ∞
Z
f2 (y) = 2 e−(x+2y) dx = −2{ |0 = −2[e−∞ − e−2y ] = 2e−2y
0 −2
Therefore the random variables X and Y are independent since,
f (x, y) = f1 (x)f2 (y) = 2e−(x+2y)

19
3.6 Bivariate Expectations
Let X and Y have joint probability density function, f (x, y) and let U (x, y) be a function
of X and Y . The expected value of X and Y is expressed as,
Z ∞Z ∞
E[U (x, y)] = U (x, y)f (x, y)dxdy
−∞ −∞

If X and Y are discrete, then

XX
E[U (x, y)] = U (x, y)f (x, y)
y x

The moments of X about X = 0 are

Z ∞
r
E(X ) = xr f1 (x)dx
−∞

where f1 (x) is the marginal probability density function of X. The central moments of
X are Z ∞
r
E[(X − µx ) ] = (X − µX )r f1 (x)dx
−∞

The joint product moments of X and Y about the point (0, 0) are
Z ∞Z ∞
r s
E[X Y ] = xr y s f (x, y)dxdy
−∞ −∞

3.6.1 Conditional Expectation

Definition 6 If X and Y have joint density function f (x, y), the conditional density
function of Y given X is f (y|x) = f (x, y)/f1 (x) where f1 (x) is the marginal density
function of X. We can define the conditional expectation, or conditional mean, of Y
given X by Z ∞
E(Y |X = x) = yf (y|x)dy
−∞

and Z ∞
E(X|Y = y) = xf (x|y)dx
−∞

We note the following properties

(a) E(Y |X = x) = E(Y ) when X and Y are independent
Rx
(b) E(Y ) = −∞ E(Y |X = x)f2 (y)dy

Example 12 A miner is trapped in a mine containing 3 doors. The first door leads to a
tunnel which takes him to safety after 2 hours of travel. The second door leads to a tunnel
which returns him to the mine after 3 hours of travel. The third door leads to a tunnel
which returns him to the mine after 5 hours. Assuming he is at all times equally likely
to choose any of the doors, what is the expected length of time until the miner reaches
safety?

20
Solution 12 Let X be the time to reach safety (hours) and Y be the door (1,2 or 3)
initially chosen. Then,

E(X) = E(X|Y = 1)P (Y = 1) + E(X|Y = 2)P (Y = 2) + E(X|Y = 3)P (Y = 3)

1
= {E(X|Y = 1) + E(X|Y = 2) + E(X|Y = 3)}
3
Now
E(X|Y = 1) = 2)
E(X|Y = 2) = 3 + E(X)
E(X|Y = 3) = 5 + E(X)

So
1
E(X) = {2 + 3 + E(X) + 5 + E(X)} = 10
3

3.7 Covariance and Correlation

Let X and Y be two jointly distributed random variables with mean µX and µY . Then,
the covariance of (X, Y ) written as Cov(X, Y ) is expressed as,

Cov(X, Y ) = E[(X − µX )(Y − µY )] = σXY

= X[XY ] − µX E(Y ) − µY E(X) + µX µY
= E(XY ) − µX µY = E(XY ) − E(X)E(Y )

Covariance measures the degree of association between X and Y . Suppose that X and Y
are jointly distributed random variables with finite variances. The correlation coefficient
between X and Y is denoted by,

Cov(X, Y ) Cov(X, Y )
ρ(XY ) = =
σX σY std(X)std(Y )

Where −1 ≤ ρ(XY ) ≤ 1
If the random variables X and Y are independent, then

Cov(X, Y ) = E(XY ) − E(X)E(Y ) = E(X)E(Y ) − E(X)E(Y ) = 0

and this implies that the correlation coefficient is also zero, as

Cov(X, Y ) Cov(X, Y ) 0
ρ(XY ) = = = =0
σX σY std(X)std(Y ) σX σY

This is because independent random variables are uncorrelated.

21
3.7.1 Properties of Covariance
Let X and Y be two random variables, then

(a) Cov(X, Y ) = Cov(Y, X)

(b) Cov(X, X) = E(X 2 ) − E(X)E(X) = Var(X)

(c) Cov(aX + b, Y ) = aCov(X, Y )

Where a and b are arbitrary constants.

Example 13 Suppose X and Y are jointly distributed with joint probability density func-
tion 1
8
(x + y) 0 < x < 2, 0 < y < 2
f (x, y) =
0 otherwise
Find the correlation coefficient between X and Y .

Solution 13 First, find E(XY ), E(X), E(Y ), Var(X) and Var(Y ).

Z 2
7
E(X) = xf1 (x)dx =
0 6
and Z 2
7
E(Y ) = xf2 (y)dy =
0 6
Due to symmetry, we find that
Z 2 Z 2
2 21 5
E(Y ) = E(X ) = x2 (x + y)dydx =
8 0 0 3
and also
5 49 11
Var(X) = Var(Y ) = − =
3 36 36
The expected value of X and Y is given by,

1 2 2
Z Z
4
E(XY ) = xy(x + y)dxdy =
8 0 0 3
While the covariance between X and Y is,
4 77 1
Cov(X, Y ) = − =−
3 66 36
For the correlation coefficient between X and Y

−1 1
ρ(XY ) = q p36√ =−
11
1136 11
36

Therefore, X and Y are negatively correlated.

22
3.8 Chapter Problems
1. Show that if X and Y are independent, then, E[X/Y = y] = E[X] for all y.

2. Suppose that X and Y are jointly distributed with a joint probability density func-
tion
k(x + y) 0 < x < 2, 0 < y < 2
f (x, y) =
0 elsewhere

(a) Find the value of the constant k

(b) Find the marginal density functions of X and Y
(c) Find f (x/y) and f (y/x)
(d) Are X and Y independent?
(e) Calculate P [x < 1, y > 3]
(f) Calculate E(x, y), E(x) and E(y)

3. Suppose that X and Y are random variables whose jpdf has a moment generating
function 1 3 t2 3 10
M (t1 , t2 ) = et1 + e +
4 8 8
for all real t1 and t2 . Find Cov(X, Y )

4. The joint density function of two continuous random variables X and Y is

cxy 0 < x < 4, 1 < y < 5
f (x, y) =
0 otherwise

Find

(a) The value of the constant c.

(b) P (X ≥ 3, Y ≤ 2) and P (1 < X < 2, 2 < Y < 3)
(c) f1 (x) and f2 (y)
(d) E(X, Y ), F (x, y) and P (X < 2/Y = 2)

5. The cumulative distribution function for the joint distribution of the continuous
random variables X and Y is
1
F (x, y) = (3x3 y + 2x2 y 2 ), 0 < x < 1, 0 < y < 1
5
(a) Find f (x, y) and f (0, 0.5)
(b) Show that f (x, y) is a complete probability density function
(c) Find the marginal probability density function of X and Y
(d) Find P (0 < x < 1) and P (0 < y < 1)

23
6. Let X and Y be two random variables with joint probability mass function

k(x + 2y) x = 1, 2; y = 1, 2, 3
f (x, y) =
0 otherwise
Find
(a) The value of the constant k
(b) E(XY 2 ) and Var(x)
(d) P (x = 1, y = 1, 2) and P (x = 1/y = 1, 2)
(f) f (y/x), E(Y ) and E(Y )
7. Let X and Y be randomly distributed random variables with joint probability den-
sity function
x+y
p (1 − p)2−x−y , x = 0, 1; y = 0, 1
f (x, y) =
0 otherwise
Find the covariance and correlation coefficient between X and Y
8. A dice and a coin are each tossed once. Write the possible outcomes in a Joint
probability table.
(a) What is the probability of head and a number greater than 3 from the dice?
(b) What is the probability of a tail and an even number from the dice?
9. Suppose X and Y are two independent random variables having the respective
probability density function of the form

2(1 − x) 0 ≤ x ≤ 1
f (x) =
0 otherwise
and
2(1 − y) 0 ≤ y ≤ 1
f (y) =
0 otherwise
(a) Find
(a) P [X + Y ≤ 1] and P [X ≤ 12 , Y ≤ 1]
(b) What is the relationship of E(x, y), E(x) and E(y)

4 Linear Regression and Correlation Analysis

Correlation is a statistical method used to determine whether a relationship between
variables exists. Regression is a statistical method used to describe the nature of the
relationship between variables. Some of the questions answered by correlation and regres-
sion are
Is there a relationship between the number of hours a student studies and the
student’s score on a particular exam.
Is there a relationship between a person’s age and his or her blood pressure?
Is caffeine related to heart damage?

24
4.1 Correlation
To test for the correlation, we use the correlation coefficient (r) to determine the strength
of the linear relationship between two variables. We use the Pearson product moment
correlation coefficient. If

The range of the correlation coefficient is from -1 to +1.

If there is a strong positive linear relationship between the variables, the value of r
will be close to +1

If there is a strong negative linear relationship between the variables, the value of r
will be close to -1.

When there is no linear relationship between the variables or only a weak relation-
ship, the value of r will be close to 0.

Formula for the Correlation Coefficient, r

P P P
xy) − ( x)( y)
n(
r=p P P P P
[n( x2 ) − ( x)2 ][n( y 2 ) − ( y)2 ]
Where n is the number of data pairs. We refer to y as the dependent variable and x as
the independent variable. A scatter plot can be used to visualize the relationship even
before the calculation of the correlation coefficient.

Example
The data shows the number of cars rental companies have and their respective annual
income. Use the data to calculate the correlation coefficient and interpret its meaning.

Comany Cars (in ten thousands) Revenue (in billions $)

A 63.0 7.0
B 29.0 3.9
C 20.8 2.1
D 19.1 2.8
E 13.4 1.4
F 8.5 1.5

Find the values of xy, x2 and y 2 . Then find the sum of each column. Since Revenue
depends on the number of cars the company has, the Revenue is the dependent variable
(y) and Cars is the independent variable (x)

25
Comany Cars (x) Revenue (y) xy x2 y2
A 63.0 7.0 441.00 3969.00 49.00
B 29.0 3.9 113.10 841.00 15.21
C 20.8 2.1 43.68 432.64 4.41
D 19.1 2.8 53.48 364.81 7.84
E 13.4 1.4 18.76 179.56 1.96
F 8.5 1.5 2.75 72.25
P 2 2.25
y 2 = 80.67
P P P P
x = 153.8 y = 18.7 xy = 682.77 x = 5859.26

We substitute the values in the table to the formula and solve for r
P P P
n( xy) − ( x)( y)
r =p P P P P
[n( x2 ) − ( x)2 ][n( y 2 ) − ( y)2 ]
6(682.77) − (153.8)(18.7)
=p
[6(5859.26) − (153.8)2 ][6(80.67) − (18.7)2 ]
=0.982

The correlation coefficient suggest a strong relationship between the number of cars
a rental company has and its annual income

Therefore, as the number of cars increases (decreases), the annual income increases
(decreases)

Remark

– For negative correlation coefficient, an increase in x leads to a decrease in y

– For a positive correlation coefficient, an increase in x leads to an increase in y
– For zero correlation coefficient, an increase/decrease in x has no influence on y

4.2 Regression
If there is a negative or positive correlation coefficient, the next step is to determine the
equation of the regression line, which is the line of best fit. The purpose of the regression
line is to enable the researcher to see the trend and make predictions based on the data.

Regression Line Equation

The regression line is given by y = a + bx where a is the y intercept and b is the slope of
the line. To calculate the value of a and b:
( y)( x2 ) − ( x)( xy)
P P P P
a= P P
n( x2 ) − ( x)2
P P P
n( xy) − ( x)( y)
b= P P
n( x2 ) − ( x)2

26
Example
We revisit the example of the number of cars the rental company has, and the annual
income made by the company. To estimate the regression line, we compute the values of
a and b. P P P
The values
P 2 needed for the equations are n = 6, x = 153.8, y = 18.7, xy =
682.77 and x = 5859.26. We substitute the values in the formula to get:
( y)( x2 ) − ( x)( xy)
P P P P
(18.7)(5859.26) − (153.8)(682.77)
a= P 2 P 2 = = 0.396
n( x ) − ( x) (6)(5859.26) − (153.8)2
P P P
n( xy) − ( x)( y) 6(682.77) − (153.8)(18.7)
b= P 2 P 2 = = 0.106
n( x ) − ( x) (6)(5859.26) − (153.8)
The equation of the regression line y = a + bx is
y = 0.396 + 0.106x
We can use the regression line to predict the values of y given the values of x, that is,
predict the annual income given the number of cars. For example
Let x = 40 cars
y = 0.396 + 0.106(40) = 4.636
Thus, with 400,000 cars, the company makes 4.636 billions dollars per year or we write
(40, 4.636)

4.3 Chapter Problems

1. Calculate the value of the correlation coefficient for the number of hours a person
exercises and the amount of milk a person consumes per week.

Subject Hours of Exercise (X) Amount of Milk Consumed (Y)

A 3 48
B 0 8
C 2 32
D 5 64
E 8 10
F 5 32
G 10 56
H 2 72
I 3 48

2. The number of calories and the number of milligrams of cholesterol for a random
sample of fast-food chicken sandwiches from seven restaurants are shown here. Is
there a relationship between the variables?

Calories x 390 535 720 300 430 500 440

Cholesterol y 43 45 80 50 55 52 60

27
3. The number of forest fires and the number of acres burned are as follows:

Fires x 72 69 58 47 84 62 57 45
Acres y 62 41 19 26 51 15 30 15

Find y when x = 60

4. In Question number 3 above, find y when x = 600 calories

5 Distribution of Functions of Random Variables

In this section, we will learn how to find the probability distribution of functions of random
variables. For example, we might know the probability density function of X, but want to
know the probability density function of u(X) = X 2 . The two techniques to be discussed
are:

(a) Cumulative Distribution function (cdf) technique

(b) Change of variable (Jacobian) technique

Suppose that a random variable X has discrete distribution and U = Φ(X) is another
random variable which is a function of X. Then, for any positive U > 0, we have
X
g(U ) = P (U = u) = P (Φ(X) = u) = f (x)
X:Φ(X)=u

If X has a continuous distribution and U = Φ(X) is another random variable which

is a function of X. Then, for any positive U > 0, we have
Z
g(U ) = P (U ≤ u) = P (Φ(X) ≤ u) = f (x)dx
X:Φ(X)=u

5.1 Cumulative distribution function technique

The cumulative distribution function (cdf) is applied to a univariate distribution. If X is
continuous and g(u) is the cdf of u = Φ(X), then
Z
G(U ) = P (U ≤ u) = P (Φ(X) ≤ u) = f (x)dx
X:Φ(X)≤u

If G(U ) is differentiable, then the pdf of U is,

d
g(U ) = G(U )
du
This method of getting the pdf of U is called the cdf technique. If Φ(X) is continuous
and strictly increasing or decreasing function of X, over the interval (a, b) , then U will
vary over some interval (α, β) as X varies over the interval (α, β), and the inverse function

28
W will be strictly increasing or decreasing over the interval (α, β). The pdf of U is given
by
d
g(U ) = f (W (u)) W (u)
du
Where U = Φ(X) if and only if X = W (U )

Example 14 Let X be a random variable with pdf

1
2
x: 0<x<2
f (x) =
0 elsewhere

Find the pdf of u = 1 − x2

Solution 14 We consider the five steps in using the cdf method:

Step 1: Find the interval for the random variable U
We transform the variable x to obtain the interval values for u as: x : 0 < x <
2 = (−3, 1)
u is continuous and strictly decreasing over the interval (0, 2). That is,

x = 0 =⇒ u = 1 and x = 2 =⇒ u = −3

As x varies over (0, 2) = (a, b), U varies over (−3, 1) = (α, β).

Step 2: Find the inverse function

The inverse function is
1
x = (1 − u) 2 , −3 < u < 1
and
1
x = w(u) = (1 − u) 2

Step 3: Find the derivative of u

Find the derivative with respect to u
d 1 1 1
(w(u)) = (1 − u)− 2 = 1
du 2 2(1 − u) 2

Step 4: Find f (w(u))

1 1
f (w(u)) = (1 − u) 2
2
Step 5: Find g(u) = f (w(u)| du
d
w(u)|)
The pdf of u is expressed as
1 1 1 1 1
g(u) = (1 − u) 2 . 1 =
2 2 (1 − u) 2 4

1

4
: −3 < u < 1
g(u) =
0 elsewhere

29
5.2 Change of Variable Technique
Another method of obtaining distribution functions of functions of random variables.
Let X be a random variable (continuous), with pdf f (x).
Let Y = µ(X) ; some function of XϵA
This implies that, X = µ−1 (Y ) = w(y)
By change of variable technique, the pdf of Y is given as:

f (w(y))|J| y ∈ B
g(y) =
0 elsewhere

Where J is the Jacobian transformation from A to B = dx

dy
A is the domain of random variable X and B is the domain of random variable Y .

Example 15 Let X be a random variable with pdf

k(x + 1) 0 < x < 2
f (x) =
0 elsewhere
Where k is a constant. Find the distribution of X 2 .

Solution 15 We first need to obtain k.

Z 2
We solve the integral f (x)dx = 1
0

x2 1
k( + x)|20 = 1 =⇒ k(2 + 2) = 1 =⇒ k =
2 4
The pdf of X is expressed as:
(
1
(x + 1); 0 < x < 2
f (x) = 4
0; Otherwise

We five steps to follow while using the change of variable technique

Step 1: Find the interval for y

We know the interval for x which is 0 < x < 2, and we can find the interval for y.
Given that Y = X 2 , then

X = 0 =⇒ Y = 0 and X = 2 =⇒ Y = 4

Now
Y = X 2 = µ(x) ⇒ B = [y|, 0 < y < 4

Step 2: Find the inverse function

The inverse transformation is given by
√
X = µ−1 (y) = y
√
use instead of µ−1 (y) := w(y), that is, w(y) = y

30
Step: Find the Jacobian
The Jacobian transformation is obtained as
√
dx d y 1 1 1 1
J= = = y − 2 = √ ⇒ |J| = √
dy dy 2 2 y 2 y

Step 4: Find f (w(y))

1 1 √
f (w(y)) = w(y) = [1 + y]
4 4
Step 5: The distribution of Y is
g(y) = f (w(y)) with |J| ; yϵB
1 √ 1
g(y) = ( y + 1) √ , 0 < y < 4
4 2 y
(
1
8
(1 + √1y ); 0 < y < 4
g(y) =
0; elsewhere

Example 16 Let X be a discrete random variable with probability mass function

1 x
2( 3 ) x = 1, 2, 3, . . .
f (x) =
0 elsewhere
Find the distribution of Y = X 3 + 2

Solution 16 We consider the five steps in the case of pmf

Step 1: Find the values of the random variable y

A = [x|x = 1, 2, 3, ...]
and
B = [y|y = 3, 10, 29, ...]

Step 2: Find the inverse function

3
p 1
w(y) = x = y − 2 = (y − 2) 3

Step 3: Find the Jacobian

Since X is discrete, we do not have to compute the Jacobian of transformation.

Step 4: Find f (w(y))

1
f (w(y)) = 2( )w(y)
3

31
Step 5: Find the distribution of y
The pmf of Y is given by
1
g(y) = f (w(y)) = 2( )w(y) ; y = 3, 10, 29, ...; yϵB
3
Thus, the pmf of y is given as
( 1

g(y) = 2( 31 )(y−2) 3 y = 3, 10, 29, . . .

0 elsewhere

5.2.1 Bivariate Distribution Cases

Suppose X and Y are jointly distributed random variables with a jpdf, f (x, y). We define
two new random variables with the transformation

U =Φ(X, Y ) and
V =Φ1 (X, Y )

which is one-to-one and maps the set S of (X, Y ) onto B, where B = set of (U, V ). This
transformation is invertible and hence

X = W1 (X, Y ) and Y =W2 (X, Y )

S is a subset of (X, Y ) plane, while B is a subset of (U, V ) plane. The Jacobian of x, y

with respect to u and v is a short-hand for the 2 × 2 determinant
∂x ∂x
|J| = ∂u
∂y
∂v
∂y ,
∂u ∂v

is called the Jacobian of transformation. The Jpdf of g(uv, ) is given by

f [W1 (U, V ); W2 (U, V )] |J| U, V ∈ ℜ
g(u, v) =
0 elsewhere
The Jacobian determinant is used when making a change of variables and evaluating
a multiple integral of a function over a region within its domain. To accommodate for the
change of coordinates the magnitude of the Jacobian determinant arises as a multiplicative
factor within the integral.
suppose that X1 , X2 , ..., Xn are jointly distributed random variables with a jpdf

f (x1 , x2 , ..., xn ); X1 , X2 , ..., Xn ϵA.

Let Y1 , Y2 , ..., Yn define a one-one transformation

y1 = µ1 (x1 , x2 , ..., xn )

y2 = µ2 (x1 , x2 , ..., xn )
..
.

32
yn = µn (x1 , x2 , ..., xn )
We find the joint pdf of Y1 , Y2 , ..., Yn as follows:
(
f (w1 (y1 , ...yn ), ..., wn (y1 , ...yn ))|J|; Y1 , Y2 , ...Yn ϵB
g(y1 , y2 , ..., yn ) =
0; otherwise

where  
∂x1 ∂x1 ∂x1
 ∂y1 ∂y2
... ∂yn 
 
 ∂x2 ∂x2
... ∂x2 
|J| =  ∂y. 1 ∂y2 ∂yn 

 .. .. .. 
 . ... . 

 
∂xn ∂xn ∂xn
∂y1 ∂y2
... ∂yn

Note: If X1 , X2 , ..., Xn were discrete random variables, we would ignore |J| in the joint
pdf g(...) of Yi : i = 1, 2, ...n.

Example 17 Let X and Y have a jpdf given by

1 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
f (x, y) =
0 elsewhere
Find the distribution of U = X + Y and Y = Y − X

Solution 17 We consider the five steps in the case of a jpdf

Step 1: Find the inverse functions

U −V U +V
U = X + Y and V = Y − X. We solve the two to get X = 2
and Y = 2

Step 2: Find the intervals

When
X = 0 =⇒ U − V = 0 andX = 1 =⇒ U − V = 2
While
Y = 0 =⇒ U + V = 0 andY = 1 =⇒ U + V = 2
Thus: 0 < u − v < 2 and 0 < u + v < 2

Step 3: Find the Jacobian

The Jacobian of transformation is given by
∂x ∂x

|J| = ∂u∂y
∂v
∂y ,
∂u ∂v

The partial derivatives and the determinant of the Jacobian is

1
− 12

|J| = 1 1 ,
2
2 2

1
This means that |J| = 2

33
Step 4: Find f (w1 (u, v), w2 (u, v))

f (w1 (u, v), w2 (u, v)) = 1

Step 5: Find g(u, v)

1
g(u, v) = f (w1 (u, v), w2 (u, v))|J| =
2
The jpdf of X and Y is given by
1
2
0 < u − v < 2, 0 < u + v < 2
g(u, v) =
0 elsewhere

5.3 Chapter Problems

1. The probability density function of the random variable X is given by
x2
−3 < x < 6
f (x) = 81
0 elsewhere

Find the probability density function of the random variable U = 13 (12 − X)

2. If the random variables X and Y have joint density function

xy
96
0 < x < 4, 1 < y < 5
f (x, y) =
0 elsewhere

(a) Find the joint density function of U = X + Y and V = Y − X

3. The probability function of a random variable X is

−x
2 x = 1, 2, 3, . . .
f (x) =
0 elsewhere

Find the probability function of the random variable U = X 4 + 1

4. Let X and Y have joint density function

−(x+y)
e x ≥ 0, y ≥ 0
f (x, y) =
0 elsewhere
X
If U = Y
, V = X + Y , find the joint density function of U and V .

5. Let X have the density function

e−x x > 0

f (x) =
0 x≤0

Find the density function of Y = X 2

34
6. Let f (x, y) be the joint density function of X and Y .

1 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
f (x, y) =
0 otherwise

Find the density function of Z = XY

7. Let the probability density function of X and Y be given by

−x−2y
2e x > 0, y > 0
f (x, y) =
0 otherwise

Find the probability density of Z = X + Y given that Z is a linear function of X

and Y .

6 Derived Distributions
6.1 Gamma Function
The Gamma Function Γ(x) is an extension of the factorial function to real (and com-
plex) numbers. If n ∈ {1, 2, 3, . . .}, then

Γ(n) = (n − 1)!

More generally, for any positive real number α, Γ(α) is defined as

Z ∞
Γ(α) = xα−1 e−x dx, for α > 0
0

Note that for α = 1, we can write

Z ∞
Γ(1) = e−x dx = 1
0

Using the integration by parts, it can be shown that:

Γ(α + 1) = αΓ(α), for α > 0

Using the change of variable x = λy, we can show the following equation is useful when
working with the gamma distribution:
Z ∞
Γ(α) = λ α
y α−1 e−λy dy, for α, λ > 0
0

6.1.1 Properties of the Gamma Function

For any positive real number α:
R∞
1. Γ(α) = 0 xα−1 e−x dx
R∞
2. 0 xα−1 e−λx dx = Γ(α)
λα
, for λ > 0

35
3. Γ(α + 1) = αΓ(α)

4. Γ(n) = (n − 1)!, for n = 1, 2, 3, . . .

√
5. Γ( 12 ) = π

Example 18 Find the value of

1. Γ( 27 ) = 25 Γ( 25 ) = 52 32 Γ( 23 ) = 52 32 12 Γ( 12 ) = 15
8
π
R∞
2. I = 0 x6 e−5x dx
Since α = 7 and λ = 5, we obtain I = Γ(7) 57
= 6!
57
= 0.0092

6.2 Gamma Distribution

A continuous random variable X is said to have a gamma distribution with parameters
α > 0 and λ > 0, shown as X ∼ Gamma(α, λ), if its probability density function is given
by ( α α−1 −λx
λ x e
Γ(α)
x>0
f (x) =
0 otherwise

Exercise
Find the E(x) and V ar(x) of the gamma distribution

Prove that the Gamma distribution is a probability density function, that is:
Z ∞
1
λα xα−1 e−λx dx = 1
Γ(α) 0

Calculus I
No ratings yet
Calculus I
84 pages
SMA 202 Linear Algebra Notes
No ratings yet
SMA 202 Linear Algebra Notes
94 pages
Gamma Function Definitions
100% (1)
Gamma Function Definitions
16 pages
Integral Calculus Lecture Notes
No ratings yet
Integral Calculus Lecture Notes
36 pages
Sma 160 Probability and Statistics 1
No ratings yet
Sma 160 Probability and Statistics 1
165 pages
Exam Ucu 103
100% (2)
Exam Ucu 103
2 pages
Applications of Partial Differentiation
No ratings yet
Applications of Partial Differentiation
28 pages
Vector Analysis Course Module
100% (1)
Vector Analysis Course Module
111 pages
Math 211 Calculus Ii Notes
No ratings yet
Math 211 Calculus Ii Notes
24 pages
MSM 111 - Mathematical Methods Course Outline I
100% (1)
MSM 111 - Mathematical Methods Course Outline I
3 pages
Probability and Statistics II
No ratings yet
Probability and Statistics II
94 pages
Sta 2200 Notes PDF
No ratings yet
Sta 2200 Notes PDF
52 pages
MSS 242 June, 2024
No ratings yet
MSS 242 June, 2024
9 pages
Introduction to Philosophy Basics
0% (1)
Introduction to Philosophy Basics
53 pages
Kcse Revision, Trigonometric 111
No ratings yet
Kcse Revision, Trigonometric 111
4 pages
Calculus I Exam Paper 2015
No ratings yet
Calculus I Exam Paper 2015
3 pages
COM216 Statistics For Computing II Lecture Notes
No ratings yet
COM216 Statistics For Computing II Lecture Notes
2 pages
Math 122
No ratings yet
Math 122
3 pages
Jomo Kenyatta University OF Agriculture and Technology University Examinations 2013/2014
No ratings yet
Jomo Kenyatta University OF Agriculture and Technology University Examinations 2013/2014
4 pages
Sma 102 Notes
No ratings yet
Sma 102 Notes
61 pages
SSC 201 - Statistical Methods and Sources I
100% (1)
SSC 201 - Statistical Methods and Sources I
77 pages
Calculus II Jkuat
50% (2)
Calculus II Jkuat
3 pages
MAT 221 TEST 2 Revision Questions
100% (1)
MAT 221 TEST 2 Revision Questions
2 pages
Chapter 6 PDF Lecture Notes
No ratings yet
Chapter 6 PDF Lecture Notes
41 pages
Polar Coordinates Notes
No ratings yet
Polar Coordinates Notes
15 pages
SMA 2301 Real Analysis I Notes
No ratings yet
SMA 2301 Real Analysis I Notes
52 pages
Probability and Queueing Theory: Question Bank
No ratings yet
Probability and Queueing Theory: Question Bank
62 pages
SMA 2101 Calculus I
100% (1)
SMA 2101 Calculus I
3 pages
Discrete Probability Distributions
No ratings yet
Discrete Probability Distributions
23 pages
Vector Analysis Exam 2014/2015
100% (1)
Vector Analysis Exam 2014/2015
3 pages
g12 Maths p2 2017 Gce
No ratings yet
g12 Maths p2 2017 Gce
10 pages
Lecture # 01 (Probability and Statistics)
No ratings yet
Lecture # 01 (Probability and Statistics)
35 pages
MAT 201 Syllabus
No ratings yet
MAT 201 Syllabus
8 pages
Bma2102 - Probability - Statistics Ii Solutions
No ratings yet
Bma2102 - Probability - Statistics Ii Solutions
8 pages
Egm 212 - Test 1-Marking Key-1
No ratings yet
Egm 212 - Test 1-Marking Key-1
14 pages
Advanced Level Statistics
No ratings yet
Advanced Level Statistics
10 pages
Row Reduction PDF
100% (1)
Row Reduction PDF
6 pages
Ch06.continous Probability Distributions
No ratings yet
Ch06.continous Probability Distributions
26 pages
Pre Tahossa Bam
No ratings yet
Pre Tahossa Bam
2 pages
Probability and Counting Practice Exam
No ratings yet
Probability and Counting Practice Exam
39 pages
Probability and Statistics Assignment 201a
No ratings yet
Probability and Statistics Assignment 201a
27 pages
Ode Lecture Notes
No ratings yet
Ode Lecture Notes
72 pages
Canonical Forms of PDEs Explained
No ratings yet
Canonical Forms of PDEs Explained
4 pages
MSS 241-MSS 231 Test One
No ratings yet
MSS 241-MSS 231 Test One
4 pages
Foundation Mathematics Tutorial 2022
No ratings yet
Foundation Mathematics Tutorial 2022
3 pages
Sta 242-Bivariate Analysis-2-Joint MGF
No ratings yet
Sta 242-Bivariate Analysis-2-Joint MGF
9 pages
Notes-Derivatives of Trig PDF
No ratings yet
Notes-Derivatives of Trig PDF
7 pages
2024 - April - UG - B.Sc. Maths - B.Sc. Maths
No ratings yet
2024 - April - UG - B.Sc. Maths - B.Sc. Maths
92 pages
MSS 241 Material
No ratings yet
MSS 241 Material
44 pages
Importance of Calculus To Actuaries
No ratings yet
Importance of Calculus To Actuaries
10 pages
Probability & Statistics Basics
100% (1)
Probability & Statistics Basics
84 pages
BasicMath F4 2023
No ratings yet
BasicMath F4 2023
6 pages
Math 122
No ratings yet
Math 122
2 pages
MAT 1100 Tutorial Sheet 7-2022 PDF
No ratings yet
MAT 1100 Tutorial Sheet 7-2022 PDF
3 pages
Mbeya (Science) - Mock F6 2022
No ratings yet
Mbeya (Science) - Mock F6 2022
45 pages
Ordinary Differential Equations Lecture Notes
No ratings yet
Ordinary Differential Equations Lecture Notes
64 pages
Analytical Geometry for Freshmen
No ratings yet
Analytical Geometry for Freshmen
156 pages
STA3045F Exam 2012
No ratings yet
STA3045F Exam 2012
5 pages
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
No ratings yet
Addis Ababa Science & Technology University Department of Electrical & Computer Engineering
63 pages
MATH2010 2022 23 AutumnNotes Gappy
No ratings yet
MATH2010 2022 23 AutumnNotes Gappy
92 pages
Latin Squares Design Has Following Features
No ratings yet
Latin Squares Design Has Following Features
9 pages
Exact Solutions For Free-Vibration Analysis of Rectangular Plates Using Bessel Functions
No ratings yet
Exact Solutions For Free-Vibration Analysis of Rectangular Plates Using Bessel Functions
5 pages
Allocation Problem
No ratings yet
Allocation Problem
9 pages
ECE Numerical Methods Test
No ratings yet
ECE Numerical Methods Test
4 pages
IB CS CheatSheet
No ratings yet
IB CS CheatSheet
2 pages
Plato's " Saving The Appearances" .
50% (2)
Plato's " Saving The Appearances" .
41 pages
Geometry+Final+Review+1 0+2025
No ratings yet
Geometry+Final+Review+1 0+2025
12 pages
Trigonometric Functions Practice Key
No ratings yet
Trigonometric Functions Practice Key
9 pages
Young's vs Bulk Modulus Explained
No ratings yet
Young's vs Bulk Modulus Explained
4 pages
Math 10 Quarterly Test Q3 SOLO
100% (1)
Math 10 Quarterly Test Q3 SOLO
4 pages
Direction: Read The Questions Very Carefully. Answer by Shading The Letter of Your Choice in Each Item
No ratings yet
Direction: Read The Questions Very Carefully. Answer by Shading The Letter of Your Choice in Each Item
3 pages
Lecture 16 - Worm Gears Worked Out Problems
50% (2)
Lecture 16 - Worm Gears Worked Out Problems
19 pages
Final 20 Assessment
No ratings yet
Final 20 Assessment
2 pages
Engineering Mathematics 2A (Scee08009) : Refresher Sheet
No ratings yet
Engineering Mathematics 2A (Scee08009) : Refresher Sheet
2 pages
Fluid and Particle Processes Assignment
No ratings yet
Fluid and Particle Processes Assignment
24 pages
Delhi Public School Bangalore East Portions For Unit Test - 2 Examination (2024 - 2025)
No ratings yet
Delhi Public School Bangalore East Portions For Unit Test - 2 Examination (2024 - 2025)
4 pages
Development of Tubular Linear Permanent Magnet Syn
No ratings yet
Development of Tubular Linear Permanent Magnet Syn
8 pages
Detailed Lesson Plan in Multiplication On Whole Numbers Again-2
No ratings yet
Detailed Lesson Plan in Multiplication On Whole Numbers Again-2
7 pages
Coordinate Geometry Solutions
No ratings yet
Coordinate Geometry Solutions
2 pages
Parallel and Perpendicular Lines Guide
No ratings yet
Parallel and Perpendicular Lines Guide
19 pages
Uzan-The Arrow of Time and Meaning PDF
No ratings yet
Uzan-The Arrow of Time and Meaning PDF
29 pages
Problem Set 1 - 2025
No ratings yet
Problem Set 1 - 2025
2 pages
Complex Numbers Quiz
No ratings yet
Complex Numbers Quiz
1 page
Calulus I Model Question 2023
No ratings yet
Calulus I Model Question 2023
2 pages
Analysis of Correlated Data With SAS and R, 4th Edition Extended Version Download
No ratings yet
Analysis of Correlated Data With SAS and R, 4th Edition Extended Version Download
16 pages
Solid Propellant Motor Performance Program
No ratings yet
Solid Propellant Motor Performance Program
135 pages
Ktea3 Parent Report
No ratings yet
Ktea3 Parent Report
6 pages
Indian Mathematicians Through History
No ratings yet
Indian Mathematicians Through History
3 pages
Applied Statistics For Public and Nonprofit Administration 8th Edition Solution Manual
No ratings yet
Applied Statistics For Public and Nonprofit Administration 8th Edition Solution Manual
6 pages
Nandha College of Technology, Erode - 52. B.E/B.Tech Degree Examinations, Dec-2012
No ratings yet
Nandha College of Technology, Erode - 52. B.E/B.Tech Degree Examinations, Dec-2012
1 page

SMA 240 Probability and Statistics 1 Lecture Notes

Uploaded by

SMA 240 Probability and Statistics 1 Lecture Notes

Uploaded by

SMA 240: Probability and Statistics I

Lecture Notes, February 2023

2 Moments and Moments Generating Functions 5

3 Bivariate Probability Distribution 13

5 Distribution of Functions of Random Variables 28

1 Review of Random Variables

X is a continuous RV if there exists a function fx : ℜ → [0, ∞) such that

The probability density function (Pdf) must satisfy

Exercise 1 1. Let X be a continuous RV with Pdf

1.2.1 Properties of cumulative distribution function

If X is a continuous RV with Pdf, f (x), the expected of X, E(X) is defined as

1.3.1 Properties of Expectation

Var(X) = E(X 2 ) − [E(X)]2

1.5 Chapter Problems

Then compute the following

(a) P (1 < X < 2)

2. A continuous RV X has a Pdf, f (x) given by

(a) The cdf of x, that is F (x)

3. Suppose that X has a moment generating function

(a) Find: P [x ≥ 12 ] and P [− 12 < X ≤ 34 ]

(b) What is the value of a such that P (X ≤ a) = 0.65?

6. The random variable X has a pmf given by

2 Moments and Moments Generating Functions

and is denoted by µ′k

The major use of moments is to approximate the probability distribution of a ran-

2.2 Moment Generating Functions

We consider the moment generating functions of some probability distribution func-

2.2.1 Bernoulli Distribution

f (x) = px (1 − p)1−x ; where x = 0, 1

The moment generating function is given by

2.2.2 Binomial Distribution

From the definition of moment generating function

2.2.3 Poisson Distribution

2.2.4 Exponential Distribution

f (x) = λe−λx ; where x > 0

The moment generating function is given as

2.2.5 Gamma Distribution

2.2.6 Normal Distribution

we can find any of the moments of X.

Theorem 1 If MX (t) exists, then for any positive integer k,

dMX (t) (k)

Taking the first derivative we have

The derivative at t=0 is

Taking the second derivative we have

The derivative at t=0 is

 Continuous Case: In the continuous case the moment generating function is in

Taking the second derivative we have

The derivative at t=0 is

Example 2 Consider a Bernoulli distribution with mass function

f (x) = px (1 − p)1−x with x = 0, 1

Example 3 Consider a Geometric distribution with mass function

f (x) = q x p with x = 0, 1, 2, ..., ∞

Example 4 Let X and Y be independent; Z = XY

MZ (t) = E(etZ ) = E(et(X+Y ) ) = E(etX+tY ) ) = E(etX etY )

2.4 Markov and Chebyshev’s Inequality

We write the Markov inequality as:

P [Y ≥ b2 ] = P ([X − E(X)]2 ≥ b2 ) = P (|X − E(X)| ≥ b)

And we can write the Chebyshev’s inequality as

2.5 Chapter Problems

4. If x = 1, 2, 3, . . . has the geometric distribution f (x) = pq x−1 , where q = 1 − p, show

6. Find the moment generating function of the point binomial

3 Bivariate Probability Distribution

3.1 Joint distributions

f (x, y) = P (X = x and Y = y), satisfying

(i) f (x, y) ≥ 0 ∀(x, y) ∈ ℜ2

The joint distribution function of X and Y is defined by

 Let M be the event that an officer is a male

 Let W be the event that an officer is a woman

 Let A be the event that an officer is promoted

 Let Ac be the event that an officer is not promoted

(a) Probability that an officer is a man and is promoted

(b) Probability that an officer is a woman and is not promoted

(c) Probability that an officer is promoted

Solution 7 (a) P (M ∩ A) = 0.24

(c) P (A) = 0.27

If F (x, y) is differentiable, then the joint probability density function of X and Y is

Continuous Case: In the continuous case the moment generating function is in

Let M be the event that an officer is a male

Let W be the event that an officer is a woman

Let A be the event that an officer is promoted

Let Ac be the event that an officer is not promoted

The range of the correlation coefficient is from -1 to +1.

Step 2: Find the inverse function

Step 3: Find the derivative of u

Step 4: Find f (w(u))

Step 1: Find the interval for y

Step 2: Find the inverse function

Step 4: Find f (w(y))

Step 1: Find the values of the random variable y

Step 2: Find the inverse function

Step 3: Find the Jacobian

Step 4: Find f (w(y))