SMA 240 Probability and Statistics 1 Lecture Notes
SMA 240 Probability and Statistics 1 Lecture Notes
Reference Books
1. An introduction to Probability and Statistics. Rohatgi, V.K. and Saleh, A.K. Second
Edition, Wiley Eastern Limited, 2011
2. Introduction to the theory of Statistics. Mood, A., Graybill, F., and Boes, D.C.
Third Edition, London
Contents
1 Review of Random Variables 2
1.1 Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Chapter Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1
4 Linear Regression and Correlation Analysis 24
4.1 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Chapter Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6 Derived Distributions 35
6.1 Gamma Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 Gamma Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Σ∞
k=1 P (X = xk ) = 1
2
1.2 Distribution Function
Let X be a RV defined on a sample space X. Consider the event E that X satisfies,
−∞ < X ≤ x, where x is any real number
P (x ∈ E) = P [−∞ < X ≤ x] = P [X ≤ x] = F (x)
The function F (x) is called the Distribution Function or the Cumulative Distribution
Function (cdf) of the RV X. For continuous RV X with Pdf f (x)
Z ∞
F (x) = f (x)dx
−∞
For discrete RV X with pmf f (x)
F (x) = Σx≤n f (x)
1.3 Expectation
Let X be a discrete RV which takes the values x1 , x2 , x3 , . . . and whose pmf is defined by
f (x) = P (X = xi ), i = 1, 2, 3, . . .
The expected value of X, denoted by E(X) is defined by
E(x) = Σ∞
i xi f (xi )
2. Let g(x) and h(x) be any real valued function of x. The expected value is
E[ag(x) + bh(x)] = aE[g(x)] + bE[h(x)], where a, b ∈ ℜ
3
1.4 Variance
The variance for a RV X is expressed in terms of the expected value
where E(X 2 ) = Σ∞ 2
i=1 xi f (xi ) for discrete RV
Z ∞
2
and E(X ) = x2 f (x)dx for continuous RV
−∞
Example 1 Given the Pdf, find the expected value and the variance
x
2
: 0≤x≤2
f (x) =
0 elsewhere
R2 R2 x2
Solution 1 (a) E(X) = 0 x. x2 dx = 0 2
dx = 4
3
R2 R2 3
(b) E(X 2 ) = 0 x2 . x2 = 0 x2 dx = 2
16 2
(c) V ar(x) = E(x2 ) − [E(x)]2 = 2 − 9
= 9
Find
4. The cdf of a RV X is
0, x < −1
x+1
F (x) = 2
, −1 ≤ x ≤ 1
1, x≥1
5. A RV X has a pdf
cx, 1 ≤ x ≤ 3
f (x) =
0 elsewhere
Find the constant c and P [0 ≤ x ≤ 1]
(a) Find the value of the constant k such that the distribution is a mass function
(b) P (x = 3, 4)
(c) Find the cdf, that is F (x)
2.1 Moments
Definition 1 The k th moment of a random variable X taken about the origin is defined
to be
E Xk
From the above definition we can easily verify that the first moment about the origin
is
E(X) = µ′1 = µ,
the second moment about the origin is
E X 2 = µ′2
and so on. In addition to taking moments about the origin, a moment of a random
variable can also be taken about the mean µ.
5
Definition 2 The k th moment of a random variable X taken about its mean, or the k th
central moment of X, is defined to be
h i
E (X − µ)k
and is denoted by µk
Definition 3 The moment generating function MX (t) for a random variable X is defined
to be
MX (t) = E etX
MX (t) = E etX ,
6
the moment generating function of the Binomial distributed random variable X is given
as
n
X
tX
etx f (x) where f (x) is the probability mass function of X
MX (t) = E e =
x=0
n n
X n x X n x
= e tx
p (1 − p)n−x
= pet (1 − p)n−x
x x
x=0t n t x=0
n
= pe + (1 − p) = pe + q , where q = 1 − p
e−λ λx
f (x) = ; where x = 0, 1, 2, ..., ∞
x!
The moment generating function MX (t) of this random variable is given as
∞ ∞ x
X e−λ λx X (λet ) t t
MX (t) = E etx = etx = e−λ = e−λ eλe = eλ(e −1)
x=0
x! x=0
x!
If λ = β1 , then
λα α−1 −xλ
f (x) = x e ; where 0 < x < ∞, α > 0, λ > 0
Γ(α)
Given the two listed properties of Gamma function. For any positive real number α:
7
R∞
Γ(α) = 0 xα−1 e−x dx
R∞
0 xα−1 e−λx dx = Γ(α)
λα
, for λ > 0
The moment generating function of the Gamma distribution can be expressed as:
∞
λα tx α−1 −xλ
Z
tX
MX (t) = E e = e x e dx
0 Γ(α)
Z ∞
λα
= xα−1 e−x(λ−t) dx
Γ(α) 0
dy y
Let y = (λ − t)x =⇒ dx = (λ−t)
=⇒ x = (λ−t)
. We substitute to get:
Z ∞
λα y α−1 −y dy
MX (t) = e
Γ(α) 0 (λ − t)α−1 (λ − t)α−1
α Z ∞
λ 1
= y α−1 e−y dy
Γ(α) (λ − t)α 0
Z ∞
λα Γ(α) h
α−1 −y
i
= Since y e dy = Γ(α)
Γ(α) (λ − t)α 0
λα
MX (t) =
(λ − t)α
That is, if you find the k th derivative of MX (t) with respect to t and then set t = 0, the
result will be µ′k
In the section below, we show how to calculate the mean and variance using moment
generating functions.
Discrete Case: In the discrete case the moment generating function is in general
given by
X tx
MX (t) = E etX = e f (x)
x
9
The derivative at t=0 is
Z
dMX (t)
= xf (x)dx = E(X)
dt x
t=0
d2 MX (t)
Z
= x2 etx f (x)dx
dt2 x
d2 MX (t)
Z
= x2 f (x)dx = E(X 2 )
dt2 x
t=0
Hence
M ′ (0) = E(X) and M ′′ (0) = E(X 2 )
And
V ar(X) = E(X 2 ) − [E(X)]2 = V ar(X) = M ′′ (0) − [M ′ (0)]2
Therefore
E(X) = M ′ (0)
V ar(X) = M ′′ (0) − [M ′ (0)]2
The examples below show how to calculate the mean and variance for 3 distributions.
Note that calculations for the other distributions are left as exercises.
Use the moment generating function to find the mean and variance.
Solution 2
MX (t) = (1 − p) + pet
dMX (t)
E(X) = = M ′ (0) = pet =p
dt
t=0 t=0
d2 MX (t)
E(X 2 ) = = M ′′ (0) = pet =p
dt2
t=0 t=0
2 2 2
V ar(X) = E(X ) − [E(X)] = p − p = p(1 − p)
Find the moment generating function and the mean and variance.
10
Solution 3 ∞ ∞
X X p
MX (t) = E(etx ) = etx q x p = p (qet )x =
x=0 x=0
1 − qet
t
dMX (t) −p(−qe ) pq pq q
E(X) = |t=0 = t 2
|t=0 = 2
= 2 =
dt (1 − qe ) (1 − q) p p
d2 MX (t) pqet (1 − qet )2 + 2pq 2 e2t (1 − qet ) pq + pq 2
E(X 2 ) = |t=0 = |t=0 =
dt2 (1 − qet )4 p3
pq + pq 2 q 2 q + q2 − q2 q
V ar(X) = M ′′ (0) − [M ′ (0)]2 = − = =
p3 p2 p2 p2
Solution 4 Z Z Z
E(XY ) = xyf (x, y)dxdy = xyf (x)f (y)dxdy
y x y
Z Z Z Z
= x[ yf (y)dy]f (x)dx = f (x)dx yf (y)dy = E(X)E(Y )
x y x y
Example 5 Let X and Y be two independent random variables with mgf ’s MX (t) and
MY (t) respectively. Obtain the mgf of Z = X + Y , or rather, show that MZ (t) =
MX (t)MY (t).
Solution 5
Due to independence;
= E(etX ).E(etY ) = MX (t)MY (t)
11
Markov Inequality
Let X be any positive and continuous random variable, we can write:
Z ∞
E(x) = xf (x)dx
−∞
Z ∞
= xf (x)dx (Since X is positive)
0
Z ∞
≥ xf (x)dx (for any a > 0)
Za ∞
≥ af (x)dx (Since x > a in the integrated region)
a
Z ∞
≥a f (x)dx
a
≥ aP (X ≥ a)
E(x)
P (X ≥ a) ≤ , for any a > 0
a
Chebyshev’s Inequality
Let X be any random variable. If we define Y = [X − E(X)]2 , then Y is a non-negative
random variable. We apply Markov’s inequality to Y , for any positive real number b.
E(Y )
P [Y ≥ b2 ] ≤
b2
But E(Y ) = E[X − E(X)]2 = V ar(X) and this implies
V ar(X)
P (|X − E(X)| ≥ b) ≤ , for any b > 0
b2
Remark: If the variance is small, then X is unlikely to be too far from the mean.
2. Each of 3 boxes contains 4 bolts, 3 square in shape and 1 circular in shape. A bolt
is chosen at random from each box. Find the probability that 3 circular bolts are
chosen.
12
3. If f (x) = λe−λx , if x ≥ 0 and zero elsewhere. Find the moment generating function
of f (x) and E(x)
pet
M (t) =
1 − qet
Find E(x)
5. Find the moment generating function of f (x) = 1 where 0 < x < 1 and thereby
1
confirm that E(x) = 12 and V ar(x) = 12
f (x) = px (1 − p)1−x
where x = 0, 1. What is the relationship between this and the moment generating
function of the binomial distribution?
7. Calculate the E(x) and V ar(x) for the Gamma distribution in section (2.2.5) and
the Normal distribution in section (2.2.6)
Definition 4 Suppose that X and Y are random variables. The joint distribution, or bi-
variate distribution of X and Y is the collection of all probabilities of the form P [(X, Y ) ∈
C] for all sets C ⊂ ℜ2 such that {(X, Y ) ∈ C} is an event.
13
(x, y) : a < x < b
.
c<y<d
In a plane, the Joint probability density function, f (x, y) is a non-negative real valued
function defined on ℜ2 such that
R∞ R∞
(i) P [(X, Y ) ∈ ℜ2 ] = −∞ −∞ f (x, y)dxdy
RdRb
(ii) P [a < x < b, c < y < d] = c a f (x, y)dxdy
RR
(iii) P [X ⊂ ℜ, ℜ ⊂ ℜ2 ] = ℜ
f (x, y)dxdy
Example 6 Consider the Joint probability density function defined by
k(6 − x − y) : 0 < x < 2, 2 < y < 4
f (x, y) =
0 elsewhere
Find
(i) the value of the constant k
(ii) P (x < 1, y < 3)
(iii) P (x + y < 3)
R4R2
Solution 6 (i) 2 0 k(6 − x − y)dxdy = 1
Z 4 Z 4
x2 2
k (6x − − xy)|0 dy = k (12 − 2 − 2y)dy
2 2 2
= k[10y − y 2 ]42 = k[40 − 16 − 20 + 4] = 1
1
k=
8
1
8
(6 − x − y) : 0 < x < 2, 2 < y < 4
f (x, y) =
0 elsewhere
(ii) P [x < 1, y < 3]
Z 3Z 1
1 3 x2
Z
1
(6 − x − y)dxdy = (6x − xy − )|dy
2 0 8 8 2 2
Z 3
1 11 1 11 y2
= ( − y)dy = [ y − ]32
8 2 2 8 2 2
1 33 9 1 3
= [ − − 11 + 2] = [33 − 9 − 22 + 4] =
8 2 2 16 8
(iii) P [x + y < 3]
1 3 3−y 1 3 x2
Z Z Z
= (6 − x − y)dxdy = 6x − xy − |03−y dy
8 2 0 8 2 2
Z 3 2
1 9 y
= (18 − 6y − + 3y − − 3y + y 2 )dy
8 2 2 2
1 3 27 y2 y3 3
Z
1 27 2 5
= ( − 6y + )dy = [ y − 3y + ]2 =
8 2 2 2 8 2 6 24
14
3.1.2 Discrete Case
The random variable (X, Y ) are discrete if each of it’s components x and y are discrete.
The Joint probability density function of x and y is defined as:
Suppose that X can assume any one of m values x1 , x2 , . . . , xm and Y can assume any
one of n values y1 , y2 , . . . yn . Then, the probability of the event that X = xj and Y = yk
is given by
P (X = xj , Y = yk ) = f (xj , yk )
A joint probability function for X and Y can be represented by a joint probability table
as in Table 3.1.2. The probability that X = xj is obtained by adding all entries in the
row corresponding to xi and is given by
n
X
P (X = xj ) = f1 (xj ) = f (xj , yk )
k=1
Y
y1 y2 ... yn Totals
X
x1 f (x1 , y1 ) f (x1 , y2 ) ... f (x1 , yn ) f1 (x1 )
x2 f (x2 , y1 ) f (x2 , y2 ) ... f (x2 , yn ) f1 (x2 )
.. .. .. .. ..
. . . . .
xm f (xm , y1 ) f (xm , y2 ) ... f (xm , yn ) f1 (xm )
Totals f2 (y1 ) f2 (y2 ) ... f2 (yn ) 1
For j = 1, 2, . . . , m, these are indicated by the entry totals in the extreme right-hand
column or margin of table3.1.2. Similarly, the probability that Y = yk is obtained by
adding all entries in the column corresponding to yk and is given by
m
X
P (X = xk ) = f2 (xk ) = f (xj , yk )
j=1
For k = 1, 2, . . . , n, these are indicated by the entry totals in the bottom row or margin
of Table 3.1.2. The probabilities f1 (xj ) and f2 (yk ) are the marginal probability functions
of X and Y , respectively.
It should also be noted that
m
X n
X
f1 (xj ) = 1, f2 (xk ) = 1
j=1 k=1
15
which can be written as m X
n
X
f (xj , yk ) = 1
j=1 k=1
In Table 3.1.2, F (x, y) is the sum of all entries for which xj ≤ x and yk ≤ y.
Example 7 The table shows the promotional status of police officers during the past two
years.
Promoted Not Promoted Total
Men 288 672 960
Women 36 204 240
Total 324 876 1200
Required
16
3.2 Bivariate Functions
The joint distribution function (cdf), F (x, y) of two random variables X and Y defined
on the same sample space is given by
F (x, y) = P (X ≤ x and Y ≤ y), −∞ < x < ∞
If X and Y are continuous, then
Z ∞ Z ∞ Z ∞
F (x, y) = f (u, v)dvdu = f (x, y)dydx
−∞ −∞ −∞
and X
F1 (x) = f1 (y), if x is discrete
X≤x
17
Example 9 Let the joint probability density function of X and Y be given by,
2 ,0 < x < y < 1
f (x, y) =
0 , elsewhere
Find the marginal probability density function of X and Y , that is f1 (x) and f2 (y).
R∞ R1
Solution 9 (a) f1 (x) = −∞
2dy = x
2dy = 2(1 − x)
2(1 − x) , 0 ≤ x ≤ 1
f1 (x) =
0 , elsewhere
R∞ Ry
(b) f2 (y) = −∞
2dy = 0
2dx = 2y
2y , 0 ≤ y ≤ 1
f1 (y) =
0 , elsewhere
P [X = x and Y = y] f (x, y)
P [X = x/Y = y] = =
P (Y = y) f2 (y)
Example 10 Suppose X and Y are two discrete random variables with joint probability
density function
1
54
(x + y) x = 1, 2, 3; y = 1, 2, 3, 4,
f (x/y) =
0 elsewhere
(b) P (y = 1/x = 1)
(c) P (y = 4/x = 2, 3)
(d) E[y|x]
18
Solution 10 (a) We express the conditional distribution of Y given X = x as
f (x, y)
f (y/x) =
f1 (x)
4
X 1 1 1
But f1 (x) = (x + y) = (4x + 10) = (2x + 5)
y=1
54 54 27
1
54
(x + y) (x + y)
f (y/x) = 1 = , x = 1, 2, 3, ; y = 1, 2, 3, 4,
54
(4x + 10) (4x + 10)
1+1 2 1
(b) P (y = 1/x = 1) = f (1/1) = 4+10
= 14
= 7
7
(c) P (y = 4/x = 3) = f (4/3) = 22
2
(d) E[y|x] = Σy xy+y
4x+10
= x+1+2x+4+3x+9+4x+16
4x+10
= 5x+15
2x+5
∞
e−(x+2y) ∞
Z
f1 (x) = 2 e−(x+2y) dy = 2{ | = −[e−∞ − e−x ] = e−x
0 −2 0
The marginal density function of Y is given by,
∞
e−(x+2y) ∞
Z
f2 (y) = 2 e−(x+2y) dx = −2{ |0 = −2[e−∞ − e−2y ] = 2e−2y
0 −2
Therefore the random variables X and Y are independent since,
f (x, y) = f1 (x)f2 (y) = 2e−(x+2y)
19
3.6 Bivariate Expectations
Let X and Y have joint probability density function, f (x, y) and let U (x, y) be a function
of X and Y . The expected value of X and Y is expressed as,
Z ∞Z ∞
E[U (x, y)] = U (x, y)f (x, y)dxdy
−∞ −∞
where f1 (x) is the marginal probability density function of X. The central moments of
X are Z ∞
r
E[(X − µx ) ] = (X − µX )r f1 (x)dx
−∞
The joint product moments of X and Y about the point (0, 0) are
Z ∞Z ∞
r s
E[X Y ] = xr y s f (x, y)dxdy
−∞ −∞
and Z ∞
E(X|Y = y) = xf (x|y)dx
−∞
Example 12 A miner is trapped in a mine containing 3 doors. The first door leads to a
tunnel which takes him to safety after 2 hours of travel. The second door leads to a tunnel
which returns him to the mine after 3 hours of travel. The third door leads to a tunnel
which returns him to the mine after 5 hours. Assuming he is at all times equally likely
to choose any of the doors, what is the expected length of time until the miner reaches
safety?
20
Solution 12 Let X be the time to reach safety (hours) and Y be the door (1,2 or 3)
initially chosen. Then,
So
1
E(X) = {2 + 3 + E(X) + 5 + E(X)} = 10
3
Covariance measures the degree of association between X and Y . Suppose that X and Y
are jointly distributed random variables with finite variances. The correlation coefficient
between X and Y is denoted by,
Cov(X, Y ) Cov(X, Y )
ρ(XY ) = =
σX σY std(X)std(Y )
Where −1 ≤ ρ(XY ) ≤ 1
If the random variables X and Y are independent, then
Cov(X, Y ) Cov(X, Y ) 0
ρ(XY ) = = = =0
σX σY std(X)std(Y ) σX σY
21
3.7.1 Properties of Covariance
Let X and Y be two random variables, then
Example 13 Suppose X and Y are jointly distributed with joint probability density func-
tion 1
8
(x + y) 0 < x < 2, 0 < y < 2
f (x, y) =
0 otherwise
Find the correlation coefficient between X and Y .
1 2 2
Z Z
4
E(XY ) = xy(x + y)dxdy =
8 0 0 3
While the covariance between X and Y is,
4 77 1
Cov(X, Y ) = − =−
3 66 36
For the correlation coefficient between X and Y
−1 1
ρ(XY ) = q p36√ =−
11
1136 11
36
22
3.8 Chapter Problems
1. Show that if X and Y are independent, then, E[X/Y = y] = E[X] for all y.
2. Suppose that X and Y are jointly distributed with a joint probability density func-
tion
k(x + y) 0 < x < 2, 0 < y < 2
f (x, y) =
0 elsewhere
3. Suppose that X and Y are random variables whose jpdf has a moment generating
function 1 3 t2 3 10
M (t1 , t2 ) = et1 + e +
4 8 8
for all real t1 and t2 . Find Cov(X, Y )
Find
5. The cumulative distribution function for the joint distribution of the continuous
random variables X and Y is
1
F (x, y) = (3x3 y + 2x2 y 2 ), 0 < x < 1, 0 < y < 1
5
(a) Find f (x, y) and f (0, 0.5)
(b) Show that f (x, y) is a complete probability density function
(c) Find the marginal probability density function of X and Y
(d) Find P (0 < x < 1) and P (0 < y < 1)
23
6. Let X and Y be two random variables with joint probability mass function
k(x + 2y) x = 1, 2; y = 1, 2, 3
f (x, y) =
0 otherwise
Find
(a) The value of the constant k
(b) E(XY 2 ) and Var(x)
(d) P (x = 1, y = 1, 2) and P (x = 1/y = 1, 2)
(f) f (y/x), E(Y ) and E(Y )
7. Let X and Y be randomly distributed random variables with joint probability den-
sity function
x+y
p (1 − p)2−x−y , x = 0, 1; y = 0, 1
f (x, y) =
0 otherwise
Find the covariance and correlation coefficient between X and Y
8. A dice and a coin are each tossed once. Write the possible outcomes in a Joint
probability table.
(a) What is the probability of head and a number greater than 3 from the dice?
(b) What is the probability of a tail and an even number from the dice?
9. Suppose X and Y are two independent random variables having the respective
probability density function of the form
2(1 − x) 0 ≤ x ≤ 1
f (x) =
0 otherwise
and
2(1 − y) 0 ≤ y ≤ 1
f (y) =
0 otherwise
(a) Find
(a) P [X + Y ≤ 1] and P [X ≤ 12 , Y ≤ 1]
(b) What is the relationship of E(x, y), E(x) and E(y)
24
4.1 Correlation
To test for the correlation, we use the correlation coefficient (r) to determine the strength
of the linear relationship between two variables. We use the Pearson product moment
correlation coefficient. If
If there is a strong positive linear relationship between the variables, the value of r
will be close to +1
If there is a strong negative linear relationship between the variables, the value of r
will be close to -1.
When there is no linear relationship between the variables or only a weak relation-
ship, the value of r will be close to 0.
Example
The data shows the number of cars rental companies have and their respective annual
income. Use the data to calculate the correlation coefficient and interpret its meaning.
Find the values of xy, x2 and y 2 . Then find the sum of each column. Since Revenue
depends on the number of cars the company has, the Revenue is the dependent variable
(y) and Cars is the independent variable (x)
25
Comany Cars (x) Revenue (y) xy x2 y2
A 63.0 7.0 441.00 3969.00 49.00
B 29.0 3.9 113.10 841.00 15.21
C 20.8 2.1 43.68 432.64 4.41
D 19.1 2.8 53.48 364.81 7.84
E 13.4 1.4 18.76 179.56 1.96
F 8.5 1.5 2.75 72.25
P 2 2.25
y 2 = 80.67
P P P P
x = 153.8 y = 18.7 xy = 682.77 x = 5859.26
We substitute the values in the table to the formula and solve for r
P P P
n( xy) − ( x)( y)
r =p P P P P
[n( x2 ) − ( x)2 ][n( y 2 ) − ( y)2 ]
6(682.77) − (153.8)(18.7)
=p
[6(5859.26) − (153.8)2 ][6(80.67) − (18.7)2 ]
=0.982
The correlation coefficient suggest a strong relationship between the number of cars
a rental company has and its annual income
Therefore, as the number of cars increases (decreases), the annual income increases
(decreases)
Remark
4.2 Regression
If there is a negative or positive correlation coefficient, the next step is to determine the
equation of the regression line, which is the line of best fit. The purpose of the regression
line is to enable the researcher to see the trend and make predictions based on the data.
26
Example
We revisit the example of the number of cars the rental company has, and the annual
income made by the company. To estimate the regression line, we compute the values of
a and b. P P P
The values
P 2 needed for the equations are n = 6, x = 153.8, y = 18.7, xy =
682.77 and x = 5859.26. We substitute the values in the formula to get:
( y)( x2 ) − ( x)( xy)
P P P P
(18.7)(5859.26) − (153.8)(682.77)
a= P 2 P 2 = = 0.396
n( x ) − ( x) (6)(5859.26) − (153.8)2
P P P
n( xy) − ( x)( y) 6(682.77) − (153.8)(18.7)
b= P 2 P 2 = = 0.106
n( x ) − ( x) (6)(5859.26) − (153.8)
The equation of the regression line y = a + bx is
y = 0.396 + 0.106x
We can use the regression line to predict the values of y given the values of x, that is,
predict the annual income given the number of cars. For example
Let x = 40 cars
y = 0.396 + 0.106(40) = 4.636
Thus, with 400,000 cars, the company makes 4.636 billions dollars per year or we write
(40, 4.636)
2. The number of calories and the number of milligrams of cholesterol for a random
sample of fast-food chicken sandwiches from seven restaurants are shown here. Is
there a relationship between the variables?
27
3. The number of forest fires and the number of acres burned are as follows:
Fires x 72 69 58 47 84 62 57 45
Acres y 62 41 19 26 51 15 30 15
Find y when x = 60
Suppose that a random variable X has discrete distribution and U = Φ(X) is another
random variable which is a function of X. Then, for any positive U > 0, we have
X
g(U ) = P (U = u) = P (Φ(X) = u) = f (x)
X:Φ(X)=u
d
g(U ) = G(U )
du
This method of getting the pdf of U is called the cdf technique. If Φ(X) is continuous
and strictly increasing or decreasing function of X, over the interval (a, b) , then U will
vary over some interval (α, β) as X varies over the interval (α, β), and the inverse function
28
W will be strictly increasing or decreasing over the interval (α, β). The pdf of U is given
by
d
g(U ) = f (W (u)) W (u)
du
Where U = Φ(X) if and only if X = W (U )
x = 0 =⇒ u = 1 and x = 2 =⇒ u = −3
As x varies over (0, 2) = (a, b), U varies over (−3, 1) = (α, β).
1
4
: −3 < u < 1
g(u) =
0 elsewhere
29
5.2 Change of Variable Technique
Another method of obtaining distribution functions of functions of random variables.
Let X be a random variable (continuous), with pdf f (x).
Let Y = µ(X) ; some function of XϵA
This implies that, X = µ−1 (Y ) = w(y)
By change of variable technique, the pdf of Y is given as:
f (w(y))|J| y ∈ B
g(y) =
0 elsewhere
x2 1
k( + x)|20 = 1 =⇒ k(2 + 2) = 1 =⇒ k =
2 4
The pdf of X is expressed as:
(
1
(x + 1); 0 < x < 2
f (x) = 4
0; Otherwise
X = 0 =⇒ Y = 0 and X = 2 =⇒ Y = 4
Now
Y = X 2 = µ(x) ⇒ B = [y|, 0 < y < 4
30
Step: Find the Jacobian
The Jacobian transformation is obtained as
√
dx d y 1 1 1 1
J= = = y − 2 = √ ⇒ |J| = √
dy dy 2 2 y 2 y
A = [x|x = 1, 2, 3, ...]
and
B = [y|y = 3, 10, 29, ...]
3
p 1
w(y) = x = y − 2 = (y − 2) 3
31
Step 5: Find the distribution of y
The pmf of Y is given by
1
g(y) = f (w(y)) = 2( )w(y) ; y = 3, 10, 29, ...; yϵB
3
Thus, the pmf of y is given as
( 1
U =Φ(X, Y ) and
V =Φ1 (X, Y )
which is one-to-one and maps the set S of (X, Y ) onto B, where B = set of (U, V ). This
transformation is invertible and hence
y1 = µ1 (x1 , x2 , ..., xn )
y2 = µ2 (x1 , x2 , ..., xn )
..
.
32
yn = µn (x1 , x2 , ..., xn )
We find the joint pdf of Y1 , Y2 , ..., Yn as follows:
(
f (w1 (y1 , ...yn ), ..., wn (y1 , ...yn ))|J|; Y1 , Y2 , ...Yn ϵB
g(y1 , y2 , ..., yn ) =
0; otherwise
where
∂x1 ∂x1 ∂x1
∂y1 ∂y2
... ∂yn
∂x2 ∂x2
... ∂x2
|J| = ∂y. 1 ∂y2 ∂yn
.. .. ..
. ... .
∂xn ∂xn ∂xn
∂y1 ∂y2
... ∂yn
Note: If X1 , X2 , ..., Xn were discrete random variables, we would ignore |J| in the joint
pdf g(...) of Yi : i = 1, 2, ...n.
1
This means that |J| = 2
33
Step 4: Find f (w1 (u, v), w2 (u, v))
e−x x > 0
f (x) =
0 x≤0
34
6. Let f (x, y) be the joint density function of X and Y .
1 0 ≤ x ≤ 1, 0 ≤ y ≤ 1
f (x, y) =
0 otherwise
6 Derived Distributions
6.1 Gamma Function
The Gamma Function Γ(x) is an extension of the factorial function to real (and com-
plex) numbers. If n ∈ {1, 2, 3, . . .}, then
Γ(n) = (n − 1)!
Using the change of variable x = λy, we can show the following equation is useful when
working with the gamma distribution:
Z ∞
Γ(α) = λ α
y α−1 e−λy dy, for α, λ > 0
0
35
3. Γ(α + 1) = αΓ(α)
1. Γ( 27 ) = 25 Γ( 25 ) = 52 32 Γ( 23 ) = 52 32 12 Γ( 12 ) = 15
8
π
R∞
2. I = 0 x6 e−5x dx
Since α = 7 and λ = 5, we obtain I = Γ(7) 57
= 6!
57
= 0.0092
Exercise
Find the E(x) and V ar(x) of the gamma distribution
Prove that the Gamma distribution is a probability density function, that is:
Z ∞
1
λα xα−1 e−λx dx = 1
Γ(α) 0
36