Chapter 5
Functions of Random Variables
Contents
5-1. Introduction
5-2. Transformation of Variables
5-3. Moment Generating Functions
5-4. Characteristic Functions
5-5. Limit and Convergence
5-6. Important Theorems
5-1 Introduction
Many statistical methods involve functions of one or more
random variables. Statistical inference requires the
distributions of these functions, which are usually
obtained through transformations of variables.
The moment generating functions will be introduced for
determining moments of distributions and for establishing
distributions of functions of random variables.
The moment generating functions are then generalized
into the characteristic functions, which enable us to
prove both the Weak Law of Large Numbers and the
Central Limit Theorem.
5-2 Transformations of Variables
Given fX (x) and Y = g(X), find fY (y).
Start by determining the cumulative distr. function of Y:
FY ( y ) P (Y y ) P ({ X : g ( X ) y})
P({ X : X g 1 (( , y ])})
Then f X ( x) dx
{ x: g ( x ) y}
dFY ( y )
fY ( y ) .
dy
1-1 Mapping
If Y = g(X) is 1-1 mapping, then
dx
fY (y) = fX (x) , where x g 1 ( y ).
dy
If Y = f (X) is 1-1 mapping and X is discrete, then
1
f (y) = f ( g ( y )).
Y X
[Example] Linear Function Y = a X + b :
1 y b
fX ( ).
fY (y) = a a
Example (Power-Law Function)
Consider Y = X 2:
P (Y y ) P ( X 2 y )
P( y X y)
FX ( y ) FX ( y ).
By differentiation,
1
fY ( y ) [ f X ( y ) f X ( y )].
2 y
Non-1-1 Mapping
If Y = g(X)
dx1 dx2
pY ( y ) p X ( x1 ) p X ( x2 ) , where xi g 1 ( y ).
dy dy
Simulating A Random Variable
Simulating A Random Variable
Example
Example
Theorem
Example
Sol:
5-3 Moment Generating Functions
The moment generating functions will be introduced for
determining moments of distributions and for establishing
distributions of functions of random variables.
The expected value of a function g(X) = X r, for r = 0,1,2,…,
is called the r-th moment (about the origin) of X.
Moment Generating Function
Although the moments of X can be determined directly
from the definition, there exists an alternative way
utilizing the moment generating function.
If the moment generating function of X does exist, it can
be used to generating all the moments of X.
Theorem
[Example] Find the moment generating function of the
binomial R. V. X and verify that np and 2 np (1 p).
(Solution)
Moment Generating Functions
The moment generating function of X having a normal
distribution with mean and variance 2 is given by
M X (t ) exp( t 12 2t 2 ).
The moment generating function of X having a chi-squared
distribution with degrees of freedom is given by
M X (t ) (1 2 t ) 2 .
M X a (t ) e at M X (t ), M aX (t ) M X (at ).
If Y = X1 + X2 + … + Xn, where Xi are independent R. V.s, then
M Y (t ) M X1 (t ) M X 2 (t ) M Xn (t ).
Theorems
5-4 Characteristic Functions
The characteristic function of X is defined as
X (t ) E[eitX ] E[cos(tX )] iE[sin(tX )]
= M X (it ) eitX f X ( x)dx.
Given a characteristic function X (t ), we can recover the
pdf f X ( x) through the (inverse) Fourier transform:
1
f X ( x) = e itX X (t )dt.
2
Moments from Characteristic function
k
k 1 d X (t )
E[ X ] k .
i d tk t 0
It can be shown that the characteristic function of X having
a normal distribution with mean and variance 2 is given by
it 2t 2 2
X (t ) e e ,
which can be derived from the moment generating function
M X (it ) of a normal X.
5-5 Limit and Convergence
In probability theory, there exist several different notions of
convergence of random variables. The convergence of
sequences to some limit random variable is an important
concept in probability theory, and its applications.
The concepts formalize the idea that a sequence of random
events can sometimes be expected to settle down into a
behavior that is essentially unchanging when items far
enough into the sequence are studied.
The different notions of convergence relate to how such a
behavior can be characterized; two readily understood
behaviors are that the sequence eventually takes a constant
value, and that values in the sequence can be described by
an unchanging probability distribution.
Modes of Convergence
lim X X defines the limit of a sequence of R.V.s.
n
n
There exist, however, several different ways (modes) to
interpret the convergence to a limit.
Convergence with probability 1
P ( lim X n X ) = P({ : lim X n ( ) X ( ), }) 1.
n n
This mode of convergence is also known as an almost
sure (a.s.) convergence. It converges almost
everywhere except for a set of events with probability
zero.
Modes of Convergence
Convergence in probability
lim P ( X n X > ) = 0, 0
n
Convergence in mean square
2
lim E ( X n X ) = 0
n
Convergence in distribution
lim FX n ( x) FX ( x)
n
i.e., the sequence of FX n ( x) converges pointwise toFX ( x).
Examples
Convergence in distribution of a binomial R.V.
to a Gauss R.V.
Convergence of the Poisson distribution to the
Gauss distribution.
If X n
n Z
, then Xn converges to Z with
n 1
probability 1 (as long as P ( Z ) 1 ), and
converges in mean square to Z if E(Z2) is finite.
Convergence Relationship
The statement convergence with probability 1 is
stronger than (implies) the statement
convergence in probability.
Convergence in mean square also implies
convergence in probability.
The weakest mode of convergence is
convergence in distribution, which is implied by
convergence in probability.
5-6 Important Theorems
If two random variables have the same characteristic
function, they have the same distribution function.
Continuity Theorem: The convergence of characteristic
functions (of a sequence of R.V.s) implies convergence of
the corresponding distribution functions.
Weak Law of Large numbers:
Let X1, X2,…,be independent, identically distributed R.V.s
having finite mean and set Sn=X1+X2+…+Xn. Then for any
>0 Sn
lim P ( ) 0. (converge in proba.)
n n
Central limit theorem
Let X1, X2,…,be independent, identically distributed
R.V.s having mean and nonzero variance 2. Then
S n n
lim P( x) ( x), (converge in distr.)
n n
where ( x) is the standard normal distribution function.
If we are sampling from a population with unknown distr.,
X (sample mean)
the sampling distribution of will still be
approximately normal provided that the sample size is large
(n 30).