0% found this document useful (0 votes)
27 views11 pages

Lec1 RandomVariables

Uploaded by

awesomenessv3.1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views11 pages

Lec1 RandomVariables

Uploaded by

awesomenessv3.1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Random Variables & Independence

STA 211: The Mathematics of Regression

Niccolo Anceschi Ph.D.

1
Slides adapted from lectures by Prof. Jerry Reiter
Introduction
▶ Probability Theory: the math of randomness
▶ A random variable is an abstract object that can take a set of
possible values with certain probabilities, rather than a single
deterministic value
▶ Useful to model experimental outcomes

▶ Statistical models are probability distributions with unknown


parameters.
▶ The problems considered by probability and statistics are
inverse to each other.
▶ In probability theory we consider some underlying process that
has some randomness or uncertainty modeled by random
variables, and we figure out what happens.
▶ In statistics we observe something that has happened, and try
to figure out what underlying process would explain those
observations.
Marginal Distribution and Expectation of RVs
A random variable X ∈ R can be discrete or continuous.
▶ Discrete: The marginal distribution is given by the probability
mass function (PMF) pX : Z → [0, 1] defined as

pX (x) = Pr(X = x) ∀x ∈ Z

▶ Continuous: The marginal distribution is given by a


probability density function (PDF) fX : R → [0, ∞) defined as
Z b
Pr(a ≤ X < b) = fX (x)dx
a

In this case, we have Pr(X = x) = 0 for every x ∈ R


▶ For a real-valued function g , we can calculate expectations:
(P
g (x)pX (x) if X is discrete
E(g (X )) = R x∈Z
g (x)fX (x)dx if X is continuous
Marginal distributions do not tell us everything

Suppose I tell you:


▶ I X is a toss of a fair die,
▶ I Y is a toss of a fair die.
Do you have all the information about X and Y now?

No. Because we can still have many different scenarios:


1. I toss one dice X . Y = X is the result of the same toss.
2. I toss two dice: X is the first toss while Y is the second.
3. I toss one dice X . Y = 7 − X is the number at the bottom
(not top) of the tossed dice.
In all cases: X & Y have the marginal distribution of a fair dice
toss.
Joint Distribution for discrete RVs

All the information (called joint distribution) about two discrete


RVs X ,Y is captured by a joint PMF:

pX ,Y (x, y ) = Pr(X = x, Y = y ),

where the comma is interpreted as an “and” (or intersection).

For example, in the previous scenario:


1. Same Toss X = Y : pX ,Y (x, y ) = 1/6 if x = y ∈ {1, . . . , 6}
and pX ,Y (x, y ) = 0 otherwise
2. Two Tosses X , Y : pX ,Y (x, y ) = 1/62 for any x, y ∈ {1, ..., 6}
3. Opposite sides Y = 7 − X : pX ,Y (x, y ) = 1/6 if x ∈ 1, ..., 6
and y = 7 − x , and pX ,Y (x, y ) = 0 otherwise.
Basic properties of the joint PMF

Law of total probability:


XX
pX ,Y (x, y ) = 1.
x∈Z y ∈Z

Recovery of marginal distributions:


X X
pX (x) = pX ,Y (x, y ) and pY (y ) = pX ,Y (x, y )
y ∈Z x∈Z

Finally, we can calculate the expectation of a function g of X , Y


using the formula:
XX
E(g (X , Y )) = g (x, y )pX ,Y (x, y )
x∈Z y ∈Z
Suggested exercise #1: For X and Y in all previous scenarios 1-3:
(i) Show X and Y have the marginal PMF pX (t) = pY (t) = 1/6
if t ∈ {1, . . . , 6}.
(ii) Compute E(X · Y ).
(iii) Compute E(X + Y ).

For the third part, you can use the fact that for any constants
a, b ∈ R, it holds:

E(aX + bY ) = aE(X ) + bE(Y ).

This is called the linearity of expectations.


Joint PDF for continuous RVs
Given two continuous RVs X , Y their joint distribution is captured
by a joint PDF fX ,Y : R × R → [0, 1) such that:
Z b2 Z b1
Pr(a1 ≤ X < b1 , a2 ≤ Y < b2 ) = fX ,Y (x, y ) dx dy ,
a2 a1

where the comma is interpreted as an “and” (or intersection).


Using the above condition, it is possible to show:
R∞ R∞
▶ Law of total probability: −∞ −∞ fX ,Y (x, y ) dx dy = 1
R∞
▶ Recovery of marginal PDF: fX (x) = −∞ fX ,Y (x, y ) dy = 1
R∞
and fY (y ) = −∞ fX ,Y (x, y ) dx = 1

Finally, if X and Y are continuous RVs


Z ∞Z ∞
E(g (X , Y )) = g (x, y )fX ,Y (x, y ) dx dy
−∞ −∞
Independence

Two discrete RVs X , Y are called independent if

pX ,Y (x, y ) = pX (x)pY (y ) ∀x, y ∈ Z.

Similarly, two continuous RVs are called independent

fX ,Y (x, y ) = fX (x)fY (y ) ∀x, y ∈ R.

Thus X and Y are independent if their joint PMF (or PDF) is a


product of their marginals.

Suggested exercise #2: Consider X and Y in the previous


scenarios 1-3. Show that X and Y are independent in scenario 2
and dependent in scenarios 1 and 3.
Measuring dependence using covariance
Given two RVs X , Y ∈ R, one way to measure the dependence
between them is through covariance:

Cov (X , Y ) = E[(X − E(X ))(Y − E(Y ))].

Correlation is the normalized form defined as:


Cov (X , Y )
Corr (X , Y ) = p p ∈ [−1, 1]
Var (X ) Var (Y )

Suggested exercise #3:


(i) Show that, if X and Y are independent, then

Cov (g (X ), h(Y )) = 0

for all real-valued functions g , h.


The converse statement is also true.
Covariance identities

Suggested exercise #4:


(i) Compute Cov (X , Y ) in the previous scenarios 1-3.
You may use the covariance identity:

Cov (X , Y ) = E [XY ] − E (X )E (Y ).

The special case X = Y yields Var (X ) = E (X 2 ) − E (X )2 .


Covariance depends on the scale but not on translation:

Cov (aX + b, cY + d) = a c Cov (X , Y ).

Finally, covariance appears in the formula for variance of sum:

Var (X + Y ) = Var (X ) + Var (Y ) + 2Cov (X , Y ).

You might also like