Some basic formulae for use in ST205
This note is meant to be helpful in understanding the derivations given in the
lecture notes. Everything in this note should be well understood for the exam.
Below, y1, . . . , yn are arbitrary random variables. Remember that the yi’s can
basically be anything: in the lecture notes, yi usually represents the observed Y
value for the ith sampling unit. But we may also consider the means for the ith
sample, usually denoted in this course as yi. The rules below also hold for such
sample means, or any other statistics (usually estimators in this course).
The expectation
Firstly, the expectation of a sum is the sum of the expectations:
E(y1 + y2) = Ey1 + Ey2
Note that by repeatedly applying this formula we can deduce
Secondly, the expectation of a constant times a random variable equals the
constant times the expectation:
E(cy1) = cEy1
Thus, expectations are relatively easy to work with, the rules for variances are
somewhat more involved.
The variance
Firstly, the variance of a sum is generally not equal to the sum of the variances:
var(y1 + y2) = var(y1) + var(y2) + 2cov(y1,y2)
The variance of a difference is as follows
var(y1 − y2) = var(y1) + var(y2) − 2cov(y1,y2)
A relation between the variance and the covariance is:
var(y1) = cov(y1,y1)
That is, the variance of a variable is the covariance of the variable with itself.
For the covariance, we have, in a way, simpler calculation rules than for the
variance: cov(y1,y2 + y3) = cov(y1,y2) + cov(y1,y3)
1
and of course cov(y2 + y3,y1) = cov(y1,y2) + cov(y1,y3)
This formula explains the above rule for the variance of a sum, since
var(y1 + y2) = cov(y1 + y2,y1 + y2)
= cov(y1 + y2,y1) + cov(y1 + y2,y2)
= cov(y1,y1) + cov(y1,y2) + cov(y2,y1) + cov(y2,y2)
= var(y1) + cov(y1,y2) + cov(y2,y1) + var(y2)
= var(y1) + var(y2) + 2cov(y1,y2)
So studying this derivation we see that the formula for the variance of a sum
follows a sound logic.
By repeatedly applying the above formulae for the variance and covariance
we can deduce
This formula can be written, perhaps slightly less elegantly, but useful for some
purposes, as
Verify what happens if n = 2!
Note that the variance of a sum equals the sum of the variances only if the
covariances between all pairs of different variables equal zero. This occurs, for
example, in stratified sampling for which
where Tˆk is the estimated total in stratum k. The reason is that Tˆk is
independent of Tˆl for all k 6= l (so their covariance equals zero).
Finally, we see what happens to the variance if we multiply a random
variable by a constant c:
var(c y1) = c2 var(y1)
For the covariance, with constants a and b, we have
cov(a y1, b y2) = a b cov(y1, y2)
Verify that the formula for var(c y1) follows from this!