Multivariate Data Analysis
Lecture 4
School of Mathematics
Univariate Normal Distribution
• For x ⇠ N(µ, 2
) the density function is
✓ ◆2 !
1 1 x µ
f (x) = p exp
2⇡ 2
1
Univariate Normal Distribution
• For x ⇠ N(µ, 2
) the density function is
✓ ◆2 !
1 1 x µ
f (x) = p exp
2⇡ 2
x µ 2
• Write = (x µ)( 2
) 1
(x µ)
1
Univariate Normal Distribution
• For x ⇠ N(µ, 2
) the density function is
✓ ◆2 !
1 1 x µ
f (x) = p exp
2⇡ 2
x µ 2
• Write = (x µ)( 2
) 1
(x µ)
• For generalised vector x this is
(x µ)0 ⌃ 1
(x µ)
1
Mahalanobis distance
• The Mahalanobis distance is
2
= (x µ)0 ⌃ 1
(x µ)
2
Mahalanobis distance
• The Mahalanobis distance is
2
= (x µ)0 ⌃ 1
(x µ)
• Also written as
2
= dM (x, µ)
2
Multivariate Normal Distribution
0
Definition 6.1 A random vector x = (x1 , . . . , xp ) follows a multivariate
normal distribution (MVN; or a p-dimensional normal distribution) of
parameters µ and ⌃, x ⇠ Np (µ, ⌃), if its density function is
⇢
1 1
f (x) = exp (x µ)0 ⌃ 1
(x µ)
(2⇡)p/2 |⌃|1/2 2
3
Multivariate Normal Distribution
0
Definition 6.1 A random vector x = (x1 , . . . , xp ) follows a multivariate
normal distribution (MVN; or a p-dimensional normal distribution) of
parameters µ and ⌃, x ⇠ Np (µ, ⌃), if its density function is
⇢
1 1
f (x) = exp (x µ)0 ⌃ 1
(x µ)
(2⇡)p/2 |⌃|1/2 2
• E [x] = µ and cov (x) = ⌃, with µ 2 Rp and ⌃ being a symmetric
square positive definite matrix.
• Here increases with p
3
Multivariate Normal Distribution
0
Definition 6.1 A random vector x = (x1 , . . . , xp ) follows a multivariate
normal distribution (MVN; or a p-dimensional normal distribution) of
parameters µ and ⌃, x ⇠ Np (µ, ⌃), if its density function is
⇢
1 1
f (x) = exp (x µ)0 ⌃ 1
(x µ)
(2⇡)p/2 |⌃|1/2 2
• E [x] = µ and cov (x) = ⌃, with µ 2 Rp and ⌃ being a symmetric
square positive definite matrix.
• Here increases with p
p
• |⌃| 1/2
is an analogue to 2 in the univariate case
3
Multivariate Normal Distribution
0
Definition 6.1 A random vector x = (x1 , . . . , xp ) follows a multivariate
normal distribution (MVN; or a p-dimensional normal distribution) of
parameters µ and ⌃, x ⇠ Np (µ, ⌃), if its density function is
⇢
1 1
f (x) = exp (x µ)0 ⌃ 1
(x µ)
(2⇡)p/2 |⌃|1/2 2
• E [x] = µ and cov (x) = ⌃, with µ 2 Rp and ⌃ being a symmetric
square positive definite matrix.
• Here increases with p
p
• |⌃| 1/2
is an analogue to 2 in the univariate case
• |⌃|
3
Generalised Population Variance
• |⌃| is the generalised population variance
4
Generalised Population Variance
• |⌃| is the generalised population variance
p
• |⌃| 1/2
is an analogue to 2 in the univariate case
4
Generalised Population Variance
• |⌃| is the generalised population variance
p
• |⌃| 1/2
is an analogue to 2 in the univariate case
• Multicollinearity indicates variables are highly intercorrelated:
4
Generalised Population Variance
• |⌃| is the generalised population variance
p
• |⌃| 1/2
is an analogue to 2 in the univariate case
• Multicollinearity indicates variables are highly intercorrelated:
• Eigenvalues of ⌃ close to 0
• |⌃| will be small
4
Example:
Consider the bivariate normal distribution:
5
Example:
Consider the bivariate normal distribution:
5
Example:
Consider the bivariate normal distribution:
Left: large |⌃|
Right: small |⌃|
5
Example: Contour plots
6
Example: Contour plots
Left: large |⌃|
Right: small |⌃|
6
Example: Finding contours
https://datasciencegenie.com/3d-contour-plots-of-bivariate-normal-distribution/
7
MVN Properties:
Normality of Linear combinations:
1. If a is a vector of constants (p ⇥ 1) so the rank(a) = 1,
y = a1 x1 + . . . + ap xp of the components of x ⇠ Np (µ, ⌃) follows a
univariate normal distribution.
8
MVN Properties:
Normality of Linear combinations:
1. If a is a vector of constants (p ⇥ 1) so the rank(a) = 1,
y = a1 x1 + . . . + ap xp of the components of x ⇠ Np (µ, ⌃) follows a
univariate normal distribution.
i.e.
If x is Np (µ, ⌃), then a0 x is N (a0 µ, a0 ⌃a) .
8
MVN Properties:
Normality of Linear combinations:
1. If a is a vector of constants (p ⇥ 1) so the rank(a) = 1,
y = a1 x1 + . . . + ap xp of the components of x ⇠ Np (µ, ⌃) follows a
univariate normal distribution.
i.e.
If x is Np (µ, ⌃), then a0 x is N (a0 µ, a0 ⌃a) .
2. If yq⇥1 = Aq⇥p xp⇥1 , being x ⇠ Np (µ, ⌃) and rank(A) = q, q p;
then
y ⇠ Nq (Aµ, A⌃A0 ) .
8
MVN Properties (cont.):
Standardized variables:
1. Any p-dimensional normal random vector x ⇠ Np (µ, ⌃) can be
transformed into a standard normal random vector z with mean
vector 0p and covariance matrix Ip by the following transformation:
1/2
z=⌃ (x µ) ⇠ Np (0, I)
9
MVN Properties (cont.):
Chi-square distribution:
• Recall 2p is a chi-square random variable with p degrees of freedom
is defined as the sum of squares of p independent standard normal
random variables.
• If z is ⇠ Np (0, I) then
p
X
zj2 = z0 z
j=1
10
MVN Properties (cont.):
Chi-square distribution:
• Recall 2p is a chi-square random variable with p degrees of freedom
is defined as the sum of squares of p independent standard normal
random variables.
• If z is ⇠ Np (0, I) then
p
X
zj2 = z0 z
j=1
And it holds that z0 z = (x µ)0 ⌃ 1
(x µ) ⇠ 2
p.
10
MVN Properties (cont.):
Normality of MVN distributions
1. All marginal and conditional distributions are also normal
distributions.
11
MVN Properties (cont.):
Normality of MVN distributions
1. All marginal and conditional distributions are also normal
distributions.
2. Any subvector of x of dimension k p follows a k-dimensional
normal distribution:
! ! !
x1 µ1 ⌃11 ⌃12
x= , µ= , ⌃=
x2 µ2 ⌃21 ⌃22
11
MVN Properties (cont.):
Normality of MVN distributions
1. All marginal and conditional distributions are also normal
distributions.
2. Any subvector of x of dimension k p follows a k-dimensional
normal distribution:
! ! !
x1 µ1 ⌃11 ⌃12
x= , µ= , ⌃=
x2 µ2 ⌃21 ⌃22
Here x1 and µ1 are r ⇥ 1 and ⌃11 is r ⇥ r where r < p
11
MVN Properties (cont.):
Normality of MVN distributions
1. All marginal and conditional distributions are also normal
distributions.
2. Any subvector of x of dimension k p follows a k-dimensional
normal distribution:
! ! !
x1 µ1 ⌃11 ⌃12
x= , µ= , ⌃=
x2 µ2 ⌃21 ⌃22
Here x1 and µ1 are r ⇥ 1 and ⌃11 is r ⇥ r where r < p
x1 ⇠ Nr (µ1 , ⌃11 )
11
MVN Properties (cont.):
Partition the observations into x(p⇥1) and y(q⇥1) , then
! ! ! !
x µx x ⌃xx ⌃xy
E = , cov =
y µy y ⌃yx ⌃yy
12
MVN Properties (cont.):
Partition the observations into x(p⇥1) and y(q⇥1) , then
! ! ! !
x µx x ⌃xx ⌃xy
E = , cov =
y µy y ⌃yx ⌃yy
For the next three slides we assume
! " ! !#
x µx ⌃xx ⌃xy
is Np+q ,
y µy ⌃yx ⌃yy
12
MVN Properties (cont.):
Independence:
1. x and y are independent if ⌃xy = 0
13
MVN Properties (cont.):
Independence:
1. x and y are independent if ⌃xy = 0
2. Two components xi and xj are independent if ij =0
13
MVN Properties (cont.):
Independence:
1. x and y are independent if ⌃xy = 0
2. Two components xi and xj are independent if ij =0
3. If the components xi of x are all uncorrelated to each other
(⇢ = Ip ), then they are independent. That is, the joint density
function can be obtained as product of the marginals:
p
Y
xi uncorrelated normal variables , f (x1 , . . . , xp ) = f (xi ) .
i=1
13
MVN Properties (cont.):
Conditional distributions:
1. If x and y are not independent, then ⌃xy 6= 0
then the conditional distribution f (x|y) is multivariate normal with
14
MVN Properties (cont.):
Conditional distributions:
1. If x and y are not independent, then ⌃xy 6= 0
then the conditional distribution f (x|y) is multivariate normal with
E (x | y) = µx + ⌃xy ⌃yy1 y µy
cov(x | y) = ⌃xx ⌃xy ⌃yy1 ⌃yx
Distribution of sum of two subvectors:
14
MVN Properties (cont.):
Conditional distributions:
1. If x and y are not independent, then ⌃xy 6= 0
then the conditional distribution f (x|y) is multivariate normal with
E (x | y) = µx + ⌃xy ⌃yy1 y µy
cov(x | y) = ⌃xx ⌃xy ⌃yy1 ⌃yx
Distribution of sum of two subvectors:
1. If x and y are the same size (both p ⇥ 1) and independent
14
MVN Properties (cont.):
Conditional distributions:
1. If x and y are not independent, then ⌃xy 6= 0
then the conditional distribution f (x|y) is multivariate normal with
E (x | y) = µx + ⌃xy ⌃yy1 y µy
cov(x | y) = ⌃xx ⌃xy ⌃yy1 ⌃yx
Distribution of sum of two subvectors:
1. If x and y are the same size (both p ⇥ 1) and independent
then
x + y is Np µx + µy , ⌃xx + ⌃yy
x y is Np µx µy , ⌃xx + ⌃yy
14
Bivariate Normal Distribution
! !!
2
0 µx x xy
Given a random vector (x, y) ⇠ N , 2
µy yx y
15
Bivariate Normal Distribution
! !!
2
0 µx x xy
Given a random vector (x, y) ⇠ N , 2
µy yx y
15
Joint density function
1. Joint density function:
✓ ◆
1 1
f (x, y ) = p exp Q(x, y ) ,
2⇡ x y 1 ⇢2 2
16
Joint density function
1. Joint density function:
✓ ◆
1 1
f (x, y ) = p exp Q(x, y ) ,
2⇡ x y 1 ⇢2 2
where
"✓ ◆2 ✓ ◆2 #
1 x µx y µy (x µx ) (y µy )
Q(x, y ) = + 2⇢
1 ⇢2 x y x y
16
Marginal density functions
2. Marginal density functions:
( ✓ ◆2 )
1 1 x µx
fx (x) = p exp
2⇡ x 2 x
and ( ✓ ◆2 )
1 1 y µy
fy (y ) = p exp .
2⇡ y 2 y
(i.e. univariate normal distributions)
17
Independence
3. If ⇢ = 0, that is, if x and y independent, then
18
Independence
3. If ⇢ = 0, that is, if x and y independent, then
✓ ◆2 ✓ ◆2
x µx y µy
Q(x, y ) = + ,
x y
18
Independence
3. If ⇢ = 0, that is, if x and y independent, then
✓ ◆2 ✓ ◆2
x µx y µy
Q(x, y ) = + ,
x y
and, consequently, f (x, y ) = fx (x)fy (y ).
18
Conditional Distributions
4. If ⇢ 6= 0, the conditional distributions can be obtained as:
8 !2 9
1 < 1 y µy ⇢ yx (x µx ) =
fy|x (y | x) = p p exp p
2⇡ y 1 ⇢ 2 : 2 y 1 ⇢2 ;
19
Conditional Distributions
4. If ⇢ 6= 0, the conditional distributions can be obtained as:
8 !2 9
1 < 1 y µy ⇢ yx (x µx ) =
fy|x (y | x) = p p exp p
2⇡ y 1 ⇢ 2 : 2 y 1 ⇢2 ;
and
8 !2 9
1 < 1 x µx ⇢ yx (y µy ) =
fx|y (x | y ) = p p exp p .
2⇡ x 1 ⇢ 2 : 2 x 1 ⇢2 ;
19
Conditional Distributions
4. If ⇢ 6= 0, the conditional distributions can be obtained as:
8 !2 9
1 < 1 y µy ⇢ yx (x µx ) =
fy|x (y | x) = p p exp p
2⇡ y 1 ⇢ 2 : 2 y 1 ⇢2 ;
and
8 !2 9
1 < 1 x µx ⇢ yx (y µy ) =
fx|y (x | y ) = p p exp p .
2⇡ x 1 ⇢ 2 : 2 x 1 ⇢2 ;
Thus
✓ ◆
y 2
y | x = x ⇠ N µy + ⇢ (x µx ) , y 1 ⇢2
x
19
Conditional Distributions
4. If ⇢ 6= 0, the conditional distributions can be obtained as:
8 !2 9
1 < 1 y µy ⇢ yx (x µx ) =
fy|x (y | x) = p p exp p
2⇡ y 1 ⇢ 2 : 2 y 1 ⇢2 ;
and
8 !2 9
1 < 1 x µx ⇢ yx (y µy ) =
fx|y (x | y ) = p p exp p .
2⇡ x 1 ⇢ 2 : 2 x 1 ⇢2 ;
Thus
✓ ◆
y 2
y | x = x ⇠ N µy + ⇢ (x µx ) , y 1 ⇢2
x
and
✓ ◆
x 2
x | y = y ⇠ N µx + ⇢ (y µy ) , x 1 ⇢2
y
19