Problems:
1. Find the correlation coefficient between annual advertising
expenditures and annual sales revenue for the following data:
Year (𝑖) 1 2 3 4 5 6 7 8 9 10
Annual 10 12 14 16 18 20 22 24 26 28
advertising
expenditure
(𝑋 )
Annual sales 20 30 37 50 56 78 89 100 120 110
(𝑌 )
∑ ∑
Solution: Now, 𝑋 = = = 19, 𝑌 = = = 69
𝑖 𝑋 𝑌 𝑋 𝑌 (𝑋 (𝑌 (𝑋 − 𝑋)(𝑌
−𝑋 −𝑌 − 𝑋) − 𝑌) − 𝑌)
1 10 20 -9 -49 81 2401 441
2 12 30 -7 -39 49 1521 273
3 14 37 -5 -32 25 1024 160
4 16 50 -3 -19 9 364 57
5 18 56 -1 -13 1 169 13
6 20 78 1 9 1 81 9
7 22 89 3 20 9 400 60
8 24 100 5 31 25 961 155
9 26 120 7 51 49 2601 357
10 28 110 9 41 81 1681 369
190 690 0 0 330 11200 1894
∑ ( )( )
Correlation coefficient is 𝑟 = =
∑ ( ) ∑ ( )
= 0.985
√ √
The correlation coefficient between annual expenditure and annual
sales revenue is 0.985.
2. Let X, Y and Z be uncorrelated random variables with zero
means and standard deviations 5, 12 and 9 respectively. If U = X
+ Y and V = Y + Z, find the correlation coefficient between U
and V.
Solution:
Given that all the three random variables have zero mean.
Hence, E(X) = E(Y) = E(Z) = 0.
Now, Var(X) = 𝐸(𝑋 ) − [𝐸(𝑋)]
⇒ 𝐸(𝑋 ) = Var(X) { since, E(X) = 0}
= 5 = 25
Similarly, 𝐸(𝑌 ) = 12 = 144 and 𝐸(𝑍 ) = 9 = 81
Since X and Y are uncorrelated we have Cov(X,Y) = 0
⇒ E(XY) = E(X).E(Y) = 0
Similarly, E(YZ) = 0 and E(ZX) = 0.
To find 𝜌(𝑈, 𝑉):
( ) ( ). ( )
Now, 𝜌(𝑈, 𝑉) =
.
E(U) = E [X + Y] = E[X] + E[Y] = 0
E(V) = E [Y + Z] = E[Y] + E[Z] = 0
𝐸(𝑈 ) = 𝐸[(𝑋 + 𝑌) ] = 𝐸[𝑋 ] + 𝐸 [𝑌 ] + 2𝐸 [𝑋𝑌]
= 25 + 144 + 0
= 169
Similarly, 𝐸(𝑉 ) = 225
Now, 𝑉𝑎𝑟(𝑈) = 𝐸(𝑈 ) − [𝐸(𝑈)] = 169
⇒ 𝜎 = √169 = 13
Similarly, 𝑉𝑎𝑟(𝑉) = 𝐸(𝑉 ) − [𝐸(𝑉)] = 225
⇒ 𝜎 = √225 = 15
E(UV) = E[(X+Y) (Y+Z)]
= E(XY) + 𝐸(𝑌 ) + E(XZ) + E(YZ)
= 144
( ) ( ). ( )
Therefore, 𝜌(𝑈, 𝑉) = = =
.
3. If the joint pdf of (X,Y) is given by 𝑓(𝑥, 𝑦) = 𝑥 + 𝑦, 0≤
𝑥, 𝑦 ≤ 1. Find 𝜌 .
Solution:
( ) ( ). ( )
We know that, 𝜌(𝑋, 𝑌) =
.
∞ ∞
Now, 𝐸(𝑋𝑌) = ∫ ∞ ∫ ∞ 𝑥𝑦𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦
= ∫ ∫ 𝑥𝑦(𝑥 + 𝑦)𝑑𝑥𝑑𝑦
=∫ + 𝑑𝑦
=∫ + 𝑑𝑦
= +
=
The pdf of X and Y is given by
𝑓(𝑥) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑦 = ∫ (𝑥 + 𝑦)𝑑𝑦 = 𝑥𝑦 + =𝑥+
𝑓(𝑦) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑥 = ∫ (𝑥 + 𝑦)𝑑𝑥 = + 𝑥𝑦 =𝑦+
𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥 = ∫ 𝑥 𝑥 + 𝑑𝑥 = + = + =
𝐸(𝑌) = ∫ 𝑦𝑓(𝑦)𝑑𝑦 = ∫ 𝑦 𝑦 + 𝑑𝑦 = + = + =
𝐸(𝑋 ) = ∫ 𝑥 𝑓(𝑥)𝑑𝑥 = ∫ 𝑥 𝑥+ 𝑑𝑥 = + = +
=
𝐸(𝑌 ) = ∫ 𝑦 𝑓(𝑦)𝑑𝑦 = ∫ 𝑦 𝑦+ 𝑑𝑦 = + = +
=
5 7 11
𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 ) − [𝐸(𝑋)] = + =
12 12 144
√11
⇒𝜎 =
12
5 7 11
𝑉𝑎𝑟(𝑌) = 𝐸(𝑌 ) − [𝐸(𝑌)] = + =
12 12 144
√11
⇒𝜎 =
12
( ) ( ). ( ) .
Therefore, 𝜌(𝑋, 𝑌) = =√ =
. .
√
4. The independent random variables X and Y have the pdf given
4𝑎𝑥 , 0 ≤ 𝑥 ≤ 1
by 𝑓 (𝑥) = , 𝑓 (𝑦) =
0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
4𝑏𝑦 , 0 ≤ 𝑦 ≤ 1
0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find the correlation coefficient.
Solution:
𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥 = ∫ 𝑥4𝑎𝑥𝑑𝑥 = 4𝑎 ∫ 𝑥 𝑑𝑥 = 4𝑎 =
𝐸(𝑌) = ∫ 𝑦𝑓(𝑦)𝑑𝑦 = ∫ 𝑦4𝑏𝑦𝑑𝑦 = 4𝑏 ∫ 𝑦 𝑑𝑦 = 4𝑏 =
Since X and Y are independent, the joint pdf of X and Y is
given by 𝑓(𝑥, 𝑦) = 𝑓(𝑥). 𝑓(𝑦)
= (4𝑎𝑥)(4𝑏𝑦)
= 16𝑎𝑏𝑥𝑦, 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 1
Now, 𝐸(𝑋𝑌) = ∫ ∫ 𝑥𝑦𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦
=∫ ∫ 𝑥𝑦(16𝑎𝑏𝑥𝑦)𝑑𝑥𝑑𝑦 =
Therefore we get, Cov(X,Y) = E(XY) – E(X)E(Y)
= - =0
Which implies that the cor(X,Y)=0
That is, the variables X and Y are independent and there is no
relationship between them.
SPEARMAN'S RANK CORRELATION COEFFICIENT
Rank correlation coefficient is useful for finding correlation between any two
qualitative characteristics such as Beauty, Honesty, and Intelligence etc., which
cannot be measured quantitatively but can be arranged serially in order of merit
or proficiency possessing the two characteristics.
Suppose we associate the ranks to individuals or items in two series based on
order of merit, the Spearman's Rank correlation coefficient r is given by
6d 2
1
n(n 1)
2
Where,
∑d2 = Sum of squares of differences of ranks between paired items in
two series
n = Number of paired items
Remarks
Spearman's rank correlation coefficient can be used to find the correlation
between two quantitative characteristics or variables. In this case, we associate
the ranks to the observations based on their magnitudes for X and Y series
separately. Let RX and Ry be the ranks of observations on two variables X and
Rank Correlation
Regression
Question 1.
Question 2.
Question 3.