0% found this document useful (0 votes)
35 views7 pages

Correlation and Regression Statistical Method

The document discusses Karl Pearson's correlation coefficient and Spearman's rank correlation, providing formulas and examples for calculating these statistical measures. It also covers regression methods, including the derivation of regression lines and the relationship between independent variables. Additionally, exercises are included to apply the concepts of correlation and regression in practical scenarios.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views7 pages

Correlation and Regression Statistical Method

The document discusses Karl Pearson's correlation coefficient and Spearman's rank correlation, providing formulas and examples for calculating these statistical measures. It also covers regression methods, including the derivation of regression lines and the relationship between independent variables. Additionally, exercises are included to apply the concepts of correlation and regression in practical scenarios.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

2.

6 Statistics and Queueing Theory

2.2 Karl pearson’s Correlation Co-efficient

Correlation is the study of relationship between two independent variables.


Karl pearson’s correlation co-efficient is

cov(x, y)
r = r(x, y) = rxy =
σx σy

where, P
xy
cov(x, y) = −xy
n
rP
x2
σx = − (x)2
n
rP
y2
σy = − (y)2
n

n is the number of data


P
x
x=
Pn
y
y=
n
Note: Correlation co-efficient between -1 and 1. i.e., −1 ≤ r ≤ 1

Problem 1 Calculate the Karl pearson’s co-efficient of correlation to the following data.

x 65 66 67 67 68 69 70 72
.
y 67 68 65 68 72 72 69 71

Solution:

X Y X2 Y2 XY
65 67 4225 4489 4355
66 68 4356 4624 4488
67 65 4489 4225 4355
67 68 4489 4624 4556
.
68 72 4624 5184 4896
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184 5041 5112
544 552 37028 38132 37560

Dr. P.Sambath, Asst.Professor(SG), SRM IST, KTR


Regression Methods 2.7

n=8
P
x 544
x = = = 68
Pn 8
y 552
y = = = 69
n
rP 8
r
x2 2
37028
σx = − (x) = − (68)2 = 2.12
n 8
rP r
y2 38132
σy = − (y)2 = − (69)2 = 2.34
n 8
P
xy 37560
cov(x, y) = −xy = − (68 × 69) = 3
n 8
cov(x, y) 3
∴ rxy = = = 0.6047
σx σy 2.12 × 2.34

2.3 Rank correlation

Spearsman’s rank correlation coefficient

6 d2i
P
ρ=1−
n(n2 − 1)

Where, di = xi − yi

Note: If ranks are repeated, then

d2i + C.F1 + C.F2 + · · ·


P 
6
ρ=1−
n(n2 − 1)

Where, di = xi − yi

m(m2 − 1)
C.F’s are correction factor and it can be calculated by C.F = Here m is the number of
12
times, the data has been repeated.

Problem 1 Calculate the spearsman’s rank correlation to the following data.

x 68 64 75 50 64 80 75 40 55 64
.
y 62 58 68 45 81 60 68 48 50 70

Solution:

Dr. P.Sambath, Asst.Professor(SG), SRM IST, KTR


2.8 Statistics and Queueing Theory

X Y Rank of X Rank of Y di = xi − yi d2i


68 62 4 5 −1 1
64 58 6 7 −1 1
75 68 2.5 3.5 −1 1
50 45 9 10 −1 1
64 81 6 1 −5 25
.
80 60 1 6 −5 25
75 68 2.5 3.5 −1 1
40 48 10 9 1 1
55 50 8 8 0 0
64 70 6 2 4 16
d2i = 72
P

In value of X,
2+3
75 is repeated 2 times and which having the rank as 2 and 3. ∴ the rank of 75 = = 2.5 and
2
m(m2 − 1) 2(22 − 1)
C.F1 = = = 0.5
12 12
5+6+7
64 is repeated 3 times and which having the rank as 5, 6 and 7. ∴ the rank of 64 = = 6 and
3
2
m(m − 1) 2
3(3 − 1)
C.F2 = = =2
12 12
In value of Y,
3+4
68 is repeated 2 times and which having the rank as 3 and 4. ∴ the rank of 68 = = 3.5 and
2
m(m2 − 1) 2(22 − 1)
C.F3 = = = 0.5
12 12

d2i + C.F1 + C.F2 + C.F3


P 
6
∴ ρ =1−
n(n2 − 1)
6 [72 + 0.5 + 2 + 0.5]
=1−
10(102 − 1)

= 1 − 0.4545

= 0.5454

Dr. P.Sambath, Asst.Professor(SG), SRM IST, KTR


Regression Methods 2.9

Exercise

Problem 1 10 competitors in a musical contest were ranked by 3 judges x, y and z. Find out which pair
of judges having the same likings of music.

x 1 2 3 4 5 6 7 8 9 10
y 10 6 7 9 5 4 3 2 1 8 .

z 8 10 9 7 6 5 4 3 2 1

2.4 Regression

Regression is the mathematical study of average relationship between the independent variables x and y.
Lines of regression of x on y
(x − x) = bxy (y − y)

Lines of regression of y on x
(y − y) = byx (x − x)

where bxy and byx are regression co-efficients. It is given by


P P
(x − x)(y − y) (x − x)(y − y)
bxy = and byx =
(y − y)2 (x − x)2

Note:
p
r= bxy byx
σx
bxy = r
σy
σy
byx = r
σx
The point of intersection of the lines of regression of y on x and x on y is the mean value
of x and y.

Problem 1 From the following data find

1. Two lines of regressions

2. Coefficient of correlation between the marks of economics and statistics

3. The most likely marks in statistics when the marks in economics is 30.

Marks in Economics 25 28 35 32 31 36 29 38 34 32
.
Marks in Statistics 43 46 49 41 36 32 31 30 33 39

Dr. P.Sambath, Asst.Professor(SG), SRM IST, KTR


2.10 Statistics and Queueing Theory

Solution:Let x be marks in Economics and y be marks in Statistics


P P
320
x y 380
x= = = 32 and y = = = 38
n 10 n 10

x y (x − x) (y − y) (x − x)2 (y − y)2 (x − x)(y − y)


25 43 −7 5 49 25 −35
28 46 −4 8 16 64 −32
35 49 3 11 9 121 33
32 41 0 3 0 9 0
31 36 −1 −2 1 4 2
.
36 32 4 −6 16 36 24
29 31 −3 −7 9 49 21
38 30 6 −8 36 64 −48
34 33 2 −5 4 25 −10
32 39 0 1 0 1 0
320 380 0 0 140 398 −93

P
(x − x)(y − y)
bxy =
(y − y)2
−93
= = −0.2336
398
and
P
(x − x)(y − y)
byx =
(x − x)2
−93
= = −0.6642
140 p √
correlation co-efficient is = bxy byx = −0.2336 × −0.6642 = 0.393
Line of regression of x on y is (x − x) = bxy (y − y)
(x − 32) = −0.2336(y − 38)
x − 32 = −0.2336y + 8.8768
x = −0.2336y + 8.8768 + 32
x = −0.2336y + 40.8768 − − − −(1)
Line of regression of y on x is(y − y) = byx (x − x)
(y − 38) = −0.6642(x − 32)
y − 38 = −0.6642x + 21.2544
y = −0.6642x + 21.2544 + 38
y = −0.6642x + 59.2544 − − − −(2)

Dr. P.Sambath, Asst.Professor(SG), SRM IST, KTR


Regression Methods 2.11

Now, to find y when x = 30

eqn.(2) ⇒ y = −0.6642(30) + 59.2544 = 39.3284

∴ Marks in Statistics = 39.32

Problem 2 Two variables x and y have the regression lines 3x + 2y − 26 = 0, 6x + y − 31 = 0 find the

1. mean value of x and y

2. correlation co-efficient between x and y

3. the variance of y when the variance of x is 25

Solution:

Given 3x + 2y − 26 = 0 (1)

6x + y − 31 = 0 (2)

1. mean value of x and y


Solving (1) and (2), we get x = 4 and y = 7
∴ x = 4 and y = 7

2. correlation co-efficient between x and y


Let 3x + 2y − 26 = 0 be line of regression of x on y
Then
2
3x + 2y − 26 = 0 ⇒ 3x = −2y + 26 ⇒ x = − y + 12
3
2
∴ bxy = −
3

Let 6x + y − 31 = 0 be line of regression of y on x


Then
6x + y − 31 = 0 ⇒ y = −6x + 31 ⇒ y = −6x + 31

∴ byx = −6
r
p 2
r= bxy byx = − × −6 > 2
3

Since the correlation coefficient should not exceed 1, 3x + 2y − 26 = 0 can not be a line of regression
of x on y and 6x + y − 31 = 0 can not be a line of regression of y on x. ∴ we have to consider

Dr. P.Sambath, Asst.Professor(SG), SRM IST, KTR


2.12 Statistics and Queueing Theory

3x + 2y − 26 = 0 be line of regression of y on x

3
3x + 2y − 26 = 0 ⇒ 2y = −3x + 26 ⇒ y = − y + 13
2

3
∴ byx = −
2

and consider 6x + y − 31 = 0 be line of regression of x on y

1 31
6x + y − 31 = 0 ⇒ 6x = −y + 31 ⇒ x = − y +
6 6
1
∴ bxy = −
6
r
p 3 1
r = bxy byx = − × − = 0.5 < 1
2 6

3. the variance of y when the variance of x is 25 (σx2 = 25)


i.e., σx = 5, we have to find σy
σx
bxy = r
σy
σx
σy =r
bxy
5
= 0.5 = −15
1

6
σy2 = 225

−−−−−−−

Dr. P.Sambath, Asst.Professor(SG), SRM IST, KTR

You might also like