100% found this document useful (1 vote)

929 views25 pages

Correlation Regression

This document discusses correlation and regression analysis. It defines univariate, bivariate, and multivariate analysis. Correlation refers to the relationship between two variables. There can be positive, negative, linear, or non-linear correlation. The correlation coefficient measures the strength and direction of linear correlation between two variables. Scatter plots are used to visualize the relationship between variables. The rank correlation coefficient measures correlation using ranks rather than raw data values.

Uploaded by

Varshney Nitin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

929 views25 pages

Correlation Regression

Uploaded by

Varshney Nitin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

CORRELATION & REGRESSION

Prepared By
Nitin Varshney
Assistant Professor
Agricultural Statistics
CoA, NAU, Waghai.
 The study related to the characteristics of only one variable
such as height, weight, age, marks, wages etc. is known as
univariate analysis.
 The study related to the relationship between two variables
such as height & weight. is known as Bivariate analysis.

CORRELATION
 When we study two or more variables simultaneously, we
observe that movements in one variable are accompanied by
movements in other variable.
 Example:
 Husband’s age and wife’s age move together
 Scores on IQ test move with scores in university
examinations
 Relation b/w income & expenditure on household.
 Relation b/w price & demand of commodity.
Meaning of Correlation

 In bivariate distribution (study of two variables), we are

interested to find out if there is any correlation or
covariation b/w the two variables.
 If the change in one variable affects a change in the other
variable, the variables are said to be correlated.

Types of Correlation

 Positive and negative

 Linear and Non-linear
 Multiple and Partial
Positive and negative correlation
 If the two variables deviate in the same direction i.e. if the
increase (or decrease) in one results in a corresponding
increase (or decrease) in other, correlation is said to be direct
or positive.
Example: Correlation between
 Heights & weights of a group of persons
 Income & expenditure

 If the two variables deviate in the opposite direction i.e. if the

increase (or decrease) in one results in a corresponding
decrease (or increase) in other, correlation is said to be
diverse or negative.
Example: Correlation between
 Price & demand of a commodity
 Volume & pressure of a perfect gas
Linear and non-linear correlation

 If the ratio of change b/w the two variables is constant then

there will be linear correlation b/w them. consider the
following example:

X 2 4 6 8 10 12 14 16
Y 3 6 9 12 15 18 21 24

 Here the ratio of change

b/w the two variables is same.
 If we plot these point on a

graph we will get a straight

line.
 If the amount of change in the one variable does not show a
constant change in the other variable. Then there will be
curvilinear or nonlinear correlation b/w them.

X Y
2 2
4 6
6 8
8 12
10 18
12 24
14 36
16 44
18 54
20 67
22 75
24 89
Multiple and Partial Correlation
 When there are interrelationship between many variables
and the value of one variable is influenced by many other
variables, e.g. The yield of crop per acre (X1) may depends
upon quality of seed (X2), fertility of soil (X3), fertilizer
used (X4), irrigation facilities (X5), weather conditions (X6)
and so on.
 Whenever we are interested in studying the joint effect of
a group of variables upon a variable, then the correlation
is known as multiple correlation.

 The correlation b/w only two variables X1 and X2, while

eliminating the linear effect of other variables is known as
partial correlation.
SCATTER DIAGRAM
 It is a simplest way of the diagrammatic representation of a
bivariate data.
 For a bivariate distribution (xi, yi); i=1, 2, …, n, if the values
of the variables X and Y are plotted along the x-axis and y-
axis respectively in the x-y plane, the diagram of dots so
obtained is known as scatter diagram.
 From the scatter diagram, we can form a fairly good idea
whether the variables are correlated or not. e.g.
 If the points are very dense (very close to each other):
There is good correlation between variables
 If the points are widely scattered: There is poor correlation
between variables.

Karl Pearson’s coefficient of correlation

 Karl Pearson developed a formula called correlation
coefficient as a measure of intensity or degree of linear
relationship between two variables.
 Correlation coefficient between two random variables X
and Y, usually denoted by r(X, Y) or rXY, is a numerical
measure of linear relationship between two variables. It is
defined as:
Cov( X , Y )
r( X , Y ) 
 XY

 It provide a measure of linear relationship between X and

Y.
 If (xi, yi); i=1, 2, …, n is the bivariate distribution, then
Coviance  Cov( X , Y )   X  Y  E[{X  E ( X )}  {Y  E (Y )}]


1
 ( xi  x )( yi  y )
n

(x  x)
1
Variance   X2  E{ X  E ( X )}2  i
2
n
  ( y  y)
1
Variance   Y2  E{Y  E (Y )}2 i
2
n

Variance   X2  E{ X  E ( X )}2 Cov( X , Y )  E[{ X  E ( X )}  {Y  E (Y )}]


1

 ( xi  x ) 2 1
n  ( xi  x )( yi  y )
n

1
 ( xi2  x 2  2 xi x )

1
n  ( xi yi  xi y  xyi  xy )
n
 
1 1
 xi2  x 2  2 x
  
xi 1
n n  xi yi y xi  x yi  x y
n

1
 X2  xi2  x 2

1
n Cov( X , Y )  xi yi xy
n
 y
1 1
Variance   Y2  E{Y  E (Y )}2  ( yi  y ) 2  2
i  y2
n n
PROPERTIES OF CORRELATION COEFFICIENT
 Range is -1 to +1.
 is independent of change of origin and scale.
 Two independent variables are uncorrelated.

Interpretation of correlation coefficient

 when r=1, there is perfect positive correlation b/w variables.
 when r=-1, there is perfect negative correlation b/w variables.
 when r=0, there is no relation b/w variables.
 when the value of r lies b/w +1 to -1, it signifies that there is
correlation b/w variables.
 when the value of r is close to +1 or -1 then it signifies high
positive or negative correlation b/w variables.
 when the value of r is close to 0 then it signifies very less
correlation b/w variables.
RANK CORRELATION
 This method is useful to study the qualitative measure of
attributes like honesty, intelligence, color, beauty, morality
etc.
 This method is based on ranks of the character under study.
 Group of individuals is arranged in an order of merit or
proficiency of any two characters A and B.
 Example: If we want to find the relation between
intelligence and beauty.
A: Intelligence B: Beauty

Ranks xi yi i=1, 2, 3, …, n

 Pearsonian coefficient of correlation between ranks xi’s and

yi’s is called rank correlation coefficient between A and B for
that group of individual.
SPEARMAN'S RANK CORRELATION COEFFICIENT
 This method is developed by Edward Spearman.
 Spearman’s formula for the rank correlation coefficient is
given by n n

d
i 1
i
2
6d i 1
i
2

  1  1
2n X2 n(n  1)
2

 di is the difference between ranks di=xi-yi.

 Range of rank correlation coefficient is -1 to +1.

Q. 2. In a marketing survey the price of tea and coffee in a town

based on quality. Data is given as follows, find the relation b/w
price of tea and coffee.

Price of Tea 88 90 95 70 60 75 50
Price of Coffee 120 134 150 115 110 140 100
REGRESSION
 The term “regression” literally means “stepping towards
the average”.
 It is given by Sir Francis Galton.

 Galton found that the offspring of abnormally tall or short

parents tend to regress or step back to the average
population height.
 Regression analysis is a mathematical measures of the
average relationship between two or more variables in
terms of the original units of the data.
 In regression analysis there are two types of variables.

Dependent Variable Independent Variable

OR OR
Regressed Regressor Explanatory
Explained
Predictor variable
variable

The variable which

The variable whose
influences the values
value is influenced or
or is used for
is to be predicted.
prediction.
LINEAR REGRESSION
 If the variables in a bivariate distribution are related
(means variables are correlated), we will find that the
points in the scatter diagram will cluster round some
curve called the “curve of regression”.
 If the curve is straight line, it is called the “line of
regression”.
 Then there is said to be linear regression between the
variables, otherwise curvilinear regression.

Linear Regression Equation

Let us suppose that in the bivariate distribution (xi, yi);
i=1, 2, 3, …, n; Y is dependent variable and X is
independent variable. Let the line of regression of Y on X
be
Y=a+bX (a, b are constants)
 There are two regression lines
 If Y is dependent variable and X is independent variable, then
it is called the line of regression Y on X.
Y= a+ byx X
 If X is dependent variable and Y is independent variable, then
it is called the line of regression X on Y.
Y= a+ bxy X
 where byx is regression coefficient (slope) of the regression line
Y on X.
 where bxy is regression coefficient (slope) of the regression line
X on Y.
 The line of regression is the line which gives the best estimate
to the value of one variable for any specific value of the other
variable.
 Thus the line of regression is the line of best fit.
 It is obtained by the principle of least squares.
PRINCIPLE OF LEAST SQUARES
 Let the line of regression of Y on X be
Y= a+ byx X
 ei=yi-(a+byxxi) is called the error of estimate or residual for
y i.
 According to the principle of least squares, we have to
determine a and b so that
n n
E 
i 1
ei2  ( y  a  b
i 1
i yx xi )
2

is minimum.
 By solving the partial derivatives we will get two normal
equations for estimating a and b.
n n

y
i 1
i  na  byx xi 1
i (i)

n n n


i 1
xi yi  a 
i 1
xi  byx 
i 1
xi2 (ii)
 If we divide the eqn. (i) by n then we get
y  a  byx x
 Thus the line of regression of Y on X passes through the point
(x , y).
 So regression coefficient (slope) of the line of regression of Y
on X is given by
Cov( x, y)
byx 
V ( x)

 xy 
(  x)( y)
byx  n
and a  y  byx x
 x  n
2
( x) 2

 So regression coefficient (slope) of the line of regression of X

on Y is given by
Cov( x, y)
bxy 
V ( y)


(  x)( y)
bxy 
xy 
n and a  y  bxy x
y 2

(  y) 2

n
 Since byx is the slope of the regression of Y on X and since the
line of regression passes through the point ( x, y ), its equation
is
Cov( X , Y ) 
Y  y  byx ( X  x )  ( X  x)  r Y ( X  x)
V (X ) X

 Similarly for the line X on Y

Cov( X , Y ) 
X  x  bxy (Y  y )  (Y  y )  r X (Y  y )
V (Y ) Y

Cov( X , Y )
r  Cov( X , Y )  r X  Y
V ( X )V (Y )
Cov( X , Y )
bYX   Cov( X , Y )  bYX  X2
V (X )
 r X  Y  bYX  X2
Y X
 bYX  r similarlyb XY  r
X Y
PROPERTIES OF REGRESSION COEFFICIENT
1. Fundamental Property: Correlation coefficient is the geometric
mean between the regression coefficients.
 
b XY  bYX  r Y  r X  r 2
X Y
 r   b XY  bYX

2. Signature Property: Sign of correlation coefficient is the same as

that of regression coefficients. Thus if the regression coefficients
are positive then correlation coefficient will be positive and vice-
versa.
3. Magnitude Property: If one of the regression coefficients is
greater than unity, the other must be less than unity.
If bYX  1 then bXY  1
4. Mean Property: The modulus value of the arithmetic mean of the
regression coefficients is not less than the modulus value of
correlation coefficient r.
1
(b XY  bYX )  r
2
5. Regression coefficients are independent of the change of
origin but not of scale.
6. Angle between two lines of regression: If θ is the acute
angle between the two lines of regression, then
 2 
  
1 1  r
  tan   X Y 
 r   X   Y 
 2 2

 If r=0, tan θ =∞→ θ=90°. Thus if the two variables are uncorrelated,
the lines of regression become perpendicular to each other.
 If r=±1, tan θ =0→ θ=0° or 180°. Thus if the two variables are
perfectly correlated, the lines of regression coincide to each other.

Q.3. From a paddy field, 15 plants were selected randomly. The length of
panicle (cm) and number of grains per panicle were recorded. Fit the
regression line for the given dataset and compute the number of estimated
grains per panicle if the panicle length is 25.2 cm.

Length of Panicle (cm) 22.4 23.3 24.1 24.3 23.5 23.1 21 20.6 26.4 25.4 23.4 21.4 23.6 24.5 22.5

No. of grains per

95 109 133 132 136 116 94 85 143 138 129 88 127 142 110
panicle

Stats for Students & Educators
No ratings yet
Stats for Students & Educators
15 pages
Measures of Dispersion Students.
No ratings yet
Measures of Dispersion Students.
20 pages
Propensity To Consume: Consumption Function
No ratings yet
Propensity To Consume: Consumption Function
10 pages
Index Numbers: Types and Uses
No ratings yet
Index Numbers: Types and Uses
14 pages
Nature and Scope of Economics
No ratings yet
Nature and Scope of Economics
30 pages
Baumol's Sales or Revenu Maximisation
No ratings yet
Baumol's Sales or Revenu Maximisation
8 pages
2.correlation Regression Summary Notes by Pranav Popat 1
No ratings yet
2.correlation Regression Summary Notes by Pranav Popat 1
4 pages
LPG
No ratings yet
LPG
10 pages
Welcome: To All MBA Students
No ratings yet
Welcome: To All MBA Students
60 pages
++ Stochastic Error Term
No ratings yet
++ Stochastic Error Term
2 pages
Industrial Economics
No ratings yet
Industrial Economics
3 pages
Understanding Index Numbers and Types
100% (1)
Understanding Index Numbers and Types
34 pages
Index Numbers: in This Chapter
No ratings yet
Index Numbers: in This Chapter
18 pages
Correlation & Regression
No ratings yet
Correlation & Regression
10 pages
Index Number - Theory Notes
No ratings yet
Index Number - Theory Notes
6 pages
Question Bank Economics - Micro and Statistics
No ratings yet
Question Bank Economics - Micro and Statistics
21 pages
Unit 4 Measures of Central Tendency and Dispersion: Structure
No ratings yet
Unit 4 Measures of Central Tendency and Dispersion: Structure
90 pages
Understanding Index Numbers in Economics
No ratings yet
Understanding Index Numbers in Economics
8 pages
Correlation Analysis Guide
No ratings yet
Correlation Analysis Guide
43 pages
Lagrangian Method in Economics
No ratings yet
Lagrangian Method in Economics
4 pages
Study of Averages Final
No ratings yet
Study of Averages Final
111 pages
F242
No ratings yet
F242
29 pages
Business Statistics
No ratings yet
Business Statistics
24 pages
Statistics For MGMT I & II
No ratings yet
Statistics For MGMT I & II
161 pages
Example 2
No ratings yet
Example 2
2 pages
Introduction to Statistics Basics
No ratings yet
Introduction to Statistics Basics
73 pages
Introduction to Statistics Guide
No ratings yet
Introduction to Statistics Guide
22 pages
Unit 3 & 4 - Ignou Part 3
No ratings yet
Unit 3 & 4 - Ignou Part 3
19 pages
Introduction To Statistics Material 2023
No ratings yet
Introduction To Statistics Material 2023
85 pages
White Test
No ratings yet
White Test
2 pages
Index Numbers: Methods and Applications
No ratings yet
Index Numbers: Methods and Applications
33 pages
MCQ Eco 1
0% (1)
MCQ Eco 1
2 pages
Class-12 Macro: Equilibrium Output
No ratings yet
Class-12 Macro: Equilibrium Output
17 pages
Statistics For Management
100% (1)
Statistics For Management
176 pages
Business Statistics Notes-3 - Compressed (1) - Compressed
No ratings yet
Business Statistics Notes-3 - Compressed (1) - Compressed
97 pages
Measurement of Economic Development
No ratings yet
Measurement of Economic Development
10 pages
Correlation Ratio
No ratings yet
Correlation Ratio
3 pages
Business Statistics Project On Correlation: Submitted by N.Bavithran BC0140018
No ratings yet
Business Statistics Project On Correlation: Submitted by N.Bavithran BC0140018
17 pages
Statistical Moments Explained
100% (1)
Statistical Moments Explained
6 pages
Mathematical Economics
No ratings yet
Mathematical Economics
2 pages
11 ECO 08 Introduction To Index Number
No ratings yet
11 ECO 08 Introduction To Index Number
4 pages
Understanding Index Numbers in Economics
100% (1)
Understanding Index Numbers in Economics
22 pages
Econometric S
No ratings yet
Econometric S
26 pages
Input Output Analysis
No ratings yet
Input Output Analysis
11 pages
Econometric Methods Assignment Guide
No ratings yet
Econometric Methods Assignment Guide
22 pages
Lecture 11 - BudgetLine and Consumer's Equilibrium
No ratings yet
Lecture 11 - BudgetLine and Consumer's Equilibrium
6 pages
Geometric Mean and Harmonic Mean
No ratings yet
Geometric Mean and Harmonic Mean
12 pages
Statistics PPT UNIT I 28.11.2020
No ratings yet
Statistics PPT UNIT I 28.11.2020
150 pages
Measures of Central Tendency - Dispersion - Skewness - NOTES PGDM
No ratings yet
Measures of Central Tendency - Dispersion - Skewness - NOTES PGDM
89 pages
Numerical On Mean Median and Mode
100% (1)
Numerical On Mean Median and Mode
3 pages
Understanding Cost Types in Economics
No ratings yet
Understanding Cost Types in Economics
22 pages
Understanding Attributes and Class Frequencies
No ratings yet
Understanding Attributes and Class Frequencies
15 pages
M2 Difference Eqns
100% (1)
M2 Difference Eqns
36 pages
International Interdependence Dynamics
100% (2)
International Interdependence Dynamics
22 pages
8 Main Limitations of Statistics
No ratings yet
8 Main Limitations of Statistics
3 pages
Statistics For Economy
No ratings yet
Statistics For Economy
160 pages
Index Number: - According To Prof. Bowley
No ratings yet
Index Number: - According To Prof. Bowley
12 pages
Correlation, Regression & Curve Fitting
No ratings yet
Correlation, Regression & Curve Fitting
6 pages
Regression Correlation
No ratings yet
Regression Correlation
22 pages
Understanding Correlation and Regression
No ratings yet
Understanding Correlation and Regression
26 pages
Owasp Testing Guide v3.0
No ratings yet
Owasp Testing Guide v3.0
374 pages
Filipino Child Nutrition Crisis
No ratings yet
Filipino Child Nutrition Crisis
7 pages
Black Soil: Key to India's Agriculture
No ratings yet
Black Soil: Key to India's Agriculture
5 pages
CH14 - Reptilia - Status of Sphenodon
No ratings yet
CH14 - Reptilia - Status of Sphenodon
2 pages
Modern Power Transformer Practice3
No ratings yet
Modern Power Transformer Practice3
3 pages
HFN1 - 2O Course Plan - Part 2 - Units 2 & 3
No ratings yet
HFN1 - 2O Course Plan - Part 2 - Units 2 & 3
22 pages
Algorithm and Flowchart
No ratings yet
Algorithm and Flowchart
47 pages
How To Install Windows 10
No ratings yet
How To Install Windows 10
10 pages
Random Variable Transformation & Chebyshev
No ratings yet
Random Variable Transformation & Chebyshev
32 pages
Dapa 1669 140 02 Site Analysis Doc Redacted
No ratings yet
Dapa 1669 140 02 Site Analysis Doc Redacted
48 pages
Managing Channel Conflicts in Coca-Cola
No ratings yet
Managing Channel Conflicts in Coca-Cola
10 pages
7 Outdated and Updated Teacher
No ratings yet
7 Outdated and Updated Teacher
14 pages
ABS Class Notations Table Apr13
No ratings yet
ABS Class Notations Table Apr13
254 pages
Assembly Drawing Examples and Exercises
No ratings yet
Assembly Drawing Examples and Exercises
2 pages
Riedel Catalog 2022 Autumn
No ratings yet
Riedel Catalog 2022 Autumn
49 pages
Brain Balancing Protocol Dr. Karen March 2014
100% (2)
Brain Balancing Protocol Dr. Karen March 2014
5 pages
La Voix Passive en Anglais
No ratings yet
La Voix Passive en Anglais
1 page
Shalini Material Cost Variance (Practical Questions), Management Accounting, M. Com. 2nd Sem.
No ratings yet
Shalini Material Cost Variance (Practical Questions), Management Accounting, M. Com. 2nd Sem.
12 pages
3-Receptor Haz de Luz 23901-01, Ec-Llt
No ratings yet
3-Receptor Haz de Luz 23901-01, Ec-Llt
4 pages
Beginner's Guide To Growing Heirloom Vegetables (Excerpt)
100% (1)
Beginner's Guide To Growing Heirloom Vegetables (Excerpt)
10 pages
Psychosocial Support Activity Pack
100% (1)
Psychosocial Support Activity Pack
35 pages
DBMS Lesson 5.1
No ratings yet
DBMS Lesson 5.1
17 pages
MEG 14 Previous Q Papers
No ratings yet
MEG 14 Previous Q Papers
75 pages
MicroMaxx Musculoskeletal
No ratings yet
MicroMaxx Musculoskeletal
2 pages
SMEA Proposal
100% (3)
SMEA Proposal
3 pages
Pob Sba
No ratings yet
Pob Sba
15 pages
B Ing
No ratings yet
B Ing
10 pages
Coa Notes
No ratings yet
Coa Notes
9 pages

Correlation Regression

Uploaded by

Correlation Regression

Uploaded by

CORRELATION & REGRESSION

 In bivariate distribution (study of two variables), we are

 Positive and negative

 If the two variables deviate in the opposite direction i.e. if the

 If the ratio of change b/w the two variables is constant then

 Here the ratio of change

graph we will get a straight

 The correlation b/w only two variables X1 and X2, while

Karl Pearson’s coefficient of correlation

 It provide a measure of linear relationship between X and

Variance   X2  E{ X  E ( X )}2 Cov( X , Y )  E[{ X  E ( X )}  {Y  E (Y )}]

Interpretation of correlation coefficient

 Pearsonian coefficient of correlation between ranks xi’s and

 di is the difference between ranks di=xi-yi.

Q. 2. In a marketing survey the price of tea and coffee in a town

 Galton found that the offspring of abnormally tall or short

Dependent Variable Independent Variable

The variable which

Linear Regression Equation

 So regression coefficient (slope) of the line of regression of X

 Similarly for the line X on Y

2. Signature Property: Sign of correlation coefficient is the same as

No. of grains per

You might also like