Chameli Devi Institute of Management
and research
Presentation on
correlation
Guided by : Presented by:
vandana mam Lokesh jain
Pankaj kukreja
Initially developed
by Sir Francis
Galton (1888) and
Karl Pearson (1896)
Sir Francis Galton 1822-1911
Introduction
Specific example
Need of study correlation
Positive & negative
correlation
Linear Correlation
Non-linear correlation
Methods for study of
correlation
“Correlation is the relationship that exists
between two or more variables. If variable are
related in such a way that change in one
creates a corresponding change in the other,
then the variable are said to be correlated”
-W.I.KING
“When the relationship is of a quantitative
nature, the appropriate statistical tool for
discovering and measuring the relationship
and expressing it in a brief formula is known
as correlation” –CORXTON & COWDEN
1.)relationship between height and weight.
2.)relationship between price of a commodity and
demand of commodity.
3.)relationship between the age of individuals and
their blood pressure.
4.)relationship between the quantum of rain fall and
the yield of wheat.
5.)relationship between the advertisement and sales
Correlation
A measure of association between
two numerical variables.
Example (positive correlation)
Typically, in the summer as the
temperature increases people are
thirstier.
Water
Temperature Consumption
(F) (ounces)
For seven
random summer 75 16
days, a person 83 20
recorded the
temperature and 85 25
their water 85 27
consumption,
during a three-hour 92 32
period spent
outside. 97 48
99 48
Correlation help in study economic theory and business
studies, it help in establishing relationship between variable like
price and quantity demanded, advertising and sales promotion
measures
Correlation analysis helps in deriving precisely the degree and
direction of such relation.
The effect of correlation is to reduce the range of uncertanity of
our prediction. The prediction based on correlation analysis will
be more reliable and near to reality.
The measure of coefficient of correlation is relative measure of
change.
Positive correlation: if both the variables vary
in same direction, correlation is said to be positive
correlation .if one variable increase, the other also
increases or, if one variable decreases, the other
variable is said to be a positive correlation.
Negative correlation: if both the variable vary
in opposite direction, the correlation is said to be
negative. In other word if one variable increases,
but other variable decreases or, if one variable
decreases but the other variables increases, than
correlation between two variables is said to be
negative correlation.
Linear correlation: if the amount of changes in one
variable bears a constant ratio to the amount of
changes in the other variable, then correlation is
said to be linear. if such variable plotted on graph
paper all the plotted points would fall on straight
line.
Non-linear correlation : if change in one variable
does not bear a constant ratio to the amount of
changes in the other variable, then correlation is
said to be non-linear. if such variable plotted on
graph, the point will fall on a curve and not a
straight line.
Scatter diagram is diagrammatic
representation of bivariant data to
ascertain the correlation between
two variable.
Perfect negative
correlation
High degree negative
Low degree negative
lu es
X- va
es
al u
Y- v
Value of r interpretation
(a) If r = +1 Perfect positive correlation
(b) If r =-1 Perfect negative correlation
(c) If r =0 No correlation
(d) If +.75 ≤ r < +1 High positive correlation
(e)If -.75 ≥ r > -1 High negative correlation
(f)If + 0.50 ≤ r < 0.75 Moderate positive correlation
(g)If -0.50 ≥ r > -0.75 Moderate negative correlation
(h)If r < +0.50 Low positive correlation
(i)If r> -0.50 Low negative correlation
Karl Pearson’s
Correlation Coefficient, r
Measures the direction and the
strength of the linear association
between two numerical paired variables.
1.Linear relationship between
variable
2.Cause & affect relationship
3.Normality
1.)correlation coefficient gives direction
as well as degree of relationship
between the variables.
2.)correlation coefficient along with
other information help in estimating
the values of the dependent variable
from the known value of independent
variable.
1.)Assumption of linear relationship:-The assumption of
linear relationship between the variables may or
may not hold always true .
2.)Time consuming :- Its computation is time
consuming as compared to other method .
3.)Affected by extreme values: -It is affected by the
value of extreme item.
4.)Requires careful interpretation:- The investigation
should reach a conclusion based on logical reasoning
and investigation on significantly related matter.
1.)ACTUAL MEAN METHOD
r= ∑ XY or ∑ XY
√∑ X2. ∑ Y2 Nσx σy
Where
X is calculation of deviation from the actual mean of X
series.i.e. X=X-X
Y is calculation of deviation from the actual mean of Y
series i.e. Y=Y-Y
2.)ASSUMED MEAN METHOD (when deviation are taken)
Formula: r= ∑dxdy - (∑dx)(∑dy)
N
√ 2
∑dx -(∑dx)
2 √ 2
∑dy -(∑dy)
2
N N
where as: dx is calculation of deviation from
assumed mean of X i.e. dx=X-A
dy is calculation of deviation from assumed
mean of Y i.e. dy =Y-A
∑X.∑Y
r= ∑XY -N
√ ∑X 2 √
- (∑X)2
∑Y2 - (∑Y)2
N N
Where as:- X is variable of X series;
Y is variable of Y series
Standard Error = 1 – r2
√N
Where, r = Coefficient of correlation,
N=Number of pair of observation
Probable Error = .6745 Standard error
or
=.6745 1 – r2
√N
h
R= 1- 6∑D2
N3-N
Where as : R = Rank Correlation Coefficient
D = Difference of the ranks between
paired items in two series
N = Number of pairs of ranks
In case of tied Ranks:
6(∑D2 + m3 – m…)
R=1 - 12
N3 - N
limitation
Simple to understand Unsuitable For group
&Easy to apply data
Suitable for qualitative Tedious Calculation
Data Approximation since
Only Method for ranks actual data are not taken
in to account
Formula:
rc = _+ √( _ 2C - n
)
+
n
Where, rc= Coefficient of Concurrent deviation
C = Number of Positive signs after multiplying the direction of
change of X series and Y series
n = Number of pairs of observations compared
Limitation
Simple to understand Does not differentiate
and Easy to apply between Small and Big
Suitable for large N variable
Result obtained from
this method is only an
approximate indicator of
the presence and
absence of correlation
Google
R S Bhardwaj
Bharat Jhunjhunwala
S.M Shukla
Queries
are
Invited...
…… ?
Thank
You