0% found this document useful (0 votes)
11 views21 pages

Correlation Analysis

The document discusses correlation analysis, which examines the relationship between two or more variables, highlighting the importance of measuring and interpreting these relationships in business contexts. It outlines the steps in correlation analysis, types of correlation, and methods for studying correlation, emphasizing that correlation does not imply causation. Additionally, it warns against misinterpreting correlation and discusses various forms of correlation, including positive, negative, simple, partial, and multiple correlation.

Uploaded by

shadowgaming9355
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views21 pages

Correlation Analysis

The document discusses correlation analysis, which examines the relationship between two or more variables, highlighting the importance of measuring and interpreting these relationships in business contexts. It outlines the steps in correlation analysis, types of correlation, and methods for studying correlation, emphasizing that correlation does not imply causation. Additionally, it warns against misinterpreting correlation and discusses various forms of correlation, including positive, negative, simple, partial, and multiple correlation.

Uploaded by

shadowgaming9355
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

CorrelationAnalysis

7
INTRODUCTION

Sofar we have studied problems relating to onevariable only. In business we come across a large
number of problems involving the useof two or more than two variables. If twoquantities vary in such
away that movements inone are accompanied by movements in the other, these quantities are said to be
correlated. For example, there exists some relationship between family income and expenditure on luxury
items,price of a commodity and amount demanded, increase in rainfall up to a point and production of
rice, an increase in the number of television licences and number of cinemaadmissions, etc. The statistical
tool with the help of which these relationships between two or more than two variables is studied is
called correlation*. The measure of correlation called the coefficient of correlation (denoted by the
analysis
symbol r) summarizes in one figure the direction and degree of correlation. Thus correlation
refers to the techniques used in measuring the closeness of the relationship between
the variables. A
correlation as : An analy
very simple definition of correlation is that given by A.M. Tuttle. He defines
sis of the covariation of two or more variables is usually called correlation.
The problem of analysing the relation between different series should
be broken down into three
steps :
(1)Determining whether a relation exists and, if it does, measuring it;
(2) Testing whether it is significant; and
(3) Establishing the cause-and-effect relations, if any.
aspect a reference may be made to
Inthis chapter only the first aspect will be discussed. For second
establishing the cause-effect
chapter on Tests on Hypothesis. The third aspect in the analysis, that of
significant correlation between the
relation, is beyond the scope of this text. An extremely high and cancer.
smoking causes lung
increase in smoking and increase in lung cancer would not prove that
convariation) between two
It should be noted that the detection and analysis of correlation (i.e.,
statistical variables requires relationship of somne sort which associates the observation
in pairs, one of
may be of almost
each pair being a value of each of the twovariables. In general, the pairing relationship places.
of time or different
any nature, such as observations at the same time or place or over a period
Significance of the Study of Correlation
reasons :
The study of correlation is of immense use in practical life because of the following
1. Most of the variables show some kind of relationship between price and supply, income and
Cxpenditure, etc. With the help of correlation analysis we can measure in one figure the degree of
relationshipexisting between the variables.
When the relationship is of a quantitative nature, the appropriate statistical tool for discovering and mcasuring the
auonship and expressing it in a brief formula is known as correlation."-Croxton and Cowden:Applied General Statistics.
200 Business Statistics

2. Once we know that estimate the value of


one
two variables are closely related, we can variable
given the value of another. This is which is
next chapter. done with the help of regression analysis discussed in the
locating the
3. Correlation analysis
tant variables contributes
on which others depend. tomay
the cconomic
revealto
behaviour, aids in
the economist the
criticaly por-
connection by which imdistur.
bances spread and sUggest to him the paths through which stabilising forces become effective.
In business, correlation analysis enables the executive to estimate costs, sales, price and other vari-
ables onthe basis of some other series with which these costs, sales, or prices may be functionally related
Some of the guesswork can be removcd froom decisions when the relationship between a variable to be
estimated and the one or more other variables on which it depends are close and reasonably invariant.

4. Progreive development in the methods of science and philosuphy characterised by


has been
found to be
increase in the knowledge of relationship or correlations. Nature has been multiplicity of
inter-related forces.
correlation is one of the most widely used and also
however, itshould be noted that coeficient of sometimes
It is abused in the sense that one Over-
one of the most widely abused statistical measures. relationships and that it does
looks the fact that correlation measures nothing but the strength of linear
notnecessarily imply a relationship.
Correlation and Causation
degree of relationship between two or more varj
Correlation analysis helps us in determining the correlation
cause-effect relationship. Even a high degree of
ablesit does not tell us anything about simply
cause and effect exists between the variables or,
does not necessarily mean that arelationship of
causation or functional relationship though the existence
stated, correlationdoes not necessarily imply
it establishes only covariation. The explanation of
of causation always implies correlation. By itself combination of the following factors :
significant degree of correlation may be anyone, or a
a small sample. We may get a high
1. The correlation may bedue to pure chance, especially in universe, there may not be any
the
degree of correlation between two variables in the sample but in
smallsamples. Such acorrelation
relationship between the variables at all. This is especially so in case of in
of the bias of the investigator
may arise either because of pure random sampling variation or because
selecting the sample. The following example shall illustrate the point: Sales
Advertisement expenditure
(Rs. lakhs) (Rs. crores)
25 120
35 140
45 160
55 180
65 200
The above data show a perfect positive relationship between advertisement expenditure and sales,
ie..as the advertisement expenditure is increasing, the sales are also increasing and the ratio of change
between the two variables is the same. However, such a situation is rare in practice.
2Both the correlated variables may be influenced by one or more other variables. It is just possible
that ahigh degree of correlation between the variables may be due to the same causes affecting each
variable or different causes affecting cach with the same effect. For example, a high degree of correlation
hetween the yield per acre of rice and tea may be due to the fact that both are related to the amount of
rainfall. But none of the two variables is the cause of the othe.
Correlation Analysis 201
3. Both the variables may be mutually influencing each ther so that neither can be
the cause and the other the effect. There may be a high degree of correlation between thedesignated as
variables but
it may be difficult to pinpoint as to which is the cause and which is the effect. This is especially likely to
be so in case of economic variables. For example, such variables as demand and supply, príce and
production, etc., mutually interact. To take aspecific case, it is a well-known principle of economics that
asthe price of a commodity increases, its demand goes down and so price is the cause and demand the
effect. But it isalso poSSible that increased demand of acommodity due to growth of population or other
reasons may torce its price up. Now the cause is the increased demand, the effect the price. Thus at times
it may become difficult to explain from the twO correlated variables which is the cause and which is the
effect because both may be reacting on each other.
The above points clearly bring out the fact that correlation does not manifest causation or functional
relationship. By itself, it establishes only covariation. Correlation observed between variables that could
not conceivably be causally related are called spurious or nonsense correlation. More appropriately, we
chould remember that it is the interpretation of the degree of correlation that is spurious, not the degree
of correlation itself. The high degree of correlation indicates only the mathematical result. We should
reach aconclusion based on logicalreasoning and intelligent investigation on significantly related matters.
Alast word of warning: Errors in correlation analysis include not only reading causation into spurious
corelation but also interpreting spuriously a perfectly valid association.
Types of Correlation
Correlation is described or classified in several different ways. Three of the most important are :
(i) Positive and negative ;
(i) Simple, partial and multiple ; and
(ii) Linear and non-linear.
or negative (in
(1) Positive and Negative Correlation. Whether correlation is positive (direct) varying in the
both the variables are
verse) would depend upon the direction of change of thevariable. If
average is also increasing or, if one
same direction, ie., if one variable is increasing the other on an
correlation is said to be positive. If, on
variable is decreasing the other on an average is also decreasing,
one variable is increasing the
the other hand, the variables are varying in opposite directions, i.e., as
negative. The following examples would
other is decreasing or vice versa, correlation is said to be
illustrate positive and negative correlation :
POSITIVE CORRELATION
POSITIVE CORRELATION

X
80 50
10 15
70 45
12 20
60 30
11 22
40 20
18 25
30 10
20 37
NEGATIVE CORRELATION NEGATIVE CORRELATION
X
100 10
20 40
90 20
30 30
60 30
40 22
40 40
60 15
80 16 30 50
202 Business Statistics

and multi
(17) Simple, Partialand Multiple Correlation. The distinction between simple, partialstudied
are it i
ple correlation is based upon the number of variahles studied When only two variables
a problem of either
it is
a problem of simple correlation. When three or more variables are studied
are Studied Slmuitane
multiple or partial correlation, In multiple correlation three or more variables per acre and both the
relationship between the yield of rice
Ousiy. ror example, when we study the
amount of rainfall and the amount of fertilisers used it is
a problem of multiple correlation. Similarly,
partial correlation we
of plastic hardness. temperature and nressure is multivariate. In
variables to be influencing each other, the
ne relationship
variables. But consider onlv two
recogntse more than two example, in the rice problem taken above if
variable being kept constant. For
enect of other influencing rainfall to periods when a certain average
daily tempera
correlation analysis of vield and problems relating
we limitour this chapter, we shall study
ture existed, it becomes aproblem of partial correlation. In
to simple correlation only. between linear and non
(Curvilinear) Correlation. The distinction
(7) Linear and Non-linear change between the variables. If
the amount
the constancy ofthe ratio of variable,
Iinear correlation is based upon constant ratio to the amount of change in the other
tends to bear a following two variables Xand Y:
of change in one variable linear. For example, observe the
then the correlation is said to be 50
30 40
X: 10 20
280 350
210
Y: 70 140 variables are plotted
between the two variables is the same. Ifsuch
Itis clear that the ratio of change
straight line.
graph paper, all the plotted points would fall on a change inone variable does
on a or curvilinear if the amount of
non-linear
Correlation would be called
other variable. For example,
if we double the
amount of change in the
not bear a constant ratio to
the necessarily be doubled. It may be
production of rice or wheat, et., would not variables. How
amount of rainfall, the a non-linear relationship between the
practical cases we find complicated than
pointed out that in most non-linear correlation are far more
for measuring
ever, since techniques of analysis make an assumption that the relationship
between the variables
generally
those for linear correlation, we
is of the linear type. curvilinear correlation:
diagrams will illustrate the difference between linear and
The following two
CURVILINEAR CORRELATION
POSITIVE LINEAR CORAELATION

X X

METHODS OF STUDYING CORRELATION


The following are the important methods of ascertaining whether two variables are correlated or not :
I. Scatter Diagram Method ;
Correlation Analysis 203

II. KarlPearson's Coefficient of Correlation ;


III. Spearman's Rank Correlation Coefficient ;and
IV. Method of Least Squares, *
of these, the first one is based on the knowledge of graphs whereas the others are the mathematical
methods.Each of these methods shall be discussed indetail in the following pages.
I. SCATTER DIAGRAM METHOD
The simplest device for studying correlation in two variables is a special type of dot chart called
dotogram or scatter diagram. When this method is used, the given data are plotted on a graph paper in
the form of dots, i.e., for each pair of Xand Yvalues we put dots and thus obtain as many points as tne
number of observations. By looking to the scatter of the various points, we can form an idea as to
whether the variables are related or not. The more the plotted points scatter" over a chart, the
lesser is the degree of relationship in between the two variables. The more nearly the points come
line falling from
to the line, the higher the degree of relationship. If all the points lie on a straightperfectly
be positive
the lower left-handcorner to the upper right-hand corner, correlation is said to
straight line rising from
(i.e., r =+ 1)(diagram I). Onthe other hand, if all the points are lying on acorrelation is said to be
diagram,
the upper left-hand corner to the lower right-hand corner of the
PERFECT POSIIIVE ORRELAIION PERFHCT NRGAIIVE ORREATICN

X X
II
plotted points fall in a narrow band, there
perfectly negative (i.e., r = -1) (diagram II). If thevariables-correlation shall be positive if the
would be a high degree of correlation between the
left-hand corner to the upper right-hand corner
points show a rising tendency from the lower
tendency from upper left-hand corner to
(diagram III) and negative if the points show a declining
the other hand, if the points are widely
the lower right-hand corner of the diagram (diagram IV). On
HGH DHGREE OF HGH DBRE F
POSIIIVE CORREATION NEGATIVE ORREATICN

X X
Y
X

X
III IV
*This method is discussed in detail in Chapter on 'Regression Analysis".
204 Business Statistics

scattered over the diagrams it indicates very low degree of relationship between the variables
correlation shall be positive if the points are rising from the lower left-hand corner to the upper
upper left-hand side
ignt-nand corner (diagram V) and negative if the points are running from the
lie on a straight line
tothe lower right-hand side to the diagram (diagram VI). If the plotted pointsrelationship
the absence of any between
Paraiiel to the X-axis, or in a haphazard manner. it shows
the variables (i.e., r= 0) as shown by diagram VII.
LOW DGRE
LOW DRRE OF
NGATIVE ORRHAIICN
POSIIE ORREATION

X X X
X
X
X
X

X X X
X

X X X
X X

X X

X X
X

X
VI

NO CORRELATION (r = 0)

X
X
X X
X
X X
X
X
X X X
X

X X X
X

X X
X

VII
Ilustration 1. Given the following pairs of values :
Capital employed (Rs. Crore): 2 3 4 5 7 8 9 11 12
Profits (Rs. Lakhs): 3 5 4 8 10 12 14
(a) Make ascatter diagram.
(b) Do vouthink that there is any correlation between profits and capital employed ? Is it positive ? Is it high or low ?
Solution. By looking the scatter diagram we can say that the variables : profits and capital employed are corelated.
Further, correlation is positive because the trend to the points is upward rising from the lower left-hand cormer to the upper rieht
hand corner of the diagram. The diagram also indicatcs that the degrce of relationship is high because the plotted points are in anarrow
band which shows that it is a case of high degree of positive correlation.
Correlation Analysis 205

14

PR(LORaFkITshS.) 12

10

X
0 2 4 6 8 10 12
CAPITAL EMPLOYED (Rs. Crore)

Merits and Limitations of the Method


of studying correlation between the vari
Merits : 1. It is a simple and non-mathematical method
idea can very quickly be formed as to whether or
ables. As such it can be easily understoodand a rough
not the variables are related. methods of
influenced by the size of extreme values whereas most of the mathematical
2. It is not
finding correlation are influenced by extreme values. relationship between the
3. Making ascatter diagram usually is the first step in investigating the
variables. also
By applying this method we can get an idea about the direction of correlation and
Limitations. between the variables
establish the exact degree of correlation
whether it is high or low. But we cannot
method.
as is possible by applying the mathematical
OF CORRELATION
II. KARL PEARSON'S COEFFICIENT
correlation, the KarlPearson's method, popularly
Of theseveral mathematicalmethods of measuring widely used in practice. The coefficient of cor
known as Pearsonian coefficient of correlation,
is most
denoted by the symbol r. It isone of the very few symbols that is used universally for describ
relation is study are
between two variables. If the two variables under
ing the degree and direction of relationship Pearson can be used for measuring the degree of
Xand Y, the following formula suggested by Karl
relationship.
2(X- X) (Y- D) ...)

Where X and Y are the respective means of X and Yvariable.


The above formula can be written as :
Exy .)
p* =

where x= (X- X)and y = (Y ).


206 Business Statistics

This formula is to be used only where the deviations are taken from actua means and not from
assumed means.
The coefficient of correlation can also be calculated from the original set of observations (ie.,
without taking deviations from mean) by applving the following formula
EX EY

** = N

Er?
(E)? / E _ ( E n ?

N N

NEXY- 2X £Y
...(üi)
JNEx'- (EX)' JNEr' -(En?
always lie between
The value of the coefficient of correlation as obtained by the above formula shall When r = -1, it
between the variables.
Il. When r =+l, it means there is perfect positive correlation
means there is perfect negative correlation between the variables. When r = 0,
it means there is no
relationship between the two variables. However, in practice, such value of r as +1, -l, and 0 are
rare.
of correlation
We normally get values which lie between + l and-l such as 0.8, -0.4, etc. The coefficient
describes not only the magnitude of correlation but also its direction. Thus, +0.8 would mean that
correlation is positive because the signof r is +ve and the magnitude of correlation is 0.8.
The following illustration will clarify the procedure of computing the coefficient of correlation :
Illustration2. Find correlation coefficient between the sales and expenses from the data given below :
Firm 2 3 4 5 6 7 8 10
Sales (Rs. Lakhs) 50 50 55 60 65 65 65 60 60 50
Expenses (Rs. Lakhs): 11 13 14 16 16 15 15 14 13 13
1
Correlation Analysis
Introduction

Everyday, managers make decisions that are based on predictions


of future events.
To make these forecasts, they dependon the relationship between what is already know.1
known is related to the future
and what is to be estimated. If they can determine, how the
Correlation analysis show,
event, they can help the decision-making process considerably.variables.
us, howto determine the nature of a relationship between two
MEANNG AND DEFINITION

Meaning
between two variables. Statistically, when
The term correlation implies a relationship
is a corresponding change in the
with the change in direction of one variable, there correlated. For example, there may
direction of the other, the two variables are said to
be
increase in the years of experience (). The
be change in the salary of a person (X) with which ranges between + 1. An absence of
coefficient
degree of change is expressed by a the
correlation is indicated by zero. Correlation has nothing to do with the units in which
variables are expressed.
Definitions
correlation are
Some of the important definitions of
According to Croxton and Cowden, "When
the relationship is of a quantitative
discovering and measuring the relationship
and
nature, the appropriate statistical tool for
expressing it ina brief formula is known
as correlation."
attemptstodetermine the 'degree
According to Ya Lun Chou, "Correlation analysis
of relationship' between variables.
According to W.I. King, "Correlation means
that between two series or groups of
connection."
data there exists some casu.:l
items are recorded with respect to
According to Wessel and Willet, *When a group of
the values of two distinct variables and it
is found that pairs of values tend to be
correlated."
associated, the two variables are said to be
According to L.R. Connor, "If two or more quantities vary
in sympathy so that
corresponding movements in the
movements in the one tend to be accompanied by
correlated."
other(s) then they are said to be
USES OFCORRELATION
OR
CORRELATION
IMPORTANCE OR SIGNIFICANCE OF

Various uses of correlation in statistical analysis are


relationships-Economic
(1) Useful in deriving the degree and direction of price and quantity
theory and business studies show relationships between variables likeetc. The correlation
yield
demanded, advertising expenditure and sales, rainfalls and crop
precisely the degree and direction of such relationships.
analysis helps in deriving
(2) Useful in reliable forecastingCorrelation analysis between two variables
decision-making and leads
enables us in reducing the range of uncertainty associated with
2 statlstics

more reliable forccasting. For cxample, if yicld of whcat


remain constant, wc may cxpcct fall in its price. has increased, other factos
(3) Useful in estimating the
The concepts of regression and value of one variable depend on
ratio of variation are based on another variable
these concepts, we can correlation with the help of
variable. For example, ifestimate the value of one variable
given the value of another
probable export at any timethe imnport and export of a country are
period can interrelated, then the
regression equations. easily be estimated on the basis of
import by using
(4) Useful in research and
useful in making analysis, drawinginvestigation-The technique of correlation analysis is
the area of research and conclusions and developing hypothesis and
investigation.
(5) Useful in measuring the
theories in
figure only-In statistical techniquedegree of relationship between two variables in one
can be summed up in a single of correlation, the average of
value of change called the coefficientrelationships in a series
relation derived otherwise can be of correlation. Any
measure of correlation.
verified and tested for significance with reference to the
CORRELATION AND CAUSE AND EFFECT
IL is generally assumed that, in RELATIONSHIP
correlation, there is cause and effect relationship
between two series, but it is not necessary
statistically two variables are found correlatcdinbutevery case. There is a possibility that
For example, a significant positive correlation practically they are not related at all.
may
vaccination in a country has no relevance. A correlation found between increase in crime and
are of no use and are characterised as 'spurious' analysis between these variables
or 'nonsense' corelation.
Thus, it is necessary to ensure that the variables
for correlation analysis are properly
selected to make the analysis purposeful. The following
interpret properly the nature and degree of relationship situations may be considered to
(1) Correlation may be due to chance
Cause and effect relationship between two coincidence-It
series but a
may happen that there is no
obta1ned by applying the formula. For. example, between high degree of correlation is
alum1nium. Such relation is known as nonsense or spuriousproduction of steel and sale of
But it should also be examined if steel and
correlation.
aluminium are used as substitutes, in
which case the relationship may not be spurious and
may serve some analytical purpose.
(2) Mutual dependence-There may be two related
difficult to determine the cause and effect. For variables, in which it becomes
example, if
demanded of a product are increasing, we may draw a conclusion the that
price and quantity
increase in demand
is the cause and increase in price is the
effect.
But it is also possible that increase in price is due to
Supply in future, in which case, anticipation of price increase anticipation
is the
of shortage of
demand (due to cause and increase in
anticipation of future shortage) is theeffect.
(3) Both being influenced by a third variable--It
POSve or negative correlation between two is likely that a high degree of
variable not variables is due to the intluence of a third
included
may not be directly in the analysis. Forexample, change in sale ot T.V. and Refrigerator
correlated.
change in general level of Both may be dependent upon third variable,
income of people. namely
Types of TYPES OF CORRELATION
(1) correlation can be classified under four broad types
Positive and Negative :-
Correlation, (2) Simple and Multiple
Correlation Analysis "3

(3) Partial and Total Correlation,


(4) Lincar and Non-lincar (Curvi-LLincar)Correlation.
() Positive and Negative Correlation--The correlation is said to be positive when
of onc
the values of two variables move in the same direction i.e. ncreasc in the value
in
variable is accompained by an increase in the value of the other variable or a decrease
variable. For
the value of one variable is accompanicd by a decrease in the value of other
example, increase in cducation leads to increase in employment, decrease in rainfall leads
to decrease in agricultural production.
one
The correlation is said to be negative, if an increase (or decreasc) in the values of
variable is follTwed by adecrease (or increasc) in the value of the other i.e. the changes
in the values of two variables move in opposite direction. For example, increase in
employment leads to decrease in crime. decrease in price leads to increase in sales etc.
This type of correlation is alsocalled inverse correlation.
(2) Simple and Multiple Correlation-Correlation is said to be simplc when only
two variables are studied. For example, yield of agricultural production and the use of
fertilizerS, or height and weight of persons. If more than two variables are
studied
simúltaneously, the correlation is said to be multiple. For example, the relationship of
pesticides
yield of agricultural production may be examined with reference to fertilizers,
etc. or relationship of price, demand and supply of a product.
(3) Partial and Total CorrelationThere are two types of multiple correlation
analysis.
The correlation is partial when we study the correlation between two variables
example,
neglecting the influence of some other variables on both the variables. For
excluding the effect of
correlation between yield of agricultural production and fertilizers
pesticides.
which, however,
The correlation is total which is based on all the relevant variables,
is normally not feasible.
(4) Linear and Non-linear CorrelationCorrelation is said to be linear, if the
amount of change
amount of change in one variable tends to bear a constant ratio to the
in the other variable. For example, consider the figurcs related to the following two
variables

Linear Correlation Non-linear Correlation

Variable Y Variable X Variable Y


Variable X
50 50 2

60 60 12

70 70 26

80 12 80 40
Ifthe corresponding values of two variables with linear correlation are plotted on a
graph, a straight line is obtained. Mathematically, this relationshipmay be expressed as
Y = a+ bX
The correlation is said to be non-linear (Curvi-linear) if the amount of change in one
variable does not bear a constant ratio to the amount of change in the other related
variable. For example, if we double the use of fertilizers, the agricultural pruductior
would not necessarily be double.
Correlation, generally speaking,refers to a linear relationship.
Statlstice

DEGREE OF CORRELATION
be determined
The degrcc or the intensity of relationship betwcen two variables can
by computing the value of cocfficient of correlation. On the basis of coefficicnt of
correlation, the degree of correlation may be of three types :
) Perfect correlation-If the relationship between two variables is such that with
an increase in the value of one variable, the value of other variable increases or decreascs
in a fixed proportion,correlation betwcen them is said to be perfcct. It is of two types -
(a) Perfect positive correlation-If both the series move in the sarme dircction and
inthe same proportion there would be perfect positive correlation between them.
The cocfficient of correlation in this case would be + 1.
(b) Perfect negative correlation-If both the series move in reverse direction and
in same proportion, there would be perfect negative correlation between them.
The coefficient of correlation in this case would be - 1.
Perfect correlation is obtained when there is complete mutual dependence between
the two series.
(2) Limited degree of Correlation-Limited degree of correlation is common in
cconomic, business and social activities and can be very high, high moderate, low or very
low. It is of two types
in two variables in
(a) Limited positive correlation--If there are unequal changes
positive. The coefficient of
the same direction, correlation is said to bc limited
correlation in this case would be between 0 and 1.
changes in two variables in
(b) Limited negative correlation--If there are unequal be limited negative. The
to
the opposite direction, the correlation is said between 0 and - 1.
coefficient of correlation in this case
would be
is observed between the two
(3) Absence of Correlation-If no relationship this case
correlation. The coefficient of correlation in
variables, it is known as absence of
would be 0 (zero).
correlation according to Karl Pearson's
The following chart shows degrees of
formula
DEGREEOF CORRELATION

Positive Negative
Degree
1. Perfect correlation
(a) Perfect positive +1
(b) Perfect negative
2. Limited degree -0-9 to -0-99
+0-9to+ 0-99
(a) Very high -0-75 to -0-9
+0:75 to +0:9
(b) Fairly high -050 to -0-75
+0-50to+ 0:75
(c) Moderate -0-25 to 0-50
+025 to0 +0-50
(d) Low below 0 to 025
below 0to +025
(e) Very low
0
3. Absence of correlation
CORRELATION
METIODS OFDETERMINING SIMPLE
correlation the different methods ot
Initially, we confine oursclves to simple linear
finding simple lincar correlation are as follows -
Correlation Analysis " 5

() Graphic Methods
(a) Correlation Graph
(b) Scatter Diagram or (Dotogram)
(1) Algebraic (or Mathematical)Methods
(a) Karl Pearson's coefficient of correlation or (Covariance Method)
Difference Method)
(b) Spcarman's Rank Co-efficient of correlation or (Rank
(c) Concurrent Deviation Method
(d) Least Squares Method (or Method of least squares)
()CORRELATION GRAPII
for each of the variable under
Under this method, two curves are drawn on thcgraph
y-axis and the values of a
study. The values of two relatcd variables are represented on
common reference viz. time, place etc. are prepresented
on x-axis i.e. the base linc. Such
semi-logarithmic or ratio scale
graphs can be drawn either on a natural scale, or on a
depending upon the size of the magnitude of the
data. Further, if the minimum values of
line is drawn in order to avoid the
the variables are much above zero, a false base
unnecessary empty spaces in the graph.
Interpretation of Correlation Graph
After viewing the graph, the inference about
the nature and degrce of correlation is
of the two curves :
drawn roughly by observing the direction and closeness
curves drawn on the graph are moving
(1) Perfect positive correlation-If both the
the correlation is perfect positive.
in the same direction (either upward or downward),
the curves move in different directions.
(2) Perfect negative correlation--If both perfect
moves upward and the other downward or vice-versa, it would indicate a
i.e. one
negative correlation between the variables.
curves move criss-croSs and show erratic
(3) Absence of Correlation-If the
movements, it would indicate that either there is
no correlation or there is very low degree
under study.
of correlation between the two variables
Illustration 1.
correlation between the two variables of income
From the following data, study the R Crores)
and expenditure using the graphic method : 2013 2014 2015
2008 2009 2010 2011 2012
Year
7 10 6 5 7
Income 4 5
5 4 3 2 8
Expenditure 6

Solution :
Correlation Graph
Rs.) -+Income --e- Expenditure

crore
Expenditur
(in

Income
&

X
2006 2007
2000 2001 2002 2003 2004 2005
Years
6" Statistics
Conclusion-I is clear from the above
is rising. the expenditure curve is falling correlation graph that when the
there isinncome
correlation between and vice-versa. Thus, curve
income and expenditure variables. a
Illustration 2. negative
Prepare a correlation graph on
correlation between age and the basis of following
blood-pressure : data and comment about he
S. No.
2 3 4
Age (Years) 55 6 7
40 70 35
8
Blood-pressure: 145 125 160 120
60
150
45 S5 50 40
Solutior 130 150 145 140
CORRELATION GRAPH
-+Age (years) ---Blood Pressure
75
160
65 155
150
145
o 140
135
130
125
120

1 2 3 4 6 7 8 9
S.No.
Conclusion-As both curves move in the same
very close to each other. Hence, there is a direction and also these curves are
age and blood-pressure. high degree of positive correlation between
Advantages
I: is the simplest to use.
2 It can be used for simple as well as multiple
correlation.
Disadvantage
We can conclude only a rough estimate of the
nature of correlation and an exact
degree of correlation may not be known.
(II)SCATTER DIAGRAM OR (DOTOGRAM OR
Under this method, a diagram is prepared on the basis ofSCATTERGRAM)
variables. The values of one of the variables are representedcorresponding values of two
on x-axis and those of ne
other variable on y-axis through natural scale.
For each paid or X and Y values, we mark a dot and we get as
many points on the
graph as the number of observations. The diagram of dots so obtained is called a scatter
diagram.
By examining the shape of the plotted dots, the degree of correlation between the
variables, can be estimated.
Interpretation of Scatter Diagram
After viewing the scatter diagram, the inference about the nature and degree
correlation is drawn as follows :
(1) Positive or Negative Correlation-If the trend of the points is upward ris
from the lower left to upper right, the correlation is positive since it shows that the Valu
Correlation Analysis "7
points
variables move in the same direction. On the other hand, if the trend of
of the two negative.
isreverse from upper left to the bottom right, the corrclation is

X 0 +X
Positive correlation Ncgative correlation

negative correlation-If all the


points form a
(2) Perfect positive or Perfect
correlation between
right, there is perfect positive
straight line from lower left to upper lower right.
variables. On the other hand, if the line is reversed from uppcr left to the
the
the correlation is perfect negative. degree of
(3) High degree or Low
correla-tion-If the points on a scatter
diagram show a very little
spread, a fairly
expected
high degrec of corrclation can be
between the variables. On
the other hand,
+X 0 X
widely, a low
Perfect Positive Perfect Negative
Corelation if the points are spread
degree of correlation isexpected.
Correlation

Y+

X X
High Degree of Low Degree of
Positive Correlation
Positive Correlation

show any trend, the two


Absence of correlation-If the plotted points do not
(4)
variables have no correlation.

X O +X
No Correlation
No Correlation

Illustration 3.
correlation through a scattcr diagram
From the following pair of data, study the 50
20 30 40
X 10
6 10
4
Y
Solution:

o
Y
Values
f
8

+X
10 20 30 40
Values of X
8 S t a t i s t i c s

Conclusion: -It is clear from


the abovc scatter
rTelationbetween the values of two diagram
Illustration 4, variables X and Y, that there is a perfecl
Following are the marks in positive
ascatter diagrarn
and comment onStatistics and Accountancy of
6
Marks in Statistics the degree of correlation students in a class.
Marks in Accountancy 6
8
Draw
Solution: 6 10

M(Aacironkustcy)
7
8

1 2 3 4 5 6 7 8 9 10 +X
Marks (in Statistics)
Conclusion--I is clear from the above graph that there is very high degree of
positive correlation between marks of
Statistics and Accountancy.
Advantages
(1) Simple-It is a sinmple and attractive
between variables. method for determining the correlation
(2) Easy to draw interpret--It is very
easy to draw a scatler diagram and a
common-man can understocd it easily and make interprelations.
(3) Useful in estimating the value of missing
value of the independent variable is given. dependent variable--When the
(4) Useful in detecting abnormal variations or
questionable data.
(5) No effect of extreme values-Values of exreme items do
not affect the result.
Such points remain isolated 1n the diagrn.
Disadvantages
(1) It does not provide the precise degree of correlation-It shows only a visual
picture of relationship between two variables which indicates the direction of relationship
i.e. positive or negative etc.
(2) Not suitable for further mathematical treatment-Since it does not
provide
an exact measure of the extent of relationship between the variables.
(II) KARI. PEARSON'S COEFFICIENT OF CORRELATION
Karl Pearson's method, popularly known as Pearson's cocfficient of correlation iS
mostwidely used in practice. It is denoted by symbol 'r, and based on the covariance of
the concerned variables.
The formula for computing Pearsorian 'r an take various alternative forms
depending upon the choice of the user.
Karl Pearson's Co-variance
Co-variance is the method of finding out joint variations between two variables. It is
given by the formula
Illustration 7.
Calculate the co-efficient of correlation from the following data through Karl
Pearson's Method:
12 9 8 10 11 13 7
14 6 9 11 12 3
[CCS Univ. BBA, June 2012]
Correlation Analysis " 11

Solution :
CALCULATION OF COEFFICIENT OF CORRELATION
d'y dxdy
X dx = X-X dy = Y
25 10
2 4 14 5
12 1
-1 1 8 -1
9 6
-3 9
-2 6
8
0 0 9
10 2 4
1 11 9
11 3 9
3 12 18
13 6 36
3
7 -3
Ldy= 84|Sdxdy = 46
SX= 70 Ldx= 28 SY= 63
-2X 70 = 10
N 7
£Y_ 63 =9
N 7

By using the formula Sdxdy


46
84
VSax Sy V28 x
46 0:95
48-497

Illustration 8.
correlation coefficient
for the following data
Calculate Karl Pearson's 4 8
10
6 2
X 5
9 11
2012]
Y [CCS Univ. BCA, Dec.

Solution : COEFFICIENT OF
CORRELATION

CALCULA TION OF
d'y dxdy
dy =y - y
dñ = x - I d'x 1 0
X 1
0 9
6 9 - 12
11 3
-4 16 9 - 12
2 5 -3
4 16 0
10
4 8
4 -2 1 -2
7 -1
2 4
8 Ldy = 20 dx dy =-26
Sdx= 40 y = 40
EX=30
EX 30 =6
=
N
SY 40= 8
N 5
correlation bctween X and Yis
The coefficient of Ldxdy - 26

V xSAy V40 x20


r =

- 26 =-0-92
28-284
12 " Statistics
(2) Short-cut method- This method may
both the series are not in whole numbers. Under bethisused when the Arithmetic
from assumed mean (A). Steps involved are : method, the Deviations Means of
are taken
(a) Select any number as assumed
mean (from or outside the serics) say for
(A) and for series- Y(B). series. y
(b) Compute deviations from A in the
Similarly, compute Deviations from first series i.e. (X- A) and denote it by
B in the second series i.e. (Y- B) and d
it by 'dy'. Summate them to obtain dr and Sdy. denote
(c) Multiply the Deviation of both the series and
sum the product to obtain Edxdy
(d) Square the Deviations of both the series
and obtain the sum of their respect
squares of Deviations i.e. SA'x and d'y.
(e) Finally, use the following formula to get the value of coefficient of
correlation
Edx dy EdxN Edy
VEdr2de)2 (Edy)?
N N

Note : 1. Edx Edy in the numerator and and (Edy² in the denominator are
N N N
correlation introduced due to assumed means.
2.Above formula, although looks large and complex, saves a lot of
Computational work.
Illustration 9.
Calculate Karl-Pearson's Coefficient of Correlation between exports and imports:
Exports: 42 44 58 55 89 98 66

Imports: 56 49 53 58 65 76 58

Solution : [B.Com., Meerut, 1997; Garhwal, 2012]


CALCULATION OF KARL PEARSON'S COEFFICIENT OF CORRELATION

X dx from 65 Y dy from 60 dy dxdy


- 23 529 56 - 4 16 92
42
-21 441 49 -11 121 231
44
49 53 -7 49 49
58 -7
100 58 - 2 4 20
55 - 10
+ 5 25 120
89 + 24 576 65
76 + 16 256 528
98 + 33 1,089
1 58' - 2 -2
66 + 1
- 3 2,785 5 475 1,038
EdxdyxN- (Edxx dy)
r=
V[zxxN- (Edx)²1 (Ey x N- (dy)²l
1,038 x7-(-3x-5)
=

V[2,785×7-(-3)] [475 x7-(-5)]


7,266 - 15
V[19,486] (3,300]
=
7,251 =+0-9042
8,018-965
Correlation Analysis " 13

Jlustration 10.
for two
From the following information relating to the Stock Exchange quotationsshares
ares Aand B, ascertain by using Pearson's Coefficient of correlation,whether A
and Bare correlated in their price :
160 164 172 182 166 170 178
Price of share A ()
280 260 234 266 254 230
Price of share B (3) 292
Solution :
CALCULATION OFCOEFFICIENT OF CORRELATION
Share A ShareB
Deviatsion
Deviation
d' Y d'y dxdy
dy Y-B
dx = X-A
32 1,024 -320
-10 100 292
160 400 - 120
36 280 20
164 -6
0
2 4 260 4- B
172 676 -312
12 144 234 -26
182 -24
6 36
-4 16 266
166 0
254 -6 36
170 A -240
64 230 -30 900
178 8
N=7 dx = 2 N=7 Ldy = - 4 Edy Edrdy
= 364 = 3,072 =-1016
Ddx Edy
Edxdy - N
r=

N N
- 1,016_(-4)

7
r=
V 364)2

7 V3,072. 7
- 1,016 + 1·143
V363-43 V3,069-71
- 1,014-86
=-0-961
1,056-24
Conclusion-There is avery high degree of negative correlation between
prices of
shares A and B.
Product Moment Còrrelation
Correlation Analysis " 19

Karl Pearson's Coefficient of Correlationis given by


Edx dy
Ldr Sdy

N VEay_ ay?
(Ddy)

=
-4.090 - (55)(-21l)
V9,155
(55)2 4 , 9 9 9 _ 2 1 ) 2

10 10
-2,935 - 2,935 = 0-443
6,622-93
V8,852-5 x 4,954-9
on the basis of given data--In such
(5) Preparation of X-series, Y-serics or both given
not clearly. Hence, we identify the
cases, the values of serics x or series yboth are
characteristics between which, the coefficient
of correlation is to be calculated and
prepare the series.
Illustration 19.
and playing habit of the following
Find the coefficient of correlation between age
students
18 19 20
15 16 17
Age 100 80
250 200 150 120
No. of students
48 30 12
200 150 90
No. of players 2009;
[B.Com., Agra, 2003, 2008; Kanpur, 2002,
Garhwal, 2003; CCS Univ., 2006, 2011]

Solution :
while playing habit (Y-series) will be
Here. age (X-serics) will be taken as it is,
No. of players X100
calculated by : No. of students
CALCULATION OF COEFFICIENT OF CORRELATION

Age group Playing habits


X dx = X-A d'r Y dy = Y-B d'y dx dy
9 80 40 1,600 - 120
15 -3
4 75 35 1,225 -70
16 -2
60 20 400 -20
17 -1
18 4- A 0 40 - B 0

19 1 30 - 10 100 -10
20 2 4 15 -25 625 - 50

N=6 N=6 Sdy = 60 Ed'y Ldxdy


=-3 = 19 = 3,950 =-270

Coefficient of correlation is given by


Ldr dy N
r=
(Edy2
N

You might also like