0% found this document useful (0 votes)
11 views9 pages

Correlation Notes

The document outlines various statistical concepts, focusing on sampling, correlation, and statistical inference. It details different types of correlation, including linear, nonlinear, and specific coefficients such as Pearson's, Spearman's, Biserial, Point Biserial, Tetrachoric, and Phi coefficients, along with their assumptions and applications. Additionally, it discusses the importance of understanding relationships between variables and the methods for computing these correlations.

Uploaded by

Rana Talukder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

Correlation Notes

The document outlines various statistical concepts, focusing on sampling, correlation, and statistical inference. It details different types of correlation, including linear, nonlinear, and specific coefficients such as Pearson's, Spearman's, Biserial, Point Biserial, Tetrachoric, and Phi coefficients, along with their assumptions and applications. Additionally, it discusses the importance of understanding relationships between variables and the methods for computing these correlations.

Uploaded by

Rana Talukder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit 1:

Sampling. Normal probability curve: Properties and applications. Standard


scores. (15 Hours)

Unit 2:
Introduction to correlation: Meaning of bivariate distribution; product moment,
rank difference, Biserial, point biserial, tetrachoric, phi coefficient, contingency
coefficient – Computation and use. (15 Hours)

da
Unit 3:
Statistical inference – concepts and steps involved in drawing a statistical

an
inference.
Concept of parametric and non-parametric statistics.

Ch
Experimental hypothesis – null hypothesis and its testing.

Concept of standard error. Computation and use of t-test and chi square test. (15
Hours)
ha
ist

UNIT 2 CORRELATION

Correlations can be used to study relationships between two or more variables.


m

It is a measure of association between two or more variables and this


relationship is determined not only in terms of direction, whether negative or
ar

positive but also in terms of its magnitude, whether high or low.

Sir Francis Galton’s contribution to the development of correlation is


Sh

noteworthy. He carried out studies on individual differences and also studies on


the influence of heredity. He studied the association between the height of
parents and that of their children with the help of bivariate distribution (that
studies relationship between two variables) and found that the parents who are
tall have children who are also tall (Veeraraghavan and Shetgovekar, 2016).
Further, in 1986, Karl Pearson put forth mathematical procedures for
correlation. Correlation can be categorised into linear and nonlinear
correlation.

Linear Correlation:

Linear correlation is denoted by a single straight line in a graph that denotes

da
linear relationship between two variables. Such a graph indicates whether
increase in one variable leads to increase in another variable and vice versa, or
decrease in one variable leads to increase in another variable and vice versa. For

an
example, if the scores on emotional intelligence increase or decrease, the scores
on self esteem also increase or decrease.

Nonlinear correlation:
Ch
As opposed to linear relationships, in nonlinear relationships. The relationship
between two given variables is not denoted by a straight line. Thus, the
ha
relationship is curvilinear as denoted in figure .
ist
m
ar
Sh

Direction and Magnitude of Correlation

●​ Positive correlation:
Positive correlation denotes that increase in one variable leads to increase in
another variable and decrease in one variable leads to decrease in another
variable. For example, if the scores on emotional intelligence obtained by
adolescents increase, then the scores obtained by them on achievement
motivation will also increase or if the scores on emotional intelligence obtained
by adolescents decrease then the scores obtained by them on achievement
motivation will also decrease.

Positive correlation indicates that both the variables are moving in the same

da
direction. s a scatter diagram denoting positive relationship between two
variables, A and B. Scatter diagram can be effectively used to present a

an
bivariate distribution that denotes relationship between the two variables.

●​ Negative Correlation:
Ch
Negative correlation denotes that increase in one variable leads to decrease in
another variable or decrease in one variable leads to increase in another
ha
variable.

For example, if the scores on occupational stress obtained by employees


ist

increase, then the scores obtained by them on work motivation will decrease or
if the scores on occupational stress obtained by employees decrease, then the
scores obtained by them on work motivation will increase.
m
ar

●​ No Correlation or Zero Correlation:

It may so happen that there is no relationship between the two variables. In such
Sh

a case the correlation will be zero (this will be further clear as we discuss the
magnitude of correlation).

Thus, in this case the relationship is neither positive nor negative. There are
such variables where there might be no relationship, for example, there may be
no relationship between height of persons and years of their work experience or
there may exist no relationship between weight of persons and attitude towards
environment.
da
an
—---------------------------------------------------------------------------------------------

Ch
Pearson’s product moment correlation is one of the methods to
compute coefficients of correlation. This is mainly used when the assumptions
of parametric statistics are met. This method is named after Karl Pearson, who
invented this method. It is denoted by ‘r
ha

Assumptions of Pearson’s Product Moment Correlation


ist

The assumptions of Pearson’s product moment correlation are as follows:


1) The variables used to compute ‘r’ are continuous in nature and the scales of
m

measurement are interval and ratio.


ar

2) The distribution of the variables in this method is unimodal and it is close to


symmetrical. The distribution need not be normal.
Sh

3) The pairs of scores involved are independent in nature and are in no way
connected with each other.

4) There is a linear relationship between the two variables. A scatter gram thus
drawn with the help of scores in the two variables, will denote a straight line.

5) ‘r’ is mainly used to ascertain the sign and size of the correlation that can be
positive, negative or zero correlation and will range between -1 to +1.
Uses of Pearson’s Product Moment Correlation

1) It helps in determining the relationship between two variables quantitatively.


With quantification, it is possible for us to compare.

2) Based on ‘r’, the regression equation can be computed. Thus, after computing
‘r’, it is possible to compute regression and determine whether one variable can

da
be predicted based on another variable.

3) ‘r’ can be used in computation of reliability and validity of psychological

an
tests.

4) It will also assist in computation of factor analysis.

COMPUTATION - CLASS NOTEBOOK.


Ch
ha
—---------------------------------------------------------------------------------------
Spearman’s rank order correlation
ist

This method is used when the assumptions of parametric statistics are not met.
The method is named after Charles Spearman, who is known for his significant
work on factor analysis and theory of intelligence besides Spearman’s rank
m

order correlation.
ar

Assumptions for Spearman’s Rank Order Correlation


Sh

The assumptions of Spearman’s rank order correlation are as follows:

1) The variables are measured in terms of ordinal scale.

2) The relationship between the two variables is linear in nature.

3) The observations are independent in nature, thus denoting that the sample
needs to be randomly selected.
4) The pairs of scores are independent in nature and are in no way connected
with other pairs.

USES-

1) It is used when the data is measured with the help of ordinal scale.

2) It is especially useful when the sample size is small, that is, less than 25- 30

da
(Mohanty and Misra, 2016).

an
3) Many times it is not possible to measure traits directly. Thus, they are
measured in terms of ranks. Spearman’s rank order correlation involves
separately ranking the scores in the two data, followed by computation of co
relationship between them.

Ch
4) It can be used to study the degree of relationship between two variables that
are monotonic. A relationship is termed as monotonic when the variables
ha
display consistent but one directional relationship.

Biserial r coefficient
ist

Biserial r coefficient is used when a variable is continious ( interval and ratio)


and another variable is artificially dichotomous (Artificially split from an
underlying continuous variable).
m
ar

[Link] variable is a continuous measurement variable (continuous metric


score),while the other is an apparently dichotomous or artificially dichotomous
variable.
Sh

2. The continuous measurement variable involved has a normal or near normal


distribution in the population without much skewness.

3. The continuous metric data, underlying the dichotomous variable, have a


unimodal and normal distribution in the population- in case of doubt of this
assumption point biserial can be computed.
4. Linear relationship between the variables.

5. In each class of dichotomized variable, every score of continuous variable


occurs at random and independent of other scores.

6.
Proportion in each dichotomized class should not be extreme
The proportion of cases in each category (e.g., 0 and 1) of the dichotomized
variable should be close to [Link] proportions are highly skewed (e.g., 90% vs

da
10%), results may be distorted.

an
Point Biserial r coefficient

Ch
[Link] variable is a continuous measurement variable (continuous metric
score),while the other is a naturally dichotomous nominal variable. Which can
not yield a continuous or normal distribution even on further exploration.
ha
[Link] continuous measurement variable involved has a normal or near normal
distribution in the population without much skewness.
3. In each class of dichotomized variable, every score of continuous variable
ist

occurs at random and independent of other scores.

4. Linear relationship between the variables.


m
ar

—------------------------------------------------------------------------------------
Sh

Tetrachoric r-

Tetrachoric correlation is a correlation between two dichotomous variables that


have underlying continuous distribution. If the two variables are measured in a
more refined way, then the continuous distribution will result.
For example, attitude to females and attitude towards liberalisation are two
variables to be correlated. Now, we simply measure them as having positive or
negative attitudes.

So we have 0 (negative attitude) and 1 (positive attitude) scores available on


both the variables. Then the correlation between these two variables can be
computed using Tetrachoric correlation (r tet).

da
Assumption-

an
1. The variables either have continuous metric data which have been artificially
dichotomized,or are only apparently dichotomous and may yield continuous
scores on further exploration.

Ch
2. The continuous metric data, underlying the dichotomous variable, have a
unimodal and normal distribution in the population.
ha
3. Linear relationship between the variables.

4. Proportion in each dichotomized class should not be extreme


ist

The proportion of cases in each category (e.g., 0 and 1) of the dichotomized


variable should be close to [Link] proportions are highly skewed (e.g., 90% vs
10%), results may be distorted.
m
ar

—---------------------------------------------------------------------

Phi coefficient -
Sh

Non parametric statistics of correlation between two genuinely dichotomous


nominal variables, each having two classes with a genuine intervening gap.

Assumption-

1.​ Both variables are genuinely dichotomous with no reasonable expectation


to yield any continuous series of data on more extensive exploration.
2.​ Each of the variables has a binomial distribution in the population with a
genuine gap between its two classes.

—-----------------------------------------------------------------------------------

SUMMARY -

da
an
Pearson r (product Both variables at continuous level (iNTERVAL
moment correlation) AND RATIO) .
Spearman rank or
rho
Biserial r Ch
Both Variable at ordinal level (RANK).

One variable is continuous and another is


artificially discrete.
ha
Point biserial r One variable is continuous and another is
naturally discrete.
ist

Tetrachoric Two artificially discrete variables.


Phi coefficient Two naturally discrete variables.
Contingency When both variables are catagorial but not
m

necessarily dichotomous.
Two nominal variables -represent categories
ar

without a natural order.


Sh

You might also like