B: Lind
Chapter
Thirteen
Correlation and
Linear Regression
Correlation ACnaolyrsriseislaatgiroounpoAfsntaatilsytiscailstechniques to
measure the association between two variables.
Advertising Minutes and $ Sales
A Scatter Diagram is 30
Sales ($thousands)
25
a chart that portrays the 20
15
10
relationship between 5
0
two variables. 70 90 110 130
Advertising Minutes
150 170 190
The Independent
The Dependent
Variable provides the
Variable is the variable basis for estimation. It
being predicted or is the predictor variable.
estimated.
The Coefficient of Correlation (r) is a measure of the
strength of the relationship between two variables.
Also called Pearson’s r and It requires interval or ratio-
Pearson’s product moment
scaled data.
correlation coefficient.
It can range from Pearson's r
-1.00 to 1.00.
Values of -1.00 or 1.00
indicate perfect and strong
correlation. -1 0 1
Negative values indicate an Values close to 0.0 indicate
inverse relationship and weak correlation.
positive values indicate a
direct relationship. The Coefficient of Correlation, r
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Perfect Negative Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Perfect Positive Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Zero Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Strong Positive Correlation
We calculate the coefficient of correlation from the
following formula.
r= (X – X)(Y – Y)
(n-1)sxsy
Formula for r
Dan Ireland, the student body
president at Toledo State
University, is concerned about
the cost to students of
textbooks. He believes there is
a relationship between the
number of pages in the text and
the selling price of the book.
To provide insight into the
problem he selects a sample of
eight textbooks currently on
sale in the bookstore. Draw a
scatter diagram. Compute the
correlation coefficient. Example 1
Book Page Price($)
Introduction to History 500 84
Basic Algebra 700 75
Introduction to Psychology 800 99
Introduction to Sociology 600 72
Business Management 400 69
Introduction to Biology 600 81
Fundamentals of Jazz 600 63
Principles of Nursing 800 93
Example 1
Scatter Diagram of Pages and Selling Price of Text
120
100
80
Price ($)
60
40
20
0
0 200 400 600 800 1000
Pages
Example 1
(X – X)(Y – Y)
(a) (b) (c) (d) (c)*(d)
Page Price Page - Mean(Page) Price-Mean(Price)
500 84 -125 4.5 (562.5)
700 75 75 -4.5 (337.5)
800 99 175 19.5 3,412.5
600 72 -25 -7.5 187.5
400 69 -225 -10.5 2,362.5
600 81 -25 1.5 (37.5)
600 63 -25 -16.5 412.5
800 93 175 13.5 2,362.5
7,800.0
Example 1
r= (X – X)(Y – Y)
(n-1)sxsy
7800
= 7(138.87)(12.21)
= .657
The correlation between the number of pages and the
selling price of the book is 0.657. This indicates a
moderate association between the variable.
Example 1
In Regression Analysis we use the independent
variable (X) to estimate the dependent variable (Y).
The relationship Both variables must
between the be at least interval
variables is linear. scale.
The least squares criterion
is used to determine the
equation. That is the term
(Y – Y′)2 is minimized.
Regression Analysis
The regression equation is Y′= a + bX
where
Y′ is the average predicted value of Y for any X.
a is the Y-intercept.
It is the estimated Y value when X=0
b is the slope of the line, or the average change
in Y′ for each change of one unit in X
The least squares principle is used to obtain a
and b.
Regression Analysis
The least squares principle is used to obtain a and
b. The equations to determine a and b are:
b = r sy a = Y – bX
sx
Regression Analysis
Develop a regression
sy
equation for the b=r
information given in
sx
example 1 that can be = (.657) 12.21
used to estimate the 138.87
selling price based on
= .0578
the number of pages.
a = Y – bX = 79.5 - .0578×625 = 43.39
Example 1 revisited
The regression equation
The sign of the b
is:
value and the sign
Y′ = 43.39 + .0578X
of r will always be
the same.
The slope of the line is
.0578. Each addition The equation crosses
page costs about a the Y-axis at $43.39. A
nickel. book with no pages
would cost $43.39!
Example 1 revisited
We can use the
regression equation
to estimate values
of Y.
The estimated selling price of an
800 page book is $89.61, found by
Price = $43.39 + .0578(Number of Pages)
= $43.39 + .0578(800)
= $89.61
Example 1 revisited