Business Research Methods: Bivariate Analysis: Measures of Associations
Business Research Methods: Bivariate Analysis: Measures of Associations
Research Methods
William G. Zikmund
Chapter 23
Bivariate Analysis: Measures of
Associations
Measures of Association
• A general term that refers to a number of
bivariate statistical techniques used to
measure the strength of a relationship
between two variables.
Relationships Among Variables
• Correlation analysis
• Bivariate regression analysis
Type of Measure of
Measurement Association
Chi-Square
Nominal Phi Coefficient
Contingency Coefficient
Correlation Coefficient
• A statistical measure of the covariation or
association between two variables.
• Are dollar sales associated with advertising
dollar expenditures?
The Correlation coefficient for two
rxy
variables, X and Y is
.
Correlation Coefficient
• r
• r ranges from +1 to -1
• r = +1 a perfect positive linear relationship
• r = -1 a perfect negative linear relationship
• r = 0 indicates no correlation
Simple Correlation Coefficient
rxy ryx
X X Y Y
i i
Xi X Yi Y
2 2
Simple Correlation Coefficient
xy
rxy ryx
2
x
2
y
Simple Correlation Coefficient
Alternative Method
2
x= Variance of X
= Variance of Y
2
y
X
.
Y
Correlation Patterns
PERFECT NEGATIVE
CORRELATION -
r= -1.0
X
.
Correlation Patterns
Y
X
.
Calculation of r
6.3389
r
17.837 5.589
6.3389
.635
99.712
Pg 629
Coefficient of Determination
Explained variance
r
2
Total Variance
Correlation Does Not Mean
Causation
• High correlation
• Rooster’s crow and the rising of the sun
– Rooster does not cause the sun to rise.
• Teachers’ salaries and the consumption of
liquor
– Covary because they are both influenced by a
third variable
Correlation Matrix
DICTIONARY GOING OR
DEFINITION MOVING
BACKWARD
120
110
100
90
80
70 80 90 100 110 120 130 140 150 160 170 180 190
.
Regression Line and Slope
Y
130
120
110
100
Yˆ aˆ ˆX
90 Yˆ
80
X
80 90 100 110 120 130 140 150 160 170 180 190
X
.
160
Y
Least-Squares
150 Regression Line
140
Actual Y for
Dealer 7
130
120
90
Actual Y for
80
Dealer 3
70 80 90 100 110 120 130 140 150 160 170 180 190
X
Scatter Diagram of Explained
Y
and Unexplained Variation
130
Deviation not explained
}
{}
120
Total deviation
110
Deviation explained by the regression
100
Y
90
80
80 90 100 110 120 130 140 150 160 170 180 190
X
.
The Least-Square Method
• Uses the criterion of attempting to make the
least amount of total error in prediction of Y
from X. More technically, the procedure
used in the least-squares method generates a
straight line that minimizes the sum of
squared deviations of the actual values from
this predicted regression line.
The Least-Square Method
• A relatively simple mathematical technique
that ensures that the straight line will most
closely represent the relationship between X
and Y.
Regression - Least-Square
Method
e
i 1
is 2
minimum
i
ei = Yi - Ŷi (The “residual”)
Yi = actual value of the dependent variable
Ŷi = estimated value of the dependent variable (Y hat)
n = number of observations
i = number of the observation
The Logic behind the Least-
Squares Technique
• No straight line can completely represent
every dot in the scatter diagram
• There will be a discrepancy between most of
the actual scores (each dot) and the predicted
score
• Uses the criterion of attempting to make the
least amount of total error in prediction of Y
from X
Bivariate Regression
ˆ
aˆ Y X
Bivariate Regression
ˆ n XY X Y
n X X
2 2
̂ = estimated slope of the line (the “regression coefficient”)
â = estimated intercept of the y axis
Y = dependent variable
Y = mean of the dependent variable
X = independent variable
X = mean of the independent variable
n = number of observations
ˆ 15193,345 2,806 ,875
15 245,759 3,515,625
2,900 ,175 2,806 ,875
3,686 ,385 3,515,625
93,300
.54638
170,760
aˆ 99 .8 .54638 125
99 .8 68 .3
31 .5
aˆ 99 .8 .54638 125
99 .8 68 .3
31 .5
Yˆ 31 .5 .546 X
31 .5 .546 89
31 .5 48 .6
80 .1
Yˆ 31 .5 .546 X
31 .5 .546 89
31 .5 48 .6
80 .1
Dealer 7 (Actual Y value 129)
Yˆ7 31 .5 .546 165
121 .6
Dealer 3 (Actual Y value 80 )
Yˆ3 31 .5 .546 95
83 .4
ˆ
ei Y9 Y9
97 96 .5
0 .5
Dealer 7 (Actual Y value 129)
Yˆ7 31 .5 .546 165
121 .6
Dealer 3 (Actual Y value 80 )
Yˆ3 31 .5 .546 95
83 .4
ˆ
ei Y9 Y9
97 96 .5
0 .5
Y9 31 .5 .546 119
ˆ
F-Test (Regression)
Yi Y Yi Y
ˆ ˆ
Yi Yi
Deviation
Deviation unexplained by
Total
= explained by the + the regression
deviation regression
(Residual
error)
Y = Mean of the total group
Ŷ = Value predicted with regression equation
Yi = Actual value
Y Y Y
Yi Yˆi
2 2
2
Yˆ
i i
Total Unexplained
Explained
variation = + variation
variation
explained (residual)
Sum of Squares
SSr SSe
r
2
1
SSt SSt
Source of Variation
• Explained by Regression
• Degrees of Freedom
– k-1 where k= number of estimated constants
(variables)
• Sum of Squares
– SSr
• Mean Squared
– SSr/k-1
Source of Variation
• Unexplained by Regression
• Degrees of Freedom
– n-k where n=number of observations
• Sum of Squares
– SSe
• Mean Squared
– SSe/n-k
r2 in the Example
3,398 .49
r
2
.875
3,882 .4
Multiple Regression
• Extension of Bivariate Regression
• Multidimensional when three or more
variables are involved
• Simultaneously investigates the effect of
two or more variables on a single dependent
variable
• Discussed in Chapter 24
Correlation Coefficient, r = .75
Correlation: Player Salary and Ticket
Price
30
20 Change in Ticket
10 Price
0 Change in
-10 Player Salary
-20
1995 1996 1997 1998 1999 2000 2001