Linear Regression
Aims
Understand linear regression with one predictor
Understand how we assess the fit of a regression
model
Total Sum of Squares
Model Sum of Squares
Residual Sum of Squares
F
R2
Know how to do Regression on PASW/SPSS
Interpret a regression model
Slide 2
What is Regression?
A way of predicting the value of one variable
from another.
It is a hypothetical model of the relationship
between two variables.
The model used is a linear one.
Therefore, we describe the relationship using the
equation of a straight line.
Slide 3
Describing a Straight Line
Yi b0 b1X i i
bi
Regression coefficient for the predictor
Gradient (slope) of the regression line
Direction/Strength of Relationship
b0
Intercept (value of Y when X = 0)
Point at which the regression line crosses the Yaxis (ordinate)
Slide 4
Intercepts and Gradients
The Method of Least Squares
Slide 6
How Good is the Model?
The regression line is only a model
based on the data.
This model might not reflect reality.
We need some way of testing how well
the model fits the observed data.
How?
Slide 7
Sums of Squares
Slide 8
Summary
SST
Total variability (variability between scores and the mean).
SSR
SST yi y
Residual/Error variability (variability between the
regression model and the actual data).
SSM
SS R yi
yip
Model variability (difference in variability between the
model and the mean).
SS M
Slide 9
yip
Testing the Model: ANOVA
SST
Total Variance In The Data
SSM
SSR
Improvement Due to the Model
Error in Model
If the model results in better prediction
than using the mean, then we expect SSM to
be much greater than SSR
Slide 10
Testing the Model: ANOVA
Mean Squared Error
Sums of Squares are total values.
They can be expressed as averages.
These are called Mean Squares, MS
F
Slide 11
MSM
MSR
Testing the Model: R2
R2
The proportion of variance accounted for by the
regression model.
The Pearson Correlation Coefficient Squared
R
Slide 12
SSM
SST
Regression: An Example
A record company boss was interested in
predicting record sales from advertising.
Data
200 different album releases
Outcome variable:
Sales (CDs and Downloads) in the week after release
Predictor variable:
The amount (in s) spent promoting the record before
release.
Step One: Graph the Data
Slide 14
Regression Using PASW/SPSS
Slide 15
Output: Model Summary
Slide 16
Output: ANOVA
SSM
MSM
SSR
MSR
SST
Slide 17
SPSS Output: Model Parameters
Slide 18
Using The Model
Record Sales i b0 b1Advertisin g Budget i
134.14 0.09612 Advertisin g Budget i
Record Sales i 134.14 0.09612 Advertisin g Budget i
134.14 0.09612 100
143.75
Slide 19