0% found this document useful (0 votes)

30 views11 pages

08 Multiple Regression

The document describes the multiple linear regression model with k independent variables. It explains that the coefficients of the population model are estimated using sample data. The least squares method is used to calculate the sample intercept and partial slopes that minimize the sum of squared errors between the observed and predicted y-values. An example is provided to illustrate the multiple regression equation and how it can be used to make predictions about pie sales based on price and advertising expenditures.

Uploaded by

ng louis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views11 pages

08 Multiple Regression

Uploaded by

ng louis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

The Population Model

Multiple Regression Multiple Regression Model with k Independent Variables:

Population Intercept Population (partial) slopes Error

Y ȕ0 ȕ1 X 1 ȕ2 X 2 ! ȕk X k İ
N(0,sig^2)

The Sample Model Least Squares Method

The coefficients of the population model are estimated b0,…,bk are values that minimize the sum of the
using sample data squared errors (SSE) :

SSE ¦ (Y Yˆ )
i i
2
¦ (Y (b
i 0 b1X1i ... bk Xki ))2
Sample
intercept Sample (partial) slopes
Predicted Y
Normal equations :

Yˆ b0 b1 X 1 b2 X 2 ! bk X k ¦e i 0

¦X e 1i i 0

#
¦X e ki i 0
Example:
ANOVA for Multiple Regression Two Independent Variables
[AQ
A distributor of frozen desert pies wants to
SST SSR SSE evaluate factors thought to influence demand
Total Sum of Regression Sum Error Sum of Y : Pie sales (units/week)
Squares of Squares Squares
X’s : Price (in $)
¦ (Y Y )
i
2
¦ (Yˆ Y )
i
2
¦ (Y Yˆ )
i i
2 Advertising ($100’s)

Data are collected for 15 weeks

dfSST dfSSR dfSSE

n 1 k n k 1

Geometrical Representation Pie Sales Example

Pie Price Advertising
Week Sales ($) ($100s) Multiple regression equation:
Y Sample 1 350 5.50 3.3
Yi observation Ŷ b0 b1X1 b 2 X 2 2 460 7.50 3.3
Sales = b0 + b1(Price)
3 350 8.00 3.0
<

ei = Yi – Yi 4 430 8.00 4.5 + b2(Advertising)

5 350 6.80 3.0

Yi 6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
x2i 9 450 7.00 3.5
X2 10 490 5.00 4.0
11 340 7.20 3.5
x1i The best plane is found by
12 300 7.90 3.2
minimizing the sum of
13 440 5.90 4.0
squared errors, 6e2
X1 14 450 5.00 3.5
15 300 7.00 2.7
Regression Using SPSS Regression Using SPSS

Regression Using SPSS Regression Using SPSS

Coefficient of
Regression Using SPSS Multiple Determination

Reports the proportion of total variation in Y

explained by all X variables taken together

SSR
R2
SST

Adjusted R2 Is the Model Useful ?

Shows the proportion of variation in Y explained by all Model explains a significant proportion of
regressors adjusted for the number of regressors used :
variation in the data ?
SSE / (n k 1) § n 1 · Is there a linear relationship between all of the X
Ra2 : 1 1 (1 R2 ) ¨ ¸
SST / (n 1) © n k 1¹ variables considered together and Y ?
Hypotheses:
(n = sample size, k = number of regressors)
H0: ȕ1 = ȕ2 = … = ȕk = 0 (no linear relationship) V

Always smaller than R2

H1: at least one ȕi DWOHDVWRQHLQGHSHQGHQW
Can be negative
variable affects Y)
F Test for Overall Significance F Test for Overall Significance
(continued)

Test statistic:

MSR SSR / k
F= = ~ F(k , n k 1)
MSE SSE / (n k 1)

Multiple Regression Model :

F Test for Overall Significance SPSS Output
(continued)

H0: ȕ1 = ȕ2 = 0 Test Statistic:

H1: not both zero MSR Sales 306.526 - 24.975(Price) 74.131(Adv ertising)
F 6.5386
D = .05 MSE
df1= 2, df2 = 12
Decision:
Since the F test statistic is in
the rejection region, reject H0

D = .05
Conclusion:
0 Reject H0
F There is evidence that at least one
independent variable affects Y
F2,12,0.05 = 3.885
Using The Equation to Make
The Multiple Regression Equation Predictions
Predict sales for a week in which the selling
Sales 306.526 - 24.975(Price) 74.131(Adv ertising)
price is $5.50 and advertising is $350:
where
Sales is in number of pies per week
Price is in $ Sales 306.526 - 24.975(Price) 74.131(Adv ertising)
Advertising is in $100’s.
b1 = -24.975: sales b2 = 74.131: sales will 306.526 - 24.975 (5.50) 74.131 (3.5)
will decrease, on increase, on average,
average, by 24.975 by 74.131 pies per 428.62
pies per week for week for each $100
each $1 increase in increase in Note that Advertising is
selling price, when advertising, when Predicted sales is in $100’s, so $350
advertising is fixed price is fixed 428.62 pies means that X2 = 3.5

Significance of Individual Are Individual Variables

Variables Significant ?
(continued)

Hypotheses: H0: ȕj = 0
H0: ȕj = 0 (Xj is useless in the presence of H1: ȕj 0
other variables)
Test Statistic:
H1: ȕj 0 (Xj is useful)
bj 0 (df = n – k – 1)
t
se(b j )
Are Individual Variables
Significant? Partial F Tests
(continued)

Used to assess the effect of adding or

removing predictors from the model
Form the basis of model building

p-value for slope

Adding One Predictor : Example Adding One Predictor : Example

(continued)

H0 : Adding X1 does not improve the model when X2 is

Test at the D = .05 level to determine whether already included ( E1 0)
price (X1) improves the model given that H1 : It does ( E1 z 0)
advertising (X2) is already included.
(X2) (X1 and X2)

ANOVA
ANOVA
df SS MS
df SS
Regression 2 29460.03 14730.01
Regression 1 17484.22
Residual 13 39009.11 Residual 12 27033.31 2252.78

Total 14 56493.33 Total 14 56493.34

Adding One Predictor : Example Removing One Predictor : Example
(continued)
(X1 and X2) (X2)
ANOVA ANOVA
df SS MS df SS Test at the D = .05 level to determine whether
Regression 2 29460.03 14730.01 Regression 1 17484.22 price (X1) can be removed from the model with
Residual 12 27033.31 2252.78 Residual 13 39009.11
Total 14 56493.34 Total 14 56493.33 price (X1) and advertising (X2) as predictors.

F
>SSR(X ,X1 2
) SSR(X2 )@ y 1
MSE(X1,X2 )
29460.03 17484.22
2252.78
5.316

> F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; adding X1 does improve the model

Removing One Predictor : Example Removing One Predictor : Example

(continued) (continued)

H0 : Removing X1 does not reduce the power of the model (X1 and X2) (X2)
when X2 is also included ( E1 0)
ANOVA ANOVA
df SS MS df SS

H1 : It does ( E1 z 0) Regression 2 29460.03 14730.01 Regression 1 17484.22

Residual 12 27033.31 2252.78 Residual 13 39009.11
Total 14 56493.34 Total 14 56493.33
(X1 and X2) (X2)
F
>SSR(X ,X1 2
) SSR(X2 )@ y 1
ANOVA MSE(X1,X2 )
ANOVA
df SS MS 29460.03 17484.22
df SS
2252.78
Regression 2 29460.03 14730.01 Regression 1 17484.22
5.316
Residual 12 27033.31 2252.78 Residual 13 39009.11

Total 14 56493.34 Total 14 56493.33 > F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; dropping X1 does reduce the power of the model

What is Collinearity ? Hald Cement Data
Y : heat evolved Y X1 X2 X3 X4
Collinearity (or multicollinearity) exists if two or X1 : tricalcium aluminate 78.5 7 26 6 60
more independent variables have a perfect X2 : tricalcium silicate 74.3 1 29 15 52

linear relationship X3 : tetracalcium alumino ferrite 104.3 11 56 8 20

X4 : dicalcium silicate 87.6 11 31 8 47
Examples : 95.9 7 52 6 33

X 3 1 2 X 2 3 X1 n : 13 109.2 11 55 9 22
102.7 3 71 17 6

X 2 4X3 72.5 1 31 22 44
93.1 2 54 18 22
115.9 21 47 4 26
83.8 1 40 23 34
113.3 11 66 9 12
109.4 10 68 8 12

Collinearity in Hald Cement Data Collinearity in Hald Cement Data

Matrix Scatter Plots
Significant correlation between independent variables
Correlations
X1 X2 X3 X4
X1 Pearson
1 .229 -.824** -.245
Correlation
Sig. (2-tailed) .453 .001 .419
N 13 13 13 13
X2 Pearson
.229 1 -.139 -.973**
Correlation
Sig. (2-tailed) .453 .650 .000
N 13 13 13 13
X3 Pearson
-.824** -.139 1 .030
Correlation
Sig. (2-tailed) .001 .650 .924
N 13 13 13 13
X4 Pearson
-.245 -.973** .030 1
Correlation
Sig. (2-tailed) .419 .000 .924
N 13 13 13 13
**. Correlation is significant at the 0.01 level (2-tailed).
0RGHO%XLOGLQJ Backward Elimination

*LYHQDOLVWRISRWHQWLDOSUHGLFWRUVPRGHOVFDQEHEXLOWXVLQJ Start with the ‘full’ model.

If all predictors are significant, we get the final answer.

%DFNZDUGHOLPLQDWLRQ
If some of them are not, the one with the largest p-value is
)RUZDUGVHOHFWLRQ
removed. A new model is then fitted using the remaining
predictors. This step is repeated until all remaining
predictors are significant.

Hald Cement Data : Backward

Elimination Forward Selection
Remove

Start with the ‘null’ model.

The predictor that gives the largest and most significant

increase in SSR is included. If the largest increase is
non-significant, the final answer is the ‘null’ model.

Each predictor not already in the model is tested. The most

significant of these is added.

Continue adding predictors until none of the remaining ones

is significant.

!%()6QRWHV
Hald Cement Data : Forward
Selection Criteria of Model Selection

In some cases, ‘backward elimination’ and ‘forward selection’

give different answers. The final choice has to be based on
performance criteria such as adj R2 and MSE :

Model Adj R2 MSE

X1,X2 0.974 5.79
X1,X4 0.967 7.476
!%()6QRWHV

For the Hald cement data, the final model is (X1,X2)

Multiple Regression Slides Mod-Ed
No ratings yet
Multiple Regression Slides Mod-Ed
32 pages
Developing a Multiple Regression Model
No ratings yet
Developing a Multiple Regression Model
36 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
40 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
17 pages
Chapter 14 MR
No ratings yet
Chapter 14 MR
35 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
73 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
29 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
24 pages
Multiple Regression
100% (1)
Multiple Regression
31 pages
Part 11 Multiple Linear Regression - Pdf.crdownload
No ratings yet
Part 11 Multiple Linear Regression - Pdf.crdownload
41 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
31 pages
12 MultipleRegression
No ratings yet
12 MultipleRegression
55 pages
Multiple Regression Analysis in Business
No ratings yet
Multiple Regression Analysis in Business
28 pages
CUHK STAT5102 Ch3
No ratings yet
CUHK STAT5102 Ch3
73 pages
CH 4 Multiple Regression Models
No ratings yet
CH 4 Multiple Regression Models
28 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
19 pages
Advanced Managerial Statistics Guide
No ratings yet
Advanced Managerial Statistics Guide
37 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
37 pages
11 Bda
No ratings yet
11 Bda
25 pages
Multiple Regression A
No ratings yet
Multiple Regression A
32 pages
Multiple Regression Analysis Explained
No ratings yet
Multiple Regression Analysis Explained
33 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
34 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
54 pages
Multiple Regression Insights
100% (1)
Multiple Regression Insights
21 pages
Chapter 8 Linear Regression
No ratings yet
Chapter 8 Linear Regression
34 pages
Chapter 14-Introduction To Multiple Regression
No ratings yet
Chapter 14-Introduction To Multiple Regression
67 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
60 pages
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 5
No ratings yet
FinQuiz - Curriculum Note, @InsightSquad Study Session 2, Reading 5
11 pages
IST2024 Lecture02
No ratings yet
IST2024 Lecture02
31 pages
12 MultipleRegression Audio
No ratings yet
12 MultipleRegression Audio
33 pages
Unit 4 Multiple Regression Model: 4.0 Objectives
No ratings yet
Unit 4 Multiple Regression Model: 4.0 Objectives
23 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
36 pages
Multiple Regression Analysis 1
No ratings yet
Multiple Regression Analysis 1
57 pages
Correlation, Simple Linear Regression and Multiple Linear Regression Practice
No ratings yet
Correlation, Simple Linear Regression and Multiple Linear Regression Practice
50 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
ADM2304 Multiple Regression Dr. Suren Phansalker
No ratings yet
ADM2304 Multiple Regression Dr. Suren Phansalker
12 pages
Understanding Linear Regression in R
No ratings yet
Understanding Linear Regression in R
24 pages
Week 2 Multiple Regression
No ratings yet
Week 2 Multiple Regression
24 pages
X X B X B X B y X X B X B N B Y: QMDS 202 Data Analysis and Modeling
No ratings yet
X X B X B X B y X X B X B N B Y: QMDS 202 Data Analysis and Modeling
6 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
37 pages
Multiple Linear Regression in R
No ratings yet
Multiple Linear Regression in R
5 pages
4 Multiple Regression Analysis
No ratings yet
4 Multiple Regression Analysis
58 pages
120.508 Module 8 Multiple Regression (PDF Full Page Color)
No ratings yet
120.508 Module 8 Multiple Regression (PDF Full Page Color)
52 pages
Complete Business Statistics: Multiple Regression
No ratings yet
Complete Business Statistics: Multiple Regression
64 pages
Multiple Regression Forecasting Techniques
No ratings yet
Multiple Regression Forecasting Techniques
100 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
Anova Explain
No ratings yet
Anova Explain
10 pages
Multiple Linear Regression: y BX BX BX
No ratings yet
Multiple Linear Regression: y BX BX BX
14 pages
Simple and Multiple Regression
100% (2)
Simple and Multiple Regression
39 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
3.multiple Correlation & Regression
No ratings yet
3.multiple Correlation & Regression
24 pages
Multiple Regression Example PDF
No ratings yet
Multiple Regression Example PDF
5 pages
Linear Regression
No ratings yet
Linear Regression
23 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
Lecture9 Regression
No ratings yet
Lecture9 Regression
24 pages
Multiple Linear Regression Session 4
No ratings yet
Multiple Linear Regression Session 4
32 pages
Week 8 - 10
No ratings yet
Week 8 - 10
72 pages
06 - Intro To Linear Regression - Explainer Slides
No ratings yet
06 - Intro To Linear Regression - Explainer Slides
26 pages
Chapter - 3.
No ratings yet
Chapter - 3.
14 pages
Probability & Statistics Guide
No ratings yet
Probability & Statistics Guide
2 pages
Age-Related Amplitude of Accommodation Study
No ratings yet
Age-Related Amplitude of Accommodation Study
8 pages
Uce - Ce - 20M QP
No ratings yet
Uce - Ce - 20M QP
3 pages
FIN335 - Ch10 - Wooldridge
No ratings yet
FIN335 - Ch10 - Wooldridge
17 pages
01 Scientific Process
No ratings yet
01 Scientific Process
55 pages
MSCI570 - Lecture 8 - Advanced Regression Analysis 2022 Part 2
No ratings yet
MSCI570 - Lecture 8 - Advanced Regression Analysis 2022 Part 2
26 pages
Dip Computation Methods
No ratings yet
Dip Computation Methods
20 pages
Lovedeep Singh Bussiness Analytics
No ratings yet
Lovedeep Singh Bussiness Analytics
24 pages
Random Fixed Effects Sem
No ratings yet
Random Fixed Effects Sem
20 pages
ECL 222 Numerical Methods-Day 2: Department of Physics, University of Colombo Electronics & Computing Laboratary Ii
No ratings yet
ECL 222 Numerical Methods-Day 2: Department of Physics, University of Colombo Electronics & Computing Laboratary Ii
28 pages
Assignment #1
No ratings yet
Assignment #1
3 pages
Ds Lab Assignment 4
No ratings yet
Ds Lab Assignment 4
4 pages
Econometrics I Course Outline - 20221108
100% (1)
Econometrics I Course Outline - 20221108
2 pages
Linear Least Square and Euler Method
No ratings yet
Linear Least Square and Euler Method
18 pages
Ec312 2017 FinalExam Sol PDF
No ratings yet
Ec312 2017 FinalExam Sol PDF
11 pages
Radiochromic Film Dosimetry Protocol
No ratings yet
Radiochromic Film Dosimetry Protocol
12 pages
CE363 Chapter 4
No ratings yet
CE363 Chapter 4
37 pages
Baron and Kenny
No ratings yet
Baron and Kenny
3 pages
Physics 103/105 Lab Manual: Princeton University Physics Department
No ratings yet
Physics 103/105 Lab Manual: Princeton University Physics Department
126 pages
MTH307 2024 - 2
No ratings yet
MTH307 2024 - 2
1 page
Logistic Regression and Regularization Techniques
No ratings yet
Logistic Regression and Regularization Techniques
39 pages
Multiple Linear Regression and Its Assumptions
No ratings yet
Multiple Linear Regression and Its Assumptions
16 pages
Assignment 2 Completed
100% (1)
Assignment 2 Completed
7 pages
2023-12-12 # Part 1: Exploring the Dataset data (airquality) 導入資料集，也是 R 內建的資料集 str (airquality) 顯示資料集結構
No ratings yet
2023-12-12 # Part 1: Exploring the Dataset data (airquality) 導入資料集，也是 R 內建的資料集 str (airquality) 顯示資料集結構
11 pages
Regression Analysis Results Spring 2024
No ratings yet
Regression Analysis Results Spring 2024
56 pages
BU255 Mock Final Exam Solution
No ratings yet
BU255 Mock Final Exam Solution
13 pages
Hasil Regresi Variabel AK
No ratings yet
Hasil Regresi Variabel AK
13 pages
Past Exam
No ratings yet
Past Exam
5 pages

08 Multiple Regression

Uploaded by

08 Multiple Regression

Uploaded by

The Population Model

Multiple Regression Multiple Regression Model with k Independent Variables:

Population Intercept Population (partial) slopes Error

The Sample Model Least Squares Method

 Data are collected for 15 weeks

dfSST dfSSR  dfSSE

Geometrical Representation Pie Sales Example

ei = Yi – Yi 4 430 8.00 4.5 + b2(Advertising)

5 350 6.80 3.0

Regression Using SPSS Regression Using SPSS

 Reports the proportion of total variation in Y

Adjusted R2 Is the Model Useful ?

 Always smaller than R2

Multiple Regression Model :

H0: ȕ1 = ȕ2 = 0 Test Statistic:

Significance of Individual Are Individual Variables

 Used to assess the effect of adding or

p-value for slope

Adding One Predictor : Example Adding One Predictor : Example

H0 : Adding X1 does not improve the model when X2 is

Total 14 56493.33 Total 14 56493.34

> F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; adding X1 does improve the model

Removing One Predictor : Example Removing One Predictor : Example

H1 : It does ( E1 z 0) Regression 2 29460.03 14730.01 Regression 1 17484.22

Total 14 56493.34 Total 14 56493.33 > F(1, 12, 0.05) = 4.75

Conclusion : reject H0 ; dropping X1 does reduce the power of the model

linear relationship X3 : tetracalcium alumino ferrite 104.3 11 56 8 20

Collinearity in Hald Cement Data Collinearity in Hald Cement Data

*LYHQDOLVWRISRWHQWLDOSUHGLFWRUVPRGHOVFDQEHEXLOWXVLQJ  Start with the ‘full’ model.

 If all predictors are significant, we get the final answer.

Hald Cement Data : Backward

 Start with the ‘null’ model.

 The predictor that gives the largest and most significant

 Each predictor not already in the model is tested. The most

 Continue adding predictors until none of the remaining ones

 In some cases, ‘backward elimination’ and ‘forward selection’

Model Adj R2 MSE

 For the Hald cement data, the final model is (X1,X2)

You might also like

Data are collected for 15 weeks

dfSST dfSSR dfSSE

Reports the proportion of total variation in Y

Always smaller than R2

Used to assess the effect of adding or

*LYHQDOLVWRISRWHQWLDOSUHGLFWRUVPRGHOVFDQEHEXLOWXVLQJ Start with the ‘full’ model.

If all predictors are significant, we get the final answer.

Start with the ‘null’ model.

The predictor that gives the largest and most significant

Each predictor not already in the model is tested. The most

Continue adding predictors until none of the remaining ones

In some cases, ‘backward elimination’ and ‘forward selection’

For the Hald cement data, the final model is (X1,X2)