0% found this document useful (0 votes)

29 views37 pages

Multiple Linear Regression

Multiple linear regression is a statistical method used to model the relationship between one dependent variable and multiple independent variables, allowing for predictions and hypothesis testing. Key assumptions for validity include linearity, homoscedasticity, normality of residuals, independence of errors, and no multicollinearity. In a practical example, a Grade 10 math teacher uses this technique to predict students' final grades based on factors like previous grades and study habits, finding that hours studied and assignment completion are significant predictors.

Uploaded by

Ian Dave

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views37 pages

Multiple Linear Regression

Uploaded by

Ian Dave

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

• Multiple linear regression is a statistical technique used to model the

relationship between one dependent variable (also known as the

response or outcome variable) and two or more independent variables

(also called predictor or explanatory variables). It extends simple linear

regression, which deals with one predictor variable, by considering the

combined effects of several predictors on the outcome.

• Multiple linear regression helps
to understand the relationship
between variables and can be
used for prediction, hypothesis
testing, and assessing the
strength of the predictors.
1. Dependent Variable (Outcome Variable)
• Continuous Data: The dependent variable must be continuous (e.g., test
scores, GPA, height, weight). This is the variable you are trying to predict.
Examples include:
• Final Math Grade (e.g., 0-100 scale)
• Salary (e.g., annual income in dollars)
2. Independent Variables (Predictors)
• Continuous Data (numerical): These are variables that can take any value within a

range. Examples include: Number of Hours Studied (e.g., hours per week), Attendance

Rate (e.g., percentage of classes attended), Previous Mat, Grades (e.g., previous

year’s grade)

• Categorical Data: These variables are categorical but need to be transformed into

dummy variables (binary variables) before they can be used in the model. Examples

include: Gender (coded as 0 for male, 1 for female), Education Level (e.g., High

School, College, which can be dummy-coded)

Multiple linear regression relies on several key assumptions to ensure the
validity and reliability of the results. These assumptions are:

1. Linearity

2. Homoscedasticity

3. Multivariate Normality

4. Independence of Errors

5. No Multicollinearity
Multiple linear regression relies on several key assumptions to ensure the
validity and reliability of the results. These assumptions are:

The relationship between the

independent variables (predictors) and
the dependent variable is linear. This
means the change in the dependent
variable is proportional to the change in
the predictors.
Multiple linear regression relies on several key assumptions to ensure the
validity and reliability of the results. These assumptions are:

The residuals should have

constant variance across all levels
of the independent variables. In
other words, the spread of the
residuals should remain the same
no matter the value of the
predictors.
Multiple linear regression relies on several key assumptions to ensure the
validity and reliability of the results. These assumptions are:

•The residuals of the regression

model should be normally
distributed, especially when making
inferences or constructing
confidence intervals. This
assumption is more critical when
the sample size is small.
•Conducting a Shapiro-Wilk Test in
a multiple linear regression dataset
is essential to check for the
normality of the residuals (the
differences between the observed
and predicted values) . If the
residuals are not normally
distributed, it can affect the validity
of hypothesis tests (e.g., t-tests, F-
tests) and confidence intervals.
Multiple linear regression relies on several key assumptions to ensure the
validity and reliability of the results. These assumptions are:

•The residuals (errors) of the model

should be independent of each
other. This is particularly important
when dealing with time series data,
where autocorrelation can be a
concern. To know if there is
autocorrelation, one should conduct
a Durbin-Watson Test
Independence of Errors means that the residuals or errors from
the regression model should not be correlated with each other. In
simple terms, if the errors in your predictions are random and not
related to each other, your model is likely good. If the errors show
a pattern, your model might need adjustments.
5. No Multicollinearity
•The independent variables should not be highly correlated with one
another. High multicollinearity can make it difficult to determine the
effect of each predictor variable on the dependent variable.
Pitzer, J., & Skinner, E. (2017). Predictors of changes in students’ motivational resilience over the school
year: The roles of teacher support, self-appraisals, and emotional reactivity. International Journal of
Behavioral Development, 41(1), 15-29.
Ms. Bini, a Grade 10 math teacher, wants to predict his students' final math grades

based on factors such as their previous math grades, attendance rates, gender, and

study habits. He believes that understanding the influence of these factors will help him

identify students who may be at risk of underperforming and allow him to tailor his

teaching strategies to better support their academic success in mathematics. However,

he faces the challenge of determining how these variables interact and contribute to his

students' final outcomes, making it difficult to provide targeted interventions.

1. How do previous math grades, attendance rates, gender, and study habits predict the

final math grades of Grade 10 students?

2. Which factor among previous math grades, attendance rates, gender, and study habits

has the most significant influence on predicting the final math grades of Grade 10

students?

3. How accurately can a regression model predict the final math grades of Grade 10

students based on their study habits, attendance rates, gender, and previous math

performance?
𝑌 = 60.4 + 0.377 𝑋1 + 14.971 𝑋2 + 0.097 𝑋3 - 0.496𝑋4
Interpretation: The number of hours studied per week significantly
predicts final math grades (p < 0.05). For every additional hour studied,
the final math grade increases by 0.377 points. This variable has a
positive and significant impact on students' final math grades.
Interpretation: The number of hours studied per week significantly
predicts final math grades (p < 0.05). For every additional hour studied,
the final math grade increases by 0.377 points. This variable has a
positive and significant impact on students' final math grades.
Interpretation: The percentage of assignments completed significantly
predicts final math grades (p < 0.05). For every 1% increase in
assignments completed, the final math grade increases by 0.097
points. This variable has a strong and positive impact on the final math
grades of students.
Interpretation: Gender does not significantly predict final math grades
(p > 0.05). The negative coefficient suggests that females may score
slightly higher than males, but the effect is not statistically significant.
Significant Predictors: The number of hours studied per week and the percentage of
assignments completed significantly predict students' final math grades. Both have a
positive relationship with final grades.
Non-Significant Predictors: Attendance percentage and gender do not significantly
predict final math grades in this model.
Thus, the model indicates that focusing on study habits (specifically hours studied and
assignment completion) may have the most substantial impact on improving students'
math grades.
The Percentage of Assignments Completed has the largest Beta value (0.550),
indicating that it has the most significant influence on predicting final math
grades.
The Percentage of Assignments Completed has the largest Beta value (0.550),
indicating that it has the most significant influence on predicting final math
grades.
The second most influential factor is Number of Hours Studied in a Week with a
Beta of 0.369.
Attendance Percentage (Beta = 0.054) and Gender (Beta = -0.073) have much
weaker effects and are not significant predictors based on their p-values.

Thus, the Percentage of Assignments Completed is the strongest predictor of

final math grades among the factors considered.
This is the multiple correlation coefficient. It measures the strength and
direction of the linear relationship between the dependent variable and the
independent variables. A value of .826 indicates a strong positive
relationship.
This is the coefficient of determination. It represents the proportion of
the variance in the dependent variable that is predictable from the
independent variables. In this case, 68.3% of the variation in the
dependent variable can be explained by the model.
This adjusts the R Square value for the number of predictors in the
model. It accounts for the model's complexity and is a more accurate
measure of how well the model generalizes to other data. Here, after
adjusting for the number of predictors, about 64.8% of the variance in
the dependent variable is explained by the model.
This is the standard deviation of the residuals (errors). It provides a
measure of how much the observed values deviate from the predicted
values. A smaller standard error indicates that the model's predictions
are closer to the actual data points.
Overall, the model explains a significant amount of the variance in the
dependent variable (R² = .683), and the predictors collectively have a
strong positive relationship with the outcome variable. The adjusted R
Square (.648) suggests the model is moderately good at predicting new
data, while the standard error of 2.0258 indicates the level of
prediction error.
A regression model is considered a good fit when it meets several key criteria indicating that it
effectively explains the variability in the dependent variable and aligns with the assumptions of the
regression analysis. Here are the main factors to consider:

1. Goodness-of-Fit Measures:
a. R-squared (𝑅2 ):
Definition: The proportion of variance in the dependent variable that is
explained by the independent variables.
Interpretation: A higher 𝑅2 value (closer to 1) indicates a better fit. For
example, an 𝑅2 of 0.80 means that 80% of the variability in the dependent
variable is explained by the model.
Context: While a high 𝑅2 suggests a good fit, it is essential to consider whether
it is high enough given the context and field of study.
A regression model is considered a good fit when it meets several key criteria indicating that it
effectively explains the variability in the dependent variable and aligns with the assumptions of the
regression analysis. Here are the main factors to consider:

2
b. Adjusted R-squared (𝑅 adj):
Definition: Adjusted for the number of predictors in the model.
2
It is more reliable than 𝑅 adj when comparing models with
different numbers of predictors.
Interpretation: Higher values indicate a better fit, but it also
2
accounts for the number of predictors. An increase in 𝑅 adj
when adding a predictor means the new predictor improves the
model.
A regression model is considered a good fit when it meets several key criteria indicating that it
effectively explains the variability in the dependent variable and aligns with the assumptions of the
regression analysis. Here are the main factors to consider:
3. Assumptions of Linear Regression:
a. Linearity:
Definition: The relationship between the dependent and independent variables should be linear.
Check: Use scatterplots to ensure a linear relationship between the predictors and the outcome.
b. Independence of Errors:
Definition: Residuals should be independent of each other.
Check: For time series data, check for autocorrelation using tests like the Durbin-Watson statistic.
c. Homoscedasticity:
Definition: The variance of residuals should be constant across all levels of the independent
variables.
Check: Plot residuals versus predicted values. The spread of residuals should be consistent across
the range of predicted values.
d. Normality of Residuals:
Definition: Residuals should be approximately normally distributed.
Check: Use Q-Q plots or histograms to assess the distribution of residuals.
A regression model is considered a good fit when it meets several key criteria indicating that it
effectively explains the variability in the dependent variable and aligns with the assumptions of the
regression analysis. Here are the main factors to consider:

3. Assumptions of Linear Regression:

a. Linearity:
Definition: The relationship between the dependent and independent variables
should be linear.
Check: Use scatterplots to ensure a linear relationship between the predictors
and the outcome.
b. Independence of Errors:
Definition: Residuals should be independent of each other.
Check: For time series data, check for autocorrelation using tests like the
Durbin-Watson statistic.

RESEARCH METHODS LESSON 18 - Multiple Regression
No ratings yet
RESEARCH METHODS LESSON 18 - Multiple Regression
6 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
Multiple Regression BI
No ratings yet
Multiple Regression BI
16 pages
Team8 Lab3
No ratings yet
Team8 Lab3
12 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Regression Notes
No ratings yet
Regression Notes
6 pages
0205019676
No ratings yet
0205019676
28 pages
Unit 5 Business Analytics
No ratings yet
Unit 5 Business Analytics
24 pages
Assumptions in Multiple Regression
100% (1)
Assumptions in Multiple Regression
16 pages
4 Asumsi Multiple Regresi
No ratings yet
4 Asumsi Multiple Regresi
5 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
6 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Econometrics 2
No ratings yet
Econometrics 2
27 pages
Comprehensive Guide to Regression Analysis
No ratings yet
Comprehensive Guide to Regression Analysis
7 pages
10 Regression Analysis
No ratings yet
10 Regression Analysis
55 pages
Data Analysts' Regression Guide
No ratings yet
Data Analysts' Regression Guide
7 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Econometrics 2
No ratings yet
Econometrics 2
128 pages
Module 2
No ratings yet
Module 2
21 pages
Multiple Regression A Leisurely Primer
No ratings yet
Multiple Regression A Leisurely Primer
26 pages
Predicting Students' Final Grade in Mathematics Module Using Multiple Linear Regression
No ratings yet
Predicting Students' Final Grade in Mathematics Module Using Multiple Linear Regression
5 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
39 pages
Updated Lecture 7
No ratings yet
Updated Lecture 7
29 pages
Multiple Linear Regression Slides
No ratings yet
Multiple Linear Regression Slides
17 pages
Models Assignment
No ratings yet
Models Assignment
43 pages
Applied Statistics Final Note 2
No ratings yet
Applied Statistics Final Note 2
3 pages
Regression & Correlation Basics
No ratings yet
Regression & Correlation Basics
17 pages
Regression
No ratings yet
Regression
49 pages
Advance Business Research Methods
No ratings yet
Advance Business Research Methods
38 pages
Team9 Lab3
No ratings yet
Team9 Lab3
29 pages
Multiple Linear Regression Test - 2025
No ratings yet
Multiple Linear Regression Test - 2025
47 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
No ratings yet
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
5 pages
ECONOMETRICS
No ratings yet
ECONOMETRICS
2 pages
Regression Analysis Guide
100% (1)
Regression Analysis Guide
35 pages
3 Multiple Regression
No ratings yet
3 Multiple Regression
31 pages
Multiple Regression Explained
100% (2)
Multiple Regression Explained
23 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
43 pages
Correlation Used To Describe The Relation Between Variables - Mobin Bahjat Dasko
No ratings yet
Correlation Used To Describe The Relation Between Variables - Mobin Bahjat Dasko
31 pages
Assumptions of Multiple Linear Regression
No ratings yet
Assumptions of Multiple Linear Regression
17 pages
Understanding Multiple Regression Assumptions
No ratings yet
Understanding Multiple Regression Assumptions
14 pages
2024 Chapter 1
No ratings yet
2024 Chapter 1
8 pages
Predictive Analytics - Business Predictions Using Mutliple Linear Regression
No ratings yet
Predictive Analytics - Business Predictions Using Mutliple Linear Regression
21 pages
Day 2-Data Science
No ratings yet
Day 2-Data Science
16 pages
ML Notes
No ratings yet
ML Notes
38 pages
Four Assumptions Test in Multiple Regression
No ratings yet
Four Assumptions Test in Multiple Regression
10 pages
Bio2 Module 4 - Multiple Linear Regression
No ratings yet
Bio2 Module 4 - Multiple Linear Regression
20 pages
Multiple Regression Insights
No ratings yet
Multiple Regression Insights
18 pages
Lecture 2: MRA and Inference: Dr. Yundan Gong
No ratings yet
Lecture 2: MRA and Inference: Dr. Yundan Gong
52 pages
3.multiple Correlation & Regression
No ratings yet
3.multiple Correlation & Regression
24 pages
Regression Analysis (AI)
No ratings yet
Regression Analysis (AI)
9 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
48 pages
Predicting VO2max with Regression Analysis
No ratings yet
Predicting VO2max with Regression Analysis
17 pages
DA&V Module 2 (SAMI)
No ratings yet
DA&V Module 2 (SAMI)
14 pages
Understanding Simple Regression Analysis
100% (1)
Understanding Simple Regression Analysis
8 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Chapter 3
No ratings yet
Chapter 3
2 pages
Gibbs Energy Entropy Worksheet1
No ratings yet
Gibbs Energy Entropy Worksheet1
1 page
Methodology p2
No ratings yet
Methodology p2
17 pages
Content 4
No ratings yet
Content 4
26 pages
Sample Rationale
No ratings yet
Sample Rationale
5 pages
Research Paper Format
No ratings yet
Research Paper Format
3 pages
Reading Writing Skills11 Q4 M2
No ratings yet
Reading Writing Skills11 Q4 M2
17 pages
Research Template Sample
No ratings yet
Research Template Sample
19 pages
2nd Quarter Reviewer
No ratings yet
2nd Quarter Reviewer
7 pages
Lesson 1
No ratings yet
Lesson 1
45 pages
Research Template 101
No ratings yet
Research Template 101
19 pages
Week-1 2
No ratings yet
Week-1 2
18 pages
Academic Pressure Insights: Grade 10
No ratings yet
Academic Pressure Insights: Grade 10
25 pages
Describe How Index Fossils Are Used To Define and Identify Subdivisions of The Geologic Time Scale
No ratings yet
Describe How Index Fossils Are Used To Define and Identify Subdivisions of The Geologic Time Scale
41 pages
Second Periodical Examination in Physical Education 1
No ratings yet
Second Periodical Examination in Physical Education 1
5 pages
Oral Com Review Q2
No ratings yet
Oral Com Review Q2
44 pages
QUESTIONNAIRE Draft Awaiting App
No ratings yet
QUESTIONNAIRE Draft Awaiting App
1 page
Introduction To The Philosophy of The Human Person - Learning Activity Sheet
No ratings yet
Introduction To The Philosophy of The Human Person - Learning Activity Sheet
1 page
Week-1 1
No ratings yet
Week-1 1
15 pages
Week 1 Introduction
No ratings yet
Week 1 Introduction
28 pages
2ND Quarter Reviewer PR1
No ratings yet
2ND Quarter Reviewer PR1
3 pages
Stocks and Bonds Module
No ratings yet
Stocks and Bonds Module
2 pages
Las-4 1
No ratings yet
Las-4 1
4 pages
Simple and General Annuity Module
No ratings yet
Simple and General Annuity Module
4 pages
Gas Laws for Science Students
No ratings yet
Gas Laws for Science Students
71 pages
What Research Is
No ratings yet
What Research Is
2 pages
Research Technical Terms Explained
No ratings yet
Research Technical Terms Explained
1 page
Rapd Thesis
100% (4)
Rapd Thesis
6 pages
NBL Youth Waiver & Liability Release
No ratings yet
NBL Youth Waiver & Liability Release
3 pages
Unknown 31
No ratings yet
Unknown 31
40 pages
Major Project Report2 Gyanendriyra Final3
100% (1)
Major Project Report2 Gyanendriyra Final3
55 pages
Pickervector
No ratings yet
Pickervector
54 pages
Arctic Sustainability Key Methodologies and Knowledge Domains A Synthesis of Knowledge I Jessica K Graybill Instant Download
No ratings yet
Arctic Sustainability Key Methodologies and Knowledge Domains A Synthesis of Knowledge I Jessica K Graybill Instant Download
47 pages
Design Plan: A Performance Task in Geometry
No ratings yet
Design Plan: A Performance Task in Geometry
12 pages
D1960002 - 002-Constituent Product - Part 4
No ratings yet
D1960002 - 002-Constituent Product - Part 4
165 pages
Speech Contexts Learning Guide
No ratings yet
Speech Contexts Learning Guide
9 pages
Nilgsoutline Eng
No ratings yet
Nilgsoutline Eng
3 pages
Trosper 1
No ratings yet
Trosper 1
8 pages
Minco Thermal Solutions Design Guide
No ratings yet
Minco Thermal Solutions Design Guide
55 pages
Life Skill Application
100% (1)
Life Skill Application
32 pages
Truck Mechanics
No ratings yet
Truck Mechanics
4 pages
U02 Test
No ratings yet
U02 Test
6 pages
Thesis Final
No ratings yet
Thesis Final
26 pages
4 PM Mic
No ratings yet
4 PM Mic
44 pages
The Probability of Using Pulverized Mussel Shells As Component in Hollow Block Making
No ratings yet
The Probability of Using Pulverized Mussel Shells As Component in Hollow Block Making
20 pages
Thesis Ideas For Biology
100% (3)
Thesis Ideas For Biology
7 pages
Civil Engineering Basics Skill To Become A Successful Civil Engineer
100% (1)
Civil Engineering Basics Skill To Become A Successful Civil Engineer
9 pages
PLC Based Automatic Car Washing System-18849
No ratings yet
PLC Based Automatic Car Washing System-18849
6 pages
AASHTO T84 (Specific Gravity & Absorption of Fine Aggregate) PDF
No ratings yet
AASHTO T84 (Specific Gravity & Absorption of Fine Aggregate) PDF
12 pages
Machine Learning Kernels Guide
No ratings yet
Machine Learning Kernels Guide
31 pages
Materials Management Essentials
No ratings yet
Materials Management Essentials
26 pages
Community Governance
No ratings yet
Community Governance
17 pages
Peanuts Space Week - Activity Pack - Original 1636739041
No ratings yet
Peanuts Space Week - Activity Pack - Original 1636739041
16 pages
Ethanol from Sunflower Hulls
No ratings yet
Ethanol from Sunflower Hulls
4 pages
Odisha Class VII Math Periodic Test 2022
No ratings yet
Odisha Class VII Math Periodic Test 2022
2 pages
LIN2601 Examination May - June 2024
100% (1)
LIN2601 Examination May - June 2024
16 pages
Reseach Methodology Lecture Notes, Ebook - MBA First Year Sem 2 - Free PDF Download PDF
No ratings yet
Reseach Methodology Lecture Notes, Ebook - MBA First Year Sem 2 - Free PDF Download PDF
355 pages

Multiple Linear Regression

Uploaded by

Multiple Linear Regression

Uploaded by

• Multiple linear regression is a statistical technique used to model the

relationship between one dependent variable (also known as the

response or outcome variable) and two or more independent variables

(also called predictor or explanatory variables). It extends simple linear

regression, which deals with one predictor variable, by considering the

combined effects of several predictors on the outcome.

School, College, which can be dummy-coded)

The relationship between the

The residuals should have

•The residuals of the regression

•The residuals (errors) of the model

teaching strategies to better support their academic success in mathematics. However,

students' final outcomes, making it difficult to provide targeted interventions.

final math grades of Grade 10 students?

Thus, the Percentage of Assignments Completed is the strongest predictor of

3. Assumptions of Linear Regression:

You might also like