0% found this document useful (0 votes)

191 views34 pages

Simple Linear Regression Analysis

Regression analysis is used to understand the relationship between two variables and predict the value of one variable based on another. A regression model contains an independent (predictor) variable and a dependent (response) variable. Linear regression estimates the coefficients of the linear equation that best predicts the dependent variable from the independent variables. The method of least squares is used to determine the regression line that minimizes the sum of the squared residuals.

Uploaded by

Jacqueline Carbonel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

191 views34 pages

Simple Linear Regression Analysis

Uploaded by

Jacqueline Carbonel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Regression Analysis

Regression Analysis
 Regression Analysis is used to:
1)understand the relation between two variables
2)predict the value of one variable based on another
variable.
 A regression model is comprised of a dependent
(response) variable and an independent (predictor)
variable.

Independent Variable(s) Dependent Variable

Prediction Relationship
Regression Analysis

 Linear regression estimates the coefficients of the

linear equation, involving one or more independent
variables that best predict the value of the
dependent variable.

 If you believe that none of your predictor variables

is correlated with the errors in your dependent
variable, you can use the linear regression
procedure
Simple Linear Regression
The Scatter Diagram – used to graphically
investigate relationship between the dependent
and independent variables

100
Y 50
0
0 20 X 40 60
Plot of all (Xi , Yi) pairs
Types of Regression Models

Positive Linear Relationship Relationship NOT Linear

Negative Linear Relationship No Relationship

Simple Linear Regression Model
 Regression models are used to test if a relationship
exist between variables; that is to use one variable to
predict another. However, there is a random error that
cannot be predicted.

Y intercept Random
Error

Yi   0  1 X i   i
Dependent
(Response)
Slope
Variable Independent
(Predictor/Explanatory)
Variable
Population Linear Regression
Model

Y Yi   0  1X i   i Observed
Value

i = Random Error

   0  1X i
YX

X
Observed Value
Sample Linear Regression Model

yˆ i  b0  b1 xi

yi = Predicted Value of Y for observation i

xi = Value of X for observation i

b0 = Sample Y - intercept used as estimate

of the population 0
b1 = Sample Slope used as estimate of the
population 1
Sample Linear Regression Model

Sample data are used to estimate the true

values for the intercept and slope.

yˆ i  b0  b1 xi
The difference between the actual value of Y
and the predicted value (using sample data) is
known as the error.
Error = actual value – predicted value
 Yi
Sample Linear Regression Model

yˆ i  b0  b1 xi
n
 n  n 
n  xi yi    xi   yi 
b1  i 1  i 1  i 1 
2
n
 n

n  xi    xi 
2

i 1  i 1 
b0  y  b1 x
Table 3.1. Intelligence Test Scores and Freshmen Chemistry Grades
Test Score Chemistry
Student (x) Grade (y)
1 65 85
2 50 74
3 55 76
4 65 90
5 55 85
6 70 87
7 65 94
8 70 98
9 55 81
10 70 91
11 50 76
12 55 74
Figure 3.1. Scatter Diagram with regression line

100

95
yˆ i  b0  b1 xi
Chemistry Grade

85
Determining point
80 estimate of b0 and b1
Using the Method of
75
Least Squares
70
40 45 50 55 60 65 70 75
Intelligence Test Score
Measures of Variation: The
Sum of Squares

Y 
SSE =(Yi - Yi )2
_ b Xi
 b0 + 1
SST = (Yi - Y) 2
Yi =
 _
SSR = (Yi - Y)2
_
Y

X
Xi
Method of Least Squares
n n
SSE   e   ( yi  b0  b1 xi )
2
i
2

i 1 i 1
n

The process of differentiating i 1 i

e 2
with respect
to b0 and b1 and equating the derivatives to zero
n
 n  n 
n xi yi    xi   yi 
b1  i 1  i 1  i 1  b0  y  b1 x
2
n
 n

n xi    xi 
2

i 1  i 1 
Method of Least Squares

n
 n  n 
n xi yi    xi   yi 
b1  i 1  i 1  i 1 
2
n
 n

n xi    xi 
2

i 1  i 1 

 (x i  x )( yi  y )
b1  i 1
n

 i
( x
i 1
 x ) 2
Table 3.1. Intelligence Test Scores and Freshmen Chemistry Grades
Test Score Chemistry
Student (x) Grade (y)
1 65 85
2 50 74
b1  0.897
3 55 76
b0  30.056
4 65 90
5 55 85
yˆ i  b0  b1 xi
6 70 87
yˆ i  30.056  0.897 xi
7 65 94
8 70 98
9 55 81
10 70 91
11 50 76
12 55 74
Figure 3.1. Scatter Diagram with regression line

100

95 yˆ i  30.056  0.897 xi
Chemistry Grade

75 The slope of 0.897 means for

each increase of one unit in
70
intelligence Test Score (X),
40 45 50 55 60 65 70 75
the Chemistry Grade (Y) is
Intelligence Test Score estimated to increase by
0.897 units.
Using SPSS
Graphs To add regression line Use
Scatter
Simple SPSS Chart Editor

100 Chart
Options
Fit Line

Regression
Line
80
Chemistry Grade

Regression
70 Rsq = 0.7438 Prediction
40 50 60 70 80
Line
Test Score
Using SPSS
Analyze
Regression
Linear

Coefficientsa a
Coefficients
Unstandardized Standardized
Unstandardized Standardized
Coefficients Coefficients
Coefficients Coefficients
Model B Std. Error Beta t Sig.
Model B Std. Error Beta t Sig.
1 (Constant) 30.043 10.137 2.964 .014
1 (Constant) 30.043 10.137 2.964 .014
Test Score .897 .167 .862 5.389 .000
Test Score .897 .167 .862 5.389 .000
a. Dependent Variable: Chemistry Grade
a. Dependent Variable: Chemistry Grade

yˆ i  30.043  0.897 xi
Using SPSS Standard Deviation
Analyze Coefficient of
Regression Correlation Determination around the
Linear regression line

Model Summaryb b
Model Summary
Adjusted Std. Error of
Adjusted Std. Error of
Model R R Square R Square the Estimate
Model R a R Square R Square the Estimate
1 .862 .744 .718 4.319
1 .862a .744 .718 4.319 Measures of
a. Predictors: (Constant), Test Score
a. Predictors: (Constant), Test Score
b. Dependent Variable: Chemistry Grade
Variation
b. Dependent Variable: Chemistry Grade

ANOVAb b
ANOVA
Sum of
Sum of
Model Squares df Mean Square F Sig.
Model Squares df Mean Square F Sig. a
1 Regression 541.693 1 541.693 29.036 .000 a
1 Regression 541.693 1 541.693 29.036 .000
Residual 186.557 10 18.656
Residual 186.557 10 18.656
Total 728.250 11
Total 728.250 11
a. Predictors: (Constant), Test Score
a. Predictors: (Constant), Test Score
b. Dependent Variable: Chemistry Grade
b. Dependent Variable: Chemistry Grade
Testing the Significance of b

 Similar to a test on r in the one-predictor case

t =(0.8972136-0)/0.1665043 = 5.39 H0 is rejected,

i.e. the regression line has a nonzero slope
2
Variance Explained – r
r2 tells us the proportion of variance in Y which is
explained by X

r 
2
SS regression

SSYˆ

 Yˆ  Y 2

 Y  Y 
2
SS total SSY

• a ratio reflecting the proportion of variance captured

by our model relative to the overall variance in our
data
• highly interpretable: r2 =.50 means 50% of the
variance in Y is explained by X
Linear Regression Assumptions

For Linear Models

 1. Normality
 Y Values Are Normally Distributed For Each
X
 Probability Distribution of Error is Normal

 2. Homoscedasticity (Constant Variance)

 3. Independence of Errors
Variation of Errors Around the
Regression Line

y values are normally distributed

f(e) around the regression line.
For each x value, the “spread” or
variance around the regression line
is the same.

Y
X2
X1
X
Regression Line
Residual Analysis

 Purposes
 Examine Linearity
 Evaluate violations of assumptions

 Graphical Analysis of Residuals

 Plot residuals Vs. Xi values 
 Difference between actual Yi & predicted Yi

 Studentized residuals:
 Allows consideration for the magnitude of the
residuals
Residual Analysis for Linearity

Not Linear
 Linear
e e

X X
Residual Analysis for Homoscedasticity

Heteroscedasticity 
SR
Homoscedasticity
SR

X X

Using Standardized Residuals

• Predict Chemistry Grade
• Predict residual
• Predict studentized residual
• Predict standardized residual
Residual Analysis for
Normality

kdensity r, normal swilk r  Normal

kernel = epanechnikov, bandwidth = 2.25
kernel = epanechnikov, bandwidth = 2.25
Kernel density estimate
Kernel density estimate
Normal density
Normal density
.1
.1

.08
.08
Density

.06
Density

.06

.04
.04

Kernel density estimate

Kernel density estimate .02
.02
-10 -5 0 5 10
-10 -5 0
Residuals 5 10
Residuals
Residual Analysis for Linearity

scatter r X, yline(0)

5
5
 Linear

Residuals
Residuals
0
0

-5
-5
50 55 60 65 70
50 55 60
Test Score 65 70
Test Score
Residual Analysis for
Homoscedasticity

 Homoscedasticity
scatter r1 X, yline(0) scatter sr X, yline(0)
2 2
2 2

1 1
1 Standardized residuals 1

Studentized residuals
Standardized residuals

Studentized residuals
0 0
0 0

-1 -1
-1 -1

-2 -2
50 55 60 65 70 -2 50 55 60 65 70 -2
50 55 60
Test Score 65 70 50 55 60
Test Score 65 70
Test Score Test Score

Using Standardized Residuals Using Studentized Residuals

Residual Analysis for
Homoscedasticity

hettest
 Homoscedasticity
Residual Analysis for
Independence

scatter r obs, yline(0)

 Independent
5
5

Residuals
Residuals
0
0

-5
-5
0 5 10 15
0 5 obs 10 15
obs
Residual Analysis for
Independence

Durbin-Watson Statistic.
The D-W statistic is
defined as:
 Independent

Linear Regression Guide: Types & Models
No ratings yet
Linear Regression Guide: Types & Models
7 pages
Stock Price Prediction via Linear Regression
No ratings yet
Stock Price Prediction via Linear Regression
6 pages
STAT 111: Introduction To Statistics & Probability For Actuaries
100% (2)
STAT 111: Introduction To Statistics & Probability For Actuaries
230 pages
Logistic Regression Guide
100% (1)
Logistic Regression Guide
34 pages
Bivariate Analysis
100% (2)
Bivariate Analysis
19 pages
Discriminant Analysis
No ratings yet
Discriminant Analysis
33 pages
Logistic Regression
0% (1)
Logistic Regression
71 pages
Taguchi'S Quality Loss Function
No ratings yet
Taguchi'S Quality Loss Function
9 pages
Linear Regression & Correlation Guide
100% (2)
Linear Regression & Correlation Guide
69 pages
Regression Notes
100% (1)
Regression Notes
20 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
21 pages
Stock Market Prediction Using Machine Learning
100% (1)
Stock Market Prediction Using Machine Learning
7 pages
Large Sample Hypothesis Testing Guide
50% (2)
Large Sample Hypothesis Testing Guide
39 pages
Understanding Statistics and Probability
100% (3)
Understanding Statistics and Probability
70 pages
Ensemble Learning-Bagging-Boosting-Stacking
No ratings yet
Ensemble Learning-Bagging-Boosting-Stacking
12 pages
Introduction To Time Series Analysis
No ratings yet
Introduction To Time Series Analysis
93 pages
R Lnaguager
No ratings yet
R Lnaguager
38 pages
Interaction Effects on Dental Implant Hardness
No ratings yet
Interaction Effects on Dental Implant Hardness
39 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
43 pages
Data Science Interview Stats Q&A
No ratings yet
Data Science Interview Stats Q&A
5 pages
Logistic Regression
100% (3)
Logistic Regression
30 pages
Multiple Regression Analysis - Inference
No ratings yet
Multiple Regression Analysis - Inference
34 pages
Ba Unit-4
No ratings yet
Ba Unit-4
16 pages
Machine Learning Glossary Guide
No ratings yet
Machine Learning Glossary Guide
44 pages
Statistics For Data Science by Mihir Patnaik
100% (1)
Statistics For Data Science by Mihir Patnaik
103 pages
1-Introduction To Statistics PDF
100% (1)
1-Introduction To Statistics PDF
37 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
ANOVA for Financial Analysts
No ratings yet
ANOVA for Financial Analysts
13 pages
Sign Test
No ratings yet
Sign Test
5 pages
Time Series Analysis Objectives
No ratings yet
Time Series Analysis Objectives
23 pages
Univariate, Bivariate and Multivariate Methods in Corpus-Based Lexicography - A Study of Synonymy
100% (1)
Univariate, Bivariate and Multivariate Methods in Corpus-Based Lexicography - A Study of Synonymy
614 pages
Time Series Models for Engineers
No ratings yet
Time Series Models for Engineers
15 pages
Supervised Learning-1
100% (1)
Supervised Learning-1
37 pages
XII STD - Statistics English Medium
No ratings yet
XII STD - Statistics English Medium
280 pages
Understanding Cronbach's Alpha Reliability
No ratings yet
Understanding Cronbach's Alpha Reliability
5 pages
PPTs of Business Analytics
No ratings yet
PPTs of Business Analytics
22 pages
Types of Data & Levels of Measurements.
No ratings yet
Types of Data & Levels of Measurements.
47 pages
Chapter 1-Introduction To Statistics
No ratings yet
Chapter 1-Introduction To Statistics
14 pages
Exam 1 Questions
No ratings yet
Exam 1 Questions
6 pages
OLS Assumptions & Issues Guide
No ratings yet
OLS Assumptions & Issues Guide
4 pages
Statistical Data Analysis Overview
No ratings yet
Statistical Data Analysis Overview
53 pages
Chapter 7 - Regression Analysis
100% (1)
Chapter 7 - Regression Analysis
111 pages
Descriptive Statistics PDF
No ratings yet
Descriptive Statistics PDF
23 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
11 pages
Predictive Analytics Overview
No ratings yet
Predictive Analytics Overview
10 pages
Cost Function
100% (1)
Cost Function
21 pages
Implementing Logic Gates Using Neural Networks (Part 2) - by Vedant Kumar - Towards Data Science
No ratings yet
Implementing Logic Gates Using Neural Networks (Part 2) - by Vedant Kumar - Towards Data Science
3 pages
103 SM - All - in - One
No ratings yet
103 SM - All - in - One
556 pages
Heart Disease Prediction via Naive Bayes
No ratings yet
Heart Disease Prediction via Naive Bayes
6 pages
01 Probability and Probability Distributions
No ratings yet
01 Probability and Probability Distributions
18 pages
AI Neural Network Training Methods
No ratings yet
AI Neural Network Training Methods
12 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
83 pages
Supervised Learning: Regression & Classification
No ratings yet
Supervised Learning: Regression & Classification
66 pages
Logistic Regression for Coupon Usage
100% (1)
Logistic Regression for Coupon Usage
56 pages
Logistic Regression for Researchers
100% (2)
Logistic Regression for Researchers
51 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
12 pages
Linear Regression
No ratings yet
Linear Regression
22 pages
Knowledge Attitude and Practices On Disaster
No ratings yet
Knowledge Attitude and Practices On Disaster
39 pages
SOLID WASTE MANAGEMENT OF SHS OF ANHS Autosaved
100% (6)
SOLID WASTE MANAGEMENT OF SHS OF ANHS Autosaved
25 pages
MS PPT. Session Guide Mark Lyndon B. Baguio
No ratings yet
MS PPT. Session Guide Mark Lyndon B. Baguio
15 pages
Impact Assessment: The Problem of Impact
No ratings yet
Impact Assessment: The Problem of Impact
16 pages
Group I - Ethics and Culture
No ratings yet
Group I - Ethics and Culture
20 pages
Organizational Profile Template Guide
No ratings yet
Organizational Profile Template Guide
2 pages
PHASE 3.MS PowerPoint Features - Book
No ratings yet
PHASE 3.MS PowerPoint Features - Book
16 pages
Agnes Grandia - The Profession - Case Study 2
No ratings yet
Agnes Grandia - The Profession - Case Study 2
1 page
Filipino Values: Ethics & Ambivalence
0% (1)
Filipino Values: Ethics & Ambivalence
11 pages
Public-Policy-And-Program-Management SYLLABUS
100% (4)
Public-Policy-And-Program-Management SYLLABUS
2 pages
Multiple Regression Analysis 1
No ratings yet
Multiple Regression Analysis 1
57 pages
Statistical Analysis for La Union Election
No ratings yet
Statistical Analysis for La Union Election
10 pages
Ecostat Department Test Analysis
No ratings yet
Ecostat Department Test Analysis
74 pages
16-The-Costco-Model CASE
No ratings yet
16-The-Costco-Model CASE
3 pages
Public Service Ethics & Accountability
100% (4)
Public Service Ethics & Accountability
50 pages
Understanding Law: Ethics and Accountability
No ratings yet
Understanding Law: Ethics and Accountability
3 pages
Measures of Partition and Dispersion
No ratings yet
Measures of Partition and Dispersion
51 pages
Lecture 2 - Mean, Median and Mode
No ratings yet
Lecture 2 - Mean, Median and Mode
9 pages
Skewness N Kurtosis
No ratings yet
Skewness N Kurtosis
4 pages
Midterm - Module 2 Summarized Descriptive Statistics - Measures of Dispersion
No ratings yet
Midterm - Module 2 Summarized Descriptive Statistics - Measures of Dispersion
4 pages
Mean Deviation
No ratings yet
Mean Deviation
18 pages
Online - Maths - MTP-4 - 100 Marks Test Paper - 3-5-2025
No ratings yet
Online - Maths - MTP-4 - 100 Marks Test Paper - 3-5-2025
8 pages
Activity 3. Illustrating A Normal Random Variable and Its Characteristics
No ratings yet
Activity 3. Illustrating A Normal Random Variable and Its Characteristics
22 pages
Analysis and Interpretation of Assessment Results: Study Guide For Module No. 4
No ratings yet
Analysis and Interpretation of Assessment Results: Study Guide For Module No. 4
20 pages
DICTATION PARAGRAPH 3-5 1st
No ratings yet
DICTATION PARAGRAPH 3-5 1st
2 pages
Methods For Describing Sets of Data
No ratings yet
Methods For Describing Sets of Data
19 pages
Analisis Data
No ratings yet
Analisis Data
4 pages
Math 141e
No ratings yet
Math 141e
3 pages
Statistics Questions-1-1
No ratings yet
Statistics Questions-1-1
8 pages
Median Calculation in Test Scores
No ratings yet
Median Calculation in Test Scores
20 pages
Mean Deviation
100% (1)
Mean Deviation
7 pages
Statistic MNGT Testbook
No ratings yet
Statistic MNGT Testbook
171 pages
Test 7
No ratings yet
Test 7
6 pages
Garima Dubey
No ratings yet
Garima Dubey
13 pages
Sta 301 Final Term Quiz File With Answer Key
No ratings yet
Sta 301 Final Term Quiz File With Answer Key
22 pages
Unit 4 Lesson 2 Quantitative Analysis and Interpretation
No ratings yet
Unit 4 Lesson 2 Quantitative Analysis and Interpretation
37 pages
Mean. Num.
100% (1)
Mean. Num.
3 pages
Advanced Econometrics (I) Chapter 9 - Hypothesis Testing Fall 2012
No ratings yet
Advanced Econometrics (I) Chapter 9 - Hypothesis Testing Fall 2012
33 pages
Data Science Lab Manual..
No ratings yet
Data Science Lab Manual..
54 pages
Quantiles and Measures of Position Guide
No ratings yet
Quantiles and Measures of Position Guide
10 pages
Ca Foundation Math
No ratings yet
Ca Foundation Math
19 pages
Module Zero (Answers) Saura
No ratings yet
Module Zero (Answers) Saura
10 pages
Business Statistics Course Overview
No ratings yet
Business Statistics Course Overview
30 pages
Arithmetic Mean: Averages or Measures of Central Tendency
No ratings yet
Arithmetic Mean: Averages or Measures of Central Tendency
9 pages
STATS 10 Assignment 1 PT 2
No ratings yet
STATS 10 Assignment 1 PT 2
4 pages
Measures of Dispersion or Variation: Vijay - Gahlawat@yahoo - Co.in
No ratings yet
Measures of Dispersion or Variation: Vijay - Gahlawat@yahoo - Co.in
31 pages

Simple Linear Regression Analysis

Uploaded by

Simple Linear Regression Analysis

Uploaded by

Regression Analysis

Independent Variable(s) Dependent Variable

 Linear regression estimates the coefficients of the

 If you believe that none of your predictor variables

Positive Linear Relationship Relationship NOT Linear

Negative Linear Relationship No Relationship

xi = Value of X for observation i

b0 = Sample Y - intercept used as estimate

Sample data are used to estimate the true

The process of differentiating i 1 i

75 The slope of 0.897 means for

 Similar to a test on r in the one-predictor case

t =(0.8972136-0)/0.1665043 = 5.39 H0 is rejected,

• a ratio reflecting the proportion of variance captured

For Linear Models

 2. Homoscedasticity (Constant Variance)

y values are normally distributed

 Graphical Analysis of Residuals

Using Standardized Residuals

kdensity r, normal swilk r  Normal

Kernel density estimate

Using Standardized Residuals Using Studentized Residuals

scatter r obs, yline(0)

You might also like