0% found this document useful (0 votes)

21 views20 pages

Lec 3 Regression.

Linear regression is a supervised machine learning algorithm that finds the best linear relationship between a dependent variable and one or more independent variables. It works by minimizing the sum of the squares of the differences between the actual and predicted values. Overfitting and underfitting occur when the model is too complex or simple respectively, and various techniques like regularization can help address overfitting.

Uploaded by

Katende Chris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views20 pages

Lec 3 Regression.

Uploaded by

Katende Chris

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Linear Regression

 Linear Regression is a supervised machine learning algorithm.

 It tries to find out the best linear relationship that describes the data you have.
 It assumes that there exists a linear relationship between a dependent variable and
independent variable(s).
 The value of the dependent variable of a linear regression model is a continuous
value i.e. real numbers.
Linear Regression
We want to find the best line (linear function y=f(X)) to
explain the data.
y

X
Simple Linear Regression
Simple Linear Regression Equation

The equation that describes how individual y values relate to x

The predicted value of y is given by: 𝑦 = 𝑏0 + 𝑏1 X . Where;

 y is a dependent variable.
 𝑦 is the predicted value of y
 X is an independent variable.
 b0 and b1 are the regression coefficients.
 b0 is the intercept or the bias that fixes the offset to a line.
 b1 is the slope or weight that specifies the factor by which X
has an impact on Y.
Error for Simple Linear Regression model

Y= 𝛽0 + 𝛽1 X + 𝜀 ed the regressiondel.
 𝜀: reflects how individuals deviate from others with the same
value of x
Ŷ82=b0 + b182 e82=Y82-Ŷ82

X=82
Estimated Simple Linear Regression Equation
 Recall: The estimated simple linear regression equation is:
𝑌 = 𝑏0 + 𝑏1 X
 b0 is the estimate for β0

 b1 is the estimate for β1

 𝑌 is the estimated (predicted) value of Y for a given x value.

ŷ
Least Squares method

• Least Squares Criterion: Choose the “best” β0 and β1 to minimize

• S=Σ(𝑌𝑖 – (𝛽0 + 𝛽1𝑋𝑖) )2

• Use calculus: take derivative with respect to β0 and with respect to

β1 and set the two resulting equations equal to zero and solve for β0
and β1

• Of all possible lines pick the one that minimizes the sum of the
distances squared of each point from that line
Least Squares Solution

b1 
 (X  X )(Y  Y )
i i

 (X  X )
slope:
2
i

Intercept: b 0  Y  b1 X
Estimating the Variance s 2

• An Estimate of s 2
The mean square error (MSE) provides the estimate
of s 2, and the notation s2 is also used.
s2 = MSE = SSE/(n-2)
where:

If points are close to the regression line then SSE will be small
If points are far from the regression line then SSE will be large

SSE: Sum of Squared Errors

Bias, variance tradeoff

Variance
Bias
 Regression predictions should be unbiased. That is:
"average of predictions" should ≈ "average of observations"
 Bias measures how far the mean of predictions is from the mean of actual
values
𝐵𝑖𝑎𝑠 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠 – 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑐𝑡𝑢𝑎𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 (𝑔𝑟𝑜𝑢𝑛𝑑 𝑡𝑟𝑢𝑡ℎ 𝑙𝑎𝑏𝑒𝑙𝑠)

 The model can’t fit the data (usually too simplistic)

 Increase the complexity of the model to minimize Bias
Variance
 Variance indicates how much the estimate of the target function will alter if different
training data were used.
 It describes how much a random variable differs from its expected value.
 It is based on a single training set.
 Measures the inconsistency of different predictions using different training

 Different samples of training data yield different model fits

 Increase the size training data set to minimize variance
Overfitting Vs Model’s Complexity

•Models with high bias will have low variance.

•Models with high variance will have a low bias
Over-fitting

 Overfitting is an undesirable behavior where a learning model gives

accurate predictions for training data but not for new data.
 The machines fits all the data points or more than the required data
points present in the training set,.
 The model starts caching noise
How to minimize Overfitting

 Reduce model complexity

 Training with more data
 Removing features
 Early stopping the training
 Regularization
Regularization and Over-fitting
Adding a regularizer:

Model
error Without regularizer
With regularizer

Number of iterations
Under fitting

 The model is not able to capture the underlying trend of the

data.
How to minimize underfitting

• By increasing the training time of the model.

• By increasing the number of features
• Increasing model complexity
2. Multiple Linear Regression
 In multiple linear regression, the dependent variable depends on more than one
independent variables

 The predicted value of y is given by:

𝑦 = 𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2 + 𝛽3𝑋3 + … … + 𝛽𝑛𝑋𝑛

Linear Regression
No ratings yet
Linear Regression
60 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Unit - 3 - ML - 24
No ratings yet
Unit - 3 - ML - 24
41 pages
Machine Learning Unit2
No ratings yet
Machine Learning Unit2
31 pages
FML Unit2
No ratings yet
FML Unit2
13 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Unit 2 ML Regression
No ratings yet
Unit 2 ML Regression
46 pages
Regression
No ratings yet
Regression
45 pages
Unit 2
No ratings yet
Unit 2
136 pages
SumitBurnwal ML
No ratings yet
SumitBurnwal ML
13 pages
d3 It ML Jan 2023 Part 2
No ratings yet
d3 It ML Jan 2023 Part 2
32 pages
Lecture 3
No ratings yet
Lecture 3
33 pages
2-Linear Regression
No ratings yet
2-Linear Regression
31 pages
Isn't Linear Regression From Statistics?
No ratings yet
Isn't Linear Regression From Statistics?
4 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
Supervised Learning Essentials
No ratings yet
Supervised Learning Essentials
30 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
36 pages
2.1 Supervised Regression
No ratings yet
2.1 Supervised Regression
26 pages
Linear-Regression ML
No ratings yet
Linear-Regression ML
36 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
MLS 1 - Regression
No ratings yet
MLS 1 - Regression
20 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Linear Regression
No ratings yet
Linear Regression
35 pages
Regression Questionnaire
No ratings yet
Regression Questionnaire
10 pages
Linear Regression Explained
No ratings yet
Linear Regression Explained
26 pages
Linear Regression
No ratings yet
Linear Regression
89 pages
ML 1
No ratings yet
ML 1
24 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Linear Regression Techniques Overview
No ratings yet
Linear Regression Techniques Overview
19 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
Top 100 ML Interview Q&A
100% (1)
Top 100 ML Interview Q&A
39 pages
RRB - Unit 2 Regresion
No ratings yet
RRB - Unit 2 Regresion
53 pages
Complete Linear Regression Algorithm
No ratings yet
Complete Linear Regression Algorithm
4 pages
Linear Regression Guide & Examples
No ratings yet
Linear Regression Guide & Examples
36 pages
MachineLearning Unit II
No ratings yet
MachineLearning Unit II
45 pages
ML Unit-2
No ratings yet
ML Unit-2
34 pages
Unit-2 Supervised Machine Learning
No ratings yet
Unit-2 Supervised Machine Learning
132 pages
Linear Regression & Least Squares
No ratings yet
Linear Regression & Least Squares
29 pages
Linear Regression Model Presentation
No ratings yet
Linear Regression Model Presentation
7 pages
Predictive Analytics
No ratings yet
Predictive Analytics
46 pages
Limitations of Linear Regression
No ratings yet
Limitations of Linear Regression
41 pages
Unit-2 ML
No ratings yet
Unit-2 ML
199 pages
Sparse Regression
No ratings yet
Sparse Regression
37 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
Data Science
100% (1)
Data Science
14 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Assignment 2
No ratings yet
Assignment 2
42 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Unit 2
No ratings yet
Unit 2
92 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Regression v33
No ratings yet
Regression v33
81 pages
Bias and Variance Tradeoff:: High Bias Underfitting Low Training & Testing
No ratings yet
Bias and Variance Tradeoff:: High Bias Underfitting Low Training & Testing
12 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Updated Module2 - OTML Updated
No ratings yet
Updated Module2 - OTML Updated
83 pages
Linear & Logistic Regression Guide
No ratings yet
Linear & Logistic Regression Guide
31 pages
03 Patterns Styles
No ratings yet
03 Patterns Styles
37 pages
Introduction 01
No ratings yet
Introduction 01
35 pages
System Testing Annotated
No ratings yet
System Testing Annotated
56 pages
Software Testing Fundamentals
No ratings yet
Software Testing Fundamentals
43 pages
OOP Exam
No ratings yet
OOP Exam
13 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
20 pages
Java Object Serialization Guide
No ratings yet
Java Object Serialization Guide
11 pages
BSE2107 OOP II - Inner Classes in Java-1
No ratings yet
BSE2107 OOP II - Inner Classes in Java-1
25 pages
Linear Regression Assignment Overview
0% (2)
Linear Regression Assignment Overview
8 pages
Language Education Research
100% (1)
Language Education Research
15 pages
Least Significant Difference Test
No ratings yet
Least Significant Difference Test
18 pages
Stan-Reference-2 17 1 PDF
No ratings yet
Stan-Reference-2 17 1 PDF
642 pages
Lecture 07
No ratings yet
Lecture 07
22 pages
Logistic Regression A Brief Primer
No ratings yet
Logistic Regression A Brief Primer
6 pages
Estimation Theory Presentation
100% (2)
Estimation Theory Presentation
66 pages
2014 Aguinis Methodological Wishes
No ratings yet
2014 Aguinis Methodological Wishes
32 pages
Ivy R Final Assessment
No ratings yet
Ivy R Final Assessment
10 pages
(Chapman & Hall - CRC Texts in Statistical Science) Anthony Almudevar - Theory of Statistical Inference (2021, Chapman and Hall - CRC) - Libgen - Li
100% (2)
(Chapman & Hall - CRC Texts in Statistical Science) Anthony Almudevar - Theory of Statistical Inference (2021, Chapman and Hall - CRC) - Libgen - Li
470 pages
Exercises D'application Regression Analysis
No ratings yet
Exercises D'application Regression Analysis
4 pages
Data Types R.M
No ratings yet
Data Types R.M
4 pages
Testing The CAPM
No ratings yet
Testing The CAPM
32 pages
MPA 501 - Statistical Methods Assignment
No ratings yet
MPA 501 - Statistical Methods Assignment
3 pages
Histogram 2018 Jan
No ratings yet
Histogram 2018 Jan
4 pages
Ambit Alam 7 MTAP - Estimation 1
No ratings yet
Ambit Alam 7 MTAP - Estimation 1
54 pages
Econometrics II CH 2
100% (1)
Econometrics II CH 2
18 pages
Violations of Classical Regression Assumptions
No ratings yet
Violations of Classical Regression Assumptions
16 pages
Statistics in Social Media
No ratings yet
Statistics in Social Media
4 pages
Sampling Distribution, Central Limit Theorem and Point Estimation of Parameters
No ratings yet
Sampling Distribution, Central Limit Theorem and Point Estimation of Parameters
22 pages
Statistical Tools
No ratings yet
Statistical Tools
12 pages
A Practical Guide To Bootstrap in R
No ratings yet
A Practical Guide To Bootstrap in R
4 pages
Chapter 3 PDF
No ratings yet
Chapter 3 PDF
81 pages
Advanced Gaussian Process Models
No ratings yet
Advanced Gaussian Process Models
15 pages
Reading Comprehension and Academic Vocabulary: Exploring Relations of Item Features and Reading Proficiency
No ratings yet
Reading Comprehension and Academic Vocabulary: Exploring Relations of Item Features and Reading Proficiency
22 pages
Null Hypothesis
No ratings yet
Null Hypothesis
10 pages
Advanced Econometrics Course Outline
No ratings yet
Advanced Econometrics Course Outline
3 pages
CH-1 Regression Data Analysis
No ratings yet
CH-1 Regression Data Analysis
17 pages
SSC201 PDF
No ratings yet
SSC201 PDF
163 pages

Lec 3 Regression.

Uploaded by

Lec 3 Regression.

Uploaded by

Linear Regression

 Linear Regression is a supervised machine learning algorithm.

The equation that describes how individual y values relate to x

 b1 is the estimate for β1

 𝑌 is the estimated (predicted) value of Y for a given x value.

• Least Squares Criterion: Choose the “best” β0 and β1 to minimize

• S=Σ(𝑌𝑖 – (𝛽0 + 𝛽1𝑋𝑖) )2

• Use calculus: take derivative with respect to β0 and with respect to

SSE: Sum of Squared Errors

 The model can’t fit the data (usually too simplistic)

 Different samples of training data yield different model fits

•Models with high bias will have low variance.

 Overfitting is an undesirable behavior where a learning model gives

 Reduce model complexity

 The model is not able to capture the underlying trend of the

• By increasing the training time of the model.

 The predicted value of y is given by:

𝑦 = 𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2 + 𝛽3𝑋3 + … … + 𝛽𝑛𝑋𝑛

You might also like