Linear Regression
Unit 2.1
Linear Regression
Linear Regression
Disclaimer
The content is curated from online/offline resources and used for educational purpose only
Linear Regression
Demo: Let’s Try Linear Regression via GUI
Reference
Linear Regression
Learning Objectives
• About Linear Regression
• Types of Regression
• Goal
• Regression Modelling
• Performance Metrices
• Bias Variance Tradeoff
• Regularized Regression
Linear Regression
Introduction
• In 1800, a person named Francis Galton, was
studying the relationship between parents and
their children.
• He investigated the relationship between height of
fathers and their sons.
• He discovered that a man’s son tends to be
roughly as tall as his father, however, tall son’s
height tended to be closer to the overall
average height of all people’s sons.
• Galton call this phenomenon as “Regression”
as “father’s son height tends to regress (or
drift towards) the mean (average) height of
everyone else.
Linear Regression
Linear Regression
• Regression is used to study the relationship between two variables.
• We can use simple regression if both the dependent variable (DV) and the independent variable (IV)
are numerical.
• If the DV is numerical but the IV is categorical, it is best to use Linear Regression.
Example
The following are situations where we can use regression:
1. Testing if IQ affects income (IQ is the IV and income is the DV).
2. Testing if hours of work affects hours of sleep (DV is hours of sleep and the hours of work is the IV).
3. Testing if the number of cigarettes smoked affects blood pressure (number of cigarettes smoked is the
IV and blood pressure is the DV).
4. Chances of heart failure due to high body fat
Linear Regression
Linear Regression
• Regression is used to study the relationship between two variables.
• We can use simple regression if both the dependent variable (DV) and the independent variable (IV)
are numerical.
• If the DV is numerical but the IV is categorical, it is best to use Linear Regression.
Reference
Linear Regression
Displaying the Data
• Displaying data for Testing if IQ affects income
(IQ is the IV and income is the DV).
• When both the DV and IV are numerical, we can
represent data in the form of a scatterplot.
Income V/s IQ
Linear Regression
Displaying the Data
• Displaying data of Chances of heart failure due to
high body fat
• It is important to perform a scatterplot because it
helps us to see if the relationship is linear.
Linear Regression
Regression Case
Dataset related to Co2 emissions from different cars.
ENGINESIZE CYLINDERS FUELCONSUMPTION_COMB CO2EMISSION
0 2.0 4 8.5 196
1 2.4 4 9.6 221
2 1.5 4 5.9 136
3 3.5 6 11.1 255
4 3.5 6 10.6 244
5 3.5 6 10.0 230
6 3.5 6 10.1 232
7 3.7 6 11.1 255
8 3.7 6 11.6 267
9 2.4 4 9.2 ?
Linear Regression
Regression Case
• Looking to the existing data of different cars, can we estimate the approx. CO2 emission of a car,
which is yet not manufactured, such as in row 9 ?
• We can use regression methods to predict a continuous value, such as CO2 Emission, using
some other variables.
• In regression there are two types of variables:
• a dependent variable (DV, which we want to predict) and
• one or more independent variables (IV, existing features).
Linear Regression
Regression Essentials
• The key point in the regression is that our dependent value should be continuous and cannot be a
discreet value.
• However, the independent variable or variables can be either a categorical or continuous.
• We use regression to build such a regression/estimation model which would be used to predict the
expected Co2 emission for a new or unknown car.
Reference
Linear Regression
Types of Regression Model
1. Simple regression is when one independent variable is used to
estimate a dependent variable.
It can be linear or non-linear.
Ex: predicting Co2 emission using the variable of Engine Size.
2. When more than one independent variable is present, the process
is called multiple linear regression.
Ex: predicting Co2 emission using Engine Size and the number of
Cylinders in any given car.
Linearity of regression depends on the relation between dependent and
independent variables; it can be either linear or non-linear regression.
Reference
Linear Regression
Regression Application Areas
• Essentially, we use regression when we want to estimate a continuous value.
• You can try to predict a salesperson's total yearly sales (sales forecast) from independent
variables such as age, education, and years of experience.
• We can use regression analysis to predict the price of a house in an area, based on its size, number
of bedrooms, and so on.
• We can even use it to predict employment income for independent variables, such as hours of work,
education, occupation, sex, age, years of experience, and so on.
Linear Regression
Simple Linear Regression
• How to calculate a regression with only 2 data points ?
• In linear regression, we calculate regression line by
drawing a connecting line
• For classic linear regression or “Least Square Method”,
you only measure the closeness in the “up and down”
direction.
• Here we have perfectly fitted line because we have
only 2 points.
Reference
Linear Regression
Regression with More Data Points
• Now wouldn't it be great if we could apply this same
concept to a graph with more than just two data points?
• By doing this, we could take multiple men and their
son’s heights and do things like tell a man how tall
we expect his son to be...before he even has a son!
• This is the idea behind supervised learning!
Linear Regression
Regression Goal
• Goal is to determine the best line by minimizing the
vertical distance between all the data points and our
line.
• Lot of different ways to minimize this, (sum of
squared errors, sum of absolute errors, etc).
• All these methods have a general goal of minimizing
this distance between your line and rest of data
points.
Linear Regression
Case Study
• This dataset is related to the Co2 emission of Y:
different cars. Dependent
• The question is: Given this dataset, can we variable
predict the Co2 emission of a car, using X: Independent variable
another field, such as Engine size?
Continuous values
Yes!
Linear Regression
Scatter Plot
• To understand linear regression, we can plot our
variables here.
• Engine size -- independent variable, Emission –
dependent/target value that we would like to predict.
• A scatterplot clearly shows the relation between
variables where changes in one variable "explain" or
possibly "cause" changes in the other variable.
• Also, it indicates that these variables are linearly
related.
Linear Regression
Inference from Scatter Plot
• As the Engine Size increases, so do the emissions.
• How do we use this line for prediction now?
• Let us assume, for a moment, that the line is a good fit
of data.
• We can use it to predict the emission of an unknown car.
Linear Regression
Regression Modeling - Fitting Line
• Fitting line help us to predict the target value, Y, using the independent variable 'Engine Size'
represented on X axis
• The fit line is shown traditionally as a polynomial.
• In Simple regression Problem (single x), the form of the model would be
𝑦ො = 𝜃1 + 𝜃2 𝑥1
𝜃1 = intercept 𝜃2 = slope of the line
• Where Y is the dependent variable, or the predicted value and X is the independent variable.
• 𝜃1 and 𝜃2 are coefficient of linear equation
Linear Regression
Model Error
• If we have, for instance, a car with engine size x1 =5.4,
and actual Co2=250,
• Its Co2 should be predicted very close to the actual
value, which is y=250, based on historical data.
• But, if we use the fit line it will return
ŷ =340.
• Compare the actual value with we predicted using
our model, you will find out that we have a 90-unit
error.
• Prediction line is not accurate. This error is also called
the residual error. Error = ŷ – y = 340-250 = 90
Reference
Linear Regression
Mean Absolute Error
In this, the residual for every data point, taking only the
absolute value of each so that negative and positive
residuals do not cancel out. Then take the average of all
these residuals.
Linear Regression
Mean Squared Error
• The mean square error (MSE) is just like the MAE.
• But squares the difference before summing them all instead of using the absolute value. We can
see this difference in the equation below.
Linear Regression
R2 Score
• Statistical measure that represents the proportion of the variance for a dependent variable that's
explained by an independent variable or variables in a regression model.
• If the R2 of a model is 0.50, then approximately half of the observed variation can be explained by
the model's inputs.
• Formula for R-Squared
Linear Regression
Adjusted R2 Score
• “R-squared will always increase when a new predictor variable is added to the regression model.”
• Regression model with a large number of predictor variables.
• Has a high R-squared value, even if the model doesn’t fit the data well.
Adjusted R2 = 1 – [(1-R2)*(n-1)/(n-k-1)]
where:
R2: The R2 of the model
n: The number of observations
k: The number of predictor variables
Formula for Adj. R-Squared
Linear Regression
Parameter Estimation
• The objective of linear regression is to minimize this MSE equation, and to minimize it, we
should find the best parameters, 𝜃0 and 𝜃1 .
• How to find θ0 and θ1 in such a way that it minimizes this error?
We have two options here:
Option 1 - We can use a mathematic approach Or,
Option 2 - We can use an optimization approach.
Linear Regression
Mathematical Approach
• θ0 and θ1 (intercept and slope of the line, termed Beta parameter) are the coefficients of the fit
line.
• Need to calculate the mean of the independent and dependent or target columns, from the
dataset.
• Notice: All of the data must be available.
• It can be shown that the intercept and slope can be calculated using these variables.
• We can start off by estimating the value for θ1.
Linear Regression
Parameter Estimation
𝑦ො = 𝜃0 + 𝜃1 𝑥1
σ𝒔𝒊=𝟏 𝒙𝒊 −ഥ
𝒙 𝒚𝒊 −ഥ
𝒚
𝜣𝟏 = 𝟐
σ𝒔𝒊=𝟏 𝒙𝒊 −ഥ
𝒙
𝑥ҧ = 3.34
𝑦ത = 256
2−3.34 196−256 + 2.4−3.34 221−256 + ………
Θ1 =
(2.0−3.34)2 +(2.4−3.34)2 + …..
Θ1 = 39
Θ0 = 𝑦ത − Θ1 𝑥ҧ = 125.74
Linear Regression
Making Predictions
• We can write down the polynomial of the line.
ෝ = 125.74 + 39x1
𝒚
• Making predictions is as simple as solving the equation for a specific set of inputs.
• Imagine we are predicting Co2 Emission(y) from EngineSize(x) for the Automobile in record
number 9. So, looking to the dataset, x1 = 2.4
• Implementing x1 in above equation, we can predict the CO2 emission of this specific car
(row 9) with engine size 2.4 :
ෝ = 218.6
𝒚
Linear Regression
Lab 1: Linear Regression on Car Emission data
Linear Regression
Bias-Variance Tradeoff
• Variance: Defines model complexity
eg:- Non Linear models
• Bias: Defines model imperfection
eg: Linear model for complex cases
Reference
Linear Regression
Lasso & Ridge Regression
• To handle bias-variance tradeoff
• Ridge regression seeks to minimize the
following:
MSE + λΣβj2
• Lasso regression seeks to minimize the
following:
MSE + λΣ|βj|
• Second term is known as a shrinkage penalty.
Reference
Linear Regression
Regularization: An Overview
• The idea of regularization revolves around modifying the loss function L; in particular, we add a
regularization term that penalizes some specified properties of the model parameters.
• where λ is a scalar that gives the weight (or importance) of the regularization term.
• Fitting the model using the modified loss function Lreg would result in model parameters with
desirable properties (specified by R).
Linear Regression
LASSO Regression
• Since we wish to discourage extreme values in model parameter.
• Need to choose a regularization term that penalizes parameter magnitudes.
• For our loss function, we will again use MSE.
• Together our regularized loss function is:
Note that is the l1 norm of the vector b
Linear Regression
Ridge Regression
• Can choose a regularization term that penalizes the squares of the parameter magnitudes. Then,
our regularized loss function is:
• Note that is the square of the l2 norm of the vector b
Linear Regression
Choosing Lambda ?
In both ridge and LASSO regression, larger choice of the regularization parameter l, the more
heavily penalize large values in b,
• If l is close to zero, we recover the MSE, i.e. ridge and LASSO regression is just ordinary
regression.
• If l is sufficiently large, the MSE term will be insignificant, and the regularization term will
force bridge and bLASSO to be close to zero.
To avoid ad-hoc choices, we should select K using cross-validation.
Linear Regression
Lab 2: Lasso & Ridge Regression for House Price Estimation
Linear Regression
Summary
In this session we have learned:
• What does it mean by term regression.
• Understood the dependent and independent variable roles.
• Framework to formulae linear regression model.
• Approach to attain solution using a mathematical approach
• How model complexity adds cost to the model
• How to avoid overfitting using regularization
• Realization of Lasso & Ridge Models
Linear Regression
Quiz
Q1. What is the primary objective of linear regression analysis?
a) To classify data points into different categories
b) To predict a continuous outcome variable based on one or more predictor variables
c) To find the median value of a dataset
d) To calculate the mode of a dataset
Correct Answer: b) To predict a continuous outcome variable based on one or more predictor
variables
Linear Regression
Quiz
Q2. In simple linear regression, how many predictor variables are used to predict the
outcome variable?
a) One
b) Two or more
c) None
d) It varies depending on the dataset
Correct Answer: a) One
Linear Regression
Quiz
Q3. What is the goal of minimizing the sum of squared errors (SSE) in linear regression?
a) To maximize the R-squared value
b) To minimize multicollinearity among predictor variables
c) To find the best-fitting line that minimizes the difference between predicted and observed values
d) To maximize the p-value of the regression coefficients
Correct Answer: c) To find the best-fitting line that minimizes the difference between predicted and
observed values
Linear Regression
Quiz
Q4. Ridge and Lasso Regression are techniques primarily used for:
a) Feature extraction
b) Dimensionality reduction
c) Regularization in linear regression
d) Classification
Correct Answer: c) Regularization in linear regression
Linear Regression
References
• https://en.wikipedia.org/wiki/Linear_regression
• https://web.stanford.edu/~hastie/ElemStatLearn//printings/ESLII_print12.pdf
• https://www.coursera.org/learn/machine-learning
• https://scikit-learn.org/stable/modules/linear_model.html#ordinary-least-squares
• https://towardsdatascience.com/linear-regression-in-python-9a1f5f000606
• https://www.r-bloggers.com/2020/09/linear-regression-in-r/
• "Introduction to Linear Regression Analysis" by Douglas C. Montgomery, Elizabeth A. Peck,
and G. Geoffrey Vining
• "Applied Linear Regression" by Sanford Weisberg
Linear Regression
Thank you...!