0% found this document useful (0 votes)
7 views28 pages

Linear Regression

The document provides an overview of linear regression within the context of machine learning, explaining its purpose in predicting values based on relationships between variables. It covers types of data, methods for regression including least squares and gradient descent, and the formulation of regression problems. Additionally, it discusses performance metrics for evaluating regression models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views28 pages

Linear Regression

The document provides an overview of linear regression within the context of machine learning, explaining its purpose in predicting values based on relationships between variables. It covers types of data, methods for regression including least squares and gradient descent, and the formulation of regression problems. Additionally, it discusses performance metrics for evaluating regression models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Linear Regression

Objectives

v What is machine learning

vTypes of data and terminology


vTypes of machine learning
v Supervised learning
v Linear regression
v Least square and Gradient Descent

v Hands on implementing Linear Regression from sketch.


Machine Learning

v Machine Learning is the science to make computers learn from data


without explicitly program them and improve their learning over time
in autonomous fashion.

v This learning comes by feeding them data in the form of


observations and real-world interactions.”

v Machine Learning can also be defined as a tool to predict future


events or values using past data.
Types of Data

vBased on Values
vContinuous data (ex. Age – 0-100)

v Categorical data (ex. Gender- Male/Female)

v Based on pattern
v Structured data (ex. Databases)
v Unstructured data (ex. Audio, Video, Text)
Types of Data- continued

v Labelled data – consists of input output pair. For every set


input features the output/response/label is present in
dataset. (ex- labelled image as cat’s or dog’s photo)
{ 𝑥# , 𝑦# , 𝑥& , 𝑦& , 𝑥' , 𝑦' … … … … … 𝑥) , 𝑦) }
vUnlabelled data- There is no output/response/label for the
input features in data. (ex. news articles, tweets, audio)
{𝑥# , 𝑥& , 𝑥' … … … … 𝑥) }
Types of Data- continued

vTraining Data – Sample data points which are used to train the
machine learning model.
vTest Data- sample data points that are used to test the
performance of machine learning model.

Note- For modelling, the original dataset is partitioned into the ratio of 70:30 or 75:25 as
training data and test data.
Types of Machine Learning

Machine
Learning

Supervised Unsupervised Reinforcement


Learning Learning Learning

Regression Classification Clustering PCA


Supervised Learning

v Class of machine learning that work on externally supplied instances in form of


predictor attributes and associated target values.

v The model learns from the training data using these ‘target variables’ as
reference variables.
vEx1 : model to predict the resale value of a car based on its mileage, age,
color etc.

v The target values are the ‘correct answers’ for the predictor model which can
either be a regression model or a classification model.
Motivation for learning
v It is being assumed that there exists a relationship/association between input
features and target variable.
v Relationship can be observed by plotting a scatter plot between the two
variables.

v Relationship measure can be quantified by calculating correlation between two


the variables.
𝑐𝑜𝑣(𝑥, 𝑦) ∑ 𝑥5 − 𝑥̅ ∗ 𝑦5 − 𝑦9
𝑐𝑜𝑟𝑟 𝑥, 𝑦 = =
𝑣𝑎𝑟 𝑥 . 𝑣𝑎𝑟(𝑦)
∑ 𝑥5 − 𝑥̅ & ∑ 𝑦5 − 𝑦9 &
Linear Regression

v Linear regression is a way to identify a relationship between two or more


variables and use these relationships to predict values of one variable for given
value(s) of other variable(s).
v Linear regression assume the relationship between variables can be modelled
through linear equation or an equation of line
Slope

Dependent/Regressed variable 𝒚 = 𝒘𝟎 + 𝒘𝟏 𝑿 Independent/Regressor variable

Intercept
Multiple Regression

v Last slide showed the linear regression model with one independent and one
dependent variable.
v In Real world a data point has various important attributes and they need to be
catered to while developing a regression model. (Many independent variables and
one dependent variable)

𝑦 = 𝑤A + 𝑤#𝑥# + 𝑤&𝑥& + 𝑤'𝑥'. … … … . wCxC


Regression –Problem Formulation

Let you have given with a data:


170
Age in Years Blood Pressure
(X) (Y) 160

Blood Pressure (Y)


56 147 150
49 145
140
72 160
38 115 130

63 130 120
47 128
110
35 45 55 65 75
Age in Year (X)
Linear Regression

v For given example the Linear Regression is modeled as:

𝐵𝑙𝑜𝑜𝑑𝑃𝑟𝑒𝑠𝑠𝑢𝑟𝑒 𝑦 = 𝑤A + 𝑤# 𝐴𝑔𝑒𝑖𝑛𝑌𝑒𝑎𝑟(𝑋)
OR

𝑦 = 𝑤A + 𝑤# 𝑋 – Equation of line

with 𝑤A 𝑖𝑠 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 𝑜𝑛 𝑌_𝑎𝑥𝑖𝑠 𝑎𝑛𝑑 𝑤# 𝑖𝑠 𝑠𝑙𝑜𝑝𝑒 𝑜𝑓 𝑙𝑖𝑛𝑒

Blood Pressure - Dependent Variable

Age in Year - Independent Variable


Linear Regression- Best Fit Line

v Regression uses line to show the trend of distribution.

vThere can be many lines that try to fit the data points in scatter diagram

v The aim is to find Best fit Line


170

160
Blood Pressure (Y)
150

140

130

120

110
35 45 55 65 75
Age in Year (X)
What is Best Fit Line

v Best fit line tries to explain the variance in given data. (minimize the total residual/error)
What is Best Fit Line

v Best fit line tries to explain the variance in given data. (minimize the total residual/error)
Linear Regression- Methods to Get Best

vLeast Square

v Gradient Descent
Linear Regression- Least Square

Model: 𝑌 = 𝑤A + 𝑤# 𝑋
Task: Estimate 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑤A 𝑎𝑛𝑑 𝑤#

According to pr𝑖𝑛𝑐𝑖𝑝𝑙𝑒 𝑜𝑓 𝑙𝑒𝑎𝑠𝑡 𝑠𝑞𝑢𝑎𝑟𝑒 𝑡ℎ𝑒 𝑛𝑜𝑟𝑚𝑎𝑙 𝑒𝑞𝑢𝑎𝑡𝑖𝑜𝑛𝑠 𝑡𝑜 𝑠𝑜𝑙𝑣𝑒 𝑓𝑜𝑟 𝑤A 𝑎𝑛𝑑 𝑤#
) )

c 𝑌5 = 𝑛 𝑤A + 𝑤# c 𝑋5 … … … … … (1)
5d# 5d#

) )
)
c 𝑋5 𝑌5 = 𝑤A c 𝑋5 + 𝑤# c 𝑋5& … … … … . (2)
5d#
5d# 5d#
Linear Regression–Least Square

Let divide the equation (1) by n (number of sample points) we get:

) )
1 1
c 𝑌5 = 𝑤A + 𝑤# c 𝑋5
𝑛 𝑛
5d# 5d#

OR
𝑦9 = 𝑤A + 𝑤#𝑥………….(3)
̅

So line of regression will always passes through the points (𝑥,̅ 𝑦)


9
Linear Regression–Least Square

Now we know :
# ) # )
𝑐𝑜𝑣 𝑥, 𝑦 = ∑ 𝑥𝑦 − 𝑥̅ 𝑦9 ≔ ∑ 𝑥𝑦 = 𝑐𝑜𝑣 𝑥, 𝑦 + 𝑥̅ 𝑦9 ………(4)
) 5d# 5 5 ) 5d# 5 5

and
# ) # )
𝑣𝑎𝑟 𝑥 = ∑5d# 𝑥5& − 𝑥̅ & and 𝑣𝑎𝑟 𝑦 = ∑5d# 𝑦5& − 𝑦9 &
) )

Dividing equation (2) by n and using equation (4) and (5) we get:

𝑐𝑜𝑣 𝑥, 𝑦 + 𝑥̅ 𝑦9 = 𝑤A𝑥̅ + 𝑤#(𝑣𝑎𝑟 𝑥 + 𝑥̅ &)…………………….(5)


Linear Regression–Least Square

Now by using equation


𝑦9 = 𝑤A + 𝑤#𝑥̅
and
𝑐𝑜𝑣 𝑥, 𝑦 + 𝑥̅ 𝑦9 = 𝑤A𝑥̅ + 𝑤#(𝑣𝑎𝑟 𝑥 + 𝑥̅ &)
We will get:

𝑐𝑜𝑣(𝑥, 𝑦)
𝑤# =
𝑣𝑎𝑟(𝑥)

and 𝑤A = 𝑦9 − 𝑤#𝑥̅
Performance metric for least square regression

1 )
∑5d#(𝑦5 − 𝑦ℎ𝑎𝑡5 )&
𝑅& = 1 − 𝑛
1 )
9 &
∑5d#(𝑦5 − 𝑦)
𝑛

& (1 − 𝑅&)(𝑛 − 1)
𝑅klm =1−
(𝑛 − 𝑘 − 1)
Linear Regression- Gradient Descent

Model: 𝑌 = 𝑤A + 𝑤# 𝑋

Task: Estimate 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑤A 𝑎𝑛𝑑 𝑤#


Define the cost function,

)
1
𝑐𝑜𝑠𝑡 𝑤A , 𝑤# = c(𝑦5 − 𝑦ℎ𝑎𝑡5 )&
𝑛
5d#
Objective of gradient Descent

)
1
𝐦𝐢𝐧 𝑐𝑜𝑠𝑡 𝑤A , 𝑤# = c(𝑦5 − (𝑤A + 𝑤# 𝑥5 ))&
𝒘𝟎 ,𝒘𝟏 𝑛
5d#
Linear Regression- Gradient Descent

Model: 𝑌 = 𝑤A + 𝑤#𝑋

Task: Estimate 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑤A 𝑎𝑛𝑑 𝑤#

Cost(w0, w1)
the objective,

)
1
𝐦𝐢𝐧 𝑐𝑜𝑠𝑡 𝑤A, 𝑤# = c(𝑦5 − (𝑤A + 𝑤#𝑥5 ))&
𝒘𝟎 ,𝒘𝟏 𝑛
5d#

w0
Linear Regression- Gradient Descent

v Gradient descent works if following steps:


1. Initialize the parameters to some random variable

2. Calculate the gradient of cost function w. r. t. to parameters

3. Update the parameters using gradient in opposite direction.

4. Repeat step-2 and step-3 for some number of times or till it reaches to minimum
cost value.
Linear Regression- Gradient Descent
)
1
𝑐𝑜𝑠𝑡 𝑤A, 𝑤# = c(𝑦5 − (𝑤A + 𝑤#𝑥5 ))&
𝑛
5d#
Calculating gradients of cost function:

𝜕𝑐𝑜𝑠𝑡(𝑤A , 𝑤# ) 2 )
𝑔𝑟𝑎𝑑𝑤A = = c (𝑦5 − (𝑤A + 𝑤# 𝑥5 ))(−1)
𝜕𝑤A 𝑛 5d#

𝜕𝑐𝑜𝑠𝑡(𝑤A , 𝑤# ) 2 )
𝑔𝑟𝑎𝑑𝑤# = = c (𝑦5 − (𝑤A + 𝑤# 𝑥5 ))(−𝑥)
𝜕𝑤# 𝑛 5d#
Parameter update:
𝑤A = 𝑤A − 𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑟𝑎𝑡𝑒 ∗ 𝑔𝑟𝑎𝑑𝑤A

𝑤# = 𝑤# − 𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔𝑟𝑎𝑡𝑒 ∗ 𝑔𝑟𝑎𝑑𝑤#
Performance metric for gradient based regression

Root Mean Square Error (RMSE) is the standard deviation of prediction errors.

(𝑦5 − 𝑦ℎ𝑎𝑡5 )&


𝑅𝑀𝑆𝐸 =
𝑛
Mean absolute error (MAE) is a measure of difference between two variables.

𝑦5 − 𝑦ℎ𝑎𝑡5
𝑀𝐴𝐸 =
𝑛
Thank you !

Let see the hands on…

You might also like