Simple Linear Regression
Univariate Regression Technique
Deterministic Relationships
In a Deterministic Relationship, the equation exactly describes the relationship between the two variables.
Deterministic Relationship Example:
(Relationship between Fahrenheit & Celsius)
Relationship is defined as :
9
𝐹𝑎h𝑟 = 𝐶𝑒𝑙𝑠𝑖𝑢𝑠 +32
5
Some other examples of Deterministic Relationships:
• Circumference = π × diameter
• Hooke's Law: Y = α + βX
where Y = amount of stretch in a spring, and X = applied weight.
• Ohm's Law: I = V/r
where V = voltage applied, r = resistance, and I = current.
• Boyle's Law: For a constant temperature, P = α/V
where P = pressure, α = constant for each gas, and V = volume of gas.
Statistical Relationships
Here, the relationship between the numerical continuous variables is not perfect.
Some examples of Statistical Relationship:
• Height and weight — as height increases, you'd expect weight to increase, but not perfectly.
• Alcohol consumed and blood alcohol content — as alcohol consumption increases, you'd expect one's blood
alcohol content to increase, but not perfectly.
• Vital lung capacity and pack-years of smoking — as amount of smoking increases (as quantified by the
number of pack-years of smoking), you'd expect lung function (as quantified by vital lung capacity) to
decrease, but not perfectly.
• Driving speed and gas mileage — as driving speed increases, you'd expect gas mileage to decrease, but not
perfectly.
Linear Regression
In simple words linear regression is predicting the value of a variable y(dependent variable) based on
some variable X(independent variable) provided there is a linear relationship between X and y.
Simple linear regression is a statistical method that allows us to summarize and study relationships
between two continuous (quantitative) variables.
This linear relationship between the 2 variables can be represented by a straight line (called
Regression line).
Seems to be a Linear Relationship between the Independent & the Dependent Variable
3.5
2.5
1.5
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5
No Linear Relationship between the Independent & the Dependent Variable
Best Fitting Line
Sum of Squares of Residuals or Ordinary Least Squares
𝑛
2
^
𝑆𝑆𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠=∑ 𝑦 𝑖−𝑦𝑖 )
(
𝑖=1
Residuals
Cost Function
Gradient Descent
Gradient Descent Function
Gradient Descent (Simplified)
Remember
Remove Outliers from your dataset since Linear Regression models are
susceptible to outliers.