0% found this document useful (0 votes)
12 views9 pages

Multiple Linear Regression

Multiple linear regression (MLR) is a supervised machine learning algorithm that predicts the outcome of a dependent variable based on multiple independent variables. Key assumptions of MLR include linearity, independence, homoscedasticity, normality of errors, no multicollinearity, no autocorrelation, and fixed independent variables. The implementation of MLR in Python can be done using the Scikit-Learn library, requiring data preparation and loading of datasets.

Uploaded by

abidgub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views9 pages

Multiple Linear Regression

Multiple linear regression (MLR) is a supervised machine learning algorithm that predicts the outcome of a dependent variable based on multiple independent variables. Key assumptions of MLR include linearity, independence, homoscedasticity, normality of errors, no multicollinearity, no autocorrelation, and fixed independent variables. The implementation of MLR in Python can be done using the Scikit-Learn library, requiring data preparation and loading of datasets.

Uploaded by

abidgub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Multiple linear regression

Multiple linear regression


• Multiple linear regression in machine learning is a supervised algorithm
that models the relationship between a dependent variable and multiple
independent variables. This relationship is used to predict the outcome of
the dependent variable.
• Multiple linear regression is a type of linear regression in machine
learning. There are mainly two types of linear regression algorithms −
• Simple linear regression − it deals with two features (one dependent
variable and one independent variable).
• Multiple linear regression − it deals with more than two features (one
dependent variable and more than one independent variables).
What is Multiple Linear Regression?
• In machine learning, multiple linear regression (MLR) is a statistical
technique that is used to predict the outcome of a dependent variable
based on the values of multiple independent variables. The multiple linear
regression algorithm is trained on data to learn a relationship (known as a
regression line) that best fits the data.
• Consider a dataset having n observations, p features i.e. independent
variables and y as one response i.e. dependent variable the regression line
for p features can be calculated as follows −
Assumptions of Multiple Linear Regression

The following are some assumptions about the dataset that are made by the
multiple linear regression model −
1. Linearity
The relationship between the dependent variable (target) and independent
(predictor) variables is linear.
2. Independence
Each observation is independent of others. The value of the dependent
variable for one observation is independent of the value of another.
3. Homoscedasticity
For all observations, the variance of the residual errors is similar across the
value of each independent variable.
4. Normality of Errors
The residuals (errors) are normally distributed. The residuals are
differences between the actual and predicted values.
5. No Multicollinearity
The independent variables are not highly correlated with each other. Linear
regression models assume that there is very little or no multi-collinearity in
the data.
6. No Autocorrelation
There is no correlation between residuals. This ensures that the residuals
(errors) are independent of each other.
7. Fixed Independent Variables
The values of independent variables are fixed in all repeated samples.
Violations of these assumptions can lead to biased or inefficient estimates.
It is essential to validate these assumptions to ensure model accuracy.
Implementing Multiple Linear Regression in Python
• To implement multiple linear regression in Python using Scikit-Learn, we
can use the same Linear Regression class as in simple linear regression,
but this time we need to provide multiple independent variables as input.
Step 1: Data Preparation
• We use the dataset named data.csv with 50 examples. It contains four
predictor (independent) variables and a target (dependent) variable. The
following table represents the data in data.csv file.
data.csv
• You can create a CSV file and store the above data points in it.
• We have our dataset as data.csv file. We will use it to understand the
implementation of the multiple linear regression in Python.
- We need to import libraries before loading the dataset.
# import libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
Load the dataset
• We load our dataset as a Pandas Data frame named dataset. Now let's
create a list of independent values (predictors) and put them in a variable
called X.
• The independent values are 'R&D Spend', 'Administration', 'Marketing
Spend'. We are not using the independent variable 'State' for sake of
simplicity.
• We put the dependent variable values to a variable y.
y.head()
Output
Profit
0 192261.83
1 191792.06
2 191050.39
3 182901.99
4 166187.94

You might also like