MULTIVARIATE LINEAR
REGRESSION
Using PYTHON
THE DATA SET Missing Data
area bedrooms age prices
1500 3 10 50000
2000 4 25 45000
1500 5 1 45000
1700 6 35000
1995 5 5 45000
2500 6 2 75000
3200 5 20 60000
3350 4 20 48000
4100 6 3 90000
4600 5 6 100000
5000 7 15 85000
Intercept Coefficients
Price=a+B1*area+b2*rooms+b3*age
Independent variables or features
Dependent Variable (in machine learning
Y=a+b1X2+b2X2+b3X3
AGENDA
1. Missing Data Handling
2. Linear Multivariate Regression
Find answers for
1. 4000 sq. feet area, 3 bedrooms, 5 yeas old
2. 7000 sq. feet area, 5 bedrooms, 10 years old
#Importing necessary libraries
and the dataset
Missing value
Taking median is a safe assumption
#taking the median value of bedrooms
#In order to make it as an
integer value we import the
math library
#In order to fill the value in the dataset
#To bring the value and
complete the dataset
Linear Regression
#Fit is used to train the programme
with the existing dataset
#Checking the coefficient value
#Checking the intercept value
FINDING ANSWERS
#Answer based on the input values
#Verifying the output by inserting it in the formula
EXERCISE
In exercise using [Link]. This file contains hiring statics for a firm such as
experience of candidate, his written test score and personal interview score.
Based on these 3 factors, HR will decide the salary. Given this data, you need
to build a machine learning model for HR department that can help them
decide salaries for future candidates. Using this predict salaries for following
candidates,
▪ 2 year experience, 9 test score, 6 interview score
▪ 12 year experience, 10 test score, 10 interview score
ANSWERS
53713.86 and 93747.79