0% found this document useful (0 votes)

50 views6 pages

ML Assignment1 Linear Regression

The document shows analysis of a dataset containing employees' years of experience and salary. It loads and inspects the data, plots a scatter plot of experience vs. salary, fits a linear regression model to predict salary from experience, and plots the training and test results. Key steps include splitting the data into train and test sets, fitting a linear regression model to the training set, and using the model to make predictions on both training and test sets.

Uploaded by

Dishant kumar yadav mhakhariya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views6 pages

ML Assignment1 Linear Regression

Uploaded by

Dishant kumar yadav mhakhariya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

keyboard_arrow_down DISHANT KUMAR YADAV 2021BCS0136

#DISHANT KUMAR YADAV

import numpy as np
import pandas as pd

df = pd.read_csv('/content/sample_data/Salary_Data.csv')
df
#DISHANT KUMAR YADAV

YearsExperience Salary

0 1.1 39343.0

1 1.3 46205.0

2 1.5 37731.0

3 2.0 43525.0

4 2.2 39891.0

5 2.9 56642.0

6 3.0 60150.0

7 3.2 54445.0

8 3.2 64445.0

9 3.7 57189.0

10 3.9 63218.0

11 4.0 55794.0

12 4.0 56957.0

13 4.1 57081.0

14 4.5 61111.0

15 4.9 67938.0

16 5.1 66029.0

17 5.3 83088.0

18 5.9 81363.0

19 6.0 93940.0

20 6.8 91738.0

21 7.1 98273.0

22 7.9 101302.0

23 8.2 113812.0

24 8.7 109431.0

25 9.0 105582.0

26 9.5 116969.0

27 9.6 112635.0

28 10.3 122391.0

29 10.5 121872.0

#DISHANT KUMAR YADAV

import matplotlib.pyplot as plt

exp = df['YearsExperience']
sal = df['Salary']

plt.scatter(exp,sal)
plt.xlabel('Experience')
plt.ylabel('Salary')
#DISHANT KUMAR YADAV
Text(0, 0.5, 'Salary')

#DISHANT KUMAR YADAV

exp_np = exp.to_numpy()
sal_np = sal.to_numpy()

exp_np.shape, sal_np.shape
#DISHANT KUMAR YADAV

((30,), (30,))

#DISHANT KUMAR YADAV

from sklearn.linear_model import LinearRegression

sklearn_model = LinearRegression().fit(exp_np.reshape((30,1)), sal_np)

sklearn_sal_predictions = sklearn_model.predict(exp_np.reshape((30,1)))
sklearn_sal_predictions.shape
#DISHANT KUMAR YADAV

(30,)

#DISHANT KUMAR YADAV

exp = df['YearsExperience']
sal = df['Salary']

plt.scatter(exp,sal)
plt.xlabel('Experience')
plt.ylabel('Salary')

plt.scatter(exp,sklearn_sal_predictions )
#DISHANT KUMAR YADAV

output <matplotlib.collections.PathCollection at 0x7c7e2822d360>

#DISHANT KUMAR YADAV
predictions_df = pd.DataFrame({'YearsExperience': exp, 'Salary':sal, 'Sklearn salary prediction':sklearn_sal_predictions})

predictions_df
#DISHANT KUMAR YADAV

YearsExperience Salary Sklearn salary prediction

0 1.1 39343.0 36187.158752

1 1.3 46205.0 38077.151217

2 1.5 37731.0 39967.143681

3 2.0 43525.0 44692.124842

4 2.2 39891.0 46582.117306

5 2.9 56642.0 53197.090931

6 3.0 60150.0 54142.087163

7 3.2 54445.0 56032.079627

8 3.2 64445.0 56032.079627

9 3.7 57189.0 60757.060788

10 3.9 63218.0 62647.053252

11 4.0 55794.0 63592.049484

12 4.0 56957.0 63592.049484

13 4.1 57081.0 64537.045717

14 4.5 61111.0 68317.030645

15 4.9 67938.0 72097.015574

16 5.1 66029.0 73987.008038

17 5.3 83088.0 75877.000502

18 5.9 81363.0 81546.977895

19 6.0 93940.0 82491.974127

20 6.8 91738.0 90051.943985

21 7.1 98273.0 92886.932681

22 7.9 101302.0 100446.902538

23 8.2 113812.0 103281.891235

24 8.7 109431.0 108006.872395

25 9.0 105582.0 110841.861092

26 9.5 116969.0 115566.842252

27 9.6 112635.0 116511.838485

28 10.3 122391.0 123126.812110

29 10.5 121872.0 125016.804574

keyboard_arrow_down DISHANT KUMAR YADAV 2021BCS0136
# Step 1: Import the required python packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

# Step 2: Load the dataset

df = pd.read_csv('/content/sample_data/Salary_Data.csv')

# Step 3: Data analysis - distribution plot shows the variation in the data distribution.
exp = df['YearsExperience']
sal = df['Salary']

plt.scatter(exp, sal)
plt.xlabel('Experience')
plt.ylabel('Salary')
plt.title('Distribution of Experience vs. Salary')
plt.show()

output

# Step 4: Split the dataset into dependent/independent variables

X = df[['YearsExperience']]
y = df['Salary']

# Step 5: Split data into Train/Test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 6: Train the regression model

regression_model = LinearRegression()
regression_model.fit(X_train, y_train)

▾ LinearRegression
LinearRegression()
# Step 7: Plot the training results
plt.scatter(X_train, y_train, color='blue')
plt.plot(X_train, regression_model.predict(X_train), color='red')
plt.xlabel('Experience')
plt.ylabel('Salary')
plt.title('Training Results: Experience vs. Salary')
plt.show()

# Step 7: Plot the test results

plt.scatter(X_test, y_test, color='blue')
plt.plot(X_train, regression_model.predict(X_train), color='red') # Same line as training for comparison
plt.xlabel('Experience')
plt.ylabel('Salary')
plt.title('Test Results: Experience vs. Salary')
plt.show()

Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
Predicting Salary with Experience
100% (1)
Predicting Salary with Experience
7 pages
Linear Regression 2
No ratings yet
Linear Regression 2
3 pages
Data Preprocessing & Visualization1
No ratings yet
Data Preprocessing & Visualization1
2 pages
Linear Regression Salary Prediction
No ratings yet
Linear Regression Salary Prediction
8 pages
Linear Regression 1
No ratings yet
Linear Regression 1
2 pages
Linear - Regression - Ipynb - Colaboratory
No ratings yet
Linear - Regression - Ipynb - Colaboratory
4 pages
C: Users Dell Downloads Salary - Data - CSV
No ratings yet
C: Users Dell Downloads Salary - Data - CSV
2 pages
DSBDA3 - Jupyter Notebook
No ratings yet
DSBDA3 - Jupyter Notebook
12 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
4 pages
Exp 2 ML
No ratings yet
Exp 2 ML
4 pages
Kunj Project 1
No ratings yet
Kunj Project 1
34 pages
Salary Prediction Project
No ratings yet
Salary Prediction Project
6 pages
EXP-4 DMusingPYTHON
No ratings yet
EXP-4 DMusingPYTHON
7 pages
Data Science Salary Analysis 2023
No ratings yet
Data Science Salary Analysis 2023
17 pages
Kunj Project 1
No ratings yet
Kunj Project 1
34 pages
Parth Anand's Employee Data Project
No ratings yet
Parth Anand's Employee Data Project
32 pages
Kunj 3
No ratings yet
Kunj 3
34 pages
Python 1
No ratings yet
Python 1
3 pages
Data Scientist Salaries 1686594662
No ratings yet
Data Scientist Salaries 1686594662
29 pages
Employee Management System
No ratings yet
Employee Management System
33 pages
Stata Instructions
No ratings yet
Stata Instructions
7 pages
Linear Reg From Scratch Codes
No ratings yet
Linear Reg From Scratch Codes
1 page
Appendix B: Source Code
No ratings yet
Appendix B: Source Code
5 pages
Viksit Ip Project File
No ratings yet
Viksit Ip Project File
33 pages
ULR 42 - Jupyter Notebook
No ratings yet
ULR 42 - Jupyter Notebook
4 pages
Linear Regression2
No ratings yet
Linear Regression2
9 pages
Employee Management-Ghanim, Rudra
No ratings yet
Employee Management-Ghanim, Rudra
25 pages
Project Report Certificate: Hardware Requirement
No ratings yet
Project Report Certificate: Hardware Requirement
11 pages
Salary Prediction
No ratings yet
Salary Prediction
32 pages
Ali Bhai's IP Project
No ratings yet
Ali Bhai's IP Project
31 pages
Employee Info
No ratings yet
Employee Info
2 pages
Python and MySQL Real Estate Project
No ratings yet
Python and MySQL Real Estate Project
46 pages
IP Employee Project
No ratings yet
IP Employee Project
32 pages
Social Network Analysis: Cheruvu Nvss Suhas 21BCE8374
No ratings yet
Social Network Analysis: Cheruvu Nvss Suhas 21BCE8374
10 pages
Aastha IP Employee Project
No ratings yet
Aastha IP Employee Project
32 pages
Assignment 03
No ratings yet
Assignment 03
6 pages
Solution To Task 1
No ratings yet
Solution To Task 1
2 pages
Emp Project
No ratings yet
Emp Project
40 pages
Salary Prediction with Linear Regression
No ratings yet
Salary Prediction with Linear Regression
7 pages
Etl and Stats Code
No ratings yet
Etl and Stats Code
2 pages
New Final Ip Project
No ratings yet
New Final Ip Project
33 pages
Unit5 - Linear Regression
No ratings yet
Unit5 - Linear Regression
4 pages
2 Linear Regression
No ratings yet
2 Linear Regression
5 pages
ML Lab
No ratings yet
ML Lab
20 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
Maxbox Starter139 Top5 Data Diagram Types
No ratings yet
Maxbox Starter139 Top5 Data Diagram Types
4 pages
Group 24 Miniproject
No ratings yet
Group 24 Miniproject
33 pages
Rajdeep 2023PGDM1197
No ratings yet
Rajdeep 2023PGDM1197
5 pages
Empid Name
No ratings yet
Empid Name
2 pages
DS P6 Yash
No ratings yet
DS P6 Yash
8 pages
ML Projects
No ratings yet
ML Projects
22 pages
Pattern Recognition
No ratings yet
Pattern Recognition
26 pages
Task 1
No ratings yet
Task 1
5 pages
NumPy and Pandas Salary Data Analysis
No ratings yet
NumPy and Pandas Salary Data Analysis
19 pages
Source Code55
No ratings yet
Source Code55
18 pages
Panda Merged
No ratings yet
Panda Merged
19 pages
Experiment No.8
No ratings yet
Experiment No.8
5 pages
Federated Learning With Non-IID Data: Yue Zhao Meng Li Liangzhen Lai
No ratings yet
Federated Learning With Non-IID Data: Yue Zhao Meng Li Liangzhen Lai
12 pages
Class Assignment
No ratings yet
Class Assignment
8 pages
Decision Tree
No ratings yet
Decision Tree
4 pages
Protein 3D Structure & Folding Prediction
No ratings yet
Protein 3D Structure & Folding Prediction
17 pages
Lecture 2 - Barriers To Communication
No ratings yet
Lecture 2 - Barriers To Communication
11 pages
C Programming Problem Solutions
No ratings yet
C Programming Problem Solutions
60 pages

ML Assignment1 Linear Regression

Uploaded by

ML Assignment1 Linear Regression

Uploaded by

keyboard_arrow_down DISHANT KUMAR YADAV 2021BCS0136

#DISHANT KUMAR YADAV

#DISHANT KUMAR YADAV

#DISHANT KUMAR YADAV

#DISHANT KUMAR YADAV

sklearn_model = LinearRegression().fit(exp_np.reshape((30,1)), sal_np)

#DISHANT KUMAR YADAV

output <matplotlib.collections.PathCollection at 0x7c7e2822d360>

YearsExperience Salary Sklearn salary prediction

0 1.1 39343.0 36187.158752

1 1.3 46205.0 38077.151217

2 1.5 37731.0 39967.143681

3 2.0 43525.0 44692.124842

4 2.2 39891.0 46582.117306

5 2.9 56642.0 53197.090931

6 3.0 60150.0 54142.087163

7 3.2 54445.0 56032.079627

8 3.2 64445.0 56032.079627

9 3.7 57189.0 60757.060788

10 3.9 63218.0 62647.053252

11 4.0 55794.0 63592.049484

12 4.0 56957.0 63592.049484

13 4.1 57081.0 64537.045717

14 4.5 61111.0 68317.030645

15 4.9 67938.0 72097.015574

16 5.1 66029.0 73987.008038

17 5.3 83088.0 75877.000502

18 5.9 81363.0 81546.977895

19 6.0 93940.0 82491.974127

20 6.8 91738.0 90051.943985

21 7.1 98273.0 92886.932681

22 7.9 101302.0 100446.902538

23 8.2 113812.0 103281.891235

24 8.7 109431.0 108006.872395

25 9.0 105582.0 110841.861092

26 9.5 116969.0 115566.842252

27 9.6 112635.0 116511.838485

28 10.3 122391.0 123126.812110

29 10.5 121872.0 125016.804574

# Step 2: Load the dataset

# Step 4: Split the dataset into dependent/independent variables

# Step 5: Split data into Train/Test sets

# Step 6: Train the regression model

# Step 7: Plot the test results

You might also like