8/24/22, 5:37 PM Untitled2.
ipynb - Colaboratory
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
Double-click (or enter) to edit
house=pd.read_csv('https://github.com/ybifoundation/Dataset/raw/main/Boston.csv')
house
CRIM ZN INDUS CHAS NX RM AGE DIS RAD TAX PTRATIO B
0 0.00632 18.0 2.31 0 0.538 6.575 65.2 4.0900 1 296.0 15.3 396.90
1 0.02731 0.0 7.07 0 0.469 6.421 78.9 4.9671 2 242.0 17.8 396.90
2 0.02729 0.0 7.07 0 0.469 7.185 61.1 4.9671 2 242.0 17.8 392.83
3 0.03237 0.0 2.18 0 0.458 6.998 45.8 6.0622 3 222.0 18.7 394.63
4 0.06905 0.0 2.18 0 0.458 7.147 54.2 6.0622 3 222.0 18.7 396.90
... ... ... ... ... ... ... ... ... ... ... ... ...
501 0.06263 0.0 11.93 0 0.573 6.593 69.1 2.4786 1 273.0 21.0 391.99
502 0.04527 0.0 11.93 0 0.573 6.120 76.7 2.2875 1 273.0 21.0 396.90
503 0.06076 0.0 11.93 0 0.573 6.976 91.0 2.1675 1 273.0 21.0 396.90
504 0.10959 0.0 11.93 0 0.573 6.794 89.3 2.3889 1 273.0 21.0 393.45
505 0.04741 0.0 11.93 0 0.573 6.030 80.8 2.5050 1 273.0 21.0 396.90
506 rows × 14 columns
house.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 506 entries, 0 to 505
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CRIM 506 non-null float64
1 ZN 506 non-null float64
2 INDUS 506 non-null float64
3 CHAS 506 non-null int64
4 NX 506 non-null float64
5 RM 506 non-null float64
6 AGE 506 non-null float64
7 DIS 506 non-null float64
8 RAD 506 non-null int64
9 TAX 506 non-null float64
10 PTRATIO 506 non-null float64
11 B 506 non-null float64
12 LSTAT 506 non-null float64
https://colab.research.google.com/drive/1_jbyw64Ie3tl-KwGi2sMHYm9vgPAofOY#scrollTo=PTA4-n2lvLHK&printMode=true 1/5
8/24/22, 5:37 PM Untitled2.ipynb - Colaboratory
13 MEDV 506 non-null float64
dtypes: float64(12), int64(2)
memory usage: 55.5 KB
house.describe()
CRIM ZN INDUS CHAS NX RM AGE
count 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000 506.000000
mean 3.613524 11.363636 11.136779 0.069170 0.554695 6.284634 68.574901
std 8.601545 23.322453 6.860353 0.253994 0.115878 0.702617 28.148861
min 0.006320 0.000000 0.460000 0.000000 0.385000 3.561000 2.900000
25% 0.082045 0.000000 5.190000 0.000000 0.449000 5.885500 45.025000
50% 0.256510 0.000000 9.690000 0.000000 0.538000 6.208500 77.500000
75% 3.677083 12.500000 18.100000 0.000000 0.624000 6.623500 94.075000
max 88.976200 100.000000 27.740000 1.000000 0.871000 8.780000 100.000000
house.isna().sum()
CRIM 0
ZN 0
INDUS 0
CHAS 0
NX 0
RM 0
AGE 0
DIS 0
RAD 0
TAX 0
PTRATIO 0
B 0
LSTAT 0
MEDV 0
dtype: int64
house.nunique()
CRIM 504
ZN 26
INDUS 76
CHAS 2
NX 81
RM 446
AGE 356
DIS 412
RAD 9
TAX 66
https://colab.research.google.com/drive/1_jbyw64Ie3tl-KwGi2sMHYm9vgPAofOY#scrollTo=PTA4-n2lvLHK&printMode=true 2/5
8/24/22, 5:37 PM Untitled2.ipynb - Colaboratory
PTRATIO 46
B 357
LSTAT 455
MEDV 229
dtype: int64
sns.pairplot(house)
https://colab.research.google.com/drive/1_jbyw64Ie3tl-KwGi2sMHYm9vgPAofOY#scrollTo=PTA4-n2lvLHK&printMode=true 3/5
8/24/22, 5:37 PM Untitled2.ipynb - Colaboratory
<seaborn.axisgrid.PairGrid at 0x7f47814b2a90>
house.columns
Index(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX',
'PTRATIO', 'B', 'LSTAT', 'MEDV'],
dtype='object')
y=house['MEDV']
y.shape
(506,)
X=house[['CRIM', 'ZN', 'INDUS', 'CHAS', 'NX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX',
'PTRATIO', 'B', 'LSTAT']]
X.shape
(506, 13)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=.30, random_state=2529)
X_train.shape, X_test.shape, y_train.shape, y_test.shape
((354, 13), (152, 13), (354,), (152,))
from sklearn.linear_model import LinearRegression
model=LinearRegression()
model.fit(X_train,y_train)
LinearRegression()
y_pred=model.predict(X_test)
https://colab.research.google.com/drive/1_jbyw64Ie3tl-KwGi2sMHYm9vgPAofOY#scrollTo=PTA4-n2lvLHK&printMode=true 4/5
8/24/22, 5:37 PM Untitled2.ipynb - Colaboratory
from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error, mean_squa
mean_absolute_error(y_test,y_pred)
2.9235694431044683
mean_absolute_percentage_error(y_test,y_pred)
0.1483112271197023
mean_squared_error(y_test,y_pred)
18.155788350662583
sample = house.sample()
sample
CRIM ZN INDUS CHAS NX RM AGE DIS RAD TAX PTRATIO B L
144 2 77974 00 19 58 0 0 871 4 903 97 8 1 3459 5 403 0 14 7 396 9
X_new = sample.loc[:,X.columns]
X_new
CRIM ZN INDUS CHAS NX RM AGE DIS RAD TAX PTRATIO B L
144 2 77974 00 19 58 0 0 871 4 903 97 8 1 3459 5 403 0 14 7 396 9
model.predict(X_new)
array([7.64840244])
https://colab.research.google.com/drive/1_jbyw64Ie3tl-KwGi2sMHYm9vgPAofOY#scrollTo=PTA4-n2lvLHK&printMode=true 5/5