0% found this document useful (1 vote)
269 views6 pages

KNN Algorithm Evaluation with UCI Dataset

The document implements the K-Nearest Neighbors algorithm on a car dataset from the UCI repository using Pandas for data loading. It performs standard preprocessing steps like feature selection, label encoding, and train-test splitting. It then trains a KNN classifier, makes predictions on the test set, and evaluates the model's performance using various metrics like accuracy, confusion matrix, and classification report. Effect of K value on error is also analyzed by plotting error rate for different K.

Uploaded by

Penchal Doggala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
269 views6 pages

KNN Algorithm Evaluation with UCI Dataset

The document implements the K-Nearest Neighbors algorithm on a car dataset from the UCI repository using Pandas for data loading. It performs standard preprocessing steps like feature selection, label encoding, and train-test splitting. It then trains a KNN classifier, makes predictions on the test set, and evaluates the model's performance using various metrics like accuracy, confusion matrix, and classification report. Effect of K value on error is also analyzed by plotting error rate for different K.

Uploaded by

Penchal Doggala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Machine Learning Lab

Assessment-3
CSE4020
[Link]:17BCE0918
Name:[Link] REDDY
Exercise 3 : Implement K – Nearest Neighbour Algorithm and evaluate the
performance of your algorithm using any data set from UCI repository.

Note : You can use Panada API to load the data set. No other API can be
used in the implementation.

Code:

import numpy as np

import [Link] as plt

import pandas as pd

from sklearn.model_selection import train_test_split

from [Link] import KNeighborsClassifier

import seaborn as sns

import warnings

[Link]('ignore')

from [Link] import classification_report, confusion_matrix,


accuracy_score

car_data = pd.read_csv("[Link]", encoding = "ISO-8859-1")


car_data.info()

car_data.head()

car_data.describe()

y=car_data['class']

y=y.to_frame()

[Link]()

X=car_data

X= X[['buying','maint','doors','persons','lug_boot','safety']]

[Link]()

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state
=32)

combine=[y_train,y_test]

classmapping={'unacc':0,'acc':1,'good':2,'vgood':3}

for dt in combine:

dt['class']=car_data['class'].map(classmapping)

y_train.head()

combine=[X_train,X_test]

classmapping={'high':0,'low':1,'med':2,'vhigh':3}

for dt in combine:

dt['buying']=car_data['buying'].map(classmapping)

X_train.head()

combine=[X_train,X_test]

classmapping={'high':0,'low':1,'med':2,'vhigh':3}

for dt in combine:
dt['maint']=car_data['maint'].map(classmapping)

X_train.head()

combine=[X_train,X_test]

classmapping={'big':0,'small':1,'med':2}

for dt in combine:

dt['lug_boot']=car_data['lug_boot'].map(classmapping)

X_train.head()

combine=[X_train,X_test]

classmapping={'high':0,'low':1,'med':2,'vhigh':3}

for dt in combine:

dt['safety']=car_data['safety'].map(classmapping)

X_train.head()

clf = KNeighborsClassifier(n_neighbors=37)

[Link](X_train, y_train)

y_pred = [Link](X_test)

y_pred

y_test.head()

print(confusion_matrix(y_test, y_pred))

print(classification_report(y_test, y_pred))

accuracy_test_knn=round([Link](X_test,y_test)*100,2)

accuracy_train_knn=round([Link](X_train,y_train)*100,2)

accuracy_knn=round(accuracy_score(y_test,y_pred)*100,2)

print('Training accuracy of KNN',accuracy_train_knn)

print('Testing accuracy of KNN',accuracy_test_knn)


print('Accuracy of KNN',accuracy_knn)

cm=confusion_matrix(y_test, y_pred)

cm_df = [Link](cm,

index = ['unacc','acc','good','vgood'],

columns = ['unacc','acc','good','vgood'])

[Link](figsize=(5.5,4))

[Link](cm_df, annot=True)

[Link]('KNN Accuracy:{0:.3f}'.format(accuracy_test_knn))

[Link]('True Label')

[Link]('Predicted Label')

[Link]()

error = []

for i in range(1, 40):

knn = KNeighborsClassifier(n_neighbors=i)

[Link](X_train, y_train)

pred_i = [Link](X_test)

[Link]([Link](pred_i != y_test.values))

[Link](figsize=(12,6))

[Link](range(1, 40), error, color='red', linestyle='dashed', marker='o',


markerfacecolor='blue', markersize=10)

[Link]('Error rate K value')

[Link]('K Value')

[Link]('Mean Error')
Output:

You might also like