0% found this document useful (0 votes)

38 views47 pages

Machinelearninglabmanual

The Machine Learning Lab Manual (BCSL606) from R. L. Jalappa Institute of Technology outlines the objectives, outcomes, and experiments for the Data Science program. It includes a series of programming tasks utilizing various datasets to implement machine learning algorithms such as PCA, KNN, and decision trees. The manual aims to equip students with practical skills in data analysis and machine learning techniques.

Uploaded by

Chethan N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views47 pages

Machinelearninglabmanual

Uploaded by

Chethan N

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Machine Learning Lab Manual( BCSL606)

R. L. JALAPPA INSTITUTE OF TECHNOLOGY

Department of CS&E(DATA SCIENCE)

MACHINE LEARNING LAB MANUAL(BCSL606)

(2022 SCHEME)

SEMESTER : VI
FACULTY INCHARGE : Dr. KARTHIK B U,
ASSISTANT PROFESSOR,
RLJIT

PROGRAMMER INCHARGE : Ms. NEHA. C. SHEKAR ,

RLJIT

NAME
USN
Batch
Sem
Branch

Dr. Karthik B U, Department of CSE(Data Science) 1

Machine Learning Lab Manual(BCSL606)

DEPARTMENT OF CSE ( DATA SCIENCE)

Vision:
To Progress as a centre of Aptitude in Data Analytics and ripen clever professional as
data analytics and researchers.
Mission:
M1: To craft the students with novel and intellectual skills to capability in the field of
Data science.
M2: To consecrate to build professionals with social-virtuousness attitude and
implementing innovative Teaching and Learning methods

PROGRAMME EDUCATIONAL OBJECTIVES: (PEO)

PEO1: Graduates will have Prospective careers in the field of Data Science.
PEO2: Graduates will have good Leadership Qualities, Self Learning abilities and Zeal
for higher Studies and Research.
PEO3: Graduates will follow Ethical Practices and exhibit high level of Professionalism
by participating and addressing Technical, Business and environmental challenges.

PROGRAMME SPECIFIC OUTCOMES: (PSO)

PSO 1: Students will able to solve the real life problems faced in the society, industry
and other areas by applying the skills of Data Science.
PSO 2: Students will have the knowledge of Software, Hardware, Algorithms,
Modelling Networking and Application Development.
PSO 3: Students will have the ability to develop computational knowledge and project
development skills Data Science Techniques.
Machine Learning Lab Manual( BCSL606)

List of Experiments
1.Develop a program to create histograms for all numerical features and analyze the
distribution of each feature. Generate box plots for all numerical features and identify
any outliers. Use California Housing dataset.
2 Develop a program to Compute the correlation matrix to understand the relationships
between pairs of features. Visualize the correlation matrix using a heatmap to know
which variables have strong positive/negative correlations. Create a pair plot to visualize
pairwise relationships between features. Use California Housing dataset.
3 Develop a program to implement Principal Component Analysis (PCA) for reducing
the dimensionality of the Iris dataset from 4 features to 2.
4 For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Find-S algorithm to output a description of the set of all hypotheses
consistent with the training examples.
5 Develop a program to implement k-Nearest Neighbour algorithm to classify the
randomly generated 100 values of x in the range of [0,1]. Perform the following based on
dataset generated.
i. Label the first 50 points {x1,……,x50} as follows: if (xi ≤ 0.5), then xi ∊ Class1, else
xi ∊ Class1
ii. Classify the remaining points, x51,……,x100 using KNN. Perform this for
k=1,2,3,4,5,20,30
6 Implement the non-parametric Locally Weighted Regression algorithm in order to fit
data points. Select appropriate data set for your experiment and draw graphs
7 Develop a program to demonstrate the working of Linear Regression and Polynomial
Regression. Use Boston Housing Dataset for Linear Regression and Auto MPG Dataset
(for vehicle fuel efficiency prediction) for Polynomial Regression.
8 Develop a program to demonstrate the working of the decision tree algorithm. Use
Breast Cancer Data set for building the decision tree and apply this knowledge to
classify a new sample.
9 Develop a program to implement the Naive Bayesian classifier considering Olivetti
Face Data set for training. Compute the accuracy of the classifier, considering a few test
data sets.
10 Develop a program to implement k-means clustering using Wisconsin Breast Cancer
data set and visualize the clustering result.

Dr. Karthik B U, Department of CSE(Data Science) 3

Machine Learning Lab Manual( BCSL606)

1. Develop a program to create histograms for all numerical features and analyze
the distribution of each feature. Generate box plots for all numerical features and
identify any outliers. Use California Housing dataset.

Program:

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing

# Step 1: Load the California Housing dataset

data = fetch_california_housing(as_frame=True)
housing_df = data.frame

# Step 2: Create histograms for numerical features

numerical_features = housing_df.select_dtypes(include=[np.number]).columns

# Plot histograms
plt.figure(figsize=(15, 10))
for i, feature in enumerate(numerical_features):
plt.subplot(3, 3, i + 1)
sns.histplot(housing_df[feature], kde=True, bins=30, color='blue')
plt.title(f'Distribution of {feature}')
plt.tight_layout()
plt.show()

# Step 3: Generate box plots for numerical features

plt.figure(figsize=(15, 10))
for i, feature in enumerate(numerical_features):
plt.subplot(3, 3, i + 1)
sns.boxplot(x=housing_df[feature], color='orange')
plt.title(f'Box Plot of {feature}')

Dr. Karthik B U, Department of CSE(Data Science) 4

Machine Learning Lab Manual( BCSL606)

plt.tight_layout()
plt.show()

# Step 4: Identify outliers using the IQR method

print("Outliers Detection:")
outliers_summary = {}
for feature in numerical_features:
Q1 = housing_df[feature].quantile(0.25)
Q3 = housing_df[feature].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
outliers = housing_df[(housing_df[feature] < lower_bound) | (housing_df[feature] >
upper_bound)]
outliers_summary[feature] = len(outliers)
print(f"{feature}: {len(outliers)} outliers")

# Optional: Print a summary of the dataset

print("\nDataset Summary:\n")
print(housing_df.describe())

Dr. Karthik B U, Department of CSE(Data Science) 5

Machine Learning Lab Manual( BCSL606)

Expected output:

Dr. Karthik B U, Department of CSE(Data Science) 6

Machine Learning Lab Manual( BCSL606)

Dr. Karthik B U, Department of CSE(Data Science) 7

Machine Learning Lab Manual( BCSL606)

2 Develop a program to Compute the correlation matrix to understand the

relationships between pairs of features. Visualize the correlation matrix using a
heatmap to know which variables have strong positive/negative correlations. Create
a pair plot to visualize pairwise relationships between features. Use California
Housing dataset.
Program:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing

# Step 1: Load the California Housing Dataset

california_data = fetch_california_housing(as_frame=True)
data = california_data.frame

# Step 2: Compute the correlation matrix

correlation_matrix = data.corr()

# Step 3: Visualize the correlation matrix using a heatmap

plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f',
linewidths=0.5)
plt.title('Correlation Matrix of California Housing Features')
plt.show()

# Step 4: Create a pair plot to visualize pairwise relationships

sns.pairplot(data, diag_kind='kde', plot_kws={'alpha': 0.5})
plt.suptitle('Pair Plot of California Housing Features', y=1.02)
plt.show()

Dr. Karthik B U, Department of CSE(Data Science) 8

Machine Learning Lab Manual( BCSL606)

Expected output:

Dr. Karthik B U, Department of CSE(Data Science) 9

Machine Learning Lab Manual( BCSL606)

Dr. Karthik B U, Department of CSE(Data Science) 10

Machine Learning Lab Manual( BCSL606)

3.Develop a program to implement Principal Component Analysis (PCA) for

reducing the dimensionality of the Iris dataset from 4 features to 2.

Program:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

# Load the Iris dataset

iris = load_iris()
data = iris.data
labels = iris.target
label_names = iris.target_names

# Convert to a DataFrame for better visualization

iris_df = pd.DataFrame(data, columns=iris.feature_names)

# Perform PCA to reduce dimensionality to 2

pca = PCA(n_components=2)
data_reduced = pca.fit_transform(data)

# Create a DataFrame for the reduced data

reduced_df = pd.DataFrame(data_reduced, columns=['Principal Component 1', 'Principal
Component 2'])
reduced_df['Label'] = labels

# Plot the reduced data

plt.figure(figsize=(8, 6))
colors = ['r', 'g', 'b']
for i, label in enumerate(np.unique(labels)):
plt.scatter(
reduced_df[reduced_df['Label'] == label]['Principal Component 1'],

Dr. Karthik B U, Department of CSE(Data Science) 11

Machine Learning Lab Manual( BCSL606)

reduced_df[reduced_df['Label'] == label]['Principal Component 2'],

label=label_names[label],
color=colors[i]
)

plt.title('PCA on Iris Dataset')

plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.legend()
plt.grid()
plt.show()

Expected output:

Dr. Karthik B U, Department of CSE(Data Science) 12

Machine Learning Lab Manual( BCSL606)

4. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Find-S algorithm to output a description of the set of all
hypotheses consistent with the training examples.

Program:
import pandas as pd
def find_s_algorithm(file_path):
data = pd.read_csv(file_path)
print("Training data:")
print(data)
attributes = data.columns[:-1]
class_label = data.columns[-1]
hypothesis = ['?' for _ in attributes]
for index, row in data.iterrows():
if row[class_label] == 'Yes':
for i, value in enumerate(row[attributes]):
if hypothesis[i] == '?' or hypothesis[i] == value:
hypothesis[i] = value
else:
hypothesis[i] = '?'
return hypothesis
file_path = 'training_data.csv'
hypothesis = find_s_algorithm(file_path)
print("\nThe final hypothesis is:", hypothesis)

Expected output:

Dr. Karthik B U, Department of CSE(Data Science) 13

Machine Learning Lab Manual( BCSL606)

5. Develop a program to implement k-Nearest Neighbour algorithm to classify the

randomly generated 100 values of x in the range of [0,1]. Perform the following
based on dataset generated.
i.Label the first 50 points {x1,……,x50} as follows: if (xi ≤ 0.5), then xi ∊ Class1,
else xi ∊ Class1
ii. Classify the remaining points, x51,……,x100 using KNN. Perform this for
k=1,2,3,4,5,20,30

Program:
import numpy as np
import matplotlib.pyplot as plt
from collections import Counter
data = np.random.rand(100)
labels = ["Class1" if x <= 0.5 else "Class2" for x in data[:50]]
def euclidean_distance(x1, x2):
return abs(x1 - x2)
def knn_classifier(train_data, train_labels, test_point, k):
distances = [(euclidean_distance(test_point, train_data[i]), train_labels[i]) for i in
range(len(train_data))]
distances.sort(key=lambda x: x[0])
k_nearest_neighbors = distances[:k]
k_nearest_labels = [label for _, label in k_nearest_neighbors]
return Counter(k_nearest_labels).most_common(1)[0][0]
train_data = data[:50]
train_labels = labels
test_data = data[50:]
k_values = [1, 2, 3, 4, 5, 20, 30]
print("--- k-Nearest Neighbors Classification ---")
print("Training dataset: First 50 points labeled based on the rule (x <= 0.5 -> Class1, x >
0.5 -> Class2)")
print("Testing dataset: Remaining 50 points to be classified\n")
results = {}

Dr. Karthik B U, Department of CSE(Data Science) 14

Machine Learning Lab Manual( BCSL606)

for k in k_values:
print(f"Results for k = {k}:")
classified_labels = [knn_classifier(train_data, train_labels, test_point, k) for test_point
in test_data]
results[k] = classified_labels
for i, label in enumerate(classified_labels, start=51):
print(f"Point x{i} (value: {test_data[i - 51]:.4f}) is classified as {label}")
print("\n")
print("Classification complete.\n")

for k in k_values:
classified_labels = results[k]
class1_points = [test_data[i] for i in range(len(test_data)) if classified_labels[i] ==
"Class1"]
class2_points = [test_data[i] for i in range(len(test_data)) if classified_labels[i] ==
"Class2"]

plt.figure(figsize=(10, 6))
plt.scatter(train_data, [0] * len(train_data), c=["blue" if label == "Class1" else "red"
for label in train_labels],
label="Training Data", marker="o")
plt.scatter(class1_points, [1] * len(class1_points), c="blue", label="Class1 (Test)",
marker="x")
plt.scatter(class2_points, [1] * len(class2_points), c="red", label="Class2 (Test)",
marker="x")

plt.title(f"k-NN Classification Results for k = {k}")

plt.xlabel("Data Points")
plt.ylabel("Classification Level")
plt.legend()
plt.grid(True)
plt.show()

Dr. Karthik B U, Department of CSE(Data Science) 15

Machine Learning Lab Manual( BCSL606)

Expected output:

Results for k = 1:
Point x51 (value: 0.4153) is classified as Class1
Point x52 (value: 0.4013) is classified as Class1
Point x53 (value: 0.7557) is classified as Class2
Point x54 (value: 0.7940) is classified as Class2
Point x55 (value: 0.5026) is classified as Class2
Point x56 (value: 0.8748) is classified as Class2
Point x57 (value: 0.3481) is classified as Class1
Point x58 (value: 0.9093) is classified as Class2
Point x59 (value: 0.1492) is classified as Class1
Point x60 (value: 0.3134) is classified as Class1
Point x61 (value: 0.7004) is classified as Class2
Point x62 (value: 0.7487) is classified as Class2
Point x63 (value: 0.9910) is classified as Class2
Point x64 (value: 0.6144) is classified as Class2
Point x65 (value: 0.5635) is classified as Class2
Point x66 (value: 0.7532) is classified as Class2
Point x67 (value: 0.1218) is classified as Class1
Point x68 (value: 0.4368) is classified as Class1

Dr. Karthik B U, Department of CSE(Data Science) 16

Machine Learning Lab Manual( BCSL606)

Point x69 (value: 0.4996) is classified as Class2

Point x70 (value: 0.3800) is classified as Class1
Point x71 (value: 0.2020) is classified as Class1
Point x72 (value: 0.6080) is classified as Class2
Point x73 (value: 0.4145) is classified as Class1
Point x74 (value: 0.5218) is classified as Class2
Point x75 (value: 0.9779) is classified as Class2
Point x76 (value: 0.5877) is classified as Class2
Point x77 (value: 0.5822) is classified as Class2
Point x78 (value: 0.1519) is classified as Class1
Point x79 (value: 0.9352) is classified as Class2
Point x80 (value: 0.7384) is classified as Class2
Point x81 (value: 0.6791) is classified as Class2
Point x82 (value: 0.6254) is classified as Class2
Point x83 (value: 0.1475) is classified as Class1
Point x84 (value: 0.4246) is classified as Class1
Point x85 (value: 0.5727) is classified as Class2
Point x86 (value: 0.8374) is classified as Class2
Point x87 (value: 0.4870) is classified as Class1
Point x88 (value: 0.2064) is classified as Class1
Point x89 (value: 0.8695) is classified as Class2
Point x90 (value: 0.7228) is classified as Class2
Point x91 (value: 0.7586) is classified as Class2
Point x92 (value: 0.6018) is classified as Class2
Point x93 (value: 0.8754) is classified as Class2
Point x94 (value: 0.9174) is classified as Class2
Point x95 (value: 0.2164) is classified as Class1
Point x96 (value: 0.4388) is classified as Class1
Point x97 (value: 0.2288) is classified as Class1
Point x98 (value: 0.3473) is classified as Class1
Point x99 (value: 0.0073) is classified as Class1
Point x100 (value: 0.9073) is classified as Class2

Dr. Karthik B U, Department of CSE(Data Science) 17

Machine Learning Lab Manual( BCSL606)

Results for k = 2:
Point x51 (value: 0.4153) is classified as Class1
Point x52 (value: 0.4013) is classified as Class1
Point x53 (value: 0.7557) is classified as Class2
Point x54 (value: 0.7940) is classified as Class2
Point x55 (value: 0.5026) is classified as Class2
Point x56 (value: 0.8748) is classified as Class2
Point x57 (value: 0.3481) is classified as Class1
Point x58 (value: 0.9093) is classified as Class2
Point x59 (value: 0.1492) is classified as Class1
Point x60 (value: 0.3134) is classified as Class1
Point x61 (value: 0.7004) is classified as Class2
Point x62 (value: 0.7487) is classified as Class2
Point x63 (value: 0.9910) is classified as Class2
Point x64 (value: 0.6144) is classified as Class2
Point x65 (value: 0.5635) is classified as Class2
Point x66 (value: 0.7532) is classified as Class2
Point x67 (value: 0.1218) is classified as Class1
Point x68 (value: 0.4368) is classified as Class1
Point x69 (value: 0.4996) is classified as Class2
Point x70 (value: 0.3800) is classified as Class1
Point x71 (value: 0.2020) is classified as Class1
Point x72 (value: 0.6080) is classified as Class2
Point x73 (value: 0.4145) is classified as Class1
Point x74 (value: 0.5218) is classified as Class2
Point x75 (value: 0.9779) is classified as Class2
Point x76 (value: 0.5877) is classified as Class2
Point x77 (value: 0.5822) is classified as Class2
Point x78 (value: 0.1519) is classified as Class1
Point x79 (value: 0.9352) is classified as Class2
Point x80 (value: 0.7384) is classified as Class2
Point x81 (value: 0.6791) is classified as Class2
Point x82 (value: 0.6254) is classified as Class2
Point x83 (value: 0.1475) is classified as Class1

Dr. Karthik B U, Department of CSE(Data Science) 18

Machine Learning Lab Manual( BCSL606)

Point x84 (value: 0.4246) is classified as Class1

Point x85 (value: 0.5727) is classified as Class2
Point x86 (value: 0.8374) is classified as Class2
Point x87 (value: 0.4870) is classified as Class1
Point x88 (value: 0.2064) is classified as Class1
Point x89 (value: 0.8695) is classified as Class2
Point x90 (value: 0.7228) is classified as Class2
Point x91 (value: 0.7586) is classified as Class2
Point x92 (value: 0.6018) is classified as Class2
Point x93 (value: 0.8754) is classified as Class2
Point x94 (value: 0.9174) is classified as Class2
Point x95 (value: 0.2164) is classified as Class1
Point x96 (value: 0.4388) is classified as Class1
Point x97 (value: 0.2288) is classified as Class1
Point x98 (value: 0.3473) is classified as Class1
Point x99 (value: 0.0073) is classified as Class1
Point x100 (value: 0.9073) is classified as Class2

Results for k = 3:
Point x51 (value: 0.4153) is classified as Class1
Point x52 (value: 0.4013) is classified as Class1
Point x53 (value: 0.7557) is classified as Class2
Point x54 (value: 0.7940) is classified as Class2
Point x55 (value: 0.5026) is classified as Class2
Point x56 (value: 0.8748) is classified as Class2
Point x57 (value: 0.3481) is classified as Class1
Point x58 (value: 0.9093) is classified as Class2
Point x59 (value: 0.1492) is classified as Class1
Point x60 (value: 0.3134) is classified as Class1
Point x61 (value: 0.7004) is classified as Class2
Point x62 (value: 0.7487) is classified as Class2
Point x63 (value: 0.9910) is classified as Class2
Point x64 (value: 0.6144) is classified as Class2

Dr. Karthik B U, Department of CSE(Data Science) 19

Machine Learning Lab Manual( BCSL606)

Point x65 (value: 0.5635) is classified as Class2

Point x66 (value: 0.7532) is classified as Class2
Point x67 (value: 0.1218) is classified as Class1
Point x68 (value: 0.4368) is classified as Class1
Point x69 (value: 0.4996) is classified as Class1
Point x70 (value: 0.3800) is classified as Class1
Point x71 (value: 0.2020) is classified as Class1
Point x72 (value: 0.6080) is classified as Class2
Point x73 (value: 0.4145) is classified as Class1
Point x74 (value: 0.5218) is classified as Class2
Point x75 (value: 0.9779) is classified as Class2
Point x76 (value: 0.5877) is classified as Class2
Point x77 (value: 0.5822) is classified as Class2
Point x78 (value: 0.1519) is classified as Class1
Point x79 (value: 0.9352) is classified as Class2
Point x80 (value: 0.7384) is classified as Class2
Point x81 (value: 0.6791) is classified as Class2
Point x82 (value: 0.6254) is classified as Class2
Point x83 (value: 0.1475) is classified as Class1
Point x84 (value: 0.4246) is classified as Class1
Point x85 (value: 0.5727) is classified as Class2
Point x86 (value: 0.8374) is classified as Class2
Point x87 (value: 0.4870) is classified as Class1
Point x88 (value: 0.2064) is classified as Class1
Point x89 (value: 0.8695) is classified as Class2
Point x90 (value: 0.7228) is classified as Class2
Point x91 (value: 0.7586) is classified as Class2
Point x92 (value: 0.6018) is classified as Class2
Point x93 (value: 0.8754) is classified as Class2
Point x94 (value: 0.9174) is classified as Class2
Point x95 (value: 0.2164) is classified as Class1
Point x96 (value: 0.4388) is classified as Class1
Point x97 (value: 0.2288) is classified as Class1
Point x98 (value: 0.3473) is classified as Class1

Dr. Karthik B U, Department of CSE(Data Science) 20

Machine Learning Lab Manual( BCSL606)

Point x99 (value: 0.0073) is classified as Class1

Point x100 (value: 0.9073) is classified as Class2

Results for k = 4:
Point x51 (value: 0.4153) is classified as Class1
Point x52 (value: 0.4013) is classified as Class1
Point x53 (value: 0.7557) is classified as Class2
Point x54 (value: 0.7940) is classified as Class2
Point x55 (value: 0.5026) is classified as Class2
Point x56 (value: 0.8748) is classified as Class2
Point x57 (value: 0.3481) is classified as Class1
Point x58 (value: 0.9093) is classified as Class2
Point x59 (value: 0.1492) is classified as Class1
Point x60 (value: 0.3134) is classified as Class1
Point x61 (value: 0.7004) is classified as Class2
Point x62 (value: 0.7487) is classified as Class2
Point x63 (value: 0.9910) is classified as Class2
Point x64 (value: 0.6144) is classified as Class2
Point x65 (value: 0.5635) is classified as Class2
Point x66 (value: 0.7532) is classified as Class2
Point x67 (value: 0.1218) is classified as Class1
Point x68 (value: 0.4368) is classified as Class1
Point x69 (value: 0.4996) is classified as Class2
Point x70 (value: 0.3800) is classified as Class1
Point x71 (value: 0.2020) is classified as Class1
Point x72 (value: 0.6080) is classified as Class2
Point x73 (value: 0.4145) is classified as Class1
Point x74 (value: 0.5218) is classified as Class2
Point x75 (value: 0.9779) is classified as Class2
Point x76 (value: 0.5877) is classified as Class2
Point x77 (value: 0.5822) is classified as Class2
Point x78 (value: 0.1519) is classified as Class1
Point x79 (value: 0.9352) is classified as Class2

Dr. Karthik B U, Department of CSE(Data Science) 21

Machine Learning Lab Manual( BCSL606)

Point x80 (value: 0.7384) is classified as Class2

Point x81 (value: 0.6791) is classified as Class2
Point x82 (value: 0.6254) is classified as Class2
Point x83 (value: 0.1475) is classified as Class1
Point x84 (value: 0.4246) is classified as Class1
Point x85 (value: 0.5727) is classified as Class2
Point x86 (value: 0.8374) is classified as Class2
Point x87 (value: 0.4870) is classified as Class1
Point x88 (value: 0.2064) is classified as Class1
Point x89 (value: 0.8695) is classified as Class2
Point x90 (value: 0.7228) is classified as Class2
Point x91 (value: 0.7586) is classified as Class2
Point x92 (value: 0.6018) is classified as Class2
Point x93 (value: 0.8754) is classified as Class2
Point x94 (value: 0.9174) is classified as Class2
Point x95 (value: 0.2164) is classified as Class1
Point x96 (value: 0.4388) is classified as Class1
Point x97 (value: 0.2288) is classified as Class1
Point x98 (value: 0.3473) is classified as Class1
Point x99 (value: 0.0073) is classified as Class1
Point x100 (value: 0.9073) is classified as Class2

Results for k = 5:
Point x51 (value: 0.4153) is classified as Class1
Point x52 (value: 0.4013) is classified as Class1
Point x53 (value: 0.7557) is classified as Class2
Point x54 (value: 0.7940) is classified as Class2
Point x55 (value: 0.5026) is classified as Class1
Point x56 (value: 0.8748) is classified as Class2
Point x57 (value: 0.3481) is classified as Class1
Point x58 (value: 0.9093) is classified as Class2
Point x59 (value: 0.1492) is classified as Class1
Point x60 (value: 0.3134) is classified as Class1

Dr. Karthik B U, Department of CSE(Data Science) 22

Machine Learning Lab Manual( BCSL606)

Point x61 (value: 0.7004) is classified as Class2

Point x62 (value: 0.7487) is classified as Class2
Point x63 (value: 0.9910) is classified as Class2
Point x64 (value: 0.6144) is classified as Class2
Point x65 (value: 0.5635) is classified as Class2
Point x66 (value: 0.7532) is classified as Class2
Point x67 (value: 0.1218) is classified as Class1
Point x68 (value: 0.4368) is classified as Class1
Point x69 (value: 0.4996) is classified as Class1
Point x70 (value: 0.3800) is classified as Class1
Point x71 (value: 0.2020) is classified as Class1
Point x72 (value: 0.6080) is classified as Class2
Point x73 (value: 0.4145) is classified as Class1
Point x74 (value: 0.5218) is classified as Class2
Point x75 (value: 0.9779) is classified as Class2
Point x76 (value: 0.5877) is classified as Class2
Point x77 (value: 0.5822) is classified as Class2
Point x78 (value: 0.1519) is classified as Class1
Point x79 (value: 0.9352) is classified as Class2
Point x80 (value: 0.7384) is classified as Class2
Point x81 (value: 0.6791) is classified as Class2
Point x82 (value: 0.6254) is classified as Class2
Point x83 (value: 0.1475) is classified as Class1
Point x84 (value: 0.4246) is classified as Class1
Point x85 (value: 0.5727) is classified as Class2
Point x86 (value: 0.8374) is classified as Class2
Point x87 (value: 0.4870) is classified as Class1
Point x88 (value: 0.2064) is classified as Class1
Point x89 (value: 0.8695) is classified as Class2
Point x90 (value: 0.7228) is classified as Class2
Point x91 (value: 0.7586) is classified as Class2
Point x92 (value: 0.6018) is classified as Class2
Point x93 (value: 0.8754) is classified as Class2
Point x94 (value: 0.9174) is classified as Class2

Dr. Karthik B U, Department of CSE(Data Science) 23

Machine Learning Lab Manual( BCSL606)

Point x95 (value: 0.2164) is classified as Class1

Point x96 (value: 0.4388) is classified as Class1
Point x97 (value: 0.2288) is classified as Class1
Point x98 (value: 0.3473) is classified as Class1
Point x99 (value: 0.0073) is classified as Class1
Point x100 (value: 0.9073) is classified as Class2

Results for k = 20:

Point x51 (value: 0.4153) is classified as Class1
Point x52 (value: 0.4013) is classified as Class1
Point x53 (value: 0.7557) is classified as Class2
Point x54 (value: 0.7940) is classified as Class2
Point x55 (value: 0.5026) is classified as Class1
Point x56 (value: 0.8748) is classified as Class2
Point x57 (value: 0.3481) is classified as Class1
Point x58 (value: 0.9093) is classified as Class2
Point x59 (value: 0.1492) is classified as Class1
Point x60 (value: 0.3134) is classified as Class1
Point x61 (value: 0.7004) is classified as Class2
Point x62 (value: 0.7487) is classified as Class2
Point x63 (value: 0.9910) is classified as Class2
Point x64 (value: 0.6144) is classified as Class2
Point x65 (value: 0.5635) is classified as Class2
Point x66 (value: 0.7532) is classified as Class2
Point x67 (value: 0.1218) is classified as Class1
Point x68 (value: 0.4368) is classified as Class1
Point x69 (value: 0.4996) is classified as Class1
Point x70 (value: 0.3800) is classified as Class1
Point x71 (value: 0.2020) is classified as Class1
Point x72 (value: 0.6080) is classified as Class2
Point x73 (value: 0.4145) is classified as Class1
Point x74 (value: 0.5218) is classified as Class2
Point x75 (value: 0.9779) is classified as Class2

Dr. Karthik B U, Department of CSE(Data Science) 24

Machine Learning Lab Manual( BCSL606)

Point x76 (value: 0.5877) is classified as Class2

Point x77 (value: 0.5822) is classified as Class2
Point x78 (value: 0.1519) is classified as Class1
Point x79 (value: 0.9352) is classified as Class2
Point x80 (value: 0.7384) is classified as Class2
Point x81 (value: 0.6791) is classified as Class2
Point x82 (value: 0.6254) is classified as Class2
Point x83 (value: 0.1475) is classified as Class1
Point x84 (value: 0.4246) is classified as Class1
Point x85 (value: 0.5727) is classified as Class2
Point x86 (value: 0.8374) is classified as Class2
Point x87 (value: 0.4870) is classified as Class1
Point x88 (value: 0.2064) is classified as Class1
Point x89 (value: 0.8695) is classified as Class2
Point x90 (value: 0.7228) is classified as Class2
Point x91 (value: 0.7586) is classified as Class2
Point x92 (value: 0.6018) is classified as Class2
Point x93 (value: 0.8754) is classified as Class2
Point x94 (value: 0.9174) is classified as Class2
Point x95 (value: 0.2164) is classified as Class1
Point x96 (value: 0.4388) is classified as Class1
Point x97 (value: 0.2288) is classified as Class1
Point x98 (value: 0.3473) is classified as Class1
Point x99 (value: 0.0073) is classified as Class1
Point x100 (value: 0.9073) is classified as Class2

Results for k = 30:

Dr. Karthik B U, Department of CSE(Data Science) 25

Machine Learning Lab Manual( BCSL606)

Point x57 (value: 0.3481) is classified as Class1

Point x58 (value: 0.9093) is classified as Class2
Point x59 (value: 0.1492) is classified as Class1
Point x60 (value: 0.3134) is classified as Class1
Point x61 (value: 0.7004) is classified as Class2
Point x62 (value: 0.7487) is classified as Class2
Point x63 (value: 0.9910) is classified as Class2
Point x64 (value: 0.6144) is classified as Class2
Point x65 (value: 0.5635) is classified as Class2
Point x66 (value: 0.7532) is classified as Class2
Point x67 (value: 0.1218) is classified as Class1
Point x68 (value: 0.4368) is classified as Class1
Point x69 (value: 0.4996) is classified as Class2
Point x70 (value: 0.3800) is classified as Class1
Point x71 (value: 0.2020) is classified as Class1
Point x72 (value: 0.6080) is classified as Class2
Point x73 (value: 0.4145) is classified as Class1
Point x74 (value: 0.5218) is classified as Class2
Point x75 (value: 0.9779) is classified as Class2
Point x76 (value: 0.5877) is classified as Class2
Point x77 (value: 0.5822) is classified as Class2
Point x78 (value: 0.1519) is classified as Class1
Point x79 (value: 0.9352) is classified as Class2
Point x80 (value: 0.7384) is classified as Class2
Point x81 (value: 0.6791) is classified as Class2
Point x82 (value: 0.6254) is classified as Class2
Point x83 (value: 0.1475) is classified as Class1
Point x84 (value: 0.4246) is classified as Class1
Point x85 (value: 0.5727) is classified as Class2
Point x86 (value: 0.8374) is classified as Class2
Point x87 (value: 0.4870) is classified as Class2
Point x88 (value: 0.2064) is classified as Class1
Point x89 (value: 0.8695) is classified as Class2
Point x90 (value: 0.7228) is classified as Class2

Dr. Karthik B U, Department of CSE(Data Science) 26

Machine Learning Lab Manual( BCSL606)

Point x91 (value: 0.7586) is classified as Class2

Point x92 (value: 0.6018) is classified as Class2
Point x93 (value: 0.8754) is classified as Class2
Point x94 (value: 0.9174) is classified as Class2
Point x95 (value: 0.2164) is classified as Class1
Point x96 (value: 0.4388) is classified as Class1
Point x97 (value: 0.2288) is classified as Class1
Point x98 (value: 0.3473) is classified as Class1
Point x99 (value: 0.0073) is classified as Class1
Point x100 (value: 0.9073) is classified as Class2

Classification complete.

Dr. Karthik B U, Department of CSE(Data Science) 27

Machine Learning Lab Manual( BCSL606)

6. Implement the non-parametric Locally Weighted Regression algorithm in order

to fit data points. Select appropriate data set for your experiment and draw graphs

Program:
import numpy as np
import matplotlib.pyplot as plt
def gaussian_kernel(x, xi, tau):
return np.exp(-np.sum((x - xi) ** 2) / (2 * tau ** 2))
def locally_weighted_regression(x, X, y, tau):
m = X.shape[0]
weights = np.array([gaussian_kernel(x, X[i], tau) for i in range(m)])
W = np.diag(weights)
X_transpose_W = X.T @ W
theta = np.linalg.inv(X_transpose_W @ X) @ X_transpose_W @ y
return x @ theta
np.random.seed(42)
X = np.linspace(0, 2 * np.pi, 100)
y = np.sin(X) + 0.1 * np.random.randn(100)
X_bias = np.c_[np.ones(X.shape), X]

x_test = np.linspace(0, 2 * np.pi, 200)

x_test_bias = np.c_[np.ones(x_test.shape), x_test]
tau = 0.5
y_pred = np.array([locally_weighted_regression(xi, X_bias, y, tau) for xi in x_test_bias])

plt.figure(figsize=(10, 6))
plt.scatter(X, y, color='red', label='Training Data', alpha=0.7)
plt.plot(x_test, y_pred, color='blue', label=f'LWR Fit (tau={tau})', linewidth=2)
plt.xlabel('X', fontsize=12)
plt.ylabel('y', fontsize=12)
plt.title('Locally Weighted Regression', fontsize=14)
plt.legend(fontsize=10)
plt.grid(alpha=0.3)
plt.show()

Dr. Karthik B U, Department of CSE(Data Science) 28

Machine Learning Lab Manual( BCSL606)

Expected output:

Dr. Karthik B U, Department of CSE(Data Science) 29

Machine Learning Lab Manual( BCSL606)

7.Develop a program to demonstrate the working of Linear Regression and

Polynomial Regression. Use Boston Housing Dataset for Linear Regression and
Auto MPG Dataset (for vehicle fuel efficiency prediction) for Polynomial
Regression.

PROGRAM:
import pandas as pd
# Load Boston Housing dataset
url = "https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv"
boston_df = pd.read_csv(url)

# Print column names

print("Available columns in the dataset:")
print(boston_df.columns.tolist())

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
from sklearn.metrics import mean_squared_error, r2_score
import warnings
warnings.filterwarnings('ignore')

# Part 1: Linear Regression with Boston Housing Dataset

print("Part 1: Linear Regression - Boston Housing Dataset")
print("-" * 50)

# Load Boston Housing dataset

url = "https://raw.githubusercontent.com/selva86/datasets/master/BostonHousing.csv"
boston_df = pd.read_csv(url)

# Features and target (using correct column names)

Dr. Karthik B U, Department of CSE(Data Science) 30

Machine Learning Lab Manual( BCSL606)

X_boston = boston_df.drop('medv', axis=1)

y_boston = boston_df['medv']

# Print dataset info

print("\nDataset Information:")
print(f"Number of samples: {len(X_boston)}")
print(f"Number of features: {len(X_boston.columns)}")
print("\nFeatures:")
for name in X_boston.columns:print(f"- {name}")

# Split the data

X_train_boston, X_test_boston, y_train_boston, y_test_boston =
train_test_split(X_boston, y_boston, test_size=0.2, random_state=42)

# Scale the features

scaler = StandardScaler()
X_train_boston_scaled = scaler.fit_transform(X_train_boston)
X_test_boston_scaled = scaler.transform(X_test_boston)

# Train Linear Regression model

lr_model = LinearRegression()
lr_model.fit(X_train_boston_scaled, y_train_boston)

# Make predictions
y_pred_boston = lr_model.predict(X_test_boston_scaled)

# Calculate metrics
mse_boston = mean_squared_error(y_test_boston, y_pred_boston)
rmse_boston = np.sqrt(mse_boston)
r2_boston = r2_score(y_test_boston, y_pred_boston)

print("\nLinear Regression Results:")

print(f"Mean Squared Error: {mse_boston:.2f}")

Dr. Karthik B U, Department of CSE(Data Science) 31

Machine Learning Lab Manual( BCSL606)

print(f"Root Mean Squared Error: {rmse_boston:.2f}")

print(f"R² Score: {r2_boston:.2f}")

# Feature importance analysis

feature_importance = pd.DataFrame({'Feature': X_boston.columns, 'Coefficient':
lr_model.coef_})
feature_importance['Abs_Coefficient'] = abs(feature_importance['Coefficient'])
feature_importance = feature_importance.sort_values('Abs_Coefficient',
ascending=False)

print("\nFeature Importance:")
print(feature_importance[['Feature', 'Coefficient']].to_string(index=False))

# Visualize feature importance

plt.figure(figsize=(12, 6))
plt.bar(feature_importance['Feature'], feature_importance['Coefficient'])
plt.xticks(rotation=45)
plt.title('Feature Importance in Boston Housing Price Prediction')
plt.xlabel('Features')
plt.ylabel('Coefficient Value')
plt.tight_layout()
plt.show()
# Plot actual vs predicted values
plt.figure(figsize=(10, 6))
plt.scatter(y_test_boston, y_pred_boston, alpha=0.5)
plt.plot([y_test_boston.min(), y_test_boston.max()], [y_test_boston.min(),
y_test_boston.max()], 'r--', lw=2)
plt.xlabel('Actual Prices ($1000s)')
plt.ylabel('Predicted Prices ($1000s)')
plt.title('Actual vs Predicted Housing Prices')
plt.tight_layout()
plt.show()

Dr. Karthik B U, Department of CSE(Data Science) 32

Machine Learning Lab Manual( BCSL606)

# Part 2: Polynomial Regression with Auto MPG Dataset

print("\nPart 2: Polynomial Regression - Auto MPG Dataset")
print("-" * 50)
# Load Auto MPG dataset
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'
column_names = ['MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight',
'Acceleration', 'Model Year', 'Origin', 'Car Name']
df = pd.read_csv(url, names=column_names, delim_whitespace=True)
# Clean the data
df = df.replace('?', np.nan)
df = df.dropna()
df['Horsepower'] = df['Horsepower'].astype(float)
# Select features for polynomial regression
X_mpg = df[['Horsepower']].values
y_mpg = df['MPG'].values
# Scale features for polynomial regression
scaler_mpg = StandardScaler()
X_mpg_scaled = scaler_mpg.fit_transform(X_mpg)
# Split the data
X_train_mpg, X_test_mpg, y_train_mpg, y_test_mpg = train_test_split(X_mpg_scaled,
y_mpg, test_size=0.2, random_state=42)
# Create and train models with different polynomial degrees
degrees = [1, 2, 3]
plt.figure(figsize=(15, 5))
for i, degree in enumerate(degrees, 1):
poly_features = PolynomialFeatures(degree=degree)
X_train_poly = poly_features.fit_transform(X_train_mpg)
X_test_poly = poly_features.transform(X_test_mpg)
poly_model = LinearRegression()
poly_model.fit(X_train_poly, y_train_mpg)
y_pred_poly = poly_model.predict(X_test_poly)
mse_poly = mean_squared_error(y_test_mpg, y_pred_poly)
rmse_poly = np.sqrt(mse_poly)
r2_poly = r2_score(y_test_mpg, y_pred_poly)

Dr. Karthik B U, Department of CSE(Data Science) 33

Machine Learning Lab Manual( BCSL606)

print(f"\nPolynomial Regression (degree {degree}) Results:")

print(f"Mean Squared Error: {mse_poly:.2f}")
print(f"Root Mean Squared Error: {rmse_poly:.2f}")
print(f"R² Score: {r2_poly:.2f}")
plt.subplot(1, 3, i)
plt.scatter(X_test_mpg, y_test_mpg, color='blue', alpha=0.5, label='Actual')
X_sort = np.sort(X_test_mpg, axis=0)
X_sort_poly = poly_features.transform(X_sort)
y_sort_pred = poly_model.predict(X_sort_poly)
plt.plot(X_sort, y_sort_pred, color='red', label='Predicted')
plt.xlabel('Horsepower (scaled)')
plt.ylabel('MPG')
plt.title(f'Polynomial Regression (degree {degree})')
plt.legend()
plt.tight_layout()
plt.show()

Expected output:
Available columns in the dataset:
['crim', 'zn', 'indus', 'chas', 'nox', 'rm', 'age', 'dis', 'rad', 'tax', 'ptratio', 'b', 'lstat', 'medv']
Part 1: Linear Regression - Boston Housing Dataset
--------------------------------------------------
Dataset Information:
Number of samples: 506
Number of features: 13
Features:
- crim
- zn
- indus
- chas
- nox
- rm
- age
- dis

Dr. Karthik B U, Department of CSE(Data Science) 34

Machine Learning Lab Manual( BCSL606)

- rad
- tax
- ptratio
-b
- lstat

Linear Regression Results:

Mean Squared Error: 24.29
Root Mean Squared Error: 4.93
R² Score: 0.67

Feature Importance:
Feature Coefficient
lstat -3.611658
rm 3.145240
dis -3.081908
rad 2.251407
ptratio -2.037752
nox -2.022319
tax -1.767014
b 1.129568
crim -1.002135
chas 0.718738
zn 0.696269
indus 0.278065
age -0.176048

Dr. Karthik B U, Department of CSE(Data Science) 35

Machine Learning Lab Manual( BCSL606)

Part 2: Polynomial Regression - Auto MPG Dataset

--------------------------------------------------

Polynomial Regression (degree 1) Results:

Mean Squared Error: 22.15
Root Mean Squared Error: 4.71
R² Score: 0.57

Polynomial Regression (degree 2) Results:

Mean Squared Error: 18.42
Root Mean Squared Error: 4.29

Dr. Karthik B U, Department of CSE(Data Science) 36

Machine Learning Lab Manual( BCSL606)

R² Score: 0.64

Polynomial Regression (degree 3) Results:

Mean Squared Error: 18.46
Root Mean Squared Error: 4.30
R² Score: 0.64

Dr. Karthik B U, Department of CSE(Data Science) 37

Machine Learning Lab Manual( BCSL606)

8.Develop a program to demonstrate the working of the decision tree algorithm.

Use Breast Cancer Data set for building the decision tree and apply this knowledge
to classify a new sample.

Program:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn import tree

data = load_breast_cancer()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy * 100:.2f}%")
new_sample = np.array([X_test[0]])
prediction = clf.predict(new_sample)
prediction_class = "Benign" if prediction == 1 else "Malignant"
print(f"Predicted Class for the new sample: {prediction_class}")
plt.figure(figsize=(12,8))
tree.plot_tree(clf, filled=True, feature_names=data.feature_names,
class_names=data.target_names)
plt.title("Decision Tree - Breast Cancer Dataset")
plt.show()

Dr. Karthik B U, Department of CSE(Data Science) 38

Machine Learning Lab Manual( BCSL606)

Expected output:
Model Accuracy: 94.74%
Predicted Class for the new sample: Benign

Dr. Karthik B U, Department of CSE(Data Science) 39

Machine Learning Lab Manual( BCSL606)

9.Develop a program to implement the Naive Bayesian classifier considering

Olivetti Face Data set for training. Compute the accuracy of the classifier,
considering a few test data sets.
Program:
import numpy as np
from sklearn.datasets import fetch_olivetti_faces
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import matplotlib.pyplot as plt

data = fetch_olivetti_faces(shuffle=True, random_state=42)

X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy * 100:.2f}%')

print("\nClassification Report:")
print(classification_report(y_test, y_pred, zero_division=1))

print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))

cross_val_accuracy = cross_val_score(gnb, X, y, cv=5, scoring='accuracy')

print(f'\nCross-validation accuracy: {cross_val_accuracy.mean() * 100:.2f}%')

fig, axes = plt.subplots(3, 5, figsize=(12, 8))

Dr. Karthik B U, Department of CSE(Data Science) 40

Machine Learning Lab Manual( BCSL606)

for ax, image, label, prediction in zip(axes.ravel(), X_test, y_test, y_pred):

ax.imshow(image.reshape(64, 64), cmap=plt.cm.gray)
ax.set_title(f"True: {label}, Pred: {prediction}")
ax.axis('off')

plt.show()

Expected output:
downloading Olivetti faces from https://ndownloader.figshare.com/files/5976027 to
C:\Users\Dell\scikit_learn_data
Accuracy: 80.83%

Classification Report:
precision recall f1-score support

0 0.67 1.00 0.80 2

1 1.00 1.00 1.00 2
2 0.33 0.67 0.44 3
3 1.00 0.00 0.00 5
4 1.00 0.50 0.67 4
5 1.00 1.00 1.00 2
7 1.00 0.75 0.86 4
8 1.00 0.67 0.80 3
9 1.00 0.75 0.86 4
10 1.00 1.00 1.00 3
11 1.00 1.00 1.00 1
12 0.40 1.00 0.57 4
13 1.00 0.80 0.89 5
14 1.00 0.40 0.57 5
15 0.67 1.00 0.80 2
16 1.00 0.67 0.80 3
17 1.00 1.00 1.00 3
18 1.00 1.00 1.00 3

Dr. Karthik B U, Department of CSE(Data Science) 41

Machine Learning Lab Manual( BCSL606)

19 0.67 1.00 0.80 2

20 1.00 1.00 1.00 3
21 1.00 0.67 0.80 3
22 1.00 0.60 0.75 5
23 1.00 0.75 0.86 4
24 1.00 1.00 1.00 3
25 1.00 0.75 0.86 4
26 1.00 1.00 1.00 2
27 1.00 1.00 1.00 5
28 0.50 1.00 0.67 2
29 1.00 1.00 1.00 2
30 1.00 1.00 1.00 2
31 1.00 0.75 0.86 4
32 1.00 1.00 1.00 2
34 0.25 1.00 0.40 1
35 1.00 1.00 1.00 5
36 1.00 1.00 1.00 3
37 1.00 1.00 1.00 1
38 1.00 0.75 0.86 4
39 0.50 1.00 0.67 5

accuracy 0.81 120

macro avg 0.89 0.85 0.83 120
weighted avg 0.91 0.81 0.81 120

Confusion Matrix:
[[2 0 0 ... 0 0 0]
[0 2 0 ... 0 0 0]
[0 0 2 ... 0 0 1]
...
[0 0 0 ... 1 0 0]
[0 0 0 ... 0 3 0]
[0 0 0 ... 0 0 5]]

Dr. Karthik B U, Department of CSE(Data Science) 42

Machine Learning Lab Manual( BCSL606)

Cross-validation accuracy: 87.25%

Dr. Karthik B U, Department of CSE(Data Science) 43

Machine Learning Lab Manual( BCSL606)

10.Develop a program to implement k-means clustering using Wisconsin Breast

Cancer data set and visualize the clustering result.
Program:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_breast_cancer
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.metrics import confusion_matrix, classification_report
data = load_breast_cancer()
X = data.data
y = data.target
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
kmeans = KMeans(n_clusters=2, random_state=42)
y_kmeans = kmeans.fit_predict(X_scaled)
print("Confusion Matrix:")
print(confusion_matrix(y, y_kmeans))
print("\nClassification Report:")
print(classification_report(y, y_kmeans))
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)
df = pd.DataFrame(X_pca, columns=['PC1', 'PC2'])
df['Cluster'] = y_kmeans
df['True Label'] = y
plt.figure(figsize=(8, 6))
sns.scatterplot(data=df, x='PC1', y='PC2', hue='Cluster', palette='Set1', s=100,
edgecolor='black', alpha=0.7)
plt.title('K-Means Clustering of Breast Cancer Dataset')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')

Dr. Karthik B U, Department of CSE(Data Science) 44

Machine Learning Lab Manual( BCSL606)

plt.legend(title="Cluster")
plt.show()
plt.figure(figsize=(8, 6))
sns.scatterplot(data=df, x='PC1', y='PC2', hue='True Label', palette='coolwarm', s=100,
edgecolor='black', alpha=0.7)
plt.title('True Labels of Breast Cancer Dataset')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.legend(title="True Label")
plt.show()
plt.figure(figsize=(8, 6))
sns.scatterplot(data=df, x='PC1', y='PC2', hue='Cluster', palette='Set1', s=100,
edgecolor='black', alpha=0.7)
centers = pca.transform(kmeans.cluster_centers_)
plt.scatter(centers[:, 0], centers[:, 1], s=200, c='red', marker='X', label='Centroids')
plt.title('K-Means Clustering with Centroids')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.legend(title="Cluster")
plt.show()

Expected output:

Dr. Karthik B U, Department of CSE(Data Science) 45

Machine Learning Lab Manual( BCSL606)

Dr. Karthik B U, Department of CSE(Data Science) 46

Machine Learning Lab Manual( BCSL606)

Dr. Karthik B U, Department of CSE(Data Science) 47

Lab Manual ML
No ratings yet
Lab Manual ML
26 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
67 pages
ML Lab Manual
No ratings yet
ML Lab Manual
60 pages
Machine Learning Labnem
No ratings yet
Machine Learning Labnem
5 pages
Bcsl606 - Lab Manual
No ratings yet
Bcsl606 - Lab Manual
28 pages
Machine Learning (BCSL606) Lab Manual
No ratings yet
Machine Learning (BCSL606) Lab Manual
30 pages
ML Lab Manual
No ratings yet
ML Lab Manual
43 pages
ML Lab Manual
No ratings yet
ML Lab Manual
40 pages
ML LAB Manual
No ratings yet
ML LAB Manual
18 pages
ML Lab
No ratings yet
ML Lab
20 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
18 pages
ML Manual
No ratings yet
ML Manual
30 pages
Machine Learning Laboratory Exercises
No ratings yet
Machine Learning Laboratory Exercises
16 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
DM LabManual Teena
No ratings yet
DM LabManual Teena
6 pages
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
23 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
Machine Learning Lab Manaul BCSL606
No ratings yet
Machine Learning Lab Manaul BCSL606
27 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
MLLAb
No ratings yet
MLLAb
36 pages
Iml Lab (1) .177
No ratings yet
Iml Lab (1) .177
32 pages
ML Lab Manual-Bcsl606
No ratings yet
ML Lab Manual-Bcsl606
64 pages
ML Lab Manual Completed
No ratings yet
ML Lab Manual Completed
56 pages
MLCyber Lab
No ratings yet
MLCyber Lab
9 pages
DSBDAlab Manual
No ratings yet
DSBDAlab Manual
116 pages
ML Lab Manual
No ratings yet
ML Lab Manual
36 pages
Ayush File 1
No ratings yet
Ayush File 1
37 pages
50 Inference
No ratings yet
50 Inference
31 pages
1 - Lab Manual (ML)
No ratings yet
1 - Lab Manual (ML)
42 pages
R22 ML Lab Manual
No ratings yet
R22 ML Lab Manual
25 pages
Auto MPG Dataset Analysis
No ratings yet
Auto MPG Dataset Analysis
25 pages
ML 3
No ratings yet
ML 3
24 pages
ML Observation
No ratings yet
ML Observation
29 pages
ML Lab Manual for CSE Students
No ratings yet
ML Lab Manual for CSE Students
32 pages
MLLab Manual
No ratings yet
MLLab Manual
24 pages
Machine Learning (BCSL606) Lab Manual
No ratings yet
Machine Learning (BCSL606) Lab Manual
117 pages
BCSL606 Machine Learning Lab
No ratings yet
BCSL606 Machine Learning Lab
33 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
Python Machine Learning Practical Guide
No ratings yet
Python Machine Learning Practical Guide
13 pages
ML RECORD - Merged
No ratings yet
ML RECORD - Merged
33 pages
Machine Learning BCA57204LAB
No ratings yet
Machine Learning BCA57204LAB
41 pages
ML Lab Manual (Upto Cie-1)
No ratings yet
ML Lab Manual (Upto Cie-1)
33 pages
ML - LAB - FILE Amrit
No ratings yet
ML - LAB - FILE Amrit
13 pages
Cp4252-Machine Learning Lab Manual 23-24
No ratings yet
Cp4252-Machine Learning Lab Manual 23-24
28 pages
Lab 02 - Introduction To Pandas
No ratings yet
Lab 02 - Introduction To Pandas
6 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
ML Record
No ratings yet
ML Record
19 pages
MLA Lab Record (2024)
No ratings yet
MLA Lab Record (2024)
47 pages
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
Machine Learning Lab Course Overview
No ratings yet
Machine Learning Lab Course Overview
49 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Machine Learning Lab Manual for Engineers
No ratings yet
Machine Learning Lab Manual for Engineers
31 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
Vamshi ml-1,2
No ratings yet
Vamshi ml-1,2
25 pages
Bcsl606 Lab Manual ML Lab
No ratings yet
Bcsl606 Lab Manual ML Lab
51 pages
Ad3461 ML Manual
No ratings yet
Ad3461 ML Manual
34 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
Dsbda Lab Manual Merged
No ratings yet
Dsbda Lab Manual Merged
117 pages
Control Systems - 1 Syllabus
No ratings yet
Control Systems - 1 Syllabus
3 pages
FPGA AES Encryption Design
No ratings yet
FPGA AES Encryption Design
32 pages
MTH603 MIDTERM SOLVED MCQS by JUNAID
100% (3)
MTH603 MIDTERM SOLVED MCQS by JUNAID
39 pages
Continuous Review
No ratings yet
Continuous Review
15 pages
DAA Final Examination 2005en
No ratings yet
DAA Final Examination 2005en
9 pages
Short-Term Hydrothermal Coordination SEO
No ratings yet
Short-Term Hydrothermal Coordination SEO
8 pages
Unit III Differtial Equation Solution
No ratings yet
Unit III Differtial Equation Solution
17 pages
Linear Regression Certification Completion
No ratings yet
Linear Regression Certification Completion
1 page
Code Breaker Worksheet Spaced
No ratings yet
Code Breaker Worksheet Spaced
3 pages
Comprehensive Guide to Signal Processing
No ratings yet
Comprehensive Guide to Signal Processing
6 pages
Probability & Statistics Exam 2021
No ratings yet
Probability & Statistics Exam 2021
1 page
Finite State Machines Overview
No ratings yet
Finite State Machines Overview
14 pages
GR 10 Edwardsmaths Test or Assignment Trig Functions T2 2022 Eng
No ratings yet
GR 10 Edwardsmaths Test or Assignment Trig Functions T2 2022 Eng
3 pages
Prediction of Stroke Using Machine Learning
No ratings yet
Prediction of Stroke Using Machine Learning
6 pages
Programme9 - Huffman Code
No ratings yet
Programme9 - Huffman Code
5 pages
Exponential and Logarithmic Analysis Project
No ratings yet
Exponential and Logarithmic Analysis Project
3 pages
Disk Scheduling Problem
100% (3)
Disk Scheduling Problem
2 pages
Lecture 07
No ratings yet
Lecture 07
19 pages
DAA - All Five Units (HandWrittern Notes)
No ratings yet
DAA - All Five Units (HandWrittern Notes)
154 pages
Solving Linear Recurrence Relations
No ratings yet
Solving Linear Recurrence Relations
4 pages
Project Stage-II Code
No ratings yet
Project Stage-II Code
40 pages
LAC Module2 AnswerKey
No ratings yet
LAC Module2 AnswerKey
5 pages
Streaming Dense Video Captioning Model
No ratings yet
Streaming Dense Video Captioning Model
11 pages
Understanding Recursive Functions
No ratings yet
Understanding Recursive Functions
8 pages
Automatic Control For Mechanical Engineers
No ratings yet
Automatic Control For Mechanical Engineers
176 pages
Code Test - Most Valuable Player
No ratings yet
Code Test - Most Valuable Player
3 pages
Design of Experiments For Nonlinear System Identification A Set Membership Approach
No ratings yet
Design of Experiments For Nonlinear System Identification A Set Membership Approach
12 pages
Adaptive Fuzzy Prescribed Time Tracking Control For Nonlinear Systems
No ratings yet
Adaptive Fuzzy Prescribed Time Tracking Control For Nonlinear Systems
37 pages
DMGT Assignment 3
No ratings yet
DMGT Assignment 3
1 page
Random Vibration 845826
100% (1)
Random Vibration 845826
22 pages