0% found this document useful (0 votes)

19 views51 pages

Bcsl606 Lab Manual ML Lab

K. S. School of Engineering and Management aims to provide quality education in engineering and management through holistic education and research, with a focus on industry interaction and leadership development. The Department of Computer Science and Business Systems emphasizes a competent learning ecosystem to cultivate innovative leaders, offering high-quality education and fostering research and entrepreneurship. The Machine Learning Laboratory course includes practical experiments to apply various machine learning algorithms and techniques, enhancing students' skills in data analysis and algorithm implementation.

Uploaded by

lakshmiv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views51 pages

Bcsl606 Lab Manual ML Lab

Uploaded by

lakshmiv

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 51

K. S.

School of Engineering and Management

Vision
To impart quality education in engineering and management to meet technological business and societal needs
through holistic education and research.

Mission
K. S. School of Engineering and Management shall,
 Establish state-of-art infrastructure to facilitate effective dissemination of technical and managerial
knowledge.
 Provide comprehensive educational experience through a combination of curricular and experiential
learning, strengthened by industry-institute interaction.
 Pursue socially relevant research and disseminate knowledge.
 Inculcate leadership skills and foster entrepreneurial spirit among students.

Department of Computer Science and Business Systems

Vision
To provide competent learning ecosystem to develop the understanding of technology and business to produce
innovative, principled and insightful leaders to meet the societal demands.

Mission
 To deliver high-quality education in the fields of technology and business through effective teaching-learning
practices and a conducive learning environment.
 To create a center of excellence through collaborations with industries and various entities, addressing the
evolving demands of society.
 To foster an environment that promotes innovation, multidisciplinary research, skill enhancement, and
entrepreneurship.
 To uphold and advocate for elevated standards of professional ethics and transparency.

Program Specific Outcomes (PSOs)

PSO 1: Comprehend fundamental and advanced concepts within the core domains of Computer
Science to analyze, design, and implement optimal solutions for real-world challenges.
PSO 2: Grasp business principles and employ the latest technologies to address business
challenges effectively.

Program Educational Objectives (PEOs)

PEO 1: Pursue a prosperous professional journey in Information Technology and its related
industries.
PEO 2: Creating innovative engineering solutions through continual learning, research, and
advanced problem-solving skills
Machine Learning lab Semester 6
Course Code BCSL606 CIE Marks 50
Teaching Hours/Week (L:T:P: S) 0:0:2:0 SEE Marks 50
Credits 01 Exam Hours 100
Examination type (SEE) Practical
Course objectives:
• To become familiar with data and visualize univariate, bivariate, and multivariate data using
statistical techniques and dimensionality reduction.
• To understand various machine learning algorithms such as similarity-based learning, regression, decision
trees, and clustering.
• To familiarize with learning theories, probability-based models and developing the skills required for decision-
making in dynamic environments.
Sl.NO Experiments
1 Develop a program to create histograms for all numerical features and analyze the distribution of each feature.
Generate box plots for all numerical features and identify any outliers. Use California Housing dataset.

Book 1: Chapter 2

2 Develop a program to Compute the correlation matrix to understand the relationships between pairs of
features. Visualize the correlation matrix using a heatmap to know which variables have strong
positive/negative correlations. Create a pair plot to visualize pairwise relationships between features. Use
California Housing dataset.

Book 1: Chapter 2
3 Develop a program to implement Principal Component Analysis (PCA) for reducing the dimensionality of the
Iris dataset from 4 features to 2.

Book 1: Chapter 2
4 For a given set of training data examples stored in a .CSV file, implement and demonstrate the Find-S algorithm
to output a description of the set of all hypotheses consistent with the training examples.

Book 1: Chapter 3
5 Develop a program to implement k-Nearest Neighbour algorithm to classify the randomly generated 100 values
of x in the range of [0,1]. Perform the following based on dataset generated.

Label the first 50 points {x1,……,x50} as follows: if (xi ≤ 0.5), then xi Class1, else xi Class1 Classify
the remaining points, x51,……,x100 using KNN. Perform this for k=1,2,3,4,5,20,30

Book 2: Chapter – 2
6 Implement the non-parametric Locally Weighted Regression algorithm in order to fit data points. Select
appropriate data set for your experiment and draw graphs

Book 1: Chapter – 4
7 Develop a program to demonstrate the working of Linear Regression and Polynomial Regression. Use Boston
Housing Dataset for Linear Regression and Auto MPG Dataset (for vehicle fuel efficiency prediction) for
Polynomial Regression.

Book 1: Chapter – 5
8 Develop a program to demonstrate the working of the decision tree algorithm. Use Breast Cancer Data set for
building the decision tree and apply this knowledge to classify a new sample.

Book 2: Chapter – 3
9 Develop a program to implement the Naive Bayesian classifier considering Olivetti Face Data set for
training. Compute the accuracy of the classifier, considering a few test data sets.

Book 2: Chapter – 4
10 Develop a program to implement k-means clustering using Wisconsin Breast Cancer data set and visualize the
clustering result.

Book 2: Chapter – 4
Course outcomes (Course Skill Set):
At the end of the course the student will be able to:
● Illustrate the principles of multivariate data and apply dimensionality reduction techniques.
● Demonstrate similarity-based learning methods and perform regression analysis.
● Develop decision trees for classification and regression problems, and Bayesian models for
probabilistic learning.
• Implement the clustering algorithms to share computing resources.
Laboratory outcomes: The students should be able to

1. Implement and demonstrate ML algorithms.

2. Evaluate different algorithms.

Conduction of Practical Examination:

 Experiment distribution
o For laboratories having only one part: Students are allowed to pick one experiment
from the lot with equal opportunity.
o For laboratories having PART A and PART B: Students are allowed to pick one
experiment from PART A and one experiment from PART B, with equal opportunity.
 Students Change of experiment is allowed only once and marks allotted for procedure to
be made zero of the changed part only.
 Marks Distribution (Courseed to change in accoradance with university regulations)
a) For laboratories having only one part – Procedure + Execution + Viva-Voce:
15+70+15 = 100 Marks
b) For laboratories having PART A and PART B
i. Part A – Procedure + Execution + Viva = 6 + 28 + 6 = 40 Marks
ii. Part B – Procedure + Execution + Viva = 9 + 42 + 9 = 60 Marks
CONTENT LIST
SL.NO. EXPERIMENT NAME PAGE NO.
Program 1: Develop a program to create histograms for all numerical features and
analyze the distribution of each feature. Generate box plots for all numerical
1. 1
features and identify any outliers. Use California Housing dataset.

Program 2: Develop a program to Compute the correlation matrix to understand

the relationships between pairs of features. Visualize the correlation matrix
using a heatmap to know which variables have strong positive/negative 5
2.
correlations. Create a pair plot to visualize pairwise relationships between
features. Use California Housing dataset.

Program 3: Develop a program to implement Principal Component Analysis (PCA)

3. for reducing the dimensionality of the Iris dataset from 4 features to 2. 8

Program 4: For a given set of training data examples stored in a .CSV file,
4. implement and demonstrate the Find-S algorithm to output a description of 10
the set of all hypotheses consistent with the training examples.

Program 5: Develop a program to implement k-Nearest Neighbour algorithm

to classify the randomly generated 100 values of x in the range of [0,1].
Perform the following based on dataset generated. 12

5. Label the first 50 points {x1,……,x50} as follows: if (xi ≤ 0.5), then xi

Class1, else xi Class1
Classify the remaining points, x51,……,x100 using KNN. Perform this for
k=1,2,3,4,5,20,30
Program 6: Implement the non-parametric Locally Weighted Regression
6. algorithm in order to fit data points. Select appropriate data set for your 23
experiment and draw graphs

Program 7: Develop a program to demonstrate the working of Linear

Regression and Polynomial Regression. Use Boston Housing Dataset for
7. Linear Regression and Auto MPG Dataset (for vehicle fuel efficiency
25
prediction) for Polynomial Regression.

Program 8: Develop a program to demonstrate the working of the decision

tree algorithm. Use Breast Cancer Data set for building the decision tree 29
8.
and apply this knowledge to classify a new sample.

Program 9: Develop a program to implement the Naive Bayesian classifier

considering Olivetti Face Data set for training. Compute the accuracy of 31
9.
the classifier, considering a few test data sets.

Program 10: Develop a program to implement k-means clustering using

10. Wisconsin Breast Cancer data set and visualize the clustering result. 34

Viva Questions
11. 38
MACHINE LEARNING LABORATORY BCSL606

Experiment 1
1. Develop a program to create histograms for all numerical features and
analyze the distribution of each feature. Generate box plots for all numerical
features and identify any outliers. Use California Housing dataset.

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

from sklearn.datasets import fetch_california_housing

# Step 1: Load the California Housing dataset
data = fetch_california_housing(as_frame=True)
housing_df = data.frame
# Step 2: Create histograms for numerical features
numerical_features = housing_df.select_dtypes(include=[np.number]).columns
# Plot histograms
plt.figure(figsize=(15, 10))
for i, feature in enumerate(numerical_features):
plt.subplot(3, 3, i + 1)
sns.histplot(housing_df[feature], kde=True, bins=30, color='blue')
plt.title(f'Distribution of {feature}')
plt.tight_layout()
plt.show()
# Step 3: Generate box plots for numerical features
plt.figure(figsize=(15, 10))
for i, feature in enumerate(numerical_features):
plt.subplot(3, 3, i + 1)

DEPT OF CS&BS, KSSEM Page 1

ATMECE
MACHINE LEARNING LABORATORY BCSL606

sns.boxplot(x=housing_df[feature], color='orange')

plt.title(f'Box Plot of {feature}') plt.tight_layout()

plt.show()
# Step 4: Identify outliers using the IQR method
print("Outliers Detection:")
outliers_summary = {}
for feature in numerical_features:
Q1 = housing_df[feature].quantile(0.25)
Q3 = housing_df[feature].quantile(0.75)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
outliers = housing_df[(housing_df[feature] < lower_bound) |
(housing_df[feature] > upper_bound)]
outliers_summary[feature] = len(outliers)
print(f"{feature}: {len(outliers)} outliers")
# Optional: Print a summary of the dataset
print("\nDataset Summary:")
print(housing_df.describe())

output:

DEPT OF CS&BS, KSSEM Page 2

ATMECE
MACHINE LEARNING LABORATORY BCSL606

DEPT OF CS&BS, KSSEM Page 3

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Outliers Detection:
MedHouseVal: 1071 outliers

Dataset Summary:
MedInc HouseAge AveRooms AveBedrms Population \
count 20640.000000 20640.000000 20640.000000 20640.000000 20640.000000
mean 3.870671 28.639486 5.429000 1.096675 1425.476744
std 1.899822 12.585558 2.474173 0.473911 1132.462122
min 0.499900 1.000000 0.846154 0.333333 3.000000
25% 2.563400 18.000000 4.440716 1.006079 787.000000
50% 3.534800 29.000000 5.229129 1.048780 1166.000000
75% 4.743250 37.000000 6.052381 1.099526 1725.000000
max 15.000100 52.000000 141.909091 34.066667 35682.000000

AveOccup Latitude Longitude MedHouseVal

count 20640.000000 20640.000000 20640.000000 20640.000000
mean 3.070655 35.631861 -119.569704 2.068558
std 10.386050 2.135952 2.003532 1.153956
min 0.692308 32.540000 -124.350000 0.149990
25% 2.429741 33.930000 -121.800000 1.196000
50% 2.818116 34.260000 -118.490000 1.797000
75% 3.282261 37.710000 -118.010000 2.647250
max 1243.333333 41.950000 -114.310000 5.000010

DEPT OF CS&BS, KSSEM Page 4

ATMECE
MACHINE LEARNING LABORATORY BCSL606

2. Develop a program to Compute the correlation matrix to understand the

relationships between pairs of features. Visualize the correlation matrix using a
heatmap to know which variables have strong positive/negative correlations.
Create a pair plot to visualize pairwise relationships between features. Use
California Housing dataset.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing
# Step 1: Load the California Housing Dataset
california_data = fetch_california_housing(as_frame=True)
data = california_data.frame
# Step 2: Compute the correlation matrix
correlation_matrix = data.corr()
# Step 3: Visualize the correlation matrix using a heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f',
linewidths=0.5)
plt.title('Correlation Matrix of California Housing Features')
plt.show()
# Step 4: Create a pair plot to visualize pairwise relationships
sns.pairplot(data, diag_kind='kde', plot_kws={'alpha': 0.5})
plt.suptitle('Pair Plot of California Housing Features', y=1.02)
plt.show()

output:

DEPT OF CS&BS, KSSEM Page 5

ATMECE
MACHINE LEARNING LABORATORY BCSL606

DEPT OF CS&BS, KSSEM Page 6

ATMECE
MACHINE LEARNING LABORATORY BCSL606

DEPT OF CS&BS, KSSEM Page 7

ATMECE
MACHINE LEARNING LABORATORY BCSL606

3. Develop a program to implement Principal Component Analysis (PCA) for reducing

the dimensionality of the Iris dataset from 4 features to 2.

import numpy as np

import pandas as pd

from sklearn.datasets import load_iris

from sklearn.decomposition import PCA

import matplotlib.pyplot as plt

# Load the Iris dataset

iris = load_iris()

data = iris.data

labels = iris.target

label_names = iris.target_names

# Convert to a DataFrame for better visualization

iris_df = pd.DataFrame(data, columns=iris.feature_names)

# Perform PCA to reduce dimensionality to 2

pca = PCA(n_components=2)

data_reduced = pca.fit_transform(data)

# Create a DataFrame for the reduced data

reduced_df = pd.DataFrame(data_reduced, columns=['Principal Component 1',

'Principal Component 2'])

reduced_df['Label'] = labels

# Plot the reduced data

plt.figure(figsize=(8, 6))

colors = ['r', 'g', 'b']

for i, label in enumerate(np.unique(labels)):

plt.scatter(

reduced_df[reduced_df['Label'] == label]['Principal Component 1'],

DEPT OF CS&BS, KSSEM Page 8

ATMECE
MACHINE LEARNING LABORATORY BCSL606

reduced_df[reduced_df['Label'] == label]['Principal Component 2'],

label=label_names[label],

color=colors[i]

plt.title('PCA on Iris Dataset')

plt.xlabel('Principal Component 1')

plt.ylabel('Principal Component 2')

plt.legend()

plt.grid()

plt.show()

output:

DEPT OF CS&BS, KSSEM Page 9

ATMECE
MACHINE LEARNING LABORATORY BCSL606

4. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Find-S algorithm to output a description of the set of all
hypotheses consistent with the training examples.

import pandas as pd

# Function to implement the Find-S Algorithm

def find_s_algorithm(file_path):

# Read the dataset

data = pd.read_csv(file_path)

print("Training data:")

print(data)

# Separate attributes and class label

attributes = data.columns[:-1] # All columns except the last one

class_label = data.columns[-1] # Last column is the class label

# Initialize hypothesis with '?' for each attribute

hypothesis = ['?' for _ in attributes]

# Iterate through the dataset to find the specific hypothesis

for index, row in data.iterrows():

# Only consider positive examples (class label == 'Yes')

if row[class_label] == 'Yes':

for i, value in enumerate(row[attributes]):

# If hypothesis is '?' or matches the current value, update hypothesis

if hypothesis[i] == '?' or hypothesis[i] == value:

hypothesis[i] = value

DEPT OF CS&BS, KSSEM Page 10

ATMECE
MACHINE LEARNING LABORATORY BCSL606

else:

# If the value is different, set '?' for that attribute

hypothesis[i] = '?'

return hypothesis

# Define file path

file_path = r"C:\Users\atme\Desktop\example ml\training_data.csv"

# Call the Find-S algorithm

hypothesis = find_s_algorithm(file_path)

# Print the final hypothesis

print("\nThe final hypothesis is:", hypothesis)

output:

Training data:Outlook Temperature Humidity Windy PlayTennis

0 Sunny Hot High False No

1 Sunny Hot High True No

2 Overcast Hot High False Yes

3 Rain Cold High False Yes

4 Rain Cold High True No

5 Overcast Hot High True Yes

6 Sunny Hot High False No

The final hypothesis is: ['Overcast', 'Hot', 'High', '?']

DEPT OF CS&BS, KSSEM Page 11

ATMECE
MACHINE LEARNING LABORATORY BCSL606

5. Develop a program to implement k-Nearest Neighbour algorithm to classify the

randomly generated 100 values of x in the range of [0,1]. Perform the following
based on dataset generated. a. Label the first 50 points {x1,……,x50} as follows:
if (xi ≤ 0.5), then xi ∊ Class1, else xi ∊ Class1 b. Classify the remaining points,
x51,……,x100 using KNN. Perform this for k=1,2,3,4,5,20,30

import numpy as np

import matplotlib.pyplot as plt

from collections import Counter

# Generate random data

data =

np.random.rand(100)

labels = ["Class1" if x <= 0.5 else "Class2" for x in data[:50]]

# Function to compute Euclidean distance

def euclidean_distance(x1, x2):

return abs(x1 - x2)

# k-NN classifier function

def knn_classifier(train_data, train_labels, test_point, k):

distances = [(euclidean_distance(test_point, train_data[i]), train_labels[i]) for i in

range(len(train_data))]

distances.sort(key=lambda x: x[0]) # Sort by distance

k_nearest_neighbors = distances[:k] # Get k nearest neighbors

k_nearest_labels = [label for _, label in k_nearest_neighbors] # Extract the labels

return Counter(k_nearest_labels).most_common(1)[0][0] # Return the most

common label

DEPT OF CS&BS, KSSEM Page 12

ATMECE
MACHINE LEARNING LABORATORY BCSL606

# Prepare the training and test data

train_data = data[:50] # First 50 points for training

train_labels = labels # Corresponding labels for training

test_data = data[50:] # Remaining 50 points for testing

# Values of k to test

k_values = [1, 2, 3, 4, 5, 20, 30]

print("--- k-Nearest Neighbors Classification ---")

print("Training dataset: First 50 points labeled based on the rule (x <= 0.5 -> Class1,
x > 0.5 -> Class2)")

print("Testing dataset: Remaining 50 points to be classified\n")

# Store the results for each k value

results = {}

# Loop over different values of k

for k in k_values:

print(f"Results for k = {k}:")

classified_labels = [knn_classifier(train_data, train_labels, test_point, k) for

test_point in test_data]

results[k] = classified_labels

# Output the classification results

for i, label in enumerate(classified_labels, start=51): # Start index at 51 for test

points

print(f"Point x{i} (value: {test_data[i - 51]:.4f}) is classified as {label}") print("\

n")

DEPT OF CS&BS, KSSEM Page 13

ATMECE
MACHINE LEARNING LABORATORY BCSL606

print("Classification complete.\n")

# Visualize the classification results for each k value

for k in k_values:

classified_labels = results[k]

class1_points = [test_data[i] for i in range(len(test_data)) if classified_labels[i] ==

"Class1"]

class2_points = [test_data[i] for i in range(len(test_data)) if classified_labels[i] ==

"Class2"]

plt.figure(figsize=(10, 6))

plt.scatter(train_data, [0] * len(train_data), c=["blue" if label == "Class1" else "red"

for label in train_labels],

label="Training Data", marker="o")

plt.scatter(class1_points, [1] * len(class1_points), c="blue", label="Class1 (Test)",

marker="x")

plt.scatter(class2_points, [1] * len(class2_points), c="red", label="Class2 (Test)",

marker="x")

plt.title(f"k-NN Classification Results for k = {k}")

plt.xlabel("Data Points")

plt.ylabel("Classification Level")

plt.legend()

plt.grid(True)

plt.show()

output:

DEPT OF CS&BS, KSSEM Page 14

ATMECE
MACHINE LEARNING LABORATORY BCSL606

--- k-Nearest Neighbors Classification ---

Training dataset: First 50 points labeled based on the rule (x <= 0.5 -> Class1, x > 0.5
-> Class2)

Testing dataset: Remaining 50 points to be classified

Results for k = 1:
Point x51 (value: 0.4701) is classified as Class1
Point x52 (value: 0.2775) is classified as Class1
Point x53 (value: 0.2544) is classified as Class1
Point x54 (value: 0.4693) is classified as Class1
Point x55 (value: 0.6401) is classified as Class2
Point x56 (value: 0.8233) is classified as Class2
Point x57 (value: 0.6015) is classified as Class2
Point x58 (value: 0.7991) is classified as Class2
Point x59 (value: 0.0437) is classified as Class1
Point x60 (value: 0.2407) is classified as Class1
Point x61 (value: 0.5801) is classified as Class2
Point x62 (value: 0.4369) is classified as Class1
Point x63 (value: 0.3841) is classified as Class1
Point x64 (value: 0.3875) is classified as Class1
Point x65 (value: 0.3774) is classified as Class1
Point x66 (value: 0.7355) is classified as Class2
Point x67 (value: 0.9212) is classified as Class2
Point x68 (value: 0.9475) is classified as Class2
Point x69 (value: 0.1168) is classified as Class1
Point x70 (value: 0.1124) is classified as Class1
Point x71 (value: 0.0971) is classified as Class1
Point x72 (value: 0.4307) is classified as Class1
Point x73 (value: 0.9223) is classified as Class2
Point x74 (value: 0.7348) is classified as Class2
Point x75 (value: 0.3533) is classified as Class1
Point x76 (value: 0.8451) is classified as Class2
Point x77 (value: 0.5237) is classified as Class2
Point x78 (value: 0.7778) is classified as Class2
Point x79 (value: 0.4949) is classified as Class1
Point x80 (value: 0.1261) is classified as Class1
Point x81 (value: 0.5396) is classified as Class2
Point x82 (value: 0.2353) is classified as Class1
Point x83 (value: 0.5715) is classified as Class2
Point x84 (value: 0.5105) is classified as Class2
Point x85 (value: 0.5449) is classified as Class2
Point x86 (value: 0.8197) is classified as Class2
Point x87 (value: 0.2319) is classified as Class1
Point x88 (value: 0.5876) is classified as Class2
Point x89 (value: 0.8649) is classified as Class2
Point x90 (value: 0.3587) is classified as Class1
Point x91 (value: 0.0785) is classified as Class1
Point x92 (value: 0.8560) is classified as Class2
Point x93 (value: 0.8341) is classified as Class2
Point x94 (value: 0.0014) is classified as Class1
Point x95 (value: 0.0512) is classified as Class1
Point x96 (value: 0.4411) is classified as Class1
Point x97 (value: 0.7493) is classified as Class2
Point x98 (value: 0.6286) is classified as Class2
Point x99 (value: 0.0223) is classified as Class1

DEPT OF CS&BS, KSSEM Page 15

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Point x100 (value: 0.3796) is classified as Class1

DEPT OF CS&BS, KSSEM Page 16

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Results for k = 2:
Point x51 (value: 0.4701) is classified as Class1
Point x52 (value: 0.2775) is classified as Class1
Point x53 (value: 0.2544) is classified as Class1
Point x54 (value: 0.4693) is classified as Class1
Point x55 (value: 0.6401) is classified as Class2
Point x56 (value: 0.8233) is classified as Class2
Point x57 (value: 0.6015) is classified as Class2
Point x58 (value: 0.7991) is classified as Class2
Point x59 (value: 0.0437) is classified as Class1
Point x60 (value: 0.2407) is classified as Class1
Point x61 (value: 0.5801) is classified as Class2
Point x62 (value: 0.4369) is classified as Class1
Point x63 (value: 0.3841) is classified as Class1
Point x64 (value: 0.3875) is classified as Class1
Point x65 (value: 0.3774) is classified as Class1
Point x66 (value: 0.7355) is classified as Class2
Point x67 (value: 0.9212) is classified as Class2
Point x68 (value: 0.9475) is classified as Class2
Point x69 (value: 0.1168) is classified as Class1
Point x70 (value: 0.1124) is classified as Class1
Point x71 (value: 0.0971) is classified as Class1
Point x72 (value: 0.4307) is classified as Class1
Point x73 (value: 0.9223) is classified as Class2
Point x74 (value: 0.7348) is classified as Class2
Point x75 (value: 0.3533) is classified as Class1
Point x76 (value: 0.8451) is classified as Class2
Point x77 (value: 0.5237) is classified as Class2
Point x78 (value: 0.7778) is classified as Class2
Point x79 (value: 0.4949) is classified as Class1
Point x80 (value: 0.1261) is classified as Class1
Point x81 (value: 0.5396) is classified as Class2
Point x82 (value: 0.2353) is classified as Class1
Point x83 (value: 0.5715) is classified as Class2
Point x84 (value: 0.5105) is classified as Class2
Point x85 (value: 0.5449) is classified as Class2
Point x86 (value: 0.8197) is classified as Class2
Point x87 (value: 0.2319) is classified as Class1
Point x88 (value: 0.5876) is classified as Class2
Point x89 (value: 0.8649) is classified as Class2
Point x90 (value: 0.3587) is classified as Class1
Point x91 (value: 0.0785) is classified as Class1
Point x92 (value: 0.8560) is classified as Class2
Point x93 (value: 0.8341) is classified as Class2
Point x94 (value: 0.0014) is classified as Class1
Point x95 (value: 0.0512) is classified as Class1
Point x96 (value: 0.4411) is classified as Class1
Point x97 (value: 0.7493) is classified as Class2
Point x98 (value: 0.6286) is classified as Class2
Point x99 (value: 0.0223) is classified as Class1
Point x100 (value: 0.3796) is classified as Class1

Results for k = 3:
Point x51 (value: 0.4701) is classified as
Class1 Point x52 (value: 0.2775) is classified
as Class1 Point x53 (value: 0.2544) is

DEPT OF CS&BS, KSSEM Page 17

ATMECE
MACHINE LEARNING LABORATORY BCSL606

classified as Class1

DEPT OF CS&BS, KSSEM Page 18

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Point x54 (value: 0.4693) is classified as Class1

Point x55 (value: 0.6401) is classified as Class2
Point x56 (value: 0.8233) is classified as Class2
Point x57 (value: 0.6015) is classified as Class2
Point x58 (value: 0.7991) is classified as Class2
Point x59 (value: 0.0437) is classified as Class1
Point x60 (value: 0.2407) is classified as Class1
Point x61 (value: 0.5801) is classified as Class2
Point x62 (value: 0.4369) is classified as Class1
Point x63 (value: 0.3841) is classified as Class1
Point x64 (value: 0.3875) is classified as Class1
Point x65 (value: 0.3774) is classified as Class1
Point x66 (value: 0.7355) is classified as Class2
Point x67 (value: 0.9212) is classified as Class2
Point x68 (value: 0.9475) is classified as Class2
Point x69 (value: 0.1168) is classified as Class1
Point x70 (value: 0.1124) is classified as Class1
Point x71 (value: 0.0971) is classified as Class1
Point x72 (value: 0.4307) is classified as Class1
Point x73 (value: 0.9223) is classified as Class2
Point x74 (value: 0.7348) is classified as Class2
Point x75 (value: 0.3533) is classified as Class1
Point x76 (value: 0.8451) is classified as Class2
Point x77 (value: 0.5237) is classified as Class2
Point x78 (value: 0.7778) is classified as Class2
Point x79 (value: 0.4949) is classified as Class1
Point x80 (value: 0.1261) is classified as Class1
Point x81 (value: 0.5396) is classified as Class2
Point x82 (value: 0.2353) is classified as Class1
Point x83 (value: 0.5715) is classified as Class2
Point x84 (value: 0.5105) is classified as Class1
Point x85 (value: 0.5449) is classified as Class2
Point x86 (value: 0.8197) is classified as Class2
Point x87 (value: 0.2319) is classified as Class1
Point x88 (value: 0.5876) is classified as Class2
Point x89 (value: 0.8649) is classified as Class2
Point x90 (value: 0.3587) is classified as Class1
Point x91 (value: 0.0785) is classified as Class1
Point x92 (value: 0.8560) is classified as Class2
Point x93 (value: 0.8341) is classified as Class2
Point x94 (value: 0.0014) is classified as Class1
Point x95 (value: 0.0512) is classified as Class1
Point x96 (value: 0.4411) is classified as Class1
Point x97 (value: 0.7493) is classified as Class2
Point x98 (value: 0.6286) is classified as Class2
Point x99 (value: 0.0223) is classified as Class1
Point x100 (value: 0.3796) is classified as Class1

Results for k = 4:
Point x51 (value: 0.4701) is classified as
Class1 Point x52 (value: 0.2775) is classified
as Class1 Point x53 (value: 0.2544) is
classified as Class1 Point x54 (value: 0.4693)
is classified as Class1 Point x55 (value:
0.6401) is classified as Class2 Point x56
(value: 0.8233) is classified as Class2 Point
x57 (value: 0.6015) is classified as Class2
Point x58 (value: 0.7991) is classified as

DEPT OF CS&BS, KSSEM Page 19

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Class2 Point x59 (value: 0.0437) is classified

as Class1

DEPT OF CS&BS, KSSEM Page 20

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Point x60 (value: 0.2407) is classified as Class1

Point x61 (value: 0.5801) is classified as Class2
Point x62 (value: 0.4369) is classified as Class1
Point x63 (value: 0.3841) is classified as Class1
Point x64 (value: 0.3875) is classified as Class1
Point x65 (value: 0.3774) is classified as Class1
Point x66 (value: 0.7355) is classified as Class2
Point x67 (value: 0.9212) is classified as Class2
Point x68 (value: 0.9475) is classified as Class2
Point x69 (value: 0.1168) is classified as Class1
Point x70 (value: 0.1124) is classified as Class1
Point x71 (value: 0.0971) is classified as Class1
Point x72 (value: 0.4307) is classified as Class1
Point x73 (value: 0.9223) is classified as Class2
Point x74 (value: 0.7348) is classified as Class2
Point x75 (value: 0.3533) is classified as Class1
Point x76 (value: 0.8451) is classified as Class2
Point x77 (value: 0.5237) is classified as Class2
Point x78 (value: 0.7778) is classified as Class2
Point x79 (value: 0.4949) is classified as Class1
Point x80 (value: 0.1261) is classified as Class1
Point x81 (value: 0.5396) is classified as Class2
Point x82 (value: 0.2353) is classified as Class1
Point x83 (value: 0.5715) is classified as Class2
Point x84 (value: 0.5105) is classified as Class2
Point x85 (value: 0.5449) is classified as Class2
Point x86 (value: 0.8197) is classified as Class2
Point x87 (value: 0.2319) is classified as Class1
Point x88 (value: 0.5876) is classified as Class2
Point x89 (value: 0.8649) is classified as Class2
Point x90 (value: 0.3587) is classified as Class1
Point x91 (value: 0.0785) is classified as Class1
Point x92 (value: 0.8560) is classified as Class2
Point x93 (value: 0.8341) is classified as Class2
Point x94 (value: 0.0014) is classified as Class1
Point x95 (value: 0.0512) is classified as Class1
Point x96 (value: 0.4411) is classified as Class1
Point x97 (value: 0.7493) is classified as Class2
Point x98 (value: 0.6286) is classified as Class2
Point x99 (value: 0.0223) is classified as Class1
Point x100 (value: 0.3796) is classified as Class1

Results for k = 5:
Point x51 (value: 0.4701) is classified as
Class1 Point x52 (value: 0.2775) is classified
as Class1 Point x53 (value: 0.2544) is
classified as Class1 Point x54 (value: 0.4693)
is classified as Class1 Point x55 (value:
0.6401) is classified as Class2 Point x56
(value: 0.8233) is classified as Class2 Point
x57 (value: 0.6015) is classified as Class2
Point x58 (value: 0.7991) is classified as
Class2 Point x59 (value: 0.0437) is classified
as Class1 Point x60 (value: 0.2407) is
classified as Class1 Point x61 (value: 0.5801)
is classified as Class2 Point x62 (value:
0.4369) is classified as Class1 Point x63
(value: 0.3841) is classified as Class1 Point

DEPT OF CS&BS, KSSEM Page 21

ATMECE
MACHINE LEARNING LABORATORY BCSL606

x64 (value: 0.3875) is classified as Class1

Point x65 (value: 0.3774) is classified as Class1

DEPT OF CS&BS, KSSEM Page 22

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Point x66 (value: 0.7355) is classified as Class2

Point x67 (value: 0.9212) is classified as Class2
Point x68 (value: 0.9475) is classified as Class2
Point x69 (value: 0.1168) is classified as Class1
Point x70 (value: 0.1124) is classified as Class1
Point x71 (value: 0.0971) is classified as Class1
Point x72 (value: 0.4307) is classified as Class1
Point x73 (value: 0.9223) is classified as Class2
Point x74 (value: 0.7348) is classified as Class2
Point x75 (value: 0.3533) is classified as Class1
Point x76 (value: 0.8451) is classified as Class2
Point x77 (value: 0.5237) is classified as Class2
Point x78 (value: 0.7778) is classified as Class2
Point x79 (value: 0.4949) is classified as Class1
Point x80 (value: 0.1261) is classified as Class1
Point x81 (value: 0.5396) is classified as Class2
Point x82 (value: 0.2353) is classified as Class1
Point x83 (value: 0.5715) is classified as Class2
Point x84 (value: 0.5105) is classified as Class2
Point x85 (value: 0.5449) is classified as Class2
Point x86 (value: 0.8197) is classified as Class2
Point x87 (value: 0.2319) is classified as Class1
Point x88 (value: 0.5876) is classified as Class2
Point x89 (value: 0.8649) is classified as Class2
Point x90 (value: 0.3587) is classified as Class1
Point x91 (value: 0.0785) is classified as Class1
Point x92 (value: 0.8560) is classified as Class2
Point x93 (value: 0.8341) is classified as Class2
Point x94 (value: 0.0014) is classified as Class1
Point x95 (value: 0.0512) is classified as Class1
Point x96 (value: 0.4411) is classified as Class1
Point x97 (value: 0.7493) is classified as Class2
Point x98 (value: 0.6286) is classified as Class2
Point x99 (value: 0.0223) is classified as Class1
Point x100 (value: 0.3796) is classified as Class1

Results for k = 20:

Point x51 (value: 0.4701) is classified as
Class1 Point x52 (value: 0.2775) is classified
as Class1 Point x53 (value: 0.2544) is
classified as Class1 Point x54 (value: 0.4693)
is classified as Class1 Point x55 (value:
0.6401) is classified as Class2 Point x56
(value: 0.8233) is classified as Class2 Point
x57 (value: 0.6015) is classified as Class2
Point x58 (value: 0.7991) is classified as
Class2 Point x59 (value: 0.0437) is classified
as Class1 Point x60 (value: 0.2407) is
classified as Class1 Point x61 (value: 0.5801)
is classified as Class2 Point x62 (value:
0.4369) is classified as Class1 Point x63
(value: 0.3841) is classified as Class1 Point
x64 (value: 0.3875) is classified as Class1
Point x65 (value: 0.3774) is classified as
Class1 Point x66 (value: 0.7355) is classified
as Class2 Point x67 (value: 0.9212) is
classified as Class2 Point x68 (value: 0.9475)
is classified as Class2 Point x69 (value:

DEPT OF CS&BS, KSSEM Page 23

ATMECE
MACHINE LEARNING LABORATORY BCSL606

0.1168) is classified as Class1 Point x70

(value: 0.1124) is classified as Class1 Point
x71 (value: 0.0971) is classified as Class1

DEPT OF CS&BS, KSSEM Page 24

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Point x72 (value: 0.4307) is classified as Class1

Point x73 (value: 0.9223) is classified as Class2
Point x74 (value: 0.7348) is classified as Class2
Point x75 (value: 0.3533) is classified as Class1
Point x76 (value: 0.8451) is classified as Class2
Point x77 (value: 0.5237) is classified as Class2
Point x78 (value: 0.7778) is classified as Class2
Point x79 (value: 0.4949) is classified as Class1
Point x80 (value: 0.1261) is classified as Class1
Point x81 (value: 0.5396) is classified as Class2
Point x82 (value: 0.2353) is classified as Class1
Point x83 (value: 0.5715) is classified as Class2
Point x84 (value: 0.5105) is classified as Class2
Point x85 (value: 0.5449) is classified as Class2
Point x86 (value: 0.8197) is classified as Class2
Point x87 (value: 0.2319) is classified as Class1
Point x88 (value: 0.5876) is classified as Class2
Point x89 (value: 0.8649) is classified as Class2
Point x90 (value: 0.3587) is classified as Class1
Point x91 (value: 0.0785) is classified as Class1
Point x92 (value: 0.8560) is classified as Class2
Point x93 (value: 0.8341) is classified as Class2
Point x94 (value: 0.0014) is classified as Class1
Point x95 (value: 0.0512) is classified as Class1
Point x96 (value: 0.4411) is classified as Class1
Point x97 (value: 0.7493) is classified as Class2
Point x98 (value: 0.6286) is classified as Class2
Point x99 (value: 0.0223) is classified as Class1
Point x100 (value: 0.3796) is classified as Class1

Results for k = 30:

Point x51 (value: 0.4701) is classified as
Class2 Point x52 (value: 0.2775) is classified
as Class1 Point x53 (value: 0.2544) is
classified as Class1 Point x54 (value: 0.4693)
is classified as Class2 Point x55 (value:
0.6401) is classified as Class2 Point x56
(value: 0.8233) is classified as Class2 Point
x57 (value: 0.6015) is classified as Class2
Point x58 (value: 0.7991) is classified as
Class2 Point x59 (value: 0.0437) is classified
as Class1 Point x60 (value: 0.2407) is
classified as Class1 Point x61 (value: 0.5801)
is classified as Class2 Point x62 (value:
0.4369) is classified as Class1 Point x63
(value: 0.3841) is classified as Class1 Point
x64 (value: 0.3875) is classified as Class1
Point x65 (value: 0.3774) is classified as
Class1 Point x66 (value: 0.7355) is classified
as Class2 Point x67 (value: 0.9212) is
classified as Class2 Point x68 (value: 0.9475)
is classified as Class2 Point x69 (value:
0.1168) is classified as Class1 Point x70
(value: 0.1124) is classified as Class1 Point
x71 (value: 0.0971) is classified as Class1
Point x72 (value: 0.4307) is classified as
Class1 Point x73 (value: 0.9223) is classified

DEPT OF CS&BS, KSSEM Page 25

ATMECE
MACHINE LEARNING LABORATORY BCSL606

as Class2 Point x74 (value: 0.7348) is

classified as Class2 Point x75 (value: 0.3533)
is classified as Class1 Point x76 (value:
0.8451) is classified as Class2 Point x77
(value: 0.5237) is classified as Class2

DEPT OF CS&BS, KSSEM Page 26

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Point x78 (value: 0.7778) is classified as Class2

Point x79 (value: 0.4949) is classified as Class2
Point x80 (value: 0.1261) is classified as Class1
Point x81 (value: 0.5396) is classified as Class2
Point x82 (value: 0.2353) is classified as Class1
Point x83 (value: 0.5715) is classified as Class2
Point x84 (value: 0.5105) is classified as Class2
Point x85 (value: 0.5449) is classified as Class2
Point x86 (value: 0.8197) is classified as Class2
Point x87 (value: 0.2319) is classified as Class1
Point x88 (value: 0.5876) is classified as Class2
Point x89 (value: 0.8649) is classified as Class2
Point x90 (value: 0.3587) is classified as Class1
Point x91 (value: 0.0785) is classified as Class1
Point x92 (value: 0.8560) is classified as Class2
Point x93 (value: 0.8341) is classified as Class2
Point x94 (value: 0.0014) is classified as Class1
Point x95 (value: 0.0512) is classified as Class1
Point x96 (value: 0.4411) is classified as Class1
Point x97 (value: 0.7493) is classified as Class2
Point x98 (value: 0.6286) is classified as Class2
Point x99 (value: 0.0223) is classified as Class1
Point x100 (value: 0.3796) is classified as Class1

Classification complete.

DEPT OF CS&BS, KSSEM Page 27

ATMECE
MACHINE LEARNING LABORATORY BCSL606

DEPT OF CS&BS, KSSEM Page 28

ATMECE
MACHINE LEARNING LABORATORY BCSL606

6. Implement the non-parametric Locally Weighted Regression algorithm in order

to fit data points. Select appropriate data set for your experiment and draw
graphs

import numpy as np

import matplotlib.pyplot as plt

def gaussian_kernel(x, xi, tau):

return np.exp(-np.sum((x - xi) ** 2) / (2 * tau ** 2))

def locally_weighted_regression(x, X, y, tau):

m = X.shape[0]

weights = np.array([gaussian_kernel(x, X[i], tau) for i in range(m)])

W = np.diag(weights)

X_transpose_W = X.T @ W
theta = np.linalg.inv(X_transpose_W @ X) @ X_transpose_W @ y

return x @ theta

np.random.seed(42)

X = np.linspace(0, 2 * np.pi, 100)

y = np.sin(X) + 0.1 * np.random.randn(100)

X_bias = np.c_[np.ones(X.shape), X]

x_test = np.linspace(0, 2 * np.pi, 200)

x_test_bias = np.c_[np.ones(x_test.shape), x_test]

tau = 0.5

y_pred = np.array([locally_weighted_regression(xi, X_bias, y, tau) for xi in

x_test_bias])

plt.figure(figsize=(10, 6))

plt.scatter(X, y, color='red', label='Training Data', alpha=0.7)

plt.plot(x_test, y_pred, color='blue', label=f'LWR Fit (tau={tau})', linewidth=2)

plt.xlabel('X', fontsize=12)

plt.ylabel('y', fontsize=12)

DEPT OF CS&BS, KSSEM Page 29

ATMECE
MACHINE LEARNING LABORATORY BCSL606

plt.title('Locally Weighted Regression', fontsize=14)

plt.legend(fontsize=10)

plt.grid(alpha=0.3)

plt.show()

output:

DEPT OF CS&BS, KSSEM Page 30

ATMECE
MACHINE LEARNING LABORATORY BCSL606

7. Develop a program to demonstrate the working of Linear Regression and

Polynomial Regression. Use Boston Housing Dataset for Linear Regression and
Auto MPG Dataset (for vehicle fuel efficiency prediction) for Polynomial
Regression.

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.datasets import fetch_california_housing

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import PolynomialFeatures, StandardScaler

from sklearn.pipeline import make_pipeline

from sklearn.metrics import mean_squared_error, r2_score

# Linear Regression for California Housing Dataset

def linear_regression_california():

housing = fetch_california_housing(as_frame=True)

X = housing.data[["AveRooms"]] # Using only AveRooms as feature

y = housing.target # Median value of homes as target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

# Linear Regression model

model = LinearRegression()

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

# Plotting results

plt.scatter(X_test, y_test, color="blue", label="Actual")

DEPT OF CS&BS, KSSEM Page 31

ATMECE
MACHINE LEARNING LABORATORY BCSL606

plt.plot(X_test, y_pred, color="red", label="Predicted")

plt.xlabel("Average number of rooms (AveRooms)")

plt.ylabel("Median value of homes ($100,000)")

plt.title("Linear Regression - California Housing Dataset")

plt.legend()

plt.show()

# Print performance metrics

print("Linear Regression - California Housing Dataset")

print("Mean Squared Error:", mean_squared_error(y_test, y_pred))

print("R^2 Score:", r2_score(y_test, y_pred))

# Polynomial Regression for Auto MPG Dataset

def polynomial_regression_auto_mpg():

# Load the Auto MPG dataset

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-
mpg.data"

column_names = ["mpg", "cylinders", "displacement", "horsepower", "weight",

"acceleration", "model_year", "origin"]

data = pd.read_csv(url, sep='\s+', names=column_names, na_values="?")

data = data.dropna() # Drop rows with missing values

X = data["displacement"].values.reshape(-1, 1) # Feature: displacement

y = data["mpg"].values # Target: mpg

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

DEPT OF CS&BS, KSSEM Page 32

ATMECE
MACHINE LEARNING LABORATORY BCSL606

# Polynomial Regression model (degree 2)

poly_model = make_pipeline(PolynomialFeatures(degree=2), StandardScaler(),

LinearRegression())

poly_model.fit(X_train, y_train)

y_pred = poly_model.predict(X_test)

# Plotting results

plt.scatter(X_test, y_test, color="blue", label="Actual")

plt.scatter(X_test, y_pred, color="red", label="Predicted")

plt.xlabel("Displacement")

plt.ylabel("Miles per gallon (mpg)")

plt.title("Polynomial Regression - Auto MPG Dataset")

plt.legend()

plt.show()
# Print performance metrics

print("Polynomial Regression - Auto MPG Dataset")

print("Mean Squared Error:", mean_squared_error(y_test, y_pred))

print("R^2 Score:", r2_score(y_test, y_pred))

if name == " main ":

print("Demonstrating Linear Regression and Polynomial Regression\n")

linear_regression_california() # Call the linear regression function

polynomial_regression_auto_mpg() # Call the polynomial regression function

output:

DEPT OF CS&BS, KSSEM Page 33

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Linear Regression - California Housing

Dataset Mean Squared Error:

1.2923314440807299 R^2 Score:

0.013795337532284901

Polynomial Regression - Auto MPG Dataset

Mean Squared Error: 0.7431490557205862

R^2 Score: 0.7505650609469626

DEPT OF CS&BS, KSSEM Page 34

ATMECE
MACHINE LEARNING LABORATORY BCSL606

8. Develop a program to demonstrate the working of the decision tree algorithm. Use Breast
Cancer Data set for building the decision tree and apply this knowledge to classify a new
sample.

# Importing necessary libraries

import numpy as np

import matplotlib.pyplot as plt

from sklearn.datasets import load_breast_cancer

from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import accuracy_score

from sklearn import tree

data = load_breast_cancer()

X = data.data

y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

clf = DecisionTreeClassifier(random_state=42)

clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print(f"Model Accuracy: {accuracy * 100:.2f}%")

new_sample = np.array([X_test[0]])

prediction = clf.predict(new_sample)

prediction_class = "Benign" if prediction == 1 else "Malignant"

print(f"Predicted Class for the new sample:

{prediction_class}") plt.figure(figsize=(12,8))

DEPT OF CS&BS, KSSEM Page 35

ATMECE
MACHINE LEARNING LABORATORY BCSL606

tree.plot_tree(clf, filled=True, feature_names=data.feature_names,

class_names=data.target_names)

plt.title("Decision Tree - Breast Cancer Dataset")

plt.show()

output:

Model Accuracy: 94.74%

Predicted Class for the new sample: Benign

DEPT OF CS&BS, KSSEM Page 36

ATMECE
MACHINE LEARNING LABORATORY BCSL606

9. Develop a program to implement the Naive Bayesian classifier considering

Olivetti Face Data set for training. Compute the accuracy of the classifier,
considering a few test data sets.

import numpy as np

from sklearn.datasets import fetch_olivetti_faces

from sklearn.model_selection import train_test_split, cross_val_score

from sklearn.naive_bayes import GaussianNB

from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

import matplotlib.pyplot as plt

data = fetch_olivetti_faces(shuffle=True, random_state=42)

X = data.data

y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,

random_state=42)

gnb = GaussianNB()

gnb.fit(X_train, y_train)

y_pred = gnb.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print(f'Accuracy: {accuracy * 100:.2f}%') print("\

nClassification Report:") print(classification_report(y_test,

y_pred, zero_division=1)) print("\nConfusion Matrix:")

print(confusion_matrix(y_test, y_pred))

cross_val_accuracy = cross_val_score(gnb, X, y, cv=5, scoring='accuracy')

print(f'\nCross-validation accuracy: {cross_val_accuracy.mean() * 100:.2f}

%') fig, axes = plt.subplots(3, 5, figsize=(12, 8))

for ax, image, label, prediction in zip(axes.ravel(), X_test, y_test, y_pred):

DEPT OF CS&BS, KSSEM Page 37

ATMECE
MACHINE LEARNING LABORATORY BCSL606

ax.imshow(image.reshape(64, 64), cmap=plt.cm.gray)

ax.set_title(f"True: {label}, Pred: {prediction}")

ax.axis('off')

plt.show()

output:
Accuracy: 80.83%
Classification Report:
precision recall f1-score support
0 0.67 1.00 0.80 2
1 1.00 1.00 1.00 2
2 0.33 0.67 0.44 3
3 1.00 0.00 0.00 5
4 1.00 0.50 0.67 4
5 1.00 1.00 1.00 2
7 1.00 0.75 0.86 4
8 1.00 0.67 0.80 3
9 1.00 0.75 0.86 4
10 1.00 1.00 1.00 3
11 1.00 1.00 1.00 1
12 0.40 1.00 0.57 4
13 1.00 0.80 0.89 5
14 1.00 0.40 0.57 5
15 0.67 1.00 0.80 2
16 1.00 0.67 0.80 3
17 1.00 1.00 1.00 3
18 1.00 1.00 1.00 3
19 0.67 1.00 0.80 2
20 1.00 1.00 1.00 3
...
[0 0 0 ... 0 3 0]
[0 0 0 ... 0 0 5]]
Cross-validation accuracy: 87.25%
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

DEPT OF CS&BS, KSSEM Page 38

ATMECE
MACHINE LEARNING LABORATORY BCSL606

DEPT OF CS&BS, KSSEM Page 39

ATMECE
MACHINE LEARNING LABORATORY BCSL606

10. Develop a program to implement k-means clustering using Wisconsin Breast

Cancer data set and visualize the clustering result.

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from sklearn.datasets import load_breast_cancer

from sklearn.cluster import KMeans

from sklearn.preprocessing import StandardScaler

from sklearn.decomposition import PCA

from sklearn.metrics import confusion_matrix, classification_report

data = load_breast_cancer()

X = data.data

y = data.target

scaler = StandardScaler()

X_scaled = scaler.fit_transform(X)

kmeans = KMeans(n_clusters=2, random_state=42) y_kmeans =

kmeans.fit_predict(X_scaled) print("Confusion Matrix:") print(confusion_matrix(y,
y_kmeans)) print("\nClassification Report:") print(classification_report(y, y_kmeans))
pca = PCA(n_components=2)

X_pca = pca.fit_transform(X_scaled)

df = pd.DataFrame(X_pca, columns=['PC1', 'PC2'])

df['Cluster'] = y_kmeans

df['True Label'] = y

plt.figure(figsize=(8, 6))

sns.scatterplot(data=df, x='PC1', y='PC2', hue='Cluster', palette='Set1', s=100,

edgecolor='black', alpha=0.7)

plt.title('K-Means Clustering of Breast Cancer Dataset')

plt.xlabel('Principal Component 1')

DEPT OF CS&BS, KSSEM Page 40

ATMECE
MACHINE LEARNING LABORATORY BCSL606

plt.ylabel('Principal Component 2')

plt.legend(title="Cluster")

plt.show()

plt.figure(figsize=(8, 6))

sns.scatterplot(data=df, x='PC1', y='PC2', hue='True Label', palette='coolwarm',

s=100, edgecolor='black', alpha=0.7)

plt.title('True Labels of Breast Cancer Dataset')

plt.xlabel('Principal Component 1')

plt.ylabel('Principal Component 2')

plt.legend(title="True Label")

plt.show()
plt.figure(figsize=(8, 6))

sns.scatterplot(data=df, x='PC1', y='PC2', hue='Cluster', palette='Set1', s=100,

edgecolor='black', alpha=0.7)

centers = pca.transform(kmeans.cluster_centers_)

plt.scatter(centers[:, 0], centers[:, 1], s=200, c='red', marker='X',

label='Centroids') plt.title('K-Means Clustering with Centroids')

plt.xlabel('Principal Component 1')

plt.ylabel('Principal Component 2')

plt.legend(title="Cluster")

plt.show()

output:

DEPT OF CS&BS, KSSEM Page 41

ATMECE
MACHINE LEARNING LABORATORY BCSL606

Confusion Matrix:
[[175 37]
[ 13 344]]

Classification Report:
precision recall f1-score support

0 0.93 0.83 0.88 212

1 0.90 0.96 0.93 357

accuracy 0.91 569

macro avg 0.92 0.89 0.90 569
weighted avg 0.91 0.91 0.91 569

DEPT OF CS&BS, KSSEM Page 42

ATMECE
MACHINE LEARNING LABORATORY BCSL606

DEPT OF CS&BS, KSSEM Page 43

ATMECE
MACHINE LEARNING LABORATORY BCSL606

VIVA QUESTIONS
1. What is the difference between supervised and unsupervised machine learning?
A Supervised learning is a process where it requires training labeled data. When it
comes to Unsupervised learning it doesn’t require data labeling.
2. How is KNN different from K-means clustering?
KNN stands for K- Nearest Neighbours, it is classified as a supervised algorithm.
K-means is an unsupervised cluster algorithm.

4. How to handle or missing data in a dataset?

An individual can easily find missing or corrupted data in a data set either by
dropping the rows or columns. On contrary, they can decide to replace the data with another
value. In Pandas they are two ways to identify the missing data, these two methods are very
useful. isnull() and dropna().

5. What is the difference between an array and Linked list?

Deep An array is an ordered fashion of collection of objects. A linked list is a series of
objects that are processed in a sequential order.

6. Explain why Navie Bayes is so Naive?

It is based on an assumption that all of the features in the data set are important, equal
and independent.

7. Please state few popular Machine Learning algorithms?

Nearest Neighbour
Neural Networks
Decision Trees etc
Support vector machines

8. What are the different types of algorithm techniques available in machine learning?
Some of them are :
Supervised learning
Unsupervised learning
Semi-supervised
learning Transduction
Learning to learn

9. What are the three stages to build the model in machine learning?
1. Model building
2. Model testing
3. Applying the model

10. Name a few libraries in Python used for Data Analysis and Scientific computations
NumPy, SciPy, Pandas, SciKit, Matplotlib, Seaborn

11. Which is the standard data missing marker used in Pandas?

NaN

DEPT OF CS&BS, KSSEM Page 44

ATMECE
MACHINE LEARNING LABORATORY BCSL606

12. Write the code to sort an array in NumPy by the nth column?
Using argsort () function this can be achieved. If there is an array X and you would
like to sort the nth column then code for this will be x[x [: n-1].argsort ()]

13. What is pylab?

A package that combines NumPy, SciPy and Matplotlib into a single namespace.

14. Is all the memory freed when Python exits?

No it is not, because the objects that are referenced from global namespaces of Python
modules are not always de-allocated when Python exits.

15. How can you randomize the items of a list in place in Python?
Shuffle (lst) can be used for randomizing the items of a list in Python

16. Which tool in Python will be used to find bugs if any?

Pylint and Pychecker. Pylint verifies that a module satisfies all the coding standards or
not. Pychecker is a static analysis tool that helps find out bugs in the course code.

17. What are the supported data types in Python?

Python has five standard data types −
 Numbers
 String
 List
 Tuple
 Dictionary

19. What are the supported sequence types in Python?

Python supports 7 sequence types. They are str, list, tuple, unicode, byte array,
xrange, and buffer. where xrange is deprecated in python 3.5.X.

20. What is pylab?

A package that combines NumPy, SciPy and Matplotlib into a single namespace.

21. What are the supported data types in Python?

SciKit-Learn

23. Is Python a case-sensitive programming language?

Yes, it is a case-sensitive language.

24. What is the difference between a tuple and a list?

The basic difference between a tuple and a list is that the former is immutable and the
latter is mutable.

DEPT OF CS&BS, KSSEM Page 45

ATMECE
MACHINE LEARNING LABORATORY BCSL606

25. What is the difference between Xrange() and range()?

Range() returns a list and xrange() returns an xrange object, which is kind of like an
iterator and generates the numbers on demand.

26. Optimize the below python code-

word = 'word'
print word. len ()
print ‘word’._len_ ()

27. What is PEP 8?

PEP 8 is a coding convention that lets us write more readable code. In other words, it
is a set of recommendations.

28. What are Python decorators?

A Python decorator is a specific change that we make in Python syntax to alter
functions easily.

29. What is Dict and List comprehensions are?

They are syntax constructions to ease the creation of a Dictionary or List based on
existing iterable.

30. What is lambda in Python?

It is a single expression anonymous function often used as inline function.

31. What is pass in Python?

Pass means, no-operation Python statement, or in other words it is a place holder in
compound statement, where there should be a blank left and nothing has to be written
there.

32. In Python what are iterators?

In Python, iterators are used to iterate a group of elements, containers like list.

33. In Python what is slicing?

A mechanism to select a range of items from sequence types like list, tuple, strings
etc. is known as slicing.

34. What is docstring in Python?

A Python documentation string is known as docstring, it is a way of documenting
Python functions, modules and classes.

35. How can you copy an object in Python?

To copy an object in Python, you can try copy.copy () or copy.deepcopy() for the
general case. You cannot copy all objects but most of them.

DEPT OF CS&BS, KSSEM Page 46

ATMECE
MACHINE LEARNING LABORATORY BCSL606

36. Explain how to delete a file in Python?

By using a command os.remove (filename) or os.unlink(filename)

37. Is It Mandatory For A Python Function To Return A Value?

It is not at all necessary for a function to return any value. However, if needed, we can
use None as a return value.

38. What Is Whitespace In Python?

Whitespace represents the characters that we use for spacing and separation. They
possess an “empty” representation. In Python, it could be a tab or space.

39. What Is Isalpha() In Python?

Python provides this built-in isalpha() function for the string handling purpose. It
returns True if all characters in the string are of alphabet type, else it returns False.

40. What Does The Join Method Do In Python?

Python provides the join() method which works on strings, lists, and tuples. It
combines them and returns a united value.

41. What Makes The CPython Different From Python?

CPython has its core developed in C. The prefix ‘C’ represents this fact. It runs an
interpreter loop used for translating the Python-ish code to C language.

42. What Is A Tuple In Python?

A tuple is a collection type data structure in Python which is immutable. They are
similar to sequences, just like the lists. However, there are some differences between a tuple
and list; the former doesn’t allow modifications whereas the list does.

43. What is Artificial Intelligence?

Artificial Intelligence is an area of computer science that emphasizes the creation of
intelligent machine that work and reacts like humans.

44. What are the various areas where AI (Artificial Intelligence) can be used?
Artificial Intelligence can be used in many areas like Computing, Speech recognition,
Bio-informatics, Humanoid robot, Computer software, Space and Aeronautics’ etc.

DEPT OF CS&BS, KSSEM Page 47

ATMECE

Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
35 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
67 pages
ML Lab Manual Pesitm
No ratings yet
ML Lab Manual Pesitm
22 pages
ML Lab Manual
No ratings yet
ML Lab Manual
43 pages
ML Manual
No ratings yet
ML Manual
42 pages
Machine Learning Lab Manual (BCSL606)
No ratings yet
Machine Learning Lab Manual (BCSL606)
19 pages
ML Lab Manual-Bcsl606
No ratings yet
ML Lab Manual-Bcsl606
64 pages
Ad3461 ML Manual
No ratings yet
Ad3461 ML Manual
34 pages
Machine Learning Lab Course Overview
No ratings yet
Machine Learning Lab Course Overview
49 pages
Ad3461 Machine Learning Laboratory Syllabus
No ratings yet
Ad3461 Machine Learning Laboratory Syllabus
2 pages
ML Manual
No ratings yet
ML Manual
32 pages
ML Lab Manual
No ratings yet
ML Lab Manual
43 pages
List of Experiments in Machine Learning Lab
No ratings yet
List of Experiments in Machine Learning Lab
6 pages
ML Lab Manual Simplified
No ratings yet
ML Lab Manual Simplified
40 pages
Machine Learning Lab Manual (BCSL606)
No ratings yet
Machine Learning Lab Manual (BCSL606)
19 pages
Machine Learning Lab BISL607
No ratings yet
Machine Learning Lab BISL607
2 pages
CL-I Lab Manual
No ratings yet
CL-I Lab Manual
131 pages
Machine Learning Lab Assignment Overview
No ratings yet
Machine Learning Lab Assignment Overview
35 pages
Machine Learning Lab Manual 17CSL76
No ratings yet
Machine Learning Lab Manual 17CSL76
57 pages
ML Lab Manual
No ratings yet
ML Lab Manual
66 pages
Machine Learning Lab Course Guide
No ratings yet
Machine Learning Lab Course Guide
3 pages
AL-405 Machine Learning Lab Manual
No ratings yet
AL-405 Machine Learning Lab Manual
40 pages
ML Syllabus
No ratings yet
ML Syllabus
5 pages
CourseCurriculum EML
No ratings yet
CourseCurriculum EML
3 pages
Machine Learning Lab Assignment Overview
No ratings yet
Machine Learning Lab Assignment Overview
38 pages
Mllab
No ratings yet
Mllab
62 pages
M.L Lab Syllabus
No ratings yet
M.L Lab Syllabus
4 pages
B Tech AIDS-90
No ratings yet
B Tech AIDS-90
1 page
Sinchana Manual Ise
No ratings yet
Sinchana Manual Ise
61 pages
ML Lab
No ratings yet
ML Lab
20 pages
DWDM Laboratory Manual 2019-20
No ratings yet
DWDM Laboratory Manual 2019-20
31 pages
ML Minor Syllabus-Sem-04
No ratings yet
ML Minor Syllabus-Sem-04
4 pages
ML Record - Unlocked
No ratings yet
ML Record - Unlocked
67 pages
CP4252 Set2
No ratings yet
CP4252 Set2
4 pages
CS3491 Set3
No ratings yet
CS3491 Set3
2 pages
ML Manual AIDS
No ratings yet
ML Manual AIDS
44 pages
ML Lab Manual for CS Students
No ratings yet
ML Lab Manual for CS Students
62 pages
Ad3461 Machine Learning Laboratory - 1
No ratings yet
Ad3461 Machine Learning Laboratory - 1
1 page
Manual - AInDS 6th Sem MACHINE LEARNING LABORATORY
No ratings yet
Manual - AInDS 6th Sem MACHINE LEARNING LABORATORY
71 pages
15CSL76 Lab Manual
75% (4)
15CSL76 Lab Manual
48 pages
Syllabus - ML Lab
No ratings yet
Syllabus - ML Lab
3 pages
MLA LabManual1
No ratings yet
MLA LabManual1
52 pages
Machine Learning Lab Sheets
No ratings yet
Machine Learning Lab Sheets
5 pages
Acropolis IT Program Vision and Outcomes
No ratings yet
Acropolis IT Program Vision and Outcomes
39 pages
BSc Computer Science: ML Course
No ratings yet
BSc Computer Science: ML Course
2 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Rishabh ML File
No ratings yet
Rishabh ML File
102 pages
Syllabus Fundamentals of Data Science
No ratings yet
Syllabus Fundamentals of Data Science
7 pages
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
No ratings yet
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
14 pages
ML Lab
No ratings yet
ML Lab
58 pages
ML Manual2025 - IV YEar
No ratings yet
ML Manual2025 - IV YEar
39 pages
AIML Lab Improvement
No ratings yet
AIML Lab Improvement
20 pages
Data Science & Big Data Lab Guide
No ratings yet
Data Science & Big Data Lab Guide
167 pages
MachineLearning Lab
No ratings yet
MachineLearning Lab
3 pages
Lab Manual - MACHINE LEARNING LABORATORY
No ratings yet
Lab Manual - MACHINE LEARNING LABORATORY
42 pages
Data Science for Engineers Course
No ratings yet
Data Science for Engineers Course
8 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
55 pages
Array and Strings MCQ
No ratings yet
Array and Strings MCQ
3 pages
Lesson - Plan - Iot (Betck205h)
No ratings yet
Lesson - Plan - Iot (Betck205h)
4 pages
DBMS - Ia3 Qp-Set - A (2024-25)
No ratings yet
DBMS - Ia3 Qp-Set - A (2024-25)
3 pages
Answer Array and Strings
No ratings yet
Answer Array and Strings
4 pages
Introduction To Indian Knowledge System - Module1 Notes
No ratings yet
Introduction To Indian Knowledge System - Module1 Notes
16 pages
Dbms - Ia2 Qp-Set - B New
No ratings yet
Dbms - Ia2 Qp-Set - B New
2 pages
IOT-Quiz Question Bank
No ratings yet
IOT-Quiz Question Bank
6 pages
Ia-3 Set B 2025
No ratings yet
Ia-3 Set B 2025
1 page
What Was A Distinguishing Feature of Urban Planning in The Indus Valley Civilization
No ratings yet
What Was A Distinguishing Feature of Urban Planning in The Indus Valley Civilization
7 pages
Traditional Knowledge in Humanities and Sciences
No ratings yet
Traditional Knowledge in Humanities and Sciences
21 pages
DMRT For Table 4 5 and 6 Date 19.10.2024
No ratings yet
DMRT For Table 4 5 and 6 Date 19.10.2024
27 pages
Microeconometrics
No ratings yet
Microeconometrics
228 pages
Kalyani
No ratings yet
Kalyani
9 pages
Part I (25%)
No ratings yet
Part I (25%)
9 pages
Analisis Kinerja Karyawan dan Kepuasan
No ratings yet
Analisis Kinerja Karyawan dan Kepuasan
28 pages
02 Forecasting
No ratings yet
02 Forecasting
9 pages
Math Model Validation Worksheet
100% (1)
Math Model Validation Worksheet
3 pages
Predicting The Distribution of Stock Returns Around The Globe in The Era of Big Data and Learning
No ratings yet
Predicting The Distribution of Stock Returns Around The Globe in The Era of Big Data and Learning
61 pages
A Goodness-Of-Fit Test For Functional Time Series With Applications To Ornstein-Uhlenbeck Processes
No ratings yet
A Goodness-Of-Fit Test For Functional Time Series With Applications To Ornstein-Uhlenbeck Processes
31 pages
Business Statistics End Term Exam (Set 2)
No ratings yet
Business Statistics End Term Exam (Set 2)
5 pages
Explicit Anomalous Cognition
No ratings yet
Explicit Anomalous Cognition
43 pages
Evaluating Machine Learning Models
100% (2)
Evaluating Machine Learning Models
10 pages
Automatic Debiased Machine Learning Via Neural Nets For Generalized Linear Regression
No ratings yet
Automatic Debiased Machine Learning Via Neural Nets For Generalized Linear Regression
30 pages
Statistics On Sas
No ratings yet
Statistics On Sas
5 pages
Probability & Stats Exam Guide
No ratings yet
Probability & Stats Exam Guide
8 pages
The Effect of Earnings Surprises On Stock Returns: Analysis of The Canadian Market
No ratings yet
The Effect of Earnings Surprises On Stock Returns: Analysis of The Canadian Market
26 pages
Data Analytics With Python Curriculum (LOCTECH) PDF
No ratings yet
Data Analytics With Python Curriculum (LOCTECH) PDF
6 pages
4 - Non-Random Pattern Test
No ratings yet
4 - Non-Random Pattern Test
12 pages
Data Analytics: Regression Modeling Overview
No ratings yet
Data Analytics: Regression Modeling Overview
24 pages
Golden State 5K Data Analysis Guide
No ratings yet
Golden State 5K Data Analysis Guide
2 pages
Assignment 01
No ratings yet
Assignment 01
4 pages
CS5103 Lecture Plan - Fundamnetals of Data Science
No ratings yet
CS5103 Lecture Plan - Fundamnetals of Data Science
2 pages
Statistics Exam Review Guide
No ratings yet
Statistics Exam Review Guide
15 pages
Math 12 Quiz Bee Worksheets
No ratings yet
Math 12 Quiz Bee Worksheets
14 pages
Intro to Descriptive & Inferential Stats
No ratings yet
Intro to Descriptive & Inferential Stats
46 pages
Answer Scheme Set A 1120
No ratings yet
Answer Scheme Set A 1120
5 pages
Heteroskedasticity and Autocorrelation in Econometrics
No ratings yet
Heteroskedasticity and Autocorrelation in Econometrics
5 pages
Real Estate Analysis Part I
No ratings yet
Real Estate Analysis Part I
8 pages
Assignment For The Graded Points
No ratings yet
Assignment For The Graded Points
5 pages
Aceh Employment and Labor Force Analysis
No ratings yet
Aceh Employment and Labor Force Analysis
12 pages