0% found this document useful (0 votes)

10 views10 pages

DSA Lab Manual

The document contains various programming tasks related to data analysis and machine learning, including plotting student performance, analyzing a book dataset, training logistic regression and SVM classifiers, implementing a decision tree algorithm, and performing clustering on a dataset. Each task includes code snippets and expected outputs, demonstrating the use of libraries like matplotlib, pandas, and scikit-learn. The tasks cover data visualization, data cleaning, model training, and evaluation metrics.

Uploaded by

sowmyabandaru1021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views10 pages

DSA Lab Manual

Uploaded by

sowmyabandaru1021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Program 1a. Students performance in the final exams.

A study was conducted to understand the effect of number of hours the students spent studying on their performance in the final
exams. Write a code to plot line chart with number of hours spent studying on x-axis and score in final exam on y-axis. Use a red ‘*’
as the point character, label the axes and give the plot a title.

import matplotlib.pyplot as plt

hours = [10,9,2,15,10,16,11,16]
score = [95,80,10,50,45,98,38,93]

# Plotting the line chart

plt.plot(hours, score, marker='*', color='red', linestyle='-')

# Adding labels and title

plt.xlabel('Number of Hours Studied')
plt.ylabel('Score in Final Exam')
plt.title('Effect of Hours Studied on Exam Score')

# Displaying the plot

plt.grid(True)
plt.show()

OUTPUT:
1.b Histogram to check the frequency distribution

For the given dataset mtcars.csv (www.kaggle.com/ruiromanini/mtcars), plot a histogram to check the frequency distribution of the
variable ‘mpg’ (Miles per gallon)

import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset

mtcars = pd.read_csv('mtcars.csv') # Replace 'path_to_your_mtcars.csv' with the actual path to your mtcars.csv file

# Plotting the histogram

plt.hist(mtcars['mpg'], bins=10, color='skyblue', edgecolor='black')

# Adding labels and title

plt.xlabel('Miles per gallon (mpg)')
plt.ylabel('Frequency')
plt.title('Histogram of Miles per gallon (mpg)')

# Displaying the plot

plt.show()

OUTPUT:
Program 2. Kaggle Book Data set

Consider the books dataset BL-Flickr-Images-Book.csv from Kaggle (https://www.kaggle.com/adeyoyintemidayo/publication-of-

books) which contains information about books. Write a program to demonstrate the following.

 Import the data into a DataFrame

 Find and drop the columns which are irrelevant for the book information.

 Change the Index of the DataFrame

 Tidy up fields in the data such as date of publication with the help of simple regular expression.

 Combine str methods with NumPy to clean columns

import pandas as pd
import numpy as np

# Import the data into a DataFrame

df = pd.read_csv('BL-Flickr-Images-Book.csv')

# Display the first few rows of the DataFrame

print("Original DataFrame:")
print(df.head())

# Find and drop the columns which are irrelevant for the book information
_ columns = ['Edition Statement', 'Corporate Author', 'Corporate Contributors', 'Former owner', 'Engraver', ' Contributors',
'Issuance type', 'Shelfmarks']
df.drop(columns=irrelevant_columns, inplace=True)

# Change the Index of the DataFrame

df.set_index('Identifier', inplace=True)

# Tidy up fields in the data such as date of publication with the help of simple regular expression
df['Date of Publication'] = df['Date of Publication'].str.extract(r'^(\d{4})', expand=False)

# Combine str methods with NumPy to clean columns

df['Place of Publication'] = np.where(df['Place of Publication'].str.contains('London'), 'London', df['Place of
Publication'].str.replace('-', ' '))

# Display the cleaned DataFrame

print("\nCleaned DataFrame:")
print(df.head())
OUTPUt:

Original DataFrame:
Identifier Edition Statement Place of Publication \
0 206 NaN London
1 216 NaN London; Virtue & Yorston
2 218 NaN London
3 472 NaN London
4 480 A new edition, revised, etc. London

Date of Publication Publisher \

0 1879 [1878] S. Tinsley & Co.
1 1868 Virtue & Co.
2 1869 Bradbury, Evans & Co.
3 1851 James Darling
4 1857 Wertheim & Macintosh

Title Author \
0 Walter Forbes. [A novel.] By A. A A. A.
1 All for Greed. [A novel. The dedication signed... A., A. A.
2 Love the Avenger. By the author of “All for Gr... A., A. A.
3 Welsh Sketches, chiefly ecclesiastical, to the... A., E. S.
4 [The World in which I live, and my place in it... A., E. S.

Contributors Corporate Author \

0 FORBES, Walter. NaN
1 BLAZE DE BURY, Marie Pauline Rose - Baroness NaN
2 BLAZE DE BURY, Marie Pauline Rose - Baroness NaN
3 Appleyard, Ernest Silvanus. NaN
4 BROOME, John Henry. NaN

Corporate Contributors Former owner Engraver Issuance type \

0 NaN NaN NaN monographic
1 NaN NaN NaN monographic
2 NaN NaN NaN monographic
3 NaN NaN NaN monographic
4 NaN NaN NaN monographic

Flickr URL \
0 http://www.flickr.com/photos/britishlibrary/ta...
1 http://www.flickr.com/photos/britishlibrary/ta...
2 http://www.flickr.com/photos/britishlibrary/ta...
3 http://www.flickr.com/photos/britishlibrary/ta...
4 http://www.flickr.com/photos/britishlibrary/ta...

Shelfmarks
0 British Library HMNTS 12641.b.30.
1 British Library HMNTS 12626.cc.2.
2 British Library HMNTS 12625.dd.1.
3 British Library HMNTS 10369.bbb.15.
4 British Library HMNTS 9007.d.28.

Cleaned DataFrame:
Place of Publication Date of Publication Publisher \
Identifier
206 London 1879 S. Tinsley & Co.
216 London 1868 Virtue & Co.
218 London 1869 Bradbury, Evans & Co.
472 London 1851 James Darling
480 London 1857 Wertheim & Macintosh

Title Author \
Identifier
206 Walter Forbes. [A novel.] By A. A A. A.
216 All for Greed. [A novel. The dedication signed... A., A. A.
218 Love the Avenger. By the author of “All for Gr... A., A. A.
472 Welsh Sketches, chiefly ecclesiastical, to the... A., E. S.
480 [The World in which I live, and my place in it... A., E. S.

Flickr URL
Identifier
206 http://www.flickr.com/photos/britishlibrary/ta...
216 http://www.flickr.com/photos/britishlibrary/ta...
218 http://www.flickr.com/photos/britishlibrary/ta...
472 http://www.flickr.com/photos/britishlibrary/ta...

480 http://www.flickr.com/photos/britishlibrary/ta...

Program 3 a. Logistic Regression

Train a regularized logistic regression classifier on the iris dataset (https://archive.ics.uci.edu/ml/machine-learning-databases/iris/ or

the inbuilt iris dataset) using sklearn. Train the model with the following hyperparameter C = 1e4 and report the best classification
accuracy.

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

# Load the Iris dataset

iris = load_iris()
X = iris.data
y = iris.target

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a pipeline with StandardScaler and LogisticRegression with regularization

pipeline = make_pipeline(StandardScaler(), LogisticRegression(C=1e4, max_iter=1000))

# Train the model

pipeline.fit(X_train, y_train)

# Calculate the accuracy on the testing set

accuracy = pipeline.score(X_test, y_test)
print("Classification accuracy:", accuracy)

OUTPUT:
Classification accuracy: 1.0

Program 3.b SVM classifier

Train an SVM classifier on the iris dataset using sklearn. Try different kernels and the associated hyperparameters. Train mo del with
the following set of hyperparameters RBF-kernel, gamma=0.5,
one-vs-rest classifier, no-feature-normalization. Also try C=0.01,1,10C=0.01,1,10. For the above set of hyperparameters, find the best
classification accuracy along with total number of support vectors on the test data.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# Load the Iris dataset

iris = load_iris()
X = iris.data
y = iris.target

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Set of hyperparameters to try

hyperparameters = [
{'kernel': 'rbf', 'gamma': 0.5, 'C': 0.01},
{'kernel': 'rbf', 'gamma': 0.5, 'C': 1},
{'kernel': 'rbf', 'gamma': 0.5, 'C': 10}
]

best_accuracy = 0
best_model = None
best_support_vectors = None

# Train SVM models with different hyperparameters and find the best accuracy
for params in hyperparameters:
model = SVC(kernel=params['kernel'], gamma=params['gamma'], C=params['C'], decision_function_shape='ovr')
model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test)
support_vectors = model.n_support_.sum()
print(f"For hyperparameters: {params}, Accuracy: {accuracy}, Total Support Vectors: {support_vectors}")
if accuracy > best_accuracy:
best_accuracy = accuracy
best_model = model
best_support_vectors = support_vectors

print("\nBest accuracy:", best_accuracy)

print("Total support vectors on test data:", best_support_vectors)

OUTPUT:

For hyperparameters: {'kernel': 'rbf', 'gamma': 0.5, 'C': 0.01}, Accuracy: 0.3, Total Support Vectors: 120
For hyperparameters: {'kernel': 'rbf', 'gamma': 0.5, 'C': 1}, Accuracy: 1.0, Total Support Vectors: 39
For hyperparameters: {'kernel': 'rbf', 'gamma': 0.5, 'C': 10}, Accuracy: 1.0, Total Support Vectors: 31

Best accuracy: 1.0

Total support vectors on test data: 39

Program 4 a.:

Decision Tree based ID3 algorithm

Consider the following dataset. Write a program to demonstrate the working of the decision tree
based ID3 algorithm.
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import pandas as pd
from io import StringIO
from IPython.display import Image
import pydotplus

# Define the dataset

data = {
'Price': ['Low', 'Low', 'Low', 'Low', 'Low', 'Med', 'Med', 'Med', 'Med', 'High', 'High', 'High', 'High'],
'Maintenance': ['Low', 'Med', 'Low', 'Med', 'High', 'Med', 'Med', 'High', 'High', 'Med', 'Med', 'High', 'High'],
'Capacity': ['2', '4', '4', '4', '4', '4', '4', '2', '5', '4', '2', '2', '5'],
'Airbag': ['No', 'Yes', 'No', 'No', 'No', 'No', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes'],
'Profitable': [1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1]
}

df = pd.DataFrame(data)

# Convert categorical variables into numerical ones

df = pd.get_dummies(df, columns=['Price', 'Maintenance', 'Airbag'])

# Separate features and target variable

X = df.drop('Profitable', axis=1)
y = df['Profitable']

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a decision tree classifier

clf = DecisionTreeClassifier(criterion='entropy')

# Train the classifier on the training data

clf.fit(X_train, y_train)

# Predict on the testing data

y_pred = clf.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
# Visualize the decision tree
dot_data = StringIO()
export_graphviz(clf, out_file=dot_data, filled=True, rounded=True, special_characters=True, feature_names=X.columns)
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
Image(graph.create_png())

OUTPUT:

Accuracy: 0.6666666666666666

Program 4 .b

Clustering

Consider the dataset spiral.txt (https://bit.ly/2Lm75Ly). The first two columns in the dataset corresponds to the co-ordinates of each
data point. The third column corresponds to the actual cluster label. Compute the rand index for the following methods:

 K – means Clustering

 Single – link Hierarchical Clustering

 Complete link hierarchical clustering.

 Also visualize the dataset and which algorithm will be able to recover the true clusters.
import numpy as np
from sklearn.cluster import KMeans, AgglomerativeClustering
from sklearn.metrics import adjusted_rand_score
import matplotlib.pyplot as plt

# Load the dataset

data = np.loadtxt("Spiral.txt", delimiter=",", skiprows=1)
X = data[:, :2] # Features
y_true = data[:, 2] # Actual cluster labels

# Visualize the dataset

plt.figure(figsize=(8, 6))
plt.scatter(X[:, 0], X[:, 1], c=y_true, cmap='viridis')
plt.title('True Clusters')
plt.xlabel('X1')
plt.ylabel('X2')
plt.show()

# K-means clustering
# kmeans = KMeans(n_clusters=3, random_state=42)
kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
kmeans_clusters = kmeans.fit_predict(X)

# Single-link Hierarchical Clustering

single_link = AgglomerativeClustering(n_clusters=3, linkage='single')
single_link_clusters = single_link.fit_predict(X)

# Complete-link Hierarchical Clustering

complete_link = AgglomerativeClustering(n_clusters=3, linkage='complete')
complete_link_clusters = complete_link.fit_predict(X)

# Compute the Rand Index

rand_index_kmeans = adjusted_rand_score(y_true, kmeans_clusters)
rand_index_single_link = adjusted_rand_score(y_true, single_link_clusters)
rand_index_complete_link = adjusted_rand_score(y_true, complete_link_clusters)

print("Rand Index for K-means Clustering:", rand_index_kmeans)

print("Rand Index for Single-link Hierarchical Clustering:", rand_index_single_link)
print("Rand Index for Complete-link Hierarchical Clustering:", rand_index_complete_link)

# This code will compute the Rand Index for each clustering method and provide a visualization of the true clusters.
# The Rand Index ranges from 0 to 1, where 1 indicates perfect clustering agreement with the true clusters.
# The method with a higher Rand Index is better at recovering the true clusters.
OUTPUT:

Data Cleaning with NumPy & Pandas
No ratings yet
Data Cleaning with NumPy & Pandas
14 pages
Filter Data in WPS by Conditions
No ratings yet
Filter Data in WPS by Conditions
79 pages
Python Data Cleaning with Pandas & NumPy
No ratings yet
Python Data Cleaning with Pandas & NumPy
15 pages
Lab Assignment 3
No ratings yet
Lab Assignment 3
19 pages
Python-Data-Cleaning - Data Cleaning Tutorial - Real Python - Ipynb at Master Realpython - Python-Data-Cleaning GitHub
No ratings yet
Python-Data-Cleaning - Data Cleaning Tutorial - Real Python - Ipynb at Master Realpython - Python-Data-Cleaning GitHub
15 pages
.Ipynb - Checkpoints: Data Cleaning Tutorial - Real Python-Checkpoint - Ipynb
No ratings yet
.Ipynb - Checkpoints: Data Cleaning Tutorial - Real Python-Checkpoint - Ipynb
9 pages
.Ipynb - Checkpoints: Python-Data-Cleaning
No ratings yet
.Ipynb - Checkpoints: Python-Data-Cleaning
6 pages
MARC Codes For Organizations in The UK and Its Dependencies
No ratings yet
MARC Codes For Organizations in The UK and Its Dependencies
100 pages
Data Cleaning TP1 E
No ratings yet
Data Cleaning TP1 E
1 page
Libraries Within The Library Ed Taylor Mandelbrote
No ratings yet
Libraries Within The Library Ed Taylor Mandelbrote
459 pages
Secret Series Complete Collection
No ratings yet
Secret Series Complete Collection
35 pages
Alston Library Data Access Project
No ratings yet
Alston Library Data Access Project
2 pages
Libraries in Wales: A Historical Overview
No ratings yet
Libraries in Wales: A Historical Overview
12 pages
BIBLIOGRAPHY
No ratings yet
BIBLIOGRAPHY
12 pages
Island Libraries History to 1850
No ratings yet
Island Libraries History to 1850
4 pages
Field of Study: Library Science Library and Information Science Documentation Science Belgian Paul Otlet
No ratings yet
Field of Study: Library Science Library and Information Science Documentation Science Belgian Paul Otlet
6 pages
Datascience Lab 1-2
No ratings yet
Datascience Lab 1-2
3 pages
Robin Alston Library History Sources
No ratings yet
Robin Alston Library History Sources
52 pages
Cities England
No ratings yet
Cities England
3 pages
Tearoom Trade Impersonal Sex in Public Places
100% (1)
Tearoom Trade Impersonal Sex in Public Places
31 pages
Dictionary
No ratings yet
Dictionary
18 pages
A Description 1
No ratings yet
A Description 1
102 pages
Engleski jezik: Natjecanje za 8. razred
No ratings yet
Engleski jezik: Natjecanje za 8. razred
11 pages
Enumerative Bibliography, Descriptive Bibliography and Formats
100% (1)
Enumerative Bibliography, Descriptive Bibliography and Formats
9 pages
1-Python Pandas Case Study
No ratings yet
1-Python Pandas Case Study
25 pages
[Cambridge Library Collection_ History of Printing, Publishing and Libraries] C. E. Sayle - Early English ... in the University Library, Cambridge, 1475 to 1640. Vol. 1_ Caxton to F. Kingston (2010, Cambridge University - Libgen.li
100% (1)
[Cambridge Library Collection_ History of Printing, Publishing and Libraries] C. E. Sayle - Early English ... in the University Library, Cambridge, 1475 to 1640. Vol. 1_ Caxton to F. Kingston (2010, Cambridge University - Libgen.li
655 pages
Essential Bibliographic Citation Info
No ratings yet
Essential Bibliographic Citation Info
1 page
Healthy Pleasures
No ratings yet
Healthy Pleasures
30 pages
Classification Made Simple - Preview
0% (1)
Classification Made Simple - Preview
39 pages
All Aboard DLR Lexicon April 2017
No ratings yet
All Aboard DLR Lexicon April 2017
60 pages
Martin Daunton-The Cambridge Urban History of Britain, Volume 3 - 1840-1950-Cambridge University Press (2001)
No ratings yet
Martin Daunton-The Cambridge Urban History of Britain, Volume 3 - 1840-1950-Cambridge University Press (2001)
968 pages
Pandas100 Sol
No ratings yet
Pandas100 Sol
54 pages
Kelly's Directory Suffolk 1900
75% (4)
Kelly's Directory Suffolk 1900
620 pages
Understanding Enumerative Bibliography
No ratings yet
Understanding Enumerative Bibliography
5 pages
Pandas Shan Ver2
No ratings yet
Pandas Shan Ver2
25 pages
Robin Alston Private Owners
No ratings yet
Robin Alston Private Owners
103 pages
Titanic Dataset Analysis and Insights
No ratings yet
Titanic Dataset Analysis and Insights
17 pages
Data on Historical Library Holdings
No ratings yet
Data on Historical Library Holdings
26 pages
Achicago BST
No ratings yet
Achicago BST
85 pages
DSA Module 3 Notes
No ratings yet
DSA Module 3 Notes
22 pages
Module 3 Theory Questions
No ratings yet
Module 3 Theory Questions
1 page
Module 2 Theory Questions
No ratings yet
Module 2 Theory Questions
1 page
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
15 pages
Module 03
No ratings yet
Module 03
36 pages
@vtucode Module 4
No ratings yet
@vtucode Module 4
46 pages
RandomForests Sayed
No ratings yet
RandomForests Sayed
21 pages
Btech Ec 6 Sem Artificial Neural Network Nec 013 2017
No ratings yet
Btech Ec 6 Sem Artificial Neural Network Nec 013 2017
1 page
Android Malware Detection Fusion
No ratings yet
Android Malware Detection Fusion
14 pages
Zeng 2020
No ratings yet
Zeng 2020
6 pages
LecML - 3 NN
No ratings yet
LecML - 3 NN
33 pages
Technology - Mca Master of Computer Applications - Semester 3 - 2023 - December - Elective 3 Deep Learning Rev 2019 C Scheme
No ratings yet
Technology - Mca Master of Computer Applications - Semester 3 - 2023 - December - Elective 3 Deep Learning Rev 2019 C Scheme
1 page
Overview of Classification Algorithms
No ratings yet
Overview of Classification Algorithms
28 pages
Convolutional Neural Networks Guide
No ratings yet
Convolutional Neural Networks Guide
40 pages
Logistic Regression & ANN Assignment
No ratings yet
Logistic Regression & ANN Assignment
1 page
Fraud Detection Using Machine Learning and Deep Learning
No ratings yet
Fraud Detection Using Machine Learning and Deep Learning
6 pages
Intro to Neural Networks Lecture
No ratings yet
Intro to Neural Networks Lecture
65 pages
Data Analytics: Unit 3: Time Series
No ratings yet
Data Analytics: Unit 3: Time Series
11 pages
LSTM & Neural Networks Guide
No ratings yet
LSTM & Neural Networks Guide
85 pages
A Comparison of Machine Learning Algorithms For Customer Churn Prediction
No ratings yet
A Comparison of Machine Learning Algorithms For Customer Churn Prediction
6 pages
What Is Bagging in Machine Learning and How To Perform Bagging
No ratings yet
What Is Bagging in Machine Learning and How To Perform Bagging
9 pages
UNIT-1: & Word Doc - Word Doc - & Word Doc - & Word Doc - Word Doc - & Word Doc - Word Doc - Word
No ratings yet
UNIT-1: & Word Doc - Word Doc - & Word Doc - & Word Doc - Word Doc - & Word Doc - Word Doc - Word
3 pages
Introduction To Neural Networks: Deep Learning For NLP
No ratings yet
Introduction To Neural Networks: Deep Learning For NLP
57 pages
Deep Learning Models Overview
No ratings yet
Deep Learning Models Overview
100 pages
AL3491 - ML - Syllabus
No ratings yet
AL3491 - ML - Syllabus
2 pages
ML 06 Multiclass
No ratings yet
ML 06 Multiclass
11 pages
Introduction to Neural Networks
No ratings yet
Introduction to Neural Networks
29 pages
Artificial Neural Network-Adaline & Madaline
No ratings yet
Artificial Neural Network-Adaline & Madaline
18 pages
Week 6 SVM
No ratings yet
Week 6 SVM
18 pages
Al3451 ML
No ratings yet
Al3451 ML
6 pages
Problems On Som
No ratings yet
Problems On Som
11 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
Neural Networks for IT Students
No ratings yet
Neural Networks for IT Students
13 pages
Comparative Study On Spoken Language Identification Based On Deep Learning
No ratings yet
Comparative Study On Spoken Language Identification Based On Deep Learning
5 pages
III-1 PML Question Bank BTL
No ratings yet
III-1 PML Question Bank BTL
4 pages
Lecture 14 - ANN
No ratings yet
Lecture 14 - ANN
50 pages

DSA Lab Manual

Uploaded by

DSA Lab Manual

Uploaded by

Program 1a. Students performance in the final exams.

import matplotlib.pyplot as plt

# Plotting the line chart

# Adding labels and title

# Displaying the plot

# Load the dataset

# Plotting the histogram

# Adding labels and title

# Displaying the plot

Consider the books dataset BL-Flickr-Images-Book.csv from Kaggle (https://www.kaggle.com/adeyoyintemidayo/publication-of-

 Import the data into a DataFrame

 Change the Index of the DataFrame

 Combine str methods with NumPy to clean columns

# Import the data into a DataFrame

# Display the first few rows of the DataFrame

# Change the Index of the DataFrame

# Combine str methods with NumPy to clean columns

# Display the cleaned DataFrame

Date of Publication Publisher \

Contributors Corporate Author \

Corporate Contributors Former owner Engraver Issuance type \

Program 3 a. Logistic Regression

Train a regularized logistic regression classifier on the iris dataset (https://archive.ics.uci.edu/ml/machine-learning-databases/iris/ or

from sklearn.datasets import load_iris

# Load the Iris dataset

# Split the data into training and testing sets

# Create a pipeline with StandardScaler and LogisticRegression with regularization

# Train the model

# Calculate the accuracy on the testing set

Program 3.b SVM classifier

# Load the Iris dataset

# Split the data into training and testing sets

# Set of hyperparameters to try

print("\nBest accuracy:", best_accuracy)

Best accuracy: 1.0

Decision Tree based ID3 algorithm

# Define the dataset

# Convert categorical variables into numerical ones

# Separate features and target variable

# Split the data into training and testing sets

# Create a decision tree classifier

# Train the classifier on the training data

# Predict on the testing data

 Single – link Hierarchical Clustering

 Complete link hierarchical clustering.

# Load the dataset

# Visualize the dataset

# Single-link Hierarchical Clustering

# Complete-link Hierarchical Clustering

# Compute the Rand Index

print("Rand Index for K-means Clustering:", rand_index_kmeans)

You might also like