0% found this document useful (0 votes)

12 views23 pages

Machine Learning Tools Tutorial

The document discusses using machine learning tools, specifically the scikit-learn library, to analyze the Iris dataset for insights through K-means clustering. It explains the importance of having numerical data for clustering and provides tips on defining the number of clusters using the elbow method. Additionally, it emphasizes the need for label encoding to convert categorical variables into numerical values for model predictions.

Uploaded by

fddarya2019

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as XLSX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views23 pages

Machine Learning Tools Tutorial

Uploaded by

fddarya2019

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as XLSX, PDF, TXT or read online on Scribd

Conte

nt
begins
in cell
Machine learning tools w

Make smarter decisions with your data using machine le

With just a few steps, you can use the scikit-learn library
and uncover insights—no complicated setup needed. Jus
and bring your data to life with easy-to-read visualizatio

Note: Use of this template requires an active Microsoft 365 subscription.

your
arrow
keys
to
move
to cell
B2.

Start with a dataset

Iris dataset (as Excel table)

sepal_length sepal_width petal_length petal_width

5.1 3.5 1.4 0.2

4.9 3 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5 3.6 1.4 0.2
5.4 3.9 1.7 0.4
4.6 3.4 1.4 0.3
5 3.4 1.5 0.2
4.4 2.9 1.4 0.2
4.9 3.1 1.5 0.1
5.4 3.7 1.5 0.2
4.8 3.4 1.6 0.2
4.8 3 1.4 0.1
4.3 3 1.1 0.1
5.8 4 1.2 0.2
5.7 4.4 1.5 0.4
5.4 3.9 1.3 0.4
5.1 3.5 1.4 0.3
5.7 3.8 1.7 0.3
5.1 3.8 1.5 0.3
5.4 3.4 1.7 0.2
5.1 3.7 1.5 0.4
4.6 3.6 1 0.2
5.1 3.3 1.7 0.5
4.8 3.4 1.9 0.2
5 3 1.6 0.2
5 3.4 1.6 0.4
5.2 3.5 1.5 0.2
5.2 3.4 1.4 0.2
4.7 3.2 1.6 0.2
4.8 3.1 1.6 0.2
5.4 3.4 1.5 0.4
5.2 4.1 1.5 0.1
5.5 4.2 1.4 0.2
4.9 3.1 1.5 0.1
5 3.2 1.2 0.2
5.5 3.5 1.3 0.2
4.9 3.1 1.5 0.1
4.4 3 1.3 0.2
5.1 3.4 1.5 0.2
5 3.5 1.3 0.3
4.5 2.3 1.3 0.3
4.4 3.2 1.3 0.2
5 3.5 1.6 0.6
5.1 3.8 1.9 0.4
4.8 3 1.4 0.3
5.1 3.8 1.6 0.2
4.6 3.2 1.4 0.2
5.3 3.7 1.5 0.2
5 3.3 1.4 0.2
7 3.2 4.7 1.4
6.4 3.2 4.5 1.5
6.9 3.1 4.9 1.5
5.5 2.3 4 1.3
6.5 2.8 4.6 1.5
5.7 2.8 4.5 1.3
6.3 3.3 4.7 1.6
4.9 2.4 3.3 1
6.6 2.9 4.6 1.3
5.2 2.7 3.9 1.4
5 2 3.5 1
5.9 3 4.2 1.5
6 2.2 4 1
6.1 2.9 4.7 1.4
5.6 2.9 3.6 1.3
6.7 3.1 4.4 1.4
5.6 3 4.5 1.5
5.8 2.7 4.1 1
6.2 2.2 4.5 1.5
5.6 2.5 3.9 1.1
5.9 3.2 4.8 1.8
6.1 2.8 4 1.3
6.3 2.5 4.9 1.5
6.1 2.8 4.7 1.2
6.4 2.9 4.3 1.3
6.6 3 4.4 1.4
6.8 2.8 4.8 1.4
6.7 3 5 1.7
6 2.9 4.5 1.5
5.7 2.6 3.5 1
5.5 2.4 3.8 1.1
5.5 2.4 3.7 1
5.8 2.7 3.9 1.2
6 2.7 5.1 1.6
5.4 3 4.5 1.5
6 3.4 4.5 1.6
6.7 3.1 4.7 1.5
6.3 2.3 4.4 1.3
5.6 3 4.1 1.3
5.5 2.5 4 1.3
5.5 2.6 4.4 1.2
6.1 3 4.6 1.4
5.8 2.6 4 1.2
5 2.3 3.3 1
5.6 2.7 4.2 1.3
5.7 3 4.2 1.2
5.7 2.9 4.2 1.3
6.2 2.9 4.3 1.3
5.1 2.5 3 1.1
5.7 2.8 4.1 1.3
6.3 3.3 6 2.5
5.8 2.7 5.1 1.9
7.1 3 5.9 2.1
6.3 2.9 5.6 1.8
6.5 3 5.8 2.2
7.6 3 6.6 2.1
4.9 2.5 4.5 1.7
7.3 2.9 6.3 1.8
6.7 2.5 5.8 1.8
7.2 3.6 6.1 2.5
6.5 3.2 5.1 2
6.4 2.7 5.3 1.9
6.8 3 5.5 2.1
5.7 2.5 5 2
5.8 2.8 5.1 2.4
6.4 3.2 5.3 2.3
6.5 3 5.5 1.8
7.7 3.8 6.7 2.2
7.7 2.6 6.9 2.3
6 2.2 5 1.5
6.9 3.2 5.7 2.3
5.6 2.8 4.9 2
7.7 2.8 6.7 2
6.3 2.7 4.9 1.8
6.7 3.3 5.7 2.1
7.2 3.2 6 1.8
6.2 2.8 4.8 1.8
6.1 3 4.9 1.8
6.4 2.8 5.6 2.1
7.2 3 5.8 1.6
7.4 2.8 6.1 1.9
7.9 3.8 6.4 2
6.4 2.8 5.6 2.2
6.3 2.8 5.1 1.5
6.1 2.6 5.6 1.4
7.7 3 6.1 2.3
6.3 3.4 5.6 2.4
6.4 3.1 5.5 1.8
6 3 4.8 1.8
6.9 3.1 5.4 2.1
6.7 3.1 5.6 2.4
6.9 3.1 5.1 2.3
5.8 2.7 5.1 1.9
6.8 3.2 5.9 2.3
6.7 3.3 5.7 2.5
6.7 3 5.2 2.3
6.3 2.5 5 1.9
6.5 3 5.2 2
6.2 3.4 5.4 2.3
5.9 3 5.1 1.8
aset

species

setosa TIP
setosa Make sure your dataset mainly
has numbers as values. In this
setosa example, the "species" column
setosa will be excluded from the
setosa clustering in the Python formula.

setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa GOOD TO KNOW
setosa The target variable (what the
model predicts) must be in
setosa numbers. This is the "species"
setosa column in this example. The
setosa Python script handles this using
label encoding, converting the
setosa categories into numbers:
setosa - setosa -> 0
- versicolor -> 1
setosa - virginica -> 2
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
setosa
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
versicolor
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
virginica
your
arrow
keys
to
move
to cell
B2.

K-means clustering
GOOD TO KNOW
K-means clustering is an unsupervised machine learning algorithm that groups similar
flowers from three species. The goal is to partition the data into groups with shared
Learn more.

1. Define the number of clusters

TIP
Use the elbow method to find the optimal number of clusters. Look for the elbow
point where the graph sharply changes from steep to flat.

Select cell G14 to see the Python formula ----------> #NAME?

TRY HERE
n_clusters: 10
that groups similar data points into clusters. In this example, the Iris dataset contains measurements o
oups with shared characteristics.

2. Visualize the clusters

Select cell Q11 to see the Python formula ---------->
et contains measurements of iris

usters
hon formula ----------> #NAME?
your
arrow
keys
to
move
to cell
B2.

Logistic regression
GOOD TO KNOW
Logistic regression is a supervised machine learning algorithm used for classification tasks. In
from three species. The goal is to predict a flower's species (setosa, versicolor, or virginica)
petal width).
Learn more.

1. Define the train split percentage

TIP
A train-test split is needed for this model. The
training set teaches the model using examples
with known species, while the testing set
evaluates its performance on new, unseen data.

TRY HERE
train percentage split: 40

Based on the input percentage, the data will

be split as: 40% train, 60% test.
or classification tasks. In this example, the Iris dataset contains measurements of iris flowers
a, versicolor, or virginica) based on its features (sepal length, sepal width, petal length, and

2. Visualize the model's performance

Select cell P11 to see the sample Python formula -----> #NAME?
GOOD TO KNOW
The confusion matrix provides a detailed breakdown of the
model's performance by comparing the true labels with the
predicted labels for each variable. The chart shows the counts of
true positives (TP), true negatives (TN), false positives (FP), and
false negatives (FN).

predicted Definitions
A B C TP: Actual class matches predicte
A TP FN FN FN: Actual class is positive but pre
actual

B FP TP TN FP: Actual class is negative but pr

C FP FP TP
down of the
bels with the
ows the counts of
ositives (FP), and

ass matches predicted class

ass is positive but predicted as another class
ass is negative but predicted as positive
your
arrow
keys
to
move
to cell
B2.

Random forest
GOOD TO KNOW
Random forest is a supervised machine learning algorithm used for classification and feature i
make predictions, essentially a "forest" of trees working together. Each tree in the forest is tra
random subset of features when making splits, a process called randomization. In this examp
flowers from three species. The goal is to determine which features (sepal length, sepal widt
important in predicting the species (setosa, versicolor, or virginica).

Learn more.

1. Define the number of trees

TRY HERE
tree count: 10

TIP
Experiment with different tree counts to see how it
affects the model's feature importance shown in step
2.
ssification and feature importance. It combines multiple decision trees to
h tree in the forest is trained on a random subset of the data and considers a
mization. In this example, the Iris dataset contains measurements of iris
(sepal length, sepal width, petal length, and petal width) are most
a).

2. Your features below

Select cell O10 to see the sample Python formula --------> #NAME?

ML Lecture 10 Project
No ratings yet
ML Lecture 10 Project
20 pages
Iris Machine Learning Model Guide
No ratings yet
Iris Machine Learning Model Guide
5 pages
ML Lab1 PGM
No ratings yet
ML Lab1 PGM
4 pages
Iris Flower Classification Final
No ratings yet
Iris Flower Classification Final
15 pages
ML LAB Manual With Output
No ratings yet
ML LAB Manual With Output
25 pages
Assignment 4 R Program1
No ratings yet
Assignment 4 R Program1
11 pages
Iris Flower Classification Project
No ratings yet
Iris Flower Classification Project
9 pages
Applied Machine Learning Course Overview
No ratings yet
Applied Machine Learning Course Overview
80 pages
Understanding-Code-for A-Classifier
No ratings yet
Understanding-Code-for A-Classifier
15 pages
Iris Flower Classification with Neural Networks
No ratings yet
Iris Flower Classification with Neural Networks
38 pages
Logistic Regression on Iris Dataset
No ratings yet
Logistic Regression on Iris Dataset
39 pages
Amber Iris
No ratings yet
Amber Iris
23 pages
Task 1 Iris Flower Classification Using Machine Learning
No ratings yet
Task 1 Iris Flower Classification Using Machine Learning
10 pages
Random Forest 1737667979
No ratings yet
Random Forest 1737667979
11 pages
Image: 1
No ratings yet
Image: 1
5 pages
Classification of Iris Flower Species Updated
100% (1)
Classification of Iris Flower Species Updated
5 pages
Day 2 Presentation
No ratings yet
Day 2 Presentation
65 pages
ML N PY Programs
No ratings yet
ML N PY Programs
17 pages
2 Machine Learning
No ratings yet
2 Machine Learning
21 pages
Module 4
No ratings yet
Module 4
30 pages
Module 4-1
No ratings yet
Module 4-1
30 pages
Iris Flower Classification Project Report
No ratings yet
Iris Flower Classification Project Report
42 pages
Iris Classification
No ratings yet
Iris Classification
6 pages
Practical 1ritesh
No ratings yet
Practical 1ritesh
3 pages
SVM and KNN Classification Assignment
No ratings yet
SVM and KNN Classification Assignment
18 pages
Scikit-learn Datasets and Setup Guide
No ratings yet
Scikit-learn Datasets and Setup Guide
92 pages
Practical File DL
No ratings yet
Practical File DL
14 pages
Scikit-learn for Iris Classification
No ratings yet
Scikit-learn for Iris Classification
20 pages
Machine Learning: Lecture 7: Create Your First Project
No ratings yet
Machine Learning: Lecture 7: Create Your First Project
17 pages
ML Project Assigment
No ratings yet
ML Project Assigment
32 pages
VAMSHI PR (1) 2 Edit
No ratings yet
VAMSHI PR (1) 2 Edit
16 pages
FIND-S Algorithm Implementation
No ratings yet
FIND-S Algorithm Implementation
51 pages
ChatGPT - MyLearning On Coding For Machine Learning
No ratings yet
ChatGPT - MyLearning On Coding For Machine Learning
16 pages
Decision Tree Classification on Iris Dataset
No ratings yet
Decision Tree Classification on Iris Dataset
6 pages
Iris Flower Classification Project
100% (1)
Iris Flower Classification Project
14 pages
Lec 10
No ratings yet
Lec 10
36 pages
Decision Tree Analysis of Iris Species
No ratings yet
Decision Tree Analysis of Iris Species
6 pages
Lab 6
No ratings yet
Lab 6
4 pages
TranMinhTu1 bt2 2
No ratings yet
TranMinhTu1 bt2 2
5 pages
Nomlab 14 Ai
No ratings yet
Nomlab 14 Ai
3 pages
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
No ratings yet
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
18 pages
ML Mod-4
No ratings yet
ML Mod-4
30 pages
Lab Manual
No ratings yet
Lab Manual
32 pages
R Course - Part7 ML - Exercise Sheet 2024
No ratings yet
R Course - Part7 ML - Exercise Sheet 2024
8 pages
Pra 8
No ratings yet
Pra 8
4 pages
Assignment No - 6-1
100% (1)
Assignment No - 6-1
3 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
Mod3 Classification
No ratings yet
Mod3 Classification
32 pages
Python ML Lab for Beginners
No ratings yet
Python ML Lab for Beginners
10 pages
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
No ratings yet
Efficient Python Tricks and Tools For Data Scientists - by Khuyen Tran
20 pages
Types of ML Systems
No ratings yet
Types of ML Systems
5 pages
L3 - Classification - RandomForest - Jupyter Notebook
No ratings yet
L3 - Classification - RandomForest - Jupyter Notebook
6 pages
Mlpy 2
No ratings yet
Mlpy 2
18 pages
Iris Flower Classification with ML
No ratings yet
Iris Flower Classification with ML
12 pages
NaiveBayesClassifier - Jupyter Notebook
No ratings yet
NaiveBayesClassifier - Jupyter Notebook
2 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
Classification Algorithms II
No ratings yet
Classification Algorithms II
9 pages
Machine Learning with Python Workshop
No ratings yet
Machine Learning with Python Workshop
65 pages
Bagging, Random Forest, Gradient Boost, AdaBoost & PCA
No ratings yet
Bagging, Random Forest, Gradient Boost, AdaBoost & PCA
8 pages
OR PPT On Assignment
No ratings yet
OR PPT On Assignment
20 pages
Data Structures and Algorithms - L1
No ratings yet
Data Structures and Algorithms - L1
31 pages
DSP Lab File - Experiments - 1 & 2
No ratings yet
DSP Lab File - Experiments - 1 & 2
17 pages
Understanding Laplacian Operators in Image Processing
No ratings yet
Understanding Laplacian Operators in Image Processing
29 pages
Assignment2 Cse330 Fall2024
No ratings yet
Assignment2 Cse330 Fall2024
1 page
20a02702c Intelligent Control Techniques
No ratings yet
20a02702c Intelligent Control Techniques
1 page
Unit 3 - Cyclic Code MCQ
No ratings yet
Unit 3 - Cyclic Code MCQ
6 pages
PDC 11 - Parallel Algorithms
No ratings yet
PDC 11 - Parallel Algorithms
36 pages
Cyclic Codes: Concepts and Implementation
No ratings yet
Cyclic Codes: Concepts and Implementation
11 pages
05 K-Nearest Neighbors
No ratings yet
05 K-Nearest Neighbors
15 pages
Introduction to Ordinary Differential Equations
No ratings yet
Introduction to Ordinary Differential Equations
78 pages
Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 8th Edition Ragsdale Solutions Manual
No ratings yet
Spreadsheet Modeling and Decision Analysis A Practical Introduction To Business Analytics 8th Edition Ragsdale Solutions Manual
44 pages
Chat GPT
No ratings yet
Chat GPT
6 pages
Minimum Spanning Trees Quiz
No ratings yet
Minimum Spanning Trees Quiz
5 pages
Grade 10 Maths CBSE MOCK TEST 1 PDF
67% (3)
Grade 10 Maths CBSE MOCK TEST 1 PDF
8 pages
Java Sorting Algorithms Guide
No ratings yet
Java Sorting Algorithms Guide
6 pages
Transformer - Ipynb - Colab
No ratings yet
Transformer - Ipynb - Colab
5 pages
Machine Learning MID-2 Question Bank
No ratings yet
Machine Learning MID-2 Question Bank
2 pages
Cme323 Lec2
No ratings yet
Cme323 Lec2
5 pages
Digital Controllers in Discrete-Time Systems
No ratings yet
Digital Controllers in Discrete-Time Systems
19 pages
AI Presentation Topics - IT - Active Learning - April2025
No ratings yet
AI Presentation Topics - IT - Active Learning - April2025
2 pages
Control
No ratings yet
Control
13 pages
Civil Engineering (Bsc. Only) : Sma 3261 Numerical Methods Cat I
No ratings yet
Civil Engineering (Bsc. Only) : Sma 3261 Numerical Methods Cat I
2 pages
Dynamic Programming for CS Students
No ratings yet
Dynamic Programming for CS Students
13 pages
Code Question1-Adaline
No ratings yet
Code Question1-Adaline
29 pages
Comparison PCM DPCM DM Adm
No ratings yet
Comparison PCM DPCM DM Adm
1 page
Ds Questions
No ratings yet
Ds Questions
2 pages
CFD 2006 - Chapter 5 FVM For Convection-Diffusion Problem
No ratings yet
CFD 2006 - Chapter 5 FVM For Convection-Diffusion Problem
27 pages
Dynamic Programming Question Bank
100% (1)
Dynamic Programming Question Bank
5 pages
AA Session02
No ratings yet
AA Session02
31 pages

Machine Learning Tools Tutorial

Uploaded by

Machine Learning Tools Tutorial

Uploaded by

Conte

Make smarter decisions with your data using machine le

Note: Use of this template requires an active Microsoft 365 subscription.

Start with a dataset

sepal_length sepal_width petal_length petal_width

5.1 3.5 1.4 0.2

1. Define the number of clusters

Select cell G14 to see the Python formula ----------> #NAME?

2. Visualize the clusters

1. Define the train split percentage

Based on the input percentage, the data will

2. Visualize the model's performance

B FP TP TN FP: Actual class is negative but pr

ass matches predicted class

1. Define the number of trees

2. Your features below

You might also like