0% found this document useful (0 votes)

16 views12 pages

K-Means Clustering in Machine Learning

Uploaded by

el mozakra takhmesa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views12 pages

K-Means Clustering in Machine Learning

Uploaded by

el mozakra takhmesa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

k-means clustering in ML

2021/2022 ‫الجامع‬
‫ي‬ ‫العام‬
‫[‪]K-MEANS CLUSTERING IN ML‬‬

‫[من مقرر التعلم الذكاء احلسابي]‬

‫[‪]2021/12/24‬‬
‫اسم المقرر‪ :‬الذكاء الحساب‬
‫الفرقة الدراسية‪ :‬الثالثة‬
‫كود المقرر‪AI304 :‬‬

‫‪k-means clustering in ML‬‬

‫دراسة بحثيه مقدمه من الطالب‬

‫اسم الطالب‪ :‬أحمد رضا محمد عبد النعيم‬ ‫اسم الطالب‪ :‬أحمد خالد فوزي منصور‬
‫رقم الجلوس‪3008 :‬‬ ‫رقم الجلوس‪3005 :‬‬

‫اسم الطالب‪ :‬عبدالرحمن قاسم محمد‬ ‫ى‬

‫مصطف كامل عىل‬ ‫اسم الطالب‪:‬‬
‫رقم الجلوس‪3080 :‬‬ ‫رقم الجلوس‪3168 :‬‬

‫اسم الطالب‪ :‬محمود رأشف عامر‬

‫رقم الجلوس‪3153 :‬‬

‫تحت رإشاف‬

‫الدكتور‬
‫سمر البديهى‬

‫‪ 1442‬ه ‪ 2202-‬م‬
Introduction

Machine learning:
It is a sub-field of Computer Science that makes computers able to learn without programming it
Machine learning types:
a) supervised learning:
In this type of learning, we supervise and direct the implementation of tasks on the deity learning
model in order to be able to deal with untrained and new situations and this is done by training
the set of data that we have.

Teaching the model by data set we

have

The model can predict unknown or

future instances

b) unsupervised learning:
In it we let the model discover the information that we want to expect without guidance or
supervision, where we give it the data, then it trains the set of data given to it and then produces
the results.

c) reinforcement learning:
In this type, models are trained to make decisions, here it all comes to making the
appropriate decision, so the model learns through trial and error, so it gets the best
decision that must be taken in a particular situation, in this type the model is not given
a set of data that contains the decision What must be taken, but the model makes the
decision by itself to perform the task given to it, when there is no data set, the model
learns by trial and error.
Major Machine learning techniques:

Machine
Learning
Techniques

Supervised Unsupervised Reinforcement

Learning Learning Learning

Data with label Data without label State and Action

A.Classification A.Clustering A.Model-Free

B.Association
B.Regression B.Model-Based
Analysis

C.Dimensionality
Reduction
k-means clustering
in ML
K-Means Clustering is an unsupervised learning algorithm that is used to solve the
clustering problems in machine learning or data science , which groups the unlabeled
dataset into different clusters. Here K defines the number of pre-defined clusters that
need to be created in the process, as if K=2, there will be two clusters, and for K=3,
there will be three clusters, and so on.
It allows us to cluster the data into different groups and a convenient way to discover
the categories of groups in the unlabeled dataset on its own without the need for any
training.
It is a centroid-based algorithm, where each cluster is associated with a centroid. The
main aim of this algorithm is to minimize the sum of distances between the data point
and their corresponding clusters.
The algorithm takes the unlabeled dataset as input, divides the dataset into k-number
of clusters, and repeats the process until it does not find the best clusters. The value
of k should be predetermined in this algorithm.
The k-means clustering algorithm mainly performs two tasks:
o Determines the best value for K center points or centroids by an iterative
process.
o Assigns each data point to its closest k-center. Those data points which are near
to the particular k-center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is away from
other clusters.
The below diagram explains the working of the K-means Clustering Algorithm:
Python Implementation of K-means Clustering
Algorithm

In the above section, we have discussed the K-means algorithm, now let's see how
it can be implemented using Python.
, we have a dataset of Mall_Customers, which is the data of customers who visit
the mall and spend there.
In the given dataset, we have Customer_Id, Gender, Age, Annual Income ($),
and Spending Score (which is the calculated value of how much a customer has
spent in the mall, the more the value, the more he has spent). From this dataset, we
need to calculate some patterns, as it is an unsupervised method, so we don't know
what to calculate exactly.
The steps to be followed for the implementation are given below:
o Data Pre-processing
o Finding the optimal number of clusters using the elbow method
o Training the K-means algorithm on the training dataset
o Visualizing the clusters

Step-1: Data pre-processing Step

The first step will be the data pre-processing, as we did in our earlier topics of Regression and
Classification. But for the clustering problem, it will be different from other models. Let's discuss it:

o Importing Libraries
As we did in previous topics, firstly, we will import the libraries for our model, which is part of
data pre-processing. The code is given below:
1. # importing libraries
2. import numpy as nm
3. import matplotlib.pyplot as mtp
4. import pandas as pd

5. # Importing the dataset

6. dataset = pd.read_csv('Mall_Customers_data.csv')
By executing the above lines of code, we will get our dataset in the Spyder IDE.
The dataset looks like the below image:

From the above dataset, we need to find some patterns in it.

o Extracting Independent Variables
Here we don't need any dependent variable for data pre-processing step as it is a
clustering problem, and we have no idea about what to determine. So we will just
add a line of code for the matrix of features.
7. x = dataset.iloc[:, [3, 4]].values
clustering problem

Step-2: Finding the optimal number of clusters using the elbow method
9. #finding optimal number of clusters using the elbow method
10. from sklearn.cluster import KMeans
11. wcss_list= [] #Initializing the list for the values of WCSS
12.
13. #Using for loop for iterations from 1 to 10.
14. for i in range(1, 11):
15. kmeans = KMeans(n_clusters=i, init='k-means++', random_state= 42)
16. kmeans.fit(x)
17. wcss_list.append(kmeans.inertia_)
18. mtp.plot(range(1, 11), wcss_list)
19. mtp.title('The Elobw Method Graph')
20. mtp.xlabel('Number of clusters(k)')
21. mtp.ylabel('wcss_list')
22. mtp.show()

Output: After executing the above code, we will get the below output:
From the above plot, we can see the elbow point is at 5. So the number of clusters
here will be 5.

Step- 3: Training the K-means algorithm on the training dataset

23. #training the K-means model on a dataset
24. kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)
25. y_predict= kmeans.fit_predict(x)
Step-4: Visualizing the Clusters
The last step is to visualize the clusters. As we have 5 clusters for our model, so we will visualize each cluster
one by one.

To visualize the clusters will use scatter plot using mtp.scatter() function of matplotlib.

26. #visulaizing the clusters

27. mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1')
#for first cluster
28. mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2
') #for second cluster
29. mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3') #
for third cluster
30. mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')
#for fourth cluster
31. mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Clust
er 5') #for fifth cluster
32. mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow'
, label = 'Centroid')
33. mtp.title('Clusters of customers')
34. mtp.xlabel('Annual Income (k$)')
35. mtp.ylabel('Spending Score (1-100)')
36. mtp.legend()
37. mtp.show()
The Output

Output:

The output image is clearly showing the five different clusters with different colors. The clusters are
formed between two parameters of the dataset; Annual income of customer and Spending. We can
change the colors and labels as per the requirement or choice. We can also observe some points from
the above patterns, which are given below:

o Cluster1 shows the customers with average salary and average spending so we can categorize
these customers as

o Cluster2 shows the customer has a high income but low spending, so we can categorize them
as careful.

o Cluster3 shows the low income and also low spending so they can be categorized as sensible.

o Cluster4 shows the customers with low income with very high spending so they can be
categorized as careless.

o Cluster5 shows the customers with high income and high spending so they can be categorized
as target, and these customers can be the most profitable customers for the mall owner.

Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
No ratings yet
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
7 pages
K-Means Clustering Algorithm Overview
No ratings yet
K-Means Clustering Algorithm Overview
47 pages
Wa0033.
No ratings yet
Wa0033.
38 pages
LP I Assignment A4 Clustering
No ratings yet
LP I Assignment A4 Clustering
13 pages
AI Week 11
No ratings yet
AI Week 11
21 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
Aam Unit 4 QB With Answer
No ratings yet
Aam Unit 4 QB With Answer
11 pages
K Means - Ipynb - Colab
No ratings yet
K Means - Ipynb - Colab
10 pages
Unit 4
No ratings yet
Unit 4
63 pages
K-Means Clustering Overview
No ratings yet
K-Means Clustering Overview
1 page
0006 - K Means Clustering - Introduction - 2025
No ratings yet
0006 - K Means Clustering - Introduction - 2025
19 pages
Detecting Patterns With Unsupervised Learning
No ratings yet
Detecting Patterns With Unsupervised Learning
21 pages
K-Means Clustering Implementation Guide
No ratings yet
K-Means Clustering Implementation Guide
8 pages
Unsupervised Learning: Clustering & Anomaly Detection
No ratings yet
Unsupervised Learning: Clustering & Anomaly Detection
50 pages
K Means Clustering - Experiment 12
No ratings yet
K Means Clustering - Experiment 12
3 pages
Mall Customer Segmentation Guide
No ratings yet
Mall Customer Segmentation Guide
8 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
26 pages
DWM Exp4
No ratings yet
DWM Exp4
9 pages
ML Clustering2
No ratings yet
ML Clustering2
11 pages
CLUSTERING
No ratings yet
CLUSTERING
11 pages
Da Exp 10
No ratings yet
Da Exp 10
6 pages
Da Exp 10
No ratings yet
Da Exp 10
6 pages
AppliedML Chap1 Clustering
No ratings yet
AppliedML Chap1 Clustering
37 pages
10.lab Activity
No ratings yet
10.lab Activity
11 pages
Zara
No ratings yet
Zara
47 pages
How To Perform Clustering Algorithms in Machine Learning
No ratings yet
How To Perform Clustering Algorithms in Machine Learning
9 pages
K-Means Clustering: Unsupervised Learning
No ratings yet
K-Means Clustering: Unsupervised Learning
5 pages
Artificial Intelligence Report
No ratings yet
Artificial Intelligence Report
23 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
26 pages
Experiment-7: Implementation of K-Means Clustering Algorithm
No ratings yet
Experiment-7: Implementation of K-Means Clustering Algorithm
3 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
Practical File of AI and ML
No ratings yet
Practical File of AI and ML
26 pages
K-Means Clustering Report
No ratings yet
K-Means Clustering Report
2 pages
Unsupervised Learning Part 1
No ratings yet
Unsupervised Learning Part 1
9 pages
3.unsupervised Learning
No ratings yet
3.unsupervised Learning
9 pages
Python K-Means Clustering Guide
No ratings yet
Python K-Means Clustering Guide
6 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
23 pages
DSUP Exp5
No ratings yet
DSUP Exp5
7 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Building K-Means Clustering Algorithm From Scratch
No ratings yet
Building K-Means Clustering Algorithm From Scratch
10 pages
ML Unit 4
No ratings yet
ML Unit 4
110 pages
ML Unit5 Notes
No ratings yet
ML Unit5 Notes
18 pages
K-Means Algorithm
No ratings yet
K-Means Algorithm
29 pages
Yunsu Han KNN K Means
No ratings yet
Yunsu Han KNN K Means
8 pages
Exp 7
No ratings yet
Exp 7
3 pages
Python Clustering Techniques Explained
No ratings yet
Python Clustering Techniques Explained
12 pages
Unit 4
No ratings yet
Unit 4
22 pages
K Means Final
No ratings yet
K Means Final
10 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
27 pages
Unit 4
No ratings yet
Unit 4
46 pages
K Means
No ratings yet
K Means
25 pages
Probability and Statistics Mansoura Day4
No ratings yet
Probability and Statistics Mansoura Day4
23 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
K-Means Clustering
No ratings yet
K-Means Clustering
4 pages
02.1 K-Means Example
No ratings yet
02.1 K-Means Example
12 pages
Determining Clusters in K-Means
No ratings yet
Determining Clusters in K-Means
21 pages
Unit IV
No ratings yet
Unit IV
96 pages
Lab Report6 - B21CI014
No ratings yet
Lab Report6 - B21CI014
8 pages
High Level Parameter Analysis in WSN Based On WMCL Algorithm
No ratings yet
High Level Parameter Analysis in WSN Based On WMCL Algorithm
33 pages
DAA - Strassen's Matrix (Anurag Verma) v1.0
No ratings yet
DAA - Strassen's Matrix (Anurag Verma) v1.0
5 pages
Digital Signal Processing Overview
No ratings yet
Digital Signal Processing Overview
45 pages
Waveform Coding and Sampling Techniques
No ratings yet
Waveform Coding and Sampling Techniques
34 pages
Cauchy Machine Neural Network Overview
No ratings yet
Cauchy Machine Neural Network Overview
25 pages
Fa21 Lecture 4 Linear Regression
No ratings yet
Fa21 Lecture 4 Linear Regression
21 pages
Polynomial Revision Worksheet
No ratings yet
Polynomial Revision Worksheet
2 pages
Data Structures
No ratings yet
Data Structures
3 pages
Stochastic Optimization in Business Risk Management
No ratings yet
Stochastic Optimization in Business Risk Management
16 pages
Learn Data Structures & Algorithms
No ratings yet
Learn Data Structures & Algorithms
1 page
Unit 2 IIR Filter
No ratings yet
Unit 2 IIR Filter
112 pages
Lab Report 6 DSP
No ratings yet
Lab Report 6 DSP
6 pages
Data Structures Algorithms
No ratings yet
Data Structures Algorithms
9 pages
Understanding Hash Tables and Functions
No ratings yet
Understanding Hash Tables and Functions
8 pages
Digital Halftoning Process
No ratings yet
Digital Halftoning Process
7 pages
Assignment 3
No ratings yet
Assignment 3
5 pages
Adaptive Filter for Impulsive Noise
No ratings yet
Adaptive Filter for Impulsive Noise
6 pages
Eet 401 Draft Scheme 24
No ratings yet
Eet 401 Draft Scheme 24
3 pages
P 2 Numerical Method 1
No ratings yet
P 2 Numerical Method 1
41 pages
ECMC02 Final Exam Answers 2003
No ratings yet
ECMC02 Final Exam Answers 2003
6 pages
Data Structures Roadmap
No ratings yet
Data Structures Roadmap
6 pages
Numerical Methods: Interpolation
No ratings yet
Numerical Methods: Interpolation
50 pages
DSP Final Exam FS 2022
No ratings yet
DSP Final Exam FS 2022
1 page
Biogeography-Based Optimization: Dan Simon Cleveland State University Fall 2008
No ratings yet
Biogeography-Based Optimization: Dan Simon Cleveland State University Fall 2008
39 pages
Digital Pulse Modulation Overview
No ratings yet
Digital Pulse Modulation Overview
18 pages
Graphing Polynomial Functions
No ratings yet
Graphing Polynomial Functions
24 pages
Sheet
No ratings yet
Sheet
3 pages
CSE-205 Algorithms: Algorithm, Asymptotic Notation & Complexity Analysis
No ratings yet
CSE-205 Algorithms: Algorithm, Asymptotic Notation & Complexity Analysis
43 pages
Perceptron Algorithm Assignment Guide
100% (1)
Perceptron Algorithm Assignment Guide
5 pages

K-Means Clustering in Machine Learning

Uploaded by

K-Means Clustering in Machine Learning

Uploaded by

k-means clustering in ML

‫[من مقرر التعلم الذكاء احلسابي]‬

‫‪k-means clustering in ML‬‬

‫دراسة بحثيه مقدمه من الطالب‬

‫اسم الطالب‪ :‬عبدالرحمن قاسم محمد‬ ‫ى‬

‫اسم الطالب‪ :‬محمود رأشف عامر‬

Teaching the model by data set we

The model can predict unknown or

Supervised Unsupervised Reinforcement

Data with label Data without label State and Action

A.Classification A.Clustering A.Model-Free

Step-1: Data pre-processing Step

5. # Importing the dataset

From the above dataset, we need to find some patterns in it.

Step- 3: Training the K-means algorithm on the training dataset

26. #visulaizing the clusters

You might also like