0% found this document useful (0 votes)

18 views6 pages

UnsupervisedLearning FoundationalMathofAI S24

The document provides an introduction to unsupervised learning, explaining its significance in discovering patterns in unlabelled data and its applications such as clustering, association, and dimensionality reduction. It details various clustering methods including centroid-based, hierarchical, distribution-based, and density-based clustering, along with evaluation metrics like the Silhouette score. Additionally, it covers dimensionality reduction techniques such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) to simplify datasets while retaining essential information.

Uploaded by

Tej Grover

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views6 pages

UnsupervisedLearning FoundationalMathofAI S24

Uploaded by

Tej Grover

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Introduction to Unsupervised Learning

Yashil Sukurdeep
July 2, 2024

1 Unsupervised Learning: Fundamentals

Unsupervised learning is a type of machine learning where models are trained
on unlabelled data, i.e., data that does not have labeled responses. As a result,
the goal is to train the model so that it learns patterns and structure from the
input data without any explicit instructions on what to predict. Unsupervised
learning is important because it helps to discover hidden patterns or intrinsic
structures in data, reduce the complexity of data, and prepare data for super-
vised learning tasks.

Unsupervised learning is used for:

• Clustering: Grouping data points into clusters based on their similarity.
• Association: Discovering rules that describe large portions of data.

• Dimensionality Reduction: Reducing the number of random variables

under consideration.
Some real life use cases of unsupervised learning include:
• Customer segmentation in marketing.

• Anomaly detection in fraud detection.

• Grouping news articles by topic.

2 Clustering
Clustering is the process of dividing a dataset into groups, where the members of
each group are similar in some way. The goal of clustering is to identify structure
in an unlabeled dataset. There are several methods used for clustering, each
with its own approach to grouping data points.

1
2.1 Centroid-Based Clustering
Centroid-based clustering methods partition the data into clusters by assign-
ing each data point to the cluster corresponding to the nearest centroid. The
centroids are then updated iteratively to minimize the distance between data
points and their respective centroids.

The K-Means clustering algorithm is an example of a centroid-based clustering

method. Given (unlabelled) data points {x1 , . . . , xn } ∈ Rd , it works as follows:

1. Initialization: Choose K initial centroids {c1 , . . . , cK } randomly from

the dataset. At this stage, each centroid could be one of the data points
themselves. Let the cluster corresponding to centroid cj be denoted by Cj
for j = 1, . . . , K.

2. Assignment step: Assign each data point to the cluster corresponding

to the nearest centroid. Mathematically, for each data point xi , find the
centroid cj that minimizes the Euclidean distance:
d
X
cj = argmin ∥xi − cℓ ∥2 = (xim − cℓm )2 ,
ℓ=1,...,n m=1

and assign data point xi to cluster Cj .

3. Update step: Recalculate the centroid of each cluster by taking the mean
of all data points assigned to that cluster. For each cluster Cj , update it’s
centroid as follows:
1 X
cj = xi
|Cj |
xi ∈Cj

4. Repeat: Repeat the assignment and update steps until the centroids no
longer change or the change is below a certain threshold.

The algorithm aims to minimize the within-cluster sum of squares (WCSS),

which is defined as:
XK X
WCSS = ∥xi − cj ∥2
j=1 xi ∈Cj

This objective function ensures that the data points within each cluster are as
close as possible to the centroid, leading to compact and well-separated clusters.

2
Figure 1: Illustration of K-means clustering with the Iris dataset. The first 3
features of the dataset were selected for visualization. The plot on the right
displays the clusters assigned by the K-means algorithm. The plot on the left
displays the data points colored by their true labels.

An important consideration when running the K-means algorithm is how to

choose the best value of K. This can be achieved by the Elbow method, which
consists of computing the ‘quality’ of your clusters as defined by an objective
function (such as the WCSS, or a performance metric such as the Silhouette
score) for a range of values of K, and then choosing the value of K where there
is an ’inflexion’ point in the graph of the objective function versus K:

Figure 2: Elbow method for choosing ‘best’ value of K for the K-means algo-
rithm.

3
2.2 Hierarchical Clustering
Hierarchical clustering is another clustering technique which builds a hierarchy
of clusters either by merging small clusters into larger ones (agglomerative) or
by splitting larger clusters into smaller ones (divisive). This method is often
visualized using a dendrogram.

Figure 3: Illustration of Hierarchical Clustering. On the left, we display the

data to be clustered, with the Euclidean distance used as the distance metric.
On the right, we display the hierarchical clustering dendrogram obtained.

2.3 Distribution-Based Clustering

Distribution-based clustering assumes that the data points are generated from a
mixture of several distributions. The goal is to identify the parameters of these
distributions.

Example: An example of this method is Gaussian Mixture Models (GMM).

2.4 Density-Based Clustering

Density-based clustering groups data points that are closely packed together,
marking as outliers the points that lie alone in low-density regions.

Example: DBSCAN (Density-Based Spatial Clustering of Applications with

Noise) is a popular algorithm for this type of clustering.

4
2.5 Evaluation Metrics for Clustering
To evaluate the performance of clustering algorithms, several metrics can be
used. The Silhouette score, in particular, is a powerful metric.

• Silhouette Score: Measures how similar an object is to its own cluster

(cohesion) compared to other clusters (separation). The Silhouette Score
s(xi ) for a data point xi is defined as:

b(xi ) − a(xi )
s(xi ) =
max(a(xi ), b(xi ))

where a(xi ) is the average distance from the point xi to the other points
in the same cluster, and b(xi ) is the average distance from the point xi to
points in the nearest cluster.

The average of the Silhouette score of the data points is then used as
the overall metric for the quality of clusters obtained accross the entire
dataset:
n
1X
S= S(xi ).
n i=1

3 Dimensionality Reduction
Dimensionality reduction is a process used in machine learning and statistics to
reduce the number of features or variables under consideration. This technique
simplifies the dataset by decreasing its dimensions without losing significant
information. It is essential for handling high-dimensional data efficiently and
is used to improve the performance of machine learning models, reduce storage
space, and decrease computation time.

3.1 Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is a statistical technique that transforms
the data into a set of orthogonal components (i.e., axes which are at right angles
to each other), which are ordered by the amount of variance they explain. The
first few components capture most of the variability in the data, allowing for a
reduction in the number of dimensions while preserving the essential patterns.

3.2 Linear Discriminant Analysis (LDA)

Linear Discriminant Analysis (LDA) is a technique used for both dimensionality
reduction and classification. It projects the data onto a lower-dimensional space
with the goal of maximizing the separation between multiple classes. LDA is
particularly useful when the classes are well-separated.

5
Figure 4: Illustration of Principal Component Analysis (PCA).

Figure 5: Illustration of Linear Discriminant Analysis (LDA) using the Iris

dataset. The different colors represent the three species of Iris flowers: setosa
(navy), versicolor (turquoise), and virginica (dark orange). The plot shows how
LDA projects the data onto a lower-dimensional space (with axes X1 and X2 )
with the goal of maximizing the separation between these classes.

Clustering FinancialData
No ratings yet
Clustering FinancialData
38 pages
Model 3
No ratings yet
Model 3
31 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
"These Are Just Rough Notes For References" What Is K-Means Clustering
No ratings yet
"These Are Just Rough Notes For References" What Is K-Means Clustering
9 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
31 pages
Unit IV
No ratings yet
Unit IV
96 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
66 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
40 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
M5
No ratings yet
M5
40 pages
Cluster Analysis Overview
No ratings yet
Cluster Analysis Overview
77 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
5 - Clustering
No ratings yet
5 - Clustering
13 pages
K Means
No ratings yet
K Means
25 pages
ML Unit III
No ratings yet
ML Unit III
82 pages
CE345 - Lecture #9 - Clustering
No ratings yet
CE345 - Lecture #9 - Clustering
56 pages
Module 3
No ratings yet
Module 3
21 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
23 pages
Unsupervised Learning & Clustering Guide
No ratings yet
Unsupervised Learning & Clustering Guide
49 pages
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
No ratings yet
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
65 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
23 pages
Unit 4
No ratings yet
Unit 4
46 pages
Supervised vs Unsupervised Learning
No ratings yet
Supervised vs Unsupervised Learning
50 pages
K Means
No ratings yet
K Means
9 pages
Lecture 1 (UNIT 1)
No ratings yet
Lecture 1 (UNIT 1)
68 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
CV Unit IV
No ratings yet
CV Unit IV
26 pages
Machine Learning Note Modul 4 5
No ratings yet
Machine Learning Note Modul 4 5
20 pages
ML Clustering2
No ratings yet
ML Clustering2
11 pages
Clustering-Part 1
No ratings yet
Clustering-Part 1
35 pages
ML Lecture06 Unsupervised Learning
No ratings yet
ML Lecture06 Unsupervised Learning
87 pages
Clustering
No ratings yet
Clustering
65 pages
Clustering
No ratings yet
Clustering
55 pages
IT3080 Lecture04 2023
No ratings yet
IT3080 Lecture04 2023
56 pages
Cluster Analysis and Methods Overview
No ratings yet
Cluster Analysis and Methods Overview
47 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
Clustering
No ratings yet
Clustering
45 pages
ML Module5 Clustering
No ratings yet
ML Module5 Clustering
71 pages
How To Perform Clustering Algorithms in Machine Learning
No ratings yet
How To Perform Clustering Algorithms in Machine Learning
9 pages
Lecture Unsupervised (17!04!2024)
No ratings yet
Lecture Unsupervised (17!04!2024)
61 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
Unit - 4 DWDM
No ratings yet
Unit - 4 DWDM
27 pages
Unit4 ML
No ratings yet
Unit4 ML
20 pages
FML Unit4
No ratings yet
FML Unit4
14 pages
3.k-Metoids and Hierarchical Updated
No ratings yet
3.k-Metoids and Hierarchical Updated
50 pages
Week 9. Unsupervised Learning
No ratings yet
Week 9. Unsupervised Learning
32 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
32 pages
(3rd Year) Pattern REcognition Lecture 4
No ratings yet
(3rd Year) Pattern REcognition Lecture 4
48 pages
13 Unsupervised Learning
No ratings yet
13 Unsupervised Learning
9 pages
Lecture 8 - Clustering
No ratings yet
Lecture 8 - Clustering
23 pages
U1 - KMeans - 5th Sem - DS
No ratings yet
U1 - KMeans - 5th Sem - DS
14 pages
Unit 4
No ratings yet
Unit 4
59 pages
Unsupervised Learning for Students
No ratings yet
Unsupervised Learning for Students
59 pages
Unsupervised
No ratings yet
Unsupervised
14 pages
Final ML Unit3 May24
No ratings yet
Final ML Unit3 May24
154 pages
Graphing Periodic Trends Completed COLORS
No ratings yet
Graphing Periodic Trends Completed COLORS
3 pages
SSDD Projectile Motion
No ratings yet
SSDD Projectile Motion
1 page
Linear Regression for Analysts
No ratings yet
Linear Regression for Analysts
6 pages
Chem CH 1 SL
No ratings yet
Chem CH 1 SL
56 pages
Math Problem Solving Practice Test
No ratings yet
Math Problem Solving Practice Test
3 pages
Foundational Mathematics of Artificial Intelligence - Introduction
No ratings yet
Foundational Mathematics of Artificial Intelligence - Introduction
25 pages
AP Java Exam Reference Guide
No ratings yet
AP Java Exam Reference Guide
15 pages
Overview of Artificial Neurons
100% (1)
Overview of Artificial Neurons
16 pages
AI & ML Curriculum for Learners
No ratings yet
AI & ML Curriculum for Learners
7 pages
Module 3 Quiz: Movie Genre Classification
No ratings yet
Module 3 Quiz: Movie Genre Classification
1 page
X Boost
No ratings yet
X Boost
2 pages
Machine Learning vs. Traditional Programming
No ratings yet
Machine Learning vs. Traditional Programming
20 pages
Syllabus ANN
No ratings yet
Syllabus ANN
2 pages
Introduction to Graph Neural Networks
No ratings yet
Introduction to Graph Neural Networks
22 pages
Multilayer Perceptron
No ratings yet
Multilayer Perceptron
9 pages
Fuzzy Logic & Machine Learning - PPT
No ratings yet
Fuzzy Logic & Machine Learning - PPT
138 pages
Question Bank Aiml
No ratings yet
Question Bank Aiml
10 pages
Agglomerative Hierarchical Clustering
No ratings yet
Agglomerative Hierarchical Clustering
21 pages
UNIT 3-Multilayer-Perceptrons
No ratings yet
UNIT 3-Multilayer-Perceptrons
23 pages
Jackknife & Bootstrap Resampling Guide
0% (1)
Jackknife & Bootstrap Resampling Guide
2 pages
Backpropagation Example
No ratings yet
Backpropagation Example
9 pages
Solution Assignment 7
No ratings yet
Solution Assignment 7
2 pages
CT1 Question - D
No ratings yet
CT1 Question - D
4 pages
Pretest Praktikum
75% (4)
Pretest Praktikum
3 pages
Research Proposal Assignment
No ratings yet
Research Proposal Assignment
31 pages
Neural Network Backpropagation Practice
No ratings yet
Neural Network Backpropagation Practice
9 pages
Analisis Sentimen Terhadap Aplikasi Ruangguru Meng
No ratings yet
Analisis Sentimen Terhadap Aplikasi Ruangguru Meng
10 pages
Training Feedforward DNN Guide
No ratings yet
Training Feedforward DNN Guide
9 pages
AI & ML Unit 4 Notes
No ratings yet
AI & ML Unit 4 Notes
16 pages
Exam Questions and Answers
No ratings yet
Exam Questions and Answers
5 pages
ML Daily Tracker 8 Weeks
No ratings yet
ML Daily Tracker 8 Weeks
2 pages
Key Scikit-Learn Models and Hyperparameters
No ratings yet
Key Scikit-Learn Models and Hyperparameters
1 page
LENET
No ratings yet
LENET
9 pages
Perceptrons and Neural Networks: Manuela Veloso
No ratings yet
Perceptrons and Neural Networks: Manuela Veloso
23 pages
Fake News Detection with Deep Learning
No ratings yet
Fake News Detection with Deep Learning
8 pages
Applied Deep Learning - Part 3 - Autoencoders - by Arden Dertat - Towards Data Science
No ratings yet
Applied Deep Learning - Part 3 - Autoencoders - by Arden Dertat - Towards Data Science
20 pages
Lecture 04 Back Propagation
No ratings yet
Lecture 04 Back Propagation
50 pages

UnsupervisedLearning FoundationalMathofAI S24

Uploaded by

UnsupervisedLearning FoundationalMathofAI S24

Uploaded by

Introduction to Unsupervised Learning

1 Unsupervised Learning: Fundamentals

Unsupervised learning is used for:

• Dimensionality Reduction: Reducing the number of random variables

• Anomaly detection in fraud detection.

The K-Means clustering algorithm is an example of a centroid-based clustering

1. Initialization: Choose K initial centroids {c1 , . . . , cK } randomly from

2. Assignment step: Assign each data point to the cluster corresponding

and assign data point xi to cluster Cj .

The algorithm aims to minimize the within-cluster sum of squares (WCSS),

An important consideration when running the K-means algorithm is how to

Figure 3: Illustration of Hierarchical Clustering. On the left, we display the

2.3 Distribution-Based Clustering

Example: An example of this method is Gaussian Mixture Models (GMM).

2.4 Density-Based Clustering

Example: DBSCAN (Density-Based Spatial Clustering of Applications with

• Silhouette Score: Measures how similar an object is to its own cluster

3.1 Principal Component Analysis (PCA)

3.2 Linear Discriminant Analysis (LDA)

Figure 5: Illustration of Linear Discriminant Analysis (LDA) using the Iris

You might also like