0% found this document useful (0 votes)

17 views27 pages

Lecture 2

Uploaded by

asedovskaya.ann

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views27 pages

Lecture 2

Uploaded by

asedovskaya.ann

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Support Vector Machines (SVM)

Support Vector Machine (SVM) is a machine learning algorithm used for classification
problems. It is a supervised learning algorithm that uses a hyperplane to separate the
samples into different classes.

The hyperplane is a line (in 2-dimensional space) or a plane (in higher-dimensional

space) that separates the samples into different classes. SVM algorithms aim to find the
hyperplane with the largest margin, which is the distance between the hyperplane and
the closest samples from each class. These closest samples are known as support vectors.

SVM can handle both linear and non-linear data distributions by transforming the data
into a higher-dimensional space using a technique called kernel trick. In this transformed
space, the hyperplane can be a straight line or a curved surface, and it can separate the
samples into different classes even if they are not linearly separable in the original space.

Once the hyperplane is determined, it can be used to make predictions for new samples.
Samples that fall on one side of the hyperplane are classified as one class, and samples
that fall on the other side are classified as the other class.
SVM is widely used for binary classification problems, and it can also be used for multi-
class classification problems by training multiple binary classifiers and combining the
results. However, SVM can be computationally expensive for large datasets, and it can
also be sensitive to the choice of kernel function and the hyperparameters used to train
the model.
Sigmoid function
Multiclass classification
Multi-Class Classification – classification jobs with more than two class labels are
referred to as multi-class classification. Multiclass classification in machine learning,
unlike binary classification, does not distinguish between normal and pathological
results. Instead, examples are assigned to one of several pre-defined classes.

True class Predicted class

Class A Class B Class C Class D

Class A TP class A FP (A samples as B) FP (A samples as C) FP (A samples as D)

Class B FP (B samples as A) TP class B FP (B samples as C) FP (B samples as D)

Class C FP (C samples as A) FP (C samples as B) TP class C FP (C samples as D)

Class D FP (D samples as A) FP (D samples as B) FP (D samples as C) TP class D

Transformation to binary

 One-vs-rest
 One-vs-one

OvR or one-vs-all, OvA or one-against-all, OAA) strategy involves training a single

classifier per class, with the samples of that class as positive samples and all other
samples as negatives. This strategy requires the base classifiers to produce a real-valued
confidence score for its decision, rather than just a class label; discrete class labels alone
can lead to ambiguities, where multiple classes are predicted for a single sample.
Number of classsifiers is equal to number of classes.
In the one-vs.-one (OvO) reduction, one trains K (K − 1) / 2 binary classifiers for a K-
way multiclass problem; each receives the samples of a pair of classes from the original
training set, and must learn to distinguish these two classes. At prediction time, a voting
scheme is applied: all K (K − 1) / 2 classifiers are applied to an unseen sample and the
class that got the highest number of "+1" predictions gets predicted by the combined
classifiers.

Multiclass Classification using Support Vector Machine

k-Nearest Neighbors (k-NN) classification algorithm
The k-Nearest Neighbors (k-NN) classification algorithm is a simple and effective
machine learning technique used for classification tasks. The basic idea behind the
algorithm is to find the k data points in the training set that are closest to the new data
point, and then use those k data points to determine the class of the new data point.

Here's how the algorithm works:

1. The algorithm starts by storing the entire training dataset in memory.
2. When a new data point is encountered, the algorithm calculates the distance
between the new data point and each of the data points in the training set.
3. The algorithm then selects the k data points from the training set that are
closest to the new data point, based on the calculated distances.
4. The class of the new data point is determined by a majority vote of the classes
of the k nearest neighbours. If k = 1, the class of the new data point is the
same as the class of the closest data point in the training set.
5. This process is repeated for each new data point in the test set.
The k-NN algorithm is considered a "lazy" learning algorithm, because it doesn't
explicitly learn a model from the training data. Instead, it simply memorizes the training
data and uses it to make predictions on new data points. This makes the algorithm very
fast and simple to implement, but it also means that the algorithm is sensitive to the
choice of k and the quality of the training data.
In summary, the k-NN classification algorithm is a simple and effective algorithm that
can be used for classification tasks. It is particularly useful when the relationship
between the features and the classes is not well understood, and it can be easily
implemented in a variety of contexts.
There are several ways to calculate the distance between two data points in a machine
learning problem, including:
 Euclidean distance
 Manhattan Distance
 Minkowski Distance
 Cosine Similarity
 Jaccard Similarity
 Mahalanobis Distance
Euclidean Distance: This is the most used distance metric, and it is based on the
Pythagorean theorem. The Euclidean distance between two points is defined as the
square root of the sum of the squares of the differences between the corresponding
coordinates.
The equation for Euclidean distance between two points, x and y, with d features is given
by:

Manhattan Distance: Also known as the taxicab or city block distance, this distance
metric is calculated as the sum of the absolute differences between the coordinates of
the two points. It is used when the distance between two points is defined as the
minimum distance a person would have to travel if moving only vertically or
horizontally.
The equation for Manhattan distance (also known as L1 distance) between two points,
x and y, with d features is given by:
d(x, y) = |x1 - y1| + |x2 - y2| + ... + |xd - yd|
where x1, x2, ..., xd are the features of the first point, and y1, y2, ..., yd are the features of
the second point. The Manhattan distance is the sum of the absolute differences between
the corresponding features of the two points. This distance metric is often used in
problems where the distances between two points are defined as the minimum distance
a person would have to travel if moving only vertically or horizontally. For example, it
is commonly used in image processing and robotics problems.

Minkowski Distance: This is a generalization of the Euclidean and Manhattan

distances, and it is defined as the Lp-norm of the differences between the two points.
The parameter p determines whether the distance is equivalent to the Euclidean or
Manhattan distance.
The equation for Minkowski distance between two points, x and y, with d features is
given by:

where x1, x2, ..., xd are the features of the first point, and y1, y2, ..., yd are the features of
the second point. The parameter p determines the degree of the Minkowski distance and
can take on any positive real value. When p = 1, the Minkowski distance becomes the
Manhattan distance. When p = 2, the Minkowski distance becomes the Euclidean
distance.

Cosine Similarity: This distance metric is used when working with high-dimensional
data, such as text or image data. It is calculated as the dot product of two vectors
divided by the product of their magnitudes.
The equation for cosine similarity between two vectors, x and y, with d features is
given by:
Jaccard Similarity: This distance metric is used to compare the similarity between two
sets of data, such as in text classification or image segmentation problems. It is
calculated as the size of the intersection of the two sets divided by the size of the union
of the two sets.
The equation for Jaccard similarity between two sets, X and Y, is given by:
Jaccard_similarity(X, Y) = |X intersection Y| / |X union Y|
where |X| is the number of elements in set X, |Y| is the number of elements in set Y,
and |X intersection Y| is the number of elements that are common to both sets X and
Y. The Jaccard similarity is a measure of the similarity between two sets, and it ranges
from 0 to 1, with values close to 1 indicating high similarity and values close to 0
indicating high dissimilarity.
Jaccard similarity is widely used in information retrieval and text classification
problems, where the data is represented as sets of words or terms. It is also used in image
processing and computer vision problems to measure the similarity between two images
based on the objects or features present in the images. In addition, it is used in
bioinformatics to measure the similarity between two biological sequences, such as
DNA or protein sequences.
Mahalanobis Distance: This is a more advanced distance metric that takes into
account the covariance of the data. It is particularly useful when working with data
that has correlations between features.
The equation for Mahalanobis distance between a vector x and the mean vector μ of a
set of n vectors, in a d-dimensional feature space, is given by:

d(x, μ) = sqrt((x - μ)^T * Σ^-1 * (x - μ))

where x is the vector representing the point, μ is the mean vector of the set of n
vectors, Σ is the covariance matrix of the set of n vectors, and (x - μ)^T * Σ^-1 * (x -
μ) is the quadratic form. The Mahalanobis distance is a measure of the distance
between a point and the mean of a set of points, taking into account the covariance
structure of the data.
Mahalanobis distance is commonly used in multivariate statistical analysis to measure
the distance between a point and a distribution. It is especially useful when the data
has correlations between the features, as it accounts for these correlations in the
calculation of the distance. It is also used in outlier detection, pattern recognition, and
classification problems, where the goal is to distinguish between normal and abnormal
data points based on their distance from the mean of the data.

The choice of distance metric depends on the specific requirements of the problem and
the nature of the data. In some cases, the choice of distance metric can significantly
impact the performance of a machine learning algorithm, so it is important to carefully
consider the appropriate distance metric for each problem.
Identification and Verification
Identification - classification to classes.
Verification – confirmation belonging to class.
Identification refers to the process of determining who a person is. This is typically done
by requiring the person to provide some form of identification, such as a username and
password, a government-issued ID, or biometric data such as a fingerprint or facial
recognition. The goal of identification is to establish the identity of an individual so that
they can be granted access to certain resources or privileges.

Verification, on the other hand, is the process of confirming that a claimed identity is
accurate. This is typically done by comparing the information provided by the person
to some other source of information, such as a database or a trusted third-party service.
The goal of verification is to ensure that the person claiming to be someone is actually
who they say they are, and not an impostor.
In summary, identification is about establishing who a person is, while verification
is about confirming that a claimed identity is accurate.

ROC (Receiver Operating Characteristic)

• Performance of a classifier represented as a point on the ROC curve
• Changing some parameter of the algorithm, sample distribution, or cost
matrix changes the location of the point
ROC Curve

- 1-dimensional data set containing 2 classes (positive and negative)

- any points located at x > t is classified as positive
At threshold t:
TP=0.5, FN=0.5, FP=0.12, FN=0.88
Using ROC for Model Comparison
l No model consistently outperform the other
l M1 is better for small FPR
l M2 is better for large FPR
l Area Under the ROC curve (AUC)
l Ideal: Area = 1
l Random guess:
 Area = 0.5

Machine 3
No ratings yet
Machine 3
8 pages
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
No ratings yet
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
6 pages
AIML-Unit 4 Notes-Assignment 4
No ratings yet
AIML-Unit 4 Notes-Assignment 4
21 pages
K-NN Classification Review
No ratings yet
K-NN Classification Review
7 pages
ML and Ai Unit 04 and Unit 05
No ratings yet
ML and Ai Unit 04 and Unit 05
58 pages
Distance-Based Methods - KNN
0% (1)
Distance-Based Methods - KNN
8 pages
Presentation UNIT-2
No ratings yet
Presentation UNIT-2
96 pages
Unit 2
No ratings yet
Unit 2
16 pages
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
4.4-InstanceBasedLearning Part 2
No ratings yet
4.4-InstanceBasedLearning Part 2
16 pages
Enhanced K-Nearest Neighbor Algorithm: Dalvinder Singh Dhaliwal, Parvinder S. Sandhu, S. N. Panda
No ratings yet
Enhanced K-Nearest Neighbor Algorithm: Dalvinder Singh Dhaliwal, Parvinder S. Sandhu, S. N. Panda
5 pages
ML Unit 2
No ratings yet
ML Unit 2
24 pages
k-Nearest Neighbors Algorithm Overview
No ratings yet
k-Nearest Neighbors Algorithm Overview
2 pages
Understanding the KNN Algorithm Basics
No ratings yet
Understanding the KNN Algorithm Basics
6 pages
K-Nearest Neighbour Classifiers
No ratings yet
K-Nearest Neighbour Classifiers
18 pages
Unit Ii
No ratings yet
Unit Ii
54 pages
DataScience - Project (Banknote Authentication) - SHILANJOY BHATTACHARJEE EE
No ratings yet
DataScience - Project (Banknote Authentication) - SHILANJOY BHATTACHARJEE EE
14 pages
KNN Algorithm for Symbolic Data Analysis
No ratings yet
KNN Algorithm for Symbolic Data Analysis
4 pages
Scikit
No ratings yet
Scikit
3 pages
Multidimensional Scaling Explained
No ratings yet
Multidimensional Scaling Explained
6 pages
k-Nearest Neighbors Algorithm Overview
No ratings yet
k-Nearest Neighbors Algorithm Overview
10 pages
Ai Unit 4
No ratings yet
Ai Unit 4
17 pages
J Neunet 2018 06 003
No ratings yet
J Neunet 2018 06 003
28 pages
2.unit 2 ML Q&A
No ratings yet
2.unit 2 ML Q&A
36 pages
4.4-InstanceBasedLearning Part 1
No ratings yet
4.4-InstanceBasedLearning Part 1
16 pages
ML 2
No ratings yet
ML 2
6 pages
MachineLearning-Spring24 - KNN Implementation For Classification
No ratings yet
MachineLearning-Spring24 - KNN Implementation For Classification
3 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
Intrinsic Discrepancy in Disease Classification
No ratings yet
Intrinsic Discrepancy in Disease Classification
3 pages
K-Nearest Neighbors Algorithm Explained
No ratings yet
K-Nearest Neighbors Algorithm Explained
4 pages
4 Separator Margin LogisticRegression SVM
No ratings yet
4 Separator Margin LogisticRegression SVM
118 pages
Machine Learning - Supervised Methods, SVM and KNN
No ratings yet
Machine Learning - Supervised Methods, SVM and KNN
7 pages
Dsbdunitiii T1729232981820-1
No ratings yet
Dsbdunitiii T1729232981820-1
26 pages
Unit 2 ML
No ratings yet
Unit 2 ML
89 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
11 pages
Suwanda 2020 J. Phys. Conf. Ser. 1566 012058
No ratings yet
Suwanda 2020 J. Phys. Conf. Ser. 1566 012058
7 pages
Understanding Distance Functions in ML
No ratings yet
Understanding Distance Functions in ML
7 pages
ML04 KNN-SVM 2024-2025
No ratings yet
ML04 KNN-SVM 2024-2025
57 pages
Distance and Similarity Metrics
No ratings yet
Distance and Similarity Metrics
14 pages
Ch3 BayesianNetwork Onwards
No ratings yet
Ch3 BayesianNetwork Onwards
5 pages
Non-Parametric Classification Overview
No ratings yet
Non-Parametric Classification Overview
74 pages
FML 4
No ratings yet
FML 4
26 pages
Lecture 17 - KNN
No ratings yet
Lecture 17 - KNN
18 pages
Lecture Notes - SVM
No ratings yet
Lecture Notes - SVM
13 pages
Reachable Distance Function For KNN Classification
No ratings yet
Reachable Distance Function For KNN Classification
152 pages
ML Unit-2
No ratings yet
ML Unit-2
55 pages
DS - Module 3
No ratings yet
DS - Module 3
65 pages
Clustering
0% (1)
Clustering
127 pages
Class 1c - DataFundamentals
No ratings yet
Class 1c - DataFundamentals
27 pages
ML Unit 2
No ratings yet
ML Unit 2
11 pages
K-Nearest Neighbour Classification Guide
No ratings yet
K-Nearest Neighbour Classification Guide
29 pages
Similarity Measures
No ratings yet
Similarity Measures
11 pages
ML 6
No ratings yet
ML 6
26 pages
Similarity Based Learning (Part 2)
No ratings yet
Similarity Based Learning (Part 2)
15 pages
ML Unit-2
No ratings yet
ML Unit-2
33 pages
KNN Model-Based Approach in Classification
No ratings yet
KNN Model-Based Approach in Classification
11 pages
IV Distance and Rule Based Models 4.1 Distance Based Models
No ratings yet
IV Distance and Rule Based Models 4.1 Distance Based Models
45 pages
Efficient Vector Set Similarity Search
No ratings yet
Efficient Vector Set Similarity Search
19 pages
Full Stack Datasciece & Ai, Generative Ai, LLM Models
No ratings yet
Full Stack Datasciece & Ai, Generative Ai, LLM Models
26 pages
Faculty Development on ML & Deep Learning
No ratings yet
Faculty Development on ML & Deep Learning
10 pages
Lai Et Al., 2021: Human-AI Collaboration in Healthcare - A Review and Research Agend
No ratings yet
Lai Et Al., 2021: Human-AI Collaboration in Healthcare - A Review and Research Agend
10 pages
نیما حسینمردی فایل ارائه
No ratings yet
نیما حسینمردی فایل ارائه
22 pages
3.2 - Classification Metrics and Build The Confusion Matrix From Scratch
No ratings yet
3.2 - Classification Metrics and Build The Confusion Matrix From Scratch
7 pages
Backtest vs OOS in Trading Algorithms
No ratings yet
Backtest vs OOS in Trading Algorithms
19 pages
Restaurant
No ratings yet
Restaurant
42 pages
LoRA vs QLoRA: Fine-Tuning Techniques
No ratings yet
LoRA vs QLoRA: Fine-Tuning Techniques
5 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
163 pages
Deep Learning in Computer Vision
No ratings yet
Deep Learning in Computer Vision
738 pages
AI Applications in Financial Reporting
No ratings yet
AI Applications in Financial Reporting
81 pages
Big Data Analytics Course Guide
No ratings yet
Big Data Analytics Course Guide
59 pages
Additional MCQs Chap 4 MA
No ratings yet
Additional MCQs Chap 4 MA
4 pages
An AI Solution For Soil Fertility and Crop Friendl
No ratings yet
An AI Solution For Soil Fertility and Crop Friendl
5 pages
Resume Aamir 1
No ratings yet
Resume Aamir 1
2 pages
Classification Pros Cons
No ratings yet
Classification Pros Cons
1 page
Cost Functions in Machine Learning
No ratings yet
Cost Functions in Machine Learning
23 pages
1Z0 1127 25 Questions
No ratings yet
1Z0 1127 25 Questions
4 pages
A Two-Stage Cryptosystem Rcognition Scheme Based On Random Forest - 2018 en
No ratings yet
A Two-Stage Cryptosystem Rcognition Scheme Based On Random Forest - 2018 en
35 pages
Project Proposal Report Group 11
No ratings yet
Project Proposal Report Group 11
3 pages
Transformers
No ratings yet
Transformers
15 pages
Data Analytics With Cognos Questions
No ratings yet
Data Analytics With Cognos Questions
15 pages
Company Profile Hycone - Ai Updating Format
No ratings yet
Company Profile Hycone - Ai Updating Format
13 pages
Courses
No ratings yet
Courses
16 pages
ML Project Report
No ratings yet
ML Project Report
15 pages
Lecture 7
No ratings yet
Lecture 7
52 pages
Winter 2022 3160714
No ratings yet
Winter 2022 3160714
2 pages
Presentation 13
No ratings yet
Presentation 13
19 pages
Letter of Recommendation 2
No ratings yet
Letter of Recommendation 2
1 page
Machine Learning Fraud Detection System
No ratings yet
Machine Learning Fraud Detection System
7 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

Support Vector Machines (SVM)

The hyperplane is a line (in 2-dimensional space) or a plane (in higher-dimensional

True class Predicted class

Class A Class B Class C Class D

Class A TP class A FP (A samples as B) FP (A samples as C) FP (A samples as D)

Class B FP (B samples as A) TP class B FP (B samples as C) FP (B samples as D)

Class C FP (C samples as A) FP (C samples as B) TP class C FP (C samples as D)

Class D FP (D samples as A) FP (D samples as B) FP (D samples as C) TP class D

OvR or one-vs-all, OvA or one-against-all, OAA) strategy involves training a single

Multiclass Classification using Support Vector Machine

Here's how the algorithm works:

Minkowski Distance: This is a generalization of the Euclidean and Manhattan

d(x, μ) = sqrt((x - μ)^T * Σ^-1 * (x - μ))

ROC (Receiver Operating Characteristic)

- 1-dimensional data set containing 2 classes (positive and negative)

You might also like