0% found this document useful (0 votes)
31 views44 pages

Chapter Four - Part One

Chapter 4 discusses classification and clustering algorithms, focusing on various classification algorithms such as Logistic Regression, Decision Trees, Naïve Bayes, Support Vector Machines, and K-Nearest Neighbors. It explains the types of classifiers (binary and multi-class) and elaborates on the implementation and evaluation of these algorithms, including data preprocessing and performance metrics like confusion matrix, precision, recall, and F1-score. Additionally, it covers specific algorithms in detail, including their advantages and disadvantages.

Uploaded by

hiluf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views44 pages

Chapter Four - Part One

Chapter 4 discusses classification and clustering algorithms, focusing on various classification algorithms such as Logistic Regression, Decision Trees, Naïve Bayes, Support Vector Machines, and K-Nearest Neighbors. It explains the types of classifiers (binary and multi-class) and elaborates on the implementation and evaluation of these algorithms, including data preprocessing and performance metrics like confusion matrix, precision, recall, and F1-score. Additionally, it covers specific algorithms in detail, including their advantages and disadvantages.

Uploaded by

hiluf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Chapter -4 Classification and Clustering

Algorithms
Classification Algorithms

Classification algorithms are used when the output variable is


categorical, which means there are two classes such as Yes-No,
Male-Female, True-false,
Popular classification algorithms
➔ Logistic Regression
➔ Decision Trees

Naïve Bayes
➔ Support vector Machines

KNN

Random Forest

2
Classification Algorithms
● In classification algorithm, a discrete output function(y) is mapped
to input variable(x).
y=f(x), where y = categorical output

3
Classification Algorithms

The algorithm which implements the classification on a dataset is known as a
classifier.
There are two types of Classifications:
 Binary Classifier: classification problem has only two possible outcomes, then
it is called as Binary Classifier.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG,
etc.

Multi-class Classifier: If a classification problem has more than two outcomes,
then it is called as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.

4
Types of ML Classification Algorithms

Classification Algorithms can be further divided into the Mainly two category:
Linear Models
✔ Logistic Regression

Support Vector Machines
Non-linear Models
✔ K-Nearest Neighbours
✔ Decision Tree

Naïve Bayes

Random Forest

5
Types of Logistic Regression

 On the basis of the categories, Logistic Regression can be classified into three
types:

 Binomial: In binomial Logistic regression, there can be only two possible


types of the dependent variables, such as 0 or 1, Pass or Fail, etc.

 Multinomial: In multinomial Logistic regression, there can be 3 or more


possible unordered types of the dependent variable, such as "cat", "dogs", or
"sheep"

 Ordinal: In ordinal Logistic regression, there can be 3 or more possible


ordered types of dependent variables, such as "low", "Medium", or "High".

6
ML Implementation of Logistic Regression

 Data Pre-processing step


Fitting Logistic Regression to the Training set

 Predicting the test result

 Test accuracy of the result

7
Code samples

8
Example -
There is a car making company that has recently launched a new car

So the company wanted to check predict whether a user will purchase the product or not, one
needs to find out the relationship between Age and Estimated Salary.
Source of data

https://www.kaggle.com/code/sandragracenelson/logistic-regres
sion-on-user-data-csv/input

9
Data Processing -Related to our data set
Data Preprocessing Techniques Techniques you should apply:
1.Learn about your data using pandas df.shape ,df.describe() ,df.isnull().sum()

How about if we want to include the age as independent variable


Replace male and female with discrete values b/n 0 and 1

Select appropriate Data


X=df.iloc[:,:] or X= df [[, , , ,]]
2. as we see there is a variation b/n age and salary value which may create bias
So, need to apply Feature scaling /normalization using StandardScalaer or MinMaxScalar
3.split ,train and test your algorithm

 10
K-Nearest Neighbors(KNN)

K-Nearest Neighbors (KNN) is a simple and versatile machine learning


algorithm used for both classification and regression tasks.

The fundamental idea behind KNN is to predict the label of a data point by
looking at its k nearest neighbors in the feature space.

Technique to classify

Given a new, unseen data point, find the k-nearest neighbors in the training set based
on some distance metric (Euclidean distance).

For classification: Assign the majority class label among the k-nearest neighbors to the
new data point.

11
K-Nearest Neighbors(KNN)

12
K-Nearest Neighbors(KNN)
Advantages
– Conceptually simple, easy to understand and explain
– Very flexible decision boundaries
– Not much learning at all

Disadvantages
– It can be hard to find a good distance measure
– Irrelevant features and noise can be very detrimental
– Typically can not handle more than a few dozen attributes
– Computational cost: requires a lot computation and memory

13
SVM Machine Learning algorithm

14
SVM Machine Learning algorithm

Support Vector Machine (SVM) is one of the most useful supervised ML algorithms.

It can be used for both classification and regression tasks.

Basic idea of support vector machines:

SVM is a geometric model that views the input data as two sets of vectors in an n-

dimensional space.

• It constructs a separating hyperplane in that space, one which maximizes the margin

between the two data sets.

15
SVM Machine Learning algorithm
A good separation is achieved by the hyperplane that has the largest distance to the

neighbouring data points of both classes.

• The vectors (points) that constrain the width of the margin are the support vectors.

● Support vectors are the data points that lie closest to the decision surface

An SVM analysis finds the line (or, in general, hyperplane) that is oriented so that the
margin between the support vectors is maximized.
In the figure above, Solution 2 is superior to Solution 1 because it has a larger margin.

16
SVM Machine Learning algorithm

17
SVM Machine Learning algorithm

SVMs maximize the margin around


the separating hyperplane.
• The decision function is fully
specified by a subset of training
samples, the support vectors.
• 2-Ds, it’s a line.
• 3-Ds, it’s a plane.
• In more dimensions, call it a
hyperplane.

18
SVM Machine Learning algorithm
Basic idea of support vector machines:

– hyperplane for linearly separable patterns

-A hyperplane is a linear decision surface that splits the space into two parts

– For non-linearly separable data-- transformations of original data to map into new space –
the Kernel function

19
SVM Machine Learning algorithm
Important because of:

– Robust to very large number of variables and small samples

– Can learn both simple and highly complex classification models

– Employ sophisticated mathematical principles to avoid overfitting

– Can be used for both classification and regression tasks

-Effective in cases of limited data.

20
SVM Implementation Python
Scenario

Worldwide, breast cancer is the most common type of cancer in women and the second
highest in terms of mortality rates. Diagnosis of breast cancer is performed when an abnormal
lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-
ray).

After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it
is cancerous or not

21
Naïve Bayes ML Algorithm

Naïve Bayes Classifier is one of the simplest and most effective Classification
algorithms which helps in building the fast machine learning models that can
make quick predictions.


It is mainly used in text classification that includes a high-dimensional training
dataset.


Some popular examples of Naïve Bayes Algorithm are spam filtration,
Sentimental analysis, and classifying articles.

22
Naïve Bayes ML Algorithm

23
Naïve Bayes ML Algorithm

24
Example : Naïve Bayes ML
Problem: using the given data set , classify or predict weather a person with the given
condition will play tennis or not?

25
Example : Naïve Bayes ML
Step-1 calculate the prior/class label probability for Yes / No conditions Yes appeared 9 , and
no appeared 5 out of 14 probability

26
Example : Naïve Bayes ML
Step-2 calculate the conditional probability of individual attributes/predictors(outlook ,
temperature,Humidity,Windy)

27
Example : Naïve Bayes ML

Step-3 apply naive bayes formula to find new instance classification: sum up yes_ conditional
probabilities of all feature and no probabilities , then compare the value lastly normalize it

Finaly , we can conclude that with the given features person will not play tennis

28
Example : Naïve Bayes ML
Step-3 based on the following classify the new species ?

29
Example : Naïve Bayes ML
From the below we understand that the new instance to classified as H is higher than M ,
so the new instance is H ,

30
Example : Naïve Bayes ML
Advantage

– Simple

– Incremental learning

– Naturally a probability estimator

– Easily handles missing values

Disadvantage / Weakness

– Independence assumption

– Categorical/discrete attributes

– Sensitive to missing values

31
Example : Naïve Bayes ML Python
Implementation

from sklearn.naive_bayes import BernoulliNB

32
Decision Tree

33
Decision Tree

34
Decision Tree
Solving the classification problem using DT is a two-step process:
• Decision Tree Induction- Construct a DT using training data/Induction

35
Decision Tree-Algorithm

36
Decision Tree...

In order to build a tree, we use the CART algorithm, which stands for Classification and

Regression Tree algorithm.

● Pruning: Pruning is the process of removing the unwanted branches from the tree.


Entropy is defined as the randomness or measuring the disorder of the information being
processed in Machine Learning

● every piece of information has a specific value to make and can be used to draw conclusions
from it.

● Entropy is higher=> difficult to draw any conclusion from that piece of information.

37
Decision Tree...


Let's consider a case when all observations belong to
the same class; then entropy will always be 0.

When entropy becomes 0, then the dataset has no
impurity.

Datasets with 0 impurities are not useful for learning.
Further, if the entropy is 1, then this kind of dataset
is good for learning.

38
Attribute Selection Measures (ASM)
In DT, the main issue arises that how to select the best attribute for the root node and
for sub-nodes.
to solve such problems there is a technique which is called as Attribute Selection
Measure (ASM).
 Information Gini
✔ is the measurement of changes in entropy after the segmentation dataset based on

an attribute

According to the value of information gain, we split the node and build the decision tree.

DT algorithm always tries to maximize the value of information gain, and a
✔ Node / attribute having the highest information gain is split first.
 Gain Index:

is a measure of impurity used while creating a decision tree in the CART(uses gain index for
splitting )

An attribute with the low Gini index should be preferred as compared to the high Gini index.
39
Decision Tree-Python Implementation

Step : Import Library and Train the data

From sklearn.tree import DecisionTreeClassifier

classifier= DecisionTreeClassifier()

Key Term

Check More on how to calculate the Gini and Gain index

https://www.youtube.com/watch?v=wefc_36d5mU&ab_channel=MaheshHuddar

40
Evaluating a Classification model:

The matrix consists of predictions result in a summarized form, which has a


total number of correct predictions and incorrect predictions.
The matrix looks like as below table:

41
Evaluating a Classification model:

Use Confusion Matrix


The confusion matrix provides us a matrix/table as output and describes the
performance of the model.
It is also known as the error matrix.

42
Evaluating a Classification model:

Precision, Recall, and F1-Score


These metrics are particularly useful in binary or multiclass classification.
Precision: The ratio of correctly predicted positive observations to the total
predicted positives.
Recall: The ratio of correctly predicted positive observations to all actual
positives.
F1-Score: The harmonic mean of precision and recall.

43
Evaluating a Regression models

Regression Metrics:
Mean Absolute Error (MAE): The average absolute differences between
predicted and actual values.
Mean Squared Error (MSE): The average of the squared differences between
predicted and actual values.
Root Mean Squared Error (RMSE): The square root of the MSE, providing an
interpretable scale.

44

You might also like