0% found this document useful (0 votes)
39 views16 pages

Naïve Bayes Classifier Algorithm

The Naïve Bayes Classifier is a supervised learning algorithm based on Bayes' theorem, primarily used for classification tasks such as text classification and spam filtering. It operates under the assumption that features are independent, allowing for quick and efficient predictions. While it has advantages like speed and effectiveness in multi-class predictions, it also has limitations due to its independence assumption among features.

Uploaded by

imfind808
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views16 pages

Naïve Bayes Classifier Algorithm

The Naïve Bayes Classifier is a supervised learning algorithm based on Bayes' theorem, primarily used for classification tasks such as text classification and spam filtering. It operates under the assumption that features are independent, allowing for quick and efficient predictions. While it has advantages like speed and effectiveness in multi-class predictions, it also has limitations due to its independence assumption among features.

Uploaded by

imfind808
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Naïve Bayes Classifier

Algorithm
Dr Preeti Rai
Naïve Bayes Classifier Algorithm

• Naïve Bayes algorithm is a supervised learning algorithm, which is based


on Bayes theorem and used for solving classification problems.
• It is mainly used in text classification that includes a high-dimensional
training dataset.
• Naïve Bayes Classifier is one of the simple and most effective Classification
algorithms which helps in building the fast machine learning models that
can make quick predictions.
• It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.
• Some popular examples of Naïve Bayes Algorithm are spam filtration,
Sentimental analysis, and classifying articles.
Why is it called Naïve Bayes?

• The Naïve Bayes algorithm is comprised of two words Naïve and


Bayes, Which can be described as:
• Naïve: It is called Naïve because it assumes that the occurrence of a
certain feature is independent of the occurrence of other features.
Such as if the fruit is identified on the bases of color, shape, and taste,
then red, spherical, and sweet fruit is recognized as an apple. Hence
each feature individually contributes to identify that it is an apple
without depending on each other.
• Bayes: It is called Bayes because it depends on the principle of
Bayes' Theorem.
Bayes' Theorem:

• Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is


used to determine the probability of a hypothesis with prior
knowledge. It depends on the conditional probability.
• The formula for Bayes' theorem is given as:

Where,
P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.
Working of Naïve Bayes' Classifier:

• Convert the given dataset into frequency tables.


• Generate Likelihood table by finding the probabilities of given
features.
• Now, use Bayes theorem to calculate the posterior probability.

• Problem: If the weather is sunny, then the Player should play or not?
Play
Outlook
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
Step 1: Frequency table for the Weather Conditions

Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 4

ep 2: Likelihood table weather condition:

Weather No Yes
Overcast 0 5 5/14= 0.35
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
All 4/14=0.29 10/14=0.71
• Applying Bayes'theorem:
• P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
• P(Sunny|Yes)= 3/10= 0.3
• P(Sunny)= 0.35
• P(Yes)=0.71
• So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
• P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
• P(Sunny|NO)= 2/4=0.5
• P(No)= 0.29
• P(Sunny)= 0.35
• So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
• So as we can see from the above calculation that P(Yes|Sunny)>P(No|
Sunny)
• Hence on a Sunny day, Player can play the game.
Advantages of Naïve Bayes
Classifier:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a
class of datasets.
• It can be used for Binary as well as Multi-class Classifications.
• It performs well in Multi-class predictions as compared to the other
Algorithms.
• It is the most popular choice for text classification problems.
Disadvantages of Naïve Bayes
Classifier:
• Naive Bayes assumes that all features are independent or unrelated,
so it cannot learn the relationship between features.
Applications of Naïve Bayes
Classifier:
• It is used for Credit Scoring.
• It is used in medical data classification.
• It can be used in real-time predictions because Naïve Bayes Classifier
is an eager learner.
• It is used in Text classification such as Spam filtering and Sentiment
analysis.
Temperature Humidity Windy Play Golf
Outlook
0 Rainy Hot High False No
1 Rainy Hot High True No
2 Overcast Hot High False Yes
3 Sunny Mild High False Yes
4 Sunny Cool Normal False Yes
5 Sunny Cool Normal True No
6 Overcast Cool Normal True Yes
7 Rainy Mild High False No
8 Rainy Cool Normal False Yes
9 Sunny Mild Normal False Yes
10 Rainy Mild Normal True Yes
11 Overcast Mild High True Yes
12 Overcast Hot Normal False Yes
13 Sunny Mild High True No
• Now, its time to put a naive assumption to the Bayes’ theorem, which is, independence among the features.
So now, we split evidence into the independent parts.
• Now, if any two events A and B are independent, then,
P(A,B) = P(A)P(B)

• Hence, we reach to the result:

which can be expressed as:

Now, as the denominator remains constant for a given input, we can remove that term:
Let us test it on a new set of features (let us call it
today):
today = (Sunny, Hot, Normal, False)

You might also like