0% found this document useful (0 votes)
32 views8 pages

Supervised Machine Learning Unit 3

This document covers supervised machine learning with a focus on the Naive Bayes classification algorithm, which utilizes Bayes' theorem and independence assumptions for classification tasks. It outlines the steps for classification, the application of Naive Bayes in various fields such as email filtering and medical diagnosis, and provides examples of parsing data from RSS feeds and analyzing regional attitudes. The document also includes code snippets for implementing Naive Bayes in Python.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Supervised Machine Learning Unit 3

This document covers supervised machine learning with a focus on the Naive Bayes classification algorithm, which utilizes Bayes' theorem and independence assumptions for classification tasks. It outlines the steps for classification, the application of Naive Bayes in various fields such as email filtering and medical diagnosis, and provides examples of parsing data from RSS feeds and analyzing regional attitudes. The document also includes code snippets for implementing Naive Bayes in Python.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

SUPERVISED MACHINE LEARNING

UNIT – 3
Syllabus
➢ Classifying with probability Theory-Naive Bayes
➢ Using probability distributions for classification
➢ Learning the naïve Bayes classifier
➢ Parsing data from RSS feeds
➢ Using naïve Bayes to reveal regional attitudes

Probability:
The probability is defined as the possibility of an event to happen is
equal to the ratio of the number of favourable outcomes and the total number
of outcomes.

• P(E) = Number of favourable outcomes/Total Number of outcomes.

Bayes Theorem:-
It is a fundamental concept in probability theory and statistics. It deals with
calculating the conditional probability of an event, which is the likelihood of
something happening given that something else has already occurred.
What is meant by Classification

A classification algorithm is a supervised learning technique used to select a new


observation category based on training data. the classification output variable is
a class, not a value, such as “green or blue”, “fruit or animal”, etc.
Probability in Classification
Examining the integration of probability into classification models. How to
interpret and utilize probability scores for improved model accuracy.
1. Classification Steps:
Step 1: Calculate distribution parameters (mean, standard deviation) for each
class.
Step 2: Use probability density functions to calculate feature likelihoods for new
data.
Step 3: Apply Bayes' theorem to combine likelihoods with prior probabilities.
Step 4: Classify new data based on the class with the highest probability.
Understanding Naive Bayes
• Naive Bayes is a simple supervised machine learning algorithm that uses
the Bayes’ theorem with strong independence assumptions between the
features to perfect results, where Naïve means assumes that each input
variable is independent. Example:- Predicting Email Spam
Classifying with probability theory-Naïve Bayes:-

Naïve Bayes classification is a popular technique in machine learning, particularly


for text classification and spam filtering. It's based on Bayes' theorem It is a
probabilistic classifier, which means it predicts on the basis of the probability of
an object.

Naive Bayes classifier calculates the probability by given steps:-


Step 1: Calculate the prior probability for given class labels.
Step 2: Find Likelihood probability with each attribute for each class.
Step 3: Put these value in Bayes Formula and calculate posterior probability.
• An overview of Naïve Bayes Classifier and its applications in machine
learning. We will explore the theory and limitations of this popular
classification algorithm.
Whenever you perform classification, the first step is to understand the problem
and identify potential features and label. It tests the classifier's performance.
Performance is evaluated on the basis of various parameters such as accuracy,
error, precision, and recall.
Daily life Example:-
• Imagine you're sorting your groceries after a shopping trip. You have a
basket full of items and want to categorize them efficiently. Probability
distributions can help you do this effectively.
Features: Each grocery item has features that determine where it should go.
These features could be temperature requirements (fresh vs. frozen),shelf life
(perishables vs. dry goods), or preparation needs (cooking required vs. ready-to-
eat).
Classes: Your goal is to classify each item into one of three classes: "Fridge" (cold
storage required),"Pantry" (room temperature storage),or "Freezer" (frozen
storage).
Real-World Applications
Naïve Bayes has diverse applications in email filtering, document classification,
sentiment analysis, and medical diagnosis.
Medical Diagnosis: Naïve Bayes can be used in medical diagnosis systems to
classify patients into different disease categories based on their symptoms,
medical history, and test results.

Parsing data from RSS feeds


An RSS (Really Simple Syndication) feed is a type of web feed that allows users
and applications to access updates to websites in a standardized, computer-
readable format. It is commonly used by news websites, blogs, and other online
publishers to distribute frequently updated content. Here are the key
components and features of an RSS feed:
Code:
import feedparser
url="[Link]
data=[Link](url)
for i in [Link]:
print(f"title: {[Link]}")
print(f"published: {[Link]}")
print(f"link : {[Link]}")
print(f"summary : {[Link]}")
print()
Using naïve Bayes to reveal regional attitudes:

1. Collect and Label Data: Gather text data from sources like social media or
surveys, ensuring each piece of data is labeled with a regional identifier.
2. Preprocess Text: Clean the text by removing unwanted characters, tokenize
the text into words, and convert text into numerical features using methods like
Bag of Words (BoW) or TF-IDF.
3. Train Naïve Bayes Model: Split your data into training and test sets, and use
the training set to train a naïve Bayes model, which will learn to associate words
with specific regions.
4. Analyze Results: Evaluate the model's accuracy using the test set, identify key
words that indicate regional attitudes, and use this insight for applications like
policy-making or marketing.

Code:
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split as tts
from sklearn.naive_bayes import MultinomialNB
from [Link] import accuracy_score
data = {
'text': [
"I love this place", "This area is not good", "Amazing experience in this
region",
"Worst place ever", "Beautiful scenery and friendly people",
"Terrible service and dirty environment", "Enjoyed the cultural festival
here",
"Had an unpleasant stay", "The food was fantastic", "Not a great place to
visit"
],
'region': ["North", "South", "East", "West", "North", "South", "East", "West",
"North", "South"]
}
d=[Link](data)
vector=CountVectorizer(stop_words='english')
x=vector.fit_transform(d['text'])
y=d['region']
x_train,x_test,y_train,y_test=tts(x,y,test_size=0.3,random_state=45)
nb=MultinomialNB()
[Link](x_train,y_train)
y_pred=[Link](x_test)
acc=accuracy_score(y_pred,y_test)
print(acc)

You might also like