Bayesian Classifier for Document Classification

This document describes using a Bayesian classifier to classify movie reviews as positive or negative based on the presence of certain words. It: 1) Shows a sample dataset of 5 movie reviews labeled as positive or negative based on the presence of words like "loved", "hated", "great", "poor", etc. 2) Calculates probabilities of words occurring in positive and negative reviews to build the classifier. 3) Tests a new review "I hated the poor acting" and predicts it as negative since that combination of words is more probable in the negative class.

Uploaded by

Abu Talha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

128 views3 pages

Bayesian Classifier for Document Classification

Uploaded by

Abu Talha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Document Classification using Bayesian Classifier

Document Text Class

1 I loved the movie +
2 I hated the movie -
3 A great movie. Good movie +
4 Poor acting -
5 Great acting. A good movie +

Total 10 unique words (Vocabulary): I, loved, the, movie, hated, a, great, good, poor, acting

Table 1: Converts documents into feature set and class label

Doc I Loved The Movie Hated A Great Good poor acting class
1 1 1 1 1 +

2 1 1 1 1 -

3 2 1 1 1 +

4 1 1 -

5 1 1 1 1 1 +
p(+) = 3/5 = 0.6

n be the number words in the + class: 14,

nk the number of times word k occurs in the + class

p(wk|+) = (nk +1)/(n+vocabulary)

Table 2: Converts documents into feature set for positive class

Doc I Loved The Movie Hated A Great Good poor acting class
1 1 1 1 1 +

3 2 1 1 1 +

5 1 1 1 1 1 +

P(I|+) = (1+1)/(14 + 10) = 0.0833 P(loved|+) = (1+1)/(14 + 10) = 0.0833

P(the|+) = (1+1)/(14 + 10) = 0.0833 P(movie|+) = (4+1)/(14 + 10) = 0.2083

P(a|+) = (2+1)/(14 + 10) = 0.125 P(great|+) = (2+1)/(14 + 10) = 0.125

P(acting|+) = (1+1)/(14 + 10) = 0.0833 P(good|+) = (2+1)/(14 + 10) = 0.125

P(hated|+) = (0+1)/(14 + 10) = 0.0417 P(poor|+) = (0+1)/(14 + 10) = 0.0417

Table 3: Converts documents into feature set for negative class

Doc I Loved The Movie Hated A Great Good poor acting class

2 1 1 1 1 -

4 1 1 -

(-) = 2/5 = 0.4

n be the number words in the - class: 6,

nk the number of times word k occurs in the - class

p(wk|-) = (nk +1)/(n+vocabulary)

P(I|-) = (1+1)/(6 + 10) = 0.125 P(loved|-) = (0+1)/(6 + 10) = 0.0625

P(the|-) = (1+1)/(6 + 10) = 0.125 P(movie|-) = (1+1)/(6 + 10) = 0.125

P(hated|-) = (1+1)/(6 + 10) = 0.125 P(a|-) = (0+1)/(6 + 10) = 0.0625

P(great|-) = (0+1)/(6 + 10) = 0.0625 P(good|-) = (0+1)/(6 + 10) = 0.0625

P(poor|-) = (1+1)/(6 + 10) = 0.125 P(acting|-) = (1+1)/(6 + 10) = 0.125

Testing Data: I hated the poor acting, ?

If Cj = +; p(+) x p(I|+) x p(hated|+) x p(the|+) x p(poor|+) x p(acting|+) = 6.03 x 10-7

If Cj = -; p(-) x p(I|-) x p(hated|-) x p(the|-) x p(poor|-) x p(acting|-) = 1.22 x 10-5

Hence, the class label of testing data “I hated the

poor acting” is -.

Naïve Bayes for Text Classification
No ratings yet
Naïve Bayes for Text Classification
25 pages
T4L1 Naive Bayes
No ratings yet
T4L1 Naive Bayes
50 pages
04-Textcat Text Class
No ratings yet
04-Textcat Text Class
77 pages
Bernoulli Naive Bayes Classification Explained
No ratings yet
Bernoulli Naive Bayes Classification Explained
8 pages
Naive Bayes Text Classification Guide
No ratings yet
Naive Bayes Text Classification Guide
10 pages
007 Z-Score - Text-Classification - TF-IDF - Unlocked
No ratings yet
007 Z-Score - Text-Classification - TF-IDF - Unlocked
6 pages
NLP 5
No ratings yet
NLP 5
1 page
Naive Bayes for Text Classification
No ratings yet
Naive Bayes for Text Classification
82 pages
Venn Diagrams and Probability
No ratings yet
Venn Diagrams and Probability
19 pages
Lecture Note Session-3,4
No ratings yet
Lecture Note Session-3,4
4 pages
Naïve Bayes as a Generative Model
No ratings yet
Naïve Bayes as a Generative Model
26 pages
Lecture Notes - Naive Bayes New
No ratings yet
Lecture Notes - Naive Bayes New
8 pages
6naiev Base
No ratings yet
6naiev Base
37 pages
NaiveBayes Classifier+EvaluationMatrics
No ratings yet
NaiveBayes Classifier+EvaluationMatrics
15 pages
Detailed Lesson Plan on Probability
No ratings yet
Detailed Lesson Plan on Probability
3 pages
Multimedia Application L7 - For
No ratings yet
Multimedia Application L7 - For
46 pages
Statistics Lesson Plan: Probability Concepts
No ratings yet
Statistics Lesson Plan: Probability Concepts
3 pages
Addition Rule in Probability Explained
No ratings yet
Addition Rule in Probability Explained
3 pages
Naive Bayes Parameter Learning Guide
No ratings yet
Naive Bayes Parameter Learning Guide
2 pages
Detailed Lesson Plan in Mathematics 5
100% (8)
Detailed Lesson Plan in Mathematics 5
6 pages
04 Textcat
No ratings yet
04 Textcat
101 pages
NLP Mod-3
No ratings yet
NLP Mod-3
15 pages
Grade 8 Probability Lesson Plan
No ratings yet
Grade 8 Probability Lesson Plan
5 pages
COMP 1433 Quiz 1 (Tuesday)
No ratings yet
COMP 1433 Quiz 1 (Tuesday)
2 pages
Classification
No ratings yet
Classification
81 pages
Probability
No ratings yet
Probability
37 pages
Module 3 NLP
No ratings yet
Module 3 NLP
17 pages
13 Probability 2
No ratings yet
13 Probability 2
2 pages
06 Multinomial Naive Bayes - A Worked Example 8-58
No ratings yet
06 Multinomial Naive Bayes - A Worked Example 8-58
4 pages
Perform An Experimental Probability and Record Results by Listing
100% (1)
Perform An Experimental Probability and Record Results by Listing
7 pages
4 - Conditional Probability - MC - Guide and Lab
No ratings yet
4 - Conditional Probability - MC - Guide and Lab
9 pages
Association of Attributes - Compressed
100% (1)
Association of Attributes - Compressed
50 pages
Probablitity Diagrams
No ratings yet
Probablitity Diagrams
17 pages
Year 10 Probability Calculator Paper
No ratings yet
Year 10 Probability Calculator Paper
8 pages
Probability Basics: Bayes' Theorem & Counting
No ratings yet
Probability Basics: Bayes' Theorem & Counting
26 pages
Data Analysis Problems in Civil Engineering
No ratings yet
Data Analysis Problems in Civil Engineering
12 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
Probability Problems for Students
No ratings yet
Probability Problems for Students
20 pages
MT2013 - 003981 - CLC - 20231 - Quiz Chapter 1
No ratings yet
MT2013 - 003981 - CLC - 20231 - Quiz Chapter 1
5 pages
Probability Basics for Students
No ratings yet
Probability Basics for Students
15 pages
Log-Log Term Frequency Model Analysis
No ratings yet
Log-Log Term Frequency Model Analysis
13 pages
Probability Exercises for IGCSE Students
No ratings yet
Probability Exercises for IGCSE Students
5 pages
Overview of Probability Concepts
No ratings yet
Overview of Probability Concepts
6 pages
Bayes Learning
No ratings yet
Bayes Learning
42 pages
Addition Rule for Probability Explained
No ratings yet
Addition Rule for Probability Explained
25 pages
Probability
No ratings yet
Probability
40 pages
Mathematics: Quarter 3 - Module 28
No ratings yet
Mathematics: Quarter 3 - Module 28
14 pages
Chap 5 Probability Concepts
No ratings yet
Chap 5 Probability Concepts
31 pages
Understanding Bayes' Rule in Statistics
No ratings yet
Understanding Bayes' Rule in Statistics
4 pages
Supervised Machine Learning Week 3 Short Answers Assignment Fall
No ratings yet
Supervised Machine Learning Week 3 Short Answers Assignment Fall
9 pages
Cryptanalysis and Probability Distributions
No ratings yet
Cryptanalysis and Probability Distributions
4 pages
Text Classification
No ratings yet
Text Classification
80 pages
Ce Ec 1 - Probability and Statistics
100% (1)
Ce Ec 1 - Probability and Statistics
2 pages
Optimal Tree Count for Random Forests
No ratings yet
Optimal Tree Count for Random Forests
44 pages
Calculating Probabilities and Venn Diagrams
No ratings yet
Calculating Probabilities and Venn Diagrams
62 pages
Probability of Two Events Lesson Plan
No ratings yet
Probability of Two Events Lesson Plan
10 pages
Mobile App Color Display Lab Report
No ratings yet
Mobile App Color Display Lab Report
8 pages
Spring 2020 Admit Card Notice
No ratings yet
Spring 2020 Admit Card Notice
1 page
Mobile App Development Lab Report
No ratings yet
Mobile App Development Lab Report
12 pages
2D Promotional Video Budget Proposal
No ratings yet
2D Promotional Video Budget Proposal
1 page
Summer 2020 Course Registration Fees
No ratings yet
Summer 2020 Course Registration Fees
1 page
PHP Lab Assignment Solutions
No ratings yet
PHP Lab Assignment Solutions
6 pages
2D Parity Error Detection Lab Report
No ratings yet
2D Parity Error Detection Lab Report
6 pages
PHP Programming Assignments
No ratings yet
PHP Programming Assignments
1 page
PHP Assignment Submission Guidelines
No ratings yet
PHP Assignment Submission Guidelines
1 page
Circuit vs Packet Switching Explained
No ratings yet
Circuit vs Packet Switching Explained
3 pages
Data Communication Lab Assignments
No ratings yet
Data Communication Lab Assignments
16 pages
Two-Dimensional Parity Check Lab Report
No ratings yet
Two-Dimensional Parity Check Lab Report
5 pages
Data Communication Lab Assignments
No ratings yet
Data Communication Lab Assignments
16 pages
Cyclic Redundancy Check Lab Report
No ratings yet
Cyclic Redundancy Check Lab Report
4 pages
JavaScript Code Review and Fixes
No ratings yet
JavaScript Code Review and Fixes
3 pages
Checksum Error Detection Lab Report
No ratings yet
Checksum Error Detection Lab Report
5 pages
Two-Dimensional Parity Check Lab Report
No ratings yet
Two-Dimensional Parity Check Lab Report
5 pages
Ichimoku Kinko Hyo Chart Signals Guide
No ratings yet
Ichimoku Kinko Hyo Chart Signals Guide
45 pages
Microwave Transmission Overview
No ratings yet
Microwave Transmission Overview
12 pages
Ezra Nehemiah Rulebook
No ratings yet
Ezra Nehemiah Rulebook
40 pages
Chris Tomlin - Holy Forever Lyrics Genius Lyrics
No ratings yet
Chris Tomlin - Holy Forever Lyrics Genius Lyrics
1 page
Comparison Study Chinese and Western Architecture
No ratings yet
Comparison Study Chinese and Western Architecture
11 pages
RC Circuit Charging & Discharging Lab
No ratings yet
RC Circuit Charging & Discharging Lab
6 pages
ORIX Consent & Authorization Form
No ratings yet
ORIX Consent & Authorization Form
2 pages
Cleto Introduction
No ratings yet
Cleto Introduction
42 pages
EDUC 31 Unit 3 Lesson 6 Classroom Management and Organization
No ratings yet
EDUC 31 Unit 3 Lesson 6 Classroom Management and Organization
14 pages
Class X Math Curriculum Guide
No ratings yet
Class X Math Curriculum Guide
6 pages
African Americans Combined Volume A Concise HIstory Combined Volume 5th Edition Ebook and TestBank Bundle Get It Now
No ratings yet
African Americans Combined Volume A Concise HIstory Combined Volume 5th Edition Ebook and TestBank Bundle Get It Now
349 pages
Total History and Civics Class 9 ICSE Morning Star Solutions Chapter
67% (3)
Total History and Civics Class 9 ICSE Morning Star Solutions Chapter
3 pages
IQ Test for Senior High Students
No ratings yet
IQ Test for Senior High Students
9 pages
Workshop on Enhancing Digital Learning Materials
No ratings yet
Workshop on Enhancing Digital Learning Materials
4 pages
Westgard Rules - Multirules by James Westgard - Westgard QC
No ratings yet
Westgard Rules - Multirules by James Westgard - Westgard QC
16 pages
Aisha's Quest to Guide Her Friends
No ratings yet
Aisha's Quest to Guide Her Friends
67 pages
Tress of The Emerald Sea - Wikipedia
No ratings yet
Tress of The Emerald Sea - Wikipedia
15 pages
Marketing Exam English Final
No ratings yet
Marketing Exam English Final
4 pages
Supreme Court Case Update: SLP 15739/2019
No ratings yet
Supreme Court Case Update: SLP 15739/2019
3 pages
Sca2010 23
No ratings yet
Sca2010 23
12 pages
New Study Quantifies Use of Social Media in Arab Spring - UW
No ratings yet
New Study Quantifies Use of Social Media in Arab Spring - UW
5 pages
Music 6 Q2 Mod1
No ratings yet
Music 6 Q2 Mod1
32 pages
LPQ150-C Series Power Supply Guide
No ratings yet
LPQ150-C Series Power Supply Guide
6 pages
Tunings - An Introduction To Historical Tunings
No ratings yet
Tunings - An Introduction To Historical Tunings
10 pages
DMGT OBjectives-1
No ratings yet
DMGT OBjectives-1
4 pages
Executive Branch of Philippine Government
No ratings yet
Executive Branch of Philippine Government
16 pages
Turbulent Shear Layer in Divergent Flow
No ratings yet
Turbulent Shear Layer in Divergent Flow
9 pages
Aem 102 - 0
No ratings yet
Aem 102 - 0
64 pages
More Math Games & Activities From Around The World Mantesh
100% (2)
More Math Games & Activities From Around The World Mantesh
174 pages
Oakley vs. South Bay: Patent Dispute
No ratings yet
Oakley vs. South Bay: Patent Dispute
115 pages