0% found this document useful (0 votes)

19 views6 pages

Unit - 4 Learning Notes

This document covers advanced analytics and statistical modeling techniques for big data, including Naive Bayesian Classifier, K-means Clustering, and Linear and Logistic Regression. It provides detailed explanations of the Naive Bayesian Classifier with examples, including probability calculations and implementation in R, as well as an overview of K-means clustering and regression modeling. The document includes code snippets for practical application of these techniques using R programming.

Uploaded by

Krishnaprasanna M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views6 pages

Unit - 4 Learning Notes

Uploaded by

Krishnaprasanna M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unit – 4

(Learning Notes)
SYLLABUS:
 Advanced Analytics and Statistical Modelling for Big Data:
 Naive Bayesian Classifier,
 K-means Clustering
 Linear and Logistic Regression

Naive Bayseian Classifier

 P(c|x) is the posterior probability of class (c, target)
given predictor (x, attributes).
 P(c) is the prior probability of class.
 P(x|c) is the likelihood which is the probability
of predictor given class.
 P(x) is the prior probability of predictor.

Example: What is the possibilities of playing Golf on a Windy Sunny Day

with Hot Weather and High Humidity.

Row No Outlook Temperature Humidity Wind Play

1 Sunny 85 85 False No
2 Sunny 80 90 True No

3 Overcast 83 78 False Yes

4 Rain 70 96 False Yes
Row No Outlook Temperature Humidity Wind Play

5 Rain 68 80 False Yes

6 Rain 65 70 True No

7 Overcast 64 65 True Yes

8 Sunny 72 95 False No

9 Sunny 69 70 False Yes

10 Rain 75 80 False Yes

 P(C = Play) = 6/10 = 0.6

 P(C = No) = 4/10 = 0.4

Played or Not Played

Yes No P(Yes) P(No)
Outlook
Sunny 1 3 1/6 3/4
Overcast 2 0 2/6 0/4

Rain 3 1 3/6 ¼

Played or Not Played

Yes No P(Yes) P(No)
Temperature
Low (<70) 3 1 3/6 1/4

High(>=70) 3 3 3/6 3/4

Played or Not Played

Yes No P(Yes) P(No)
Humidity
Low (<85) 5 1 5/6 1/4

High(>=85) 3 1 3/6 1/4

Played or Not Played

Yes No P(Yes) P(No)
Wind
True 1 2 1/6 2/4

False 5 2 5/6 2/4

Given Condition & Played:

P(X|C)*P(C)
=1/6 * 1/6 * 3/6 * 3/6 * 6/10
=(0.16)*(0.16)*(0.5)*(0.5) *(0.6)
=0.0038

Given Condition & Not Played:

P(X|C)*P(C)
=2/4 * 3/4 * 3/4 * 1/4 *4/10
=(0.5) * (0.75) * (0.75) * (0.25) * (0.4)
=0.0281

Normalization Factor
P(X)
=4/10 * 6/10 * 4/10 * 3/10
=(0.4) * (0.6) * (0.4) * (0.3)
= 0.0288

Bayes Probability for Played

P(X|C)*P(C) / P(X) = 0.0038 / 0.0288 = 0.1319

Bayes Probability for Not - Played

P(X|C)*P(C) / P(X) = 0.0281 / 0.0288 = 0.9756
#Getting started with Naive Bayes
#Install the package
#[Link]("e1071")
#Loading the library
library(e1071)
?naiveBayes
#The documentation also contains an example implementation of
Titanic dataset
#Next load the Titanic dataset
data("Titanic")
str(Titanic)

#Save into a data frame and view it

Titanic_df=[Link](Titanic)
head(Titanic_df,10)
#Creating data from table
repeating_sequence=[Link](seq_len(nrow(Titanic_df)), Titanic_df$Freq)
#This will repeat each combination equal to the frequency of each
combination

#Create the dataset by row repetition created

Titanic_dataset=Titanic_df[repeating_sequence,]
#We no longer need the frequency, drop the feature
Titanic_dataset$Freq=NULL

#Fitting the Naive Bayes model

Naive_Bayes_Model=naiveBayes(Survived ~., data=Titanic_dataset)
#What does the model say? Print the model summary
Naive_Bayes_Model

#Prediction on the dataset

NB_Predictions=predict(Naive_Bayes_Model,Titanic_dataset)
NB_Predictions

#Confusion matrix to check accuracy

table(NB_Predictions,Titanic_dataset$Survived)
K – Mean Clustering
Algorithm:
1. Accept the data set
2. Identify two means / random numbers
3. Continue clustering the dataset until Means and equal or Two
successive clusters are same.

Sample – 1:
Dataset: {2, 3, 4, 10, 11, 12, 20, 25, 30}
Means: 3, 12

#Getting started with K-Mean

dataset <- c(2, 3, 4, 10, 11, 12, 20, 25, 30)
dataset

cluster <- kmeans(dataset,c(3,20),[Link] = 10)

cluster$cluster
cluster$size

dataset[cluster$cluster==1]
dataset[cluster$cluster==2]
#Getting started with Regression Model
# The predictor vector.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

# The response vector.

y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.

relation <- lm(y~x)

# Find weight of a person with height 170.

a <- [Link](x = 170)
result <- predict(relation,a)
print(result)

Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Naive Bayes
No ratings yet
Naive Bayes
6 pages
Pattern Recognition
No ratings yet
Pattern Recognition
76 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
Unit 2 AAM
No ratings yet
Unit 2 AAM
32 pages
32-Naive Bayes Cont''d-03-10-2024
No ratings yet
32-Naive Bayes Cont''d-03-10-2024
31 pages
Naive Bayes & SVM Overview
No ratings yet
Naive Bayes & SVM Overview
79 pages
Naïve Bayes Classifier Overview
No ratings yet
Naïve Bayes Classifier Overview
11 pages
Lecture - 4.2 - Continuous Data and Zero Frequency Problem in Naive Bayes Classifier
No ratings yet
Lecture - 4.2 - Continuous Data and Zero Frequency Problem in Naive Bayes Classifier
11 pages
Lecture10 - Bayesian Classifier
No ratings yet
Lecture10 - Bayesian Classifier
40 pages
2 Naive Bayes
No ratings yet
2 Naive Bayes
49 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
CCS - Lec 5
No ratings yet
CCS - Lec 5
33 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
7 pages
Lec 03 NaiveBayesClassification
No ratings yet
Lec 03 NaiveBayesClassification
33 pages
Bayesian-Classification Ok
No ratings yet
Bayesian-Classification Ok
21 pages
Naïve Bayes
No ratings yet
Naïve Bayes
15 pages
Naive Bayes Classifier Guide
No ratings yet
Naive Bayes Classifier Guide
24 pages
ML Unit 2
No ratings yet
ML Unit 2
107 pages
9 Supervised Learning - II
No ratings yet
9 Supervised Learning - II
55 pages
Machine Learning With Titanic Dataset Tutorial
No ratings yet
Machine Learning With Titanic Dataset Tutorial
7 pages
A Simple Approach To Weather Predictions by Using Naive Bayes Classifiers
No ratings yet
A Simple Approach To Weather Predictions by Using Naive Bayes Classifiers
12 pages
Naivebayes Tute
No ratings yet
Naivebayes Tute
4 pages
Naive Bayes Classification Guide
No ratings yet
Naive Bayes Classification Guide
21 pages
What Is Naive Bayes Algorithm?
No ratings yet
What Is Naive Bayes Algorithm?
18 pages
L6 - SLM Notes (Bayes Algorithm)
No ratings yet
L6 - SLM Notes (Bayes Algorithm)
28 pages
Naive Bayes for Tennis Prediction
No ratings yet
Naive Bayes for Tennis Prediction
4 pages
Unit 5-6
No ratings yet
Unit 5-6
18 pages
06 - NaiveBayes and ME
No ratings yet
06 - NaiveBayes and ME
25 pages
06 - NaiveBayes and ME
No ratings yet
06 - NaiveBayes and ME
26 pages
Bayesian Learning
No ratings yet
Bayesian Learning
41 pages
Bayesian Learning Lecture 1
No ratings yet
Bayesian Learning Lecture 1
4 pages
Predicting The Missing Value by Bayesian Classification: Abstract
No ratings yet
Predicting The Missing Value by Bayesian Classification: Abstract
5 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
6 pages
AI 02 Naive Bayes
No ratings yet
AI 02 Naive Bayes
9 pages
Naive Bayes
No ratings yet
Naive Bayes
32 pages
Lecture 19 - Bayes
No ratings yet
Lecture 19 - Bayes
33 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
46 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
Quantitative Methods Module 1
No ratings yet
Quantitative Methods Module 1
24 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
07 - ML - Naive-Bayes-update
No ratings yet
07 - ML - Naive-Bayes-update
26 pages
Ame: Waqar Ali
No ratings yet
Ame: Waqar Ali
22 pages
Naïve Bayes Classifier in Machine Learning
No ratings yet
Naïve Bayes Classifier in Machine Learning
19 pages
Naive Bayes Classification Explained
No ratings yet
Naive Bayes Classification Explained
25 pages
What Is Naive Bayes Algorithm
No ratings yet
What Is Naive Bayes Algorithm
10 pages
Murphy's Machine Learning Solutions Manual
No ratings yet
Murphy's Machine Learning Solutions Manual
100 pages
Lemlem Abebaw Asaye Asignment 7
No ratings yet
Lemlem Abebaw Asaye Asignment 7
9 pages
Naive Ba Yes
No ratings yet
Naive Ba Yes
28 pages
L NMC Logit Eee3335
No ratings yet
L NMC Logit Eee3335
8 pages
9-Decision Tree Induction-23-01-2025
No ratings yet
9-Decision Tree Induction-23-01-2025
40 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
BSC ML CH2
No ratings yet
BSC ML CH2
79 pages
Naïve Bayes Classifier Guide
No ratings yet
Naïve Bayes Classifier Guide
26 pages
Lecture 12
No ratings yet
Lecture 12
13 pages
Machine Learning - Unit 2
No ratings yet
Machine Learning - Unit 2
104 pages
Unit 4
No ratings yet
Unit 4
30 pages
Unit 3
No ratings yet
Unit 3
28 pages
Unit - 2 Learning Notes
No ratings yet
Unit - 2 Learning Notes
7 pages
Unit - 3 Learning Notes
No ratings yet
Unit - 3 Learning Notes
8 pages
BDA Unit1 Notes
No ratings yet
BDA Unit1 Notes
14 pages
HT With R
No ratings yet
HT With R
33 pages
BDA Unit2 Notes
No ratings yet
BDA Unit2 Notes
23 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
10 pages
Untitled
No ratings yet
Untitled
40 pages
Seminar 1 - Introduction (Handout)
No ratings yet
Seminar 1 - Introduction (Handout)
55 pages
REVIEWa 3
No ratings yet
REVIEWa 3
11 pages
Nandeck Manual: by Andrea "Nand" Nini
No ratings yet
Nandeck Manual: by Andrea "Nand" Nini
181 pages
Synopsis Format
40% (5)
Synopsis Format
2 pages
Patient Health Monitoring System Using ESP32 and Blynk App
No ratings yet
Patient Health Monitoring System Using ESP32 and Blynk App
15 pages
KPI Dashboard - Demo
No ratings yet
KPI Dashboard - Demo
24 pages
Partech - 7300w Monitor DataSheet - 132237DS Iss16
No ratings yet
Partech - 7300w Monitor DataSheet - 132237DS Iss16
2 pages
Cmpl-Dt-Bmc-Tuspl 2023-Nov-144-07.11.2023
No ratings yet
Cmpl-Dt-Bmc-Tuspl 2023-Nov-144-07.11.2023
5 pages
Axt1800 Datasheet 20230602
No ratings yet
Axt1800 Datasheet 20230602
6 pages
Nternational Eamwork Ransforming Ransformers: Bruce Mork, Nicola Chiesa, Dmtry Ishchenko, Alejandro Avendaño Ceceña
No ratings yet
Nternational Eamwork Ransforming Ransformers: Bruce Mork, Nicola Chiesa, Dmtry Ishchenko, Alejandro Avendaño Ceceña
1 page
Numerical Differentiation & Integration
No ratings yet
Numerical Differentiation & Integration
5 pages
Moral Panic
No ratings yet
Moral Panic
1 page
01 (Compulsory) Comprehensive Application of Routing and Switching Technologies
0% (1)
01 (Compulsory) Comprehensive Application of Routing and Switching Technologies
25 pages
Zomato Case Study, April22
No ratings yet
Zomato Case Study, April22
10 pages
Definition of A Spreadsheet
No ratings yet
Definition of A Spreadsheet
4 pages
Robotics in Assembly Automation
No ratings yet
Robotics in Assembly Automation
73 pages
Acer Mouse - Google Shopping
No ratings yet
Acer Mouse - Google Shopping
9 pages
E-Commerce Seller Seminar Charter
No ratings yet
E-Commerce Seller Seminar Charter
8 pages
Option Module Configurations Aug2020
No ratings yet
Option Module Configurations Aug2020
6 pages
Directivas Mpasm
No ratings yet
Directivas Mpasm
16 pages
Leica Cyclone PUBLISHER
No ratings yet
Leica Cyclone PUBLISHER
2 pages
Ose44850f VR7000 S
No ratings yet
Ose44850f VR7000 S
2 pages
68306803be7902e25831bbf9 Lumesewapawitewejilozani
No ratings yet
68306803be7902e25831bbf9 Lumesewapawitewejilozani
6 pages
Presentation On Mern Stack (1) 1
No ratings yet
Presentation On Mern Stack (1) 1
9 pages
Class XI Exam Syllabus 2023-24
No ratings yet
Class XI Exam Syllabus 2023-24
12 pages
PT-175 Forestry Mulcher Specs
No ratings yet
PT-175 Forestry Mulcher Specs
3 pages
Vietnam Power Stations List
No ratings yet
Vietnam Power Stations List
37 pages
SPD24RM20 Redundant Power Module
No ratings yet
SPD24RM20 Redundant Power Module
3 pages
Active Phase Selection From 3phase
No ratings yet
Active Phase Selection From 3phase
46 pages