0% found this document useful (0 votes)

29 views16 pages

Data Mining

The document describes using hierarchical clustering and DBSCAN algorithms to cluster a dataset in Weka. It explains the process of loading a dataset into Weka, selecting the clustering algorithm, and running it to obtain the results.

Uploaded by

pranay23varanasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views16 pages

Data Mining

Uploaded by

pranay23varanasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

23

WEEK 8 DATE : 22ND MARCH

Aim :
Demonstrate classifica on process on a given dataset using Naïve Bayesian
Classifier.
Descrip on :
Naïve Bayes algorithm:
There are different types of classifica on algorithms. In which Naïve Bayes
classifier is one of those. Naive Bayes classifier is a simple probabilis c classifier
based on Bayes' theorem with the assump on of independence between
features. Despite its simplicity, it's quite effec ve for many classifica on tasks,
especially in natural language processing (NLP) and document classifica on.
Here's a breakdown of its key components and workings:
1. Bayes' Theorem: Bayes' theorem is a fundamental theorem in probability
theory that describes the probability of an event based on prior knowledge of
condi ons that might be related to the event. It's formulated as: P(A/B)
=(P(B/A) *P(A))/P(B);
2. Naive Assump on: The "naive" assump on in the Naive Bayes classifier is
that features are independent of each other given the class label. In reality, this
assump on might not hold true for many datasets, but despite its
simplifica on, Naive Bayes o en performs well in prac ce
Process :
In Weka We can perform Naïve Bayes classifica on using following steps. The
analysis is done on [Link]ff file.
1. First of all, load your data set into the Weka interface using the steps
discussed in the earlier weeks.
2. Now choose Classify tab which is present on the ribbon.
3. Under the Classify tab there is default set to ZeroR. If it’s not there and
select choose→Weka→Classifiers→Bayes→Naïve Bayes.
4. Now click on the start bu on.
5. Now this would perform the Naïve Bayes classifica on on our dataset and
would give the results.

Pranay Varanasi 322103383048

WEEK 9 DATE : 22ND MARCH

Aim :
Demonstrate classifica on process on a given dataset using Rule based
Classifier.
Descrip on :
Rule Based Classifier:
Rule-based classifica on is a technique in machine learning and ar ficial
intelligence where data is classified into predefined categories based on a set
of explicitly defined rules.
These rules are typically created manually by domain experts or derived from
exis ng knowledge about the problem domain.
The classifica on process involves evalua ng the input data against these rules
to determine the appropriate category.
Rule-based classifica on is transparent and easy to interpret since the decision-
making process is based on explicit rules.
However, it may struggle with complex or ambiguous data pa erns that are not
adequately captured by the predefined rules.
A rule-based classifier is a type of classifier in machine learning that uses a set
of i hen rules to make predic ons or decisions about the class label of input
data.
Each rule consists of a condi on (if) and an associated class label or ac on
(then).
These rules are typically derived from analyzing the training data or provided
by domain experts. The classifica on process involves sequen ally applying the
rules to the input data un l a matching rule is found, which determines the
predicted class label or ac on.
Rule-based classifiers are o en simple, interpretable, and suitable for domains
where decision-making can be expressed in logical rules.
However, their performance may be limited when dealing with complex data
pa erns or when the rule set is not comprehensive enough to cover all
possible cases.

Pranay Varanasi 322103383048

Process :
1. First of all, load your data set into the Weka interface using the steps
discussed in the earlier weeks.
2. Now choose Classify tab which is present on the ribbon.
3. Under the Classify tab there is default set to ZeroR. If it’s not there and select
choose→Weka→Classiﬁers→Rules→JRip.
4. Now click on the start bu on.
5. Now this would perform the JRip (Java Ripper) Rule Based classiﬁca on on
our dataset and would give the results

Output :

Pranay Varanasi 322103383048

WEEK 10 DATE : 22ND MARCH

AIM :
Demonstrate classifica on process on a given dataset using Nearest neighbor
Classifier.
Descrip on :
Nearest Neighbor classifier: The nearest neighbor classifier is a simple yet
effec ve algorithm used for classifica on tasks in machine learning.
It operates on the principle of finding the most similar training instance
(nearest neighbor) to a given test instance and assigning the same label.
This algorithm doesn't require a training phase as it memorizes the en re
training dataset. However, its computa onal complexity grows linearly with the
size of the training set, making it less suitable for large datasets.
One of the key decisions in implemen ng the nearest neighbor classifier is
choosing the appropriate distance metric, commonly the Euclidean distance for
numerical features and other metrics like Hamming distance for categorical
features.
Addi onally, the classifier's performance heavily relies on the quality and
representa veness of the training data, as noisy or unbalanced data can lead to
misclassifica ons.
Despite its simplicity, nearest neighbor classifiers can perform remarkably well,
especially in low-dimensional feature spaces, but they may struggle with high-
dimensional data due to the curse of dimensionality.
Regulariza on techniques and op miza on strategies are o en employed to
enhance its performance in such scenarios
Process :
1. First of all, load your data set into the Weka interface using the steps
discussed in the earlier weeks.
2. Now choose Classify tab which is present on the ribbon.
3. Under the Classify tab there is default set to ZeroR. If it’s not there and
select choose→Weka→Classifiers→Lazy→IBk.
4. Now click on the start bu on.

Pranay Varanasi 322103383048

5. Now this would perform the IBk (Instance Based k-nearest neighbors) Rule
Based classiﬁca on on our dataset and would give the results.
Output :

Pranay Varanasi 322103383048

WEEK 12 DATE : 12 TH APRIL

AIM :
Cluster the given dataset using a hierarchical clustering algorithm.
Descrip on :
The hierarchical clustering algorithm is a method used to group data points into
a hierarchy of clusters.
It starts by considering each data point as a separate cluster and then
itera vely merges the closest clusters based on a similarity measure un l only
one cluster remains, forming a hierarchical tree-like structure called a
dendrogram. There are two main approaches to hierarchical clustering:
agglomera ve and divisive.
Agglomera ve clustering begins with each data point as a singleton cluster and
merges the most similar clusters at each step un l a single cluster containing all
data points is formed.
Divisive clustering, on the other hand, starts with all data points in a single
cluster and recursively splits it into smaller clusters un l each cluster only
contains a single data point.
Hierarchical clustering is intui ve and does not require the number of clusters
to be speciﬁed beforehand, making it useful for exploratory data analysis and
visualizing the structure of the data. However, it can be computa onally
expensive for large datasets

Process :
1. First of all, load your data set into the Weka interface using the steps
discussed in the earlier weeks.
2. Now choose Cluster tab which is present on the ribbon.
3. Under the Cluster tab there is default set to ZeroR. If it’s not there and select
choose→Weka→Clusters→Hierarchal Clusterer
4. Now click on the start bu on.
5. Now this would perform the hierarchal clustering on our dataset and would
give the results.

Pranay Varanasi 322103383048

Output :

Pranay Varanasi 322103383048

WEEK 13 DATE : 12TH APRIL

AIM :
Cluster the given dataset using the DBSCAN algorithm
Descrip on :
DB Scan:
The DBSCAN (Density-Based Spa al Clustering of Applica ons with Noise)
algorithm is a popular density-based clustering algorithm used in machine
learning and data mining.
Unlike k-means, which requires the number of clusters to be specified
beforehand, DBSCAN automa cally iden fies the number of clusters based on
the density of data points in the feature space. The algorithm defines clusters
as dense regions of data points separated by regions of lower density.
It works by categorizing each data point as a core point, border point, or noise
point based on the density of data points around it and a predefined distance
threshold.
Core points are those that have a sufficient number of neighboring points
within the specified distance, while border points are reachable from core
points but do not have enough neighbors to be considered core points
themselves.
Noise points are outliers that do not belong to any cluster. DBSCAN is robust to
noise and capable of discovering clusters of arbitrary shape, making it suitable
for a wide range of applica ons such as spa al data analysis, anomaly
detec on, and image segmenta on.
Process :
1. First of all, load your data set into the Weka interface using the steps
discussed in the earlier weeks.
2. Now choose Cluster tab which is present on the ribbon.
3. Under the Cluster tab there is default set to ZeroR. If it’s not there and select
choose→Weka→Clusters→Make Density Based Clusterer
4. Now click on the start bu on.

Pranay Varanasi 322103383048

5. Now this would perform the density based clustering on our dataset and
would give the results.
Output :

Pranay Varanasi 322103383048

Tut2 Weka
No ratings yet
Tut2 Weka
8 pages
DWM Exp4 A49
No ratings yet
DWM Exp4 A49
11 pages
A35 DWM Exp-4
No ratings yet
A35 DWM Exp-4
8 pages
DOC-2024085
No ratings yet
DOC-2024085
7 pages
Data Mining Practical
No ratings yet
Data Mining Practical
31 pages
DWM - Exp No 5
No ratings yet
DWM - Exp No 5
7 pages
6.034 Design Assignment 2: 1 Data Sets
No ratings yet
6.034 Design Assignment 2: 1 Data Sets
6 pages
EC9560 Data Mining: Lab 02: Classification and Prediction Using WEKA
No ratings yet
EC9560 Data Mining: Lab 02: Classification and Prediction Using WEKA
5 pages
kNN Classification Lab Guide
No ratings yet
kNN Classification Lab Guide
4 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
DWMExp 5
No ratings yet
DWMExp 5
6 pages
Lab 12 Introduction To Rapidminer/Weka.: Objective
No ratings yet
Lab 12 Introduction To Rapidminer/Weka.: Objective
24 pages
Data Warehousing
No ratings yet
Data Warehousing
54 pages
41 j48 Naive Bayes Weka
No ratings yet
41 j48 Naive Bayes Weka
5 pages
Data Mining 4th Is
No ratings yet
Data Mining 4th Is
24 pages
Big Data Notes
No ratings yet
Big Data Notes
33 pages
3 Module DWM
No ratings yet
3 Module DWM
16 pages
DSS07 CLS Rule Induction, K NN, Naive Bayesian en Đã G P
No ratings yet
DSS07 CLS Rule Induction, K NN, Naive Bayesian en Đã G P
507 pages
DMDW 11 Classification Basic
No ratings yet
DMDW 11 Classification Basic
41 pages
AI32 Guide To Weka PDF
No ratings yet
AI32 Guide To Weka PDF
6 pages
DATA - FA 2024 - Dist
No ratings yet
DATA - FA 2024 - Dist
85 pages
Week 02 Classification & Clustering
No ratings yet
Week 02 Classification & Clustering
31 pages
Weka Tool
No ratings yet
Weka Tool
12 pages
DAV Lab1 For Students
No ratings yet
DAV Lab1 For Students
2 pages
Naive Bayes
No ratings yet
Naive Bayes
2 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
141 pages
Machine Learning-Lecture 04
No ratings yet
Machine Learning-Lecture 04
31 pages
Exp.8 Demonstration of Classification Process On Dataset Employee - Arff Using Naïve Bayes Algorithm
No ratings yet
Exp.8 Demonstration of Classification Process On Dataset Employee - Arff Using Naïve Bayes Algorithm
4 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Data Mining Classification Guide
No ratings yet
Data Mining Classification Guide
10 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
29 pages
J48 & Naive Bayes Classification Guide
No ratings yet
J48 & Naive Bayes Classification Guide
3 pages
Assignment 1-Preprocessing Handon
No ratings yet
Assignment 1-Preprocessing Handon
6 pages
08 Class Basic
No ratings yet
08 Class Basic
103 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
ML 5
No ratings yet
ML 5
76 pages
Classification 2
No ratings yet
Classification 2
56 pages
Introduction to WEKA: Features & Usage
No ratings yet
Introduction to WEKA: Features & Usage
51 pages
Data Mining Practical Guide
No ratings yet
Data Mining Practical Guide
27 pages
4 22865 IS465 2019 1 2 1 08ClassBasic
No ratings yet
4 22865 IS465 2019 1 2 1 08ClassBasic
43 pages
Learning AI
No ratings yet
Learning AI
34 pages
ML Module4 Classification
No ratings yet
ML Module4 Classification
79 pages
Weka Data Processing and Analysis Guide
No ratings yet
Weka Data Processing and Analysis Guide
100 pages
Naïve Bayesian Classification Overview
No ratings yet
Naïve Bayesian Classification Overview
38 pages
DA LabFile
No ratings yet
DA LabFile
63 pages
Classification Algorithms
No ratings yet
Classification Algorithms
23 pages
Machine Learning-Classification
No ratings yet
Machine Learning-Classification
52 pages
Weka Exp11
No ratings yet
Weka Exp11
6 pages
MM Document
No ratings yet
MM Document
8 pages
Module 3 - Classification
No ratings yet
Module 3 - Classification
111 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
DM - Ch4 - Classification (Part1)
No ratings yet
DM - Ch4 - Classification (Part1)
20 pages
Anjali Weka Software Report
No ratings yet
Anjali Weka Software Report
17 pages
R2015 - 15ME1151 - Renewable Sources of Energy - OE (ALL BRANCHES)
No ratings yet
R2015 - 15ME1151 - Renewable Sources of Energy - OE (ALL BRANCHES)
1 page
Sample Abstract
No ratings yet
Sample Abstract
1 page
Fitness Assistant Abstract With Tables
No ratings yet
Fitness Assistant Abstract With Tables
3 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
90 pages
ACID Properties & Concurrency in DBMS
No ratings yet
ACID Properties & Concurrency in DBMS
63 pages
Unit 5
No ratings yet
Unit 5
137 pages
Statistics Notes
100% (1)
Statistics Notes
94 pages
Harmonized System Classification Rules
100% (1)
Harmonized System Classification Rules
300 pages
EdYoda Data Science Program Overview
No ratings yet
EdYoda Data Science Program Overview
14 pages
A Review of Machine Learning For The Optimization of Production Process
No ratings yet
A Review of Machine Learning For The Optimization of Production Process
14 pages
Data Mining Practical Machine Learning Tools and Techniques 2nd Edition by Ian Witten, Eibe Frank ISBN 0120884070 8670120884070 PDF Download
No ratings yet
Data Mining Practical Machine Learning Tools and Techniques 2nd Edition by Ian Witten, Eibe Frank ISBN 0120884070 8670120884070 PDF Download
50 pages
Explainability For Large Language Models: A Survey
No ratings yet
Explainability For Large Language Models: A Survey
38 pages
XGBoost Package Overview and Features
No ratings yet
XGBoost Package Overview and Features
54 pages
Deep Learning
No ratings yet
Deep Learning
2 pages
Artificial Neural Networks Guide
No ratings yet
Artificial Neural Networks Guide
63 pages
AI-Powered Multimodal Crime Management Framework
No ratings yet
AI-Powered Multimodal Crime Management Framework
8 pages
Deep Learning for Bird Species ID
No ratings yet
Deep Learning for Bird Species ID
74 pages
ID3 Algorithm for Spam Filtering
No ratings yet
ID3 Algorithm for Spam Filtering
11 pages
Bhavana Gubbi: Profile Summary Technical Skills
No ratings yet
Bhavana Gubbi: Profile Summary Technical Skills
1 page
Trend Analysis in Machine Learning Research
No ratings yet
Trend Analysis in Machine Learning Research
6 pages
Bucket Brigade
0% (1)
Bucket Brigade
6 pages
BA ZG523 Introduction To Data Science
50% (2)
BA ZG523 Introduction To Data Science
12 pages
A Seismic Sensor Based Human Activity Recognition Framework Using Deep Learning
No ratings yet
A Seismic Sensor Based Human Activity Recognition Framework Using Deep Learning
8 pages
341-Forest Cover Type Prediction
100% (1)
341-Forest Cover Type Prediction
5 pages
Online Payment Fraud Detection Techniques
No ratings yet
Online Payment Fraud Detection Techniques
6 pages
ART16000108
No ratings yet
ART16000108
6 pages
01 - Graziella
No ratings yet
01 - Graziella
6 pages
Basics of Pattern Recognition
No ratings yet
Basics of Pattern Recognition
35 pages
ML Evaluation Metrics CheatSheet
No ratings yet
ML Evaluation Metrics CheatSheet
3 pages
CP4252 Machine Learning
No ratings yet
CP4252 Machine Learning
3 pages
Lecture 1
No ratings yet
Lecture 1
24 pages
Machine Learning Projects Python
94% (18)
Machine Learning Projects Python
134 pages
Assignment 1 - Colab
No ratings yet
Assignment 1 - Colab
2 pages
Firefly Algorithm for Feature Selection
No ratings yet
Firefly Algorithm for Feature Selection
4 pages
Urdu Character Recognition via ML
No ratings yet
Urdu Character Recognition via ML
6 pages
Neural Network Predicts Kansas Lithofacies
No ratings yet
Neural Network Predicts Kansas Lithofacies
18 pages

Data Mining

Uploaded by

Data Mining

Uploaded by

23

WEEK 8 DATE : 22ND MARCH

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

WEEK 9 DATE : 22ND MARCH

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

WEEK 10 DATE : 22ND MARCH

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

WEEK 12 DATE : 12 TH APRIL

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

WEEK 13 DATE : 12TH APRIL

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

Pranay Varanasi 322103383048

You might also like