0% found this document useful (0 votes)

59 views6 pages

Exp 8

Lab Experiment of data warehouse

Uploaded by

Kratos grime

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views6 pages

Exp 8

Lab Experiment of data warehouse

Uploaded by

Kratos grime

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

LAB Manual

PART A
(PART A: TO BE REFFERED BY STUDENTS)

Experiment No.08

A.1 Aim:

Implementation of Agglomerative hierarchical clustering in any programming language like

JAVA, C++, Python or WEKA tool.

A.2 Prerequisite:
Familiarity with the WEKA tool and programming languages.

A.3 Outcome:
After successful completion of this experiment students will be able to
➢ Use classification and clustering algorithms of data mining.

A.4 Theory:
THEORY:
Hierarchical Clustering:-
Build a tree-based hierarchical taxonomy (dendrogram) from a set of documents.
One approach: recursive application of a partitional clustering algorithm.
Dendogram: Hierarchical Clustering
• Clustering obtained by cutting the dendrogram at a desired level: each connected
component forms a cluster.

Hierarchical Clustering algorithms:-

Agglomerative (bottom-up):

1. Start with each document being a single cluster.

2. Eventually all documents belong to the same cluster.

Divisive (top-down):
1.
2. Start with all documents belong to the same cluster.
3. Eventually each node forms a cluster on its own.
4. Does not require the number of clusters k in advance
5. Needs a termination/readout condition
6. The final mode in both Agglomerative and Divisive is of no use.

Dendogram: Hierarchical Clustering

Clustering obtained by cutting the dendrogram at a desired level: each
connected component forms a cluster.
Many variants to defining closest pair of clusters:-

Single-link: Similarity of the most cosine-similar (single-link)

Complete-link: Similarity of the “furthest” points, the least cosine-similar

Centroid : Clusters whose centroids (centers of gravity) are the most cosine-similar

Average-link: Average cosine between pairs of elements

PART B
(PART B: TO BE COMPLETED BY STUDENTS)

(Students must submit the soft copy as per following segments within two hours of the
practical. The soft copy must be uploaded on the Blackboard or emailed to the concerned
lab in charge faculties at the end of the practical in case the there is no Black board access
available)

Roll. No. Name:

Class: Batch:
Date of Experiment: Date of Submission:
Grade:

B.1 Software Code written by student:

@relation bank_customers

@attribute age numeric

@attribute job {manager, developer, technician, analyst, retired,
accountant}
@attribute qualification {tertiary, primary, secondary}
@attribute communication_type {cellular, telephonic}
@attribute acc_balance numeric
@attribute marital_status {married, unmarried}

@data
32,manager,tertiary,cellular,30000,married
30,developer,secondary,telephonic,28000,married
40,manager,tertiary,cellular,40000,unmarried
70,retired,secondary,telephonic,52000,married
54,analyst,primary,cellular,32000,married
58,manager,tertiary,telephonic,60000,married
44,technician,secondary,cellular,40000,unmarried
35,manager,tertiary,telephonic,55000,married
42,technician,primary,telephonic,35000,married
28,accountant,secondary,cellular,32000,married
50,manager,tertiary,telephonic,70000,married
51,retired,primary,cellular,44000,unmarried
64,retired,secondary,cellular,30000,married
90,retired,tertiary,telephonic,85000,married
76,analyst,secondary,cellular,80000,unmarried
79,accountant,secondary,telephonic,72500,married
30,developer,primary,telephonic,50000,married
42,developer,secondary,telephonic,55000,married
B.2 Input and Output:
B.3 Observations and learning:
We observed that how data is preprocessed and cluster is implemented.

B.4 Conclusion:
We successfully implemented Agglomerative hierarchical clustering in WEKA tool.

B.5 Question of Curiosity

(To be answered by student based on the practical performed and learning/observations)

Q1: Explain the advantages and disadvantages of agglomeration and hierarchical

clustering.
Ans:
Advantages
1) No apriori information about the number of clusters required.
2) Easy to implement and gives best result in some cases.
Disadvantages
1) Algorithm can never undo what was done previously.
2) Time complexity of at least O(n2 log n) is required, where ‘n’ is the number of data points.
3) Based on the type of distance matrix chosen for merging different algorithms can suffer
with one or more of the following:
i) Sensitivity to noise and outliers
ii) Breaking large clusters
iii) Difficulty handling different sized clusters and convex shapes
4) No objective function is directly minimized
5) Sometimes it is difficult to identify the correct number of clusters by the dendogram.

Q2: What is the relationship between top-down, bottom-up and division

/agglomeration?
Ans:
The top-down approach starts from a bulk material that incorporates critical nanoscale
details.
The bottom-up approach include self-assembly and molecular patterning.

DWM Exp8 C49
No ratings yet
DWM Exp8 C49
10 pages
3.2 HierCluster
No ratings yet
3.2 HierCluster
17 pages
Clustering: EE-671 Prof L. Behera, IITK
No ratings yet
Clustering: EE-671 Prof L. Behera, IITK
33 pages
Clustering Methods and Algorithms
No ratings yet
Clustering Methods and Algorithms
110 pages
Hierarchical Clustering Explained
No ratings yet
Hierarchical Clustering Explained
14 pages
DWM Exp6 A49
No ratings yet
DWM Exp6 A49
7 pages
ML PR 5
No ratings yet
ML PR 5
23 pages
Hierarchical Clustering Techniques Explained
100% (1)
Hierarchical Clustering Techniques Explained
33 pages
Hierarchical Clustering PDF
No ratings yet
Hierarchical Clustering PDF
5 pages
Agglomerative Hierarchical Clustering
No ratings yet
Agglomerative Hierarchical Clustering
41 pages
ML TCS Lecture Hierarchical 1608
No ratings yet
ML TCS Lecture Hierarchical 1608
41 pages
Hierarchical Clustering Explained
No ratings yet
Hierarchical Clustering Explained
20 pages
Agglomerative Hierarchical Clustering
No ratings yet
Agglomerative Hierarchical Clustering
22 pages
Chp10 Cluster Analysis Basic Concepts and Methods
No ratings yet
Chp10 Cluster Analysis Basic Concepts and Methods
24 pages
Exp 5 ML
No ratings yet
Exp 5 ML
9 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
11 pages
Agnes
No ratings yet
Agnes
25 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
10 pages
Understanding Hierarchical Clustering Techniques
No ratings yet
Understanding Hierarchical Clustering Techniques
51 pages
Module 3
No ratings yet
Module 3
123 pages
Hierarchical Clustering Algorithm
No ratings yet
Hierarchical Clustering Algorithm
9 pages
Lect 11 DM
No ratings yet
Lect 11 DM
41 pages
Hierarchical Clustering Guide
No ratings yet
Hierarchical Clustering Guide
110 pages
Un Supervised Learning
No ratings yet
Un Supervised Learning
22 pages
ML 8
No ratings yet
ML 8
12 pages
Clustering Techniques and Algorithms
No ratings yet
Clustering Techniques and Algorithms
77 pages
P 3.1.3 Hierarchical
No ratings yet
P 3.1.3 Hierarchical
30 pages
Herichycal Cluster - March2020
No ratings yet
Herichycal Cluster - March2020
29 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
101 pages
Lecture 6
No ratings yet
Lecture 6
55 pages
Herichycal March2020
No ratings yet
Herichycal March2020
29 pages
Clustering Algorithms Overview
No ratings yet
Clustering Algorithms Overview
37 pages
Clustering
No ratings yet
Clustering
75 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
23 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
32 pages
Hierarchical Clustering Methods Explained
No ratings yet
Hierarchical Clustering Methods Explained
31 pages
9536 DWM Expt 7 Merged
No ratings yet
9536 DWM Expt 7 Merged
14 pages
Agglomerative Clustering
No ratings yet
Agglomerative Clustering
6 pages
Intro to Clustering Methods
No ratings yet
Intro to Clustering Methods
39 pages
Customer Segmentation Techniques Explained
No ratings yet
Customer Segmentation Techniques Explained
46 pages
Clustering
No ratings yet
Clustering
131 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
10 pages
Unit 4 Clustering
No ratings yet
Unit 4 Clustering
32 pages
Chinninti Venkata Assessment Machine Learning
No ratings yet
Chinninti Venkata Assessment Machine Learning
11 pages
13 Clustering and Classifier
No ratings yet
13 Clustering and Classifier
123 pages
Agglomerative Clustering
No ratings yet
Agglomerative Clustering
44 pages
K-Means Clustering Overview
No ratings yet
K-Means Clustering Overview
24 pages
3CP10 MJJ Hierarchical Clustering
No ratings yet
3CP10 MJJ Hierarchical Clustering
40 pages
B43 Exp5 ML
No ratings yet
B43 Exp5 ML
6 pages
Cluster 1
No ratings yet
Cluster 1
6 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
80 pages
Clustering Hierarchical Algorithms
100% (1)
Clustering Hierarchical Algorithms
21 pages
Unsupervised Learning and Clustering
No ratings yet
Unsupervised Learning and Clustering
19 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
CS276A Text Retrieval and Mining
No ratings yet
CS276A Text Retrieval and Mining
48 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
4 pages
Seminar Report
No ratings yet
Seminar Report
9 pages
Mid Term Exam Date Sheet Fall 2024
No ratings yet
Mid Term Exam Date Sheet Fall 2024
4 pages
John Rawls Biography Ethics
No ratings yet
John Rawls Biography Ethics
2 pages
Law, Ethics and Technology
100% (3)
Law, Ethics and Technology
10 pages
COMM0999-myRoadmap-T1 2024 TEMPLATE For Moodle
No ratings yet
COMM0999-myRoadmap-T1 2024 TEMPLATE For Moodle
4 pages
Evidence Guide NC 2013
No ratings yet
Evidence Guide NC 2013
65 pages
Intro To Rocky Shores
No ratings yet
Intro To Rocky Shores
46 pages
Joseph Pientka Briefing Document
75% (4)
Joseph Pientka Briefing Document
7 pages
Study Plan-Eng - Docx - 20250510 - 154735 - 0000
No ratings yet
Study Plan-Eng - Docx - 20250510 - 154735 - 0000
6 pages
Quiz 04 e
No ratings yet
Quiz 04 e
2 pages
Novodur P2MC: Acrylonitrile Butadiene Styrene (ABS)
No ratings yet
Novodur P2MC: Acrylonitrile Butadiene Styrene (ABS)
3 pages
Authorized Hacker Techniques Tools and Incident Handling 3rd Edition Ebook and TestBank Bundle
No ratings yet
Authorized Hacker Techniques Tools and Incident Handling 3rd Edition Ebook and TestBank Bundle
326 pages
Teacher's Guide (Sample) - 02
No ratings yet
Teacher's Guide (Sample) - 02
22 pages
SDS Guide for Chemical Synthesis
No ratings yet
SDS Guide for Chemical Synthesis
7 pages
Introduction To Machine Leraning
No ratings yet
Introduction To Machine Leraning
27 pages
Parts of A Flowering Plant Lesson 1
No ratings yet
Parts of A Flowering Plant Lesson 1
3 pages
Air Sumur
No ratings yet
Air Sumur
7 pages
Sutuiltuta: $ (Rrffi
No ratings yet
Sutuiltuta: $ (Rrffi
228 pages
1.1.5 Journal - Psych
No ratings yet
1.1.5 Journal - Psych
4 pages
Gum Tape
No ratings yet
Gum Tape
2 pages
Environmental Studies Assignment
No ratings yet
Environmental Studies Assignment
26 pages
Urban Sprawl Metrics: An Analysis of Global Urban Expansion Using GIS
No ratings yet
Urban Sprawl Metrics: An Analysis of Global Urban Expansion Using GIS
13 pages
Leadership Competency Model Guide
100% (3)
Leadership Competency Model Guide
11 pages
S1 S4microandnanoelectronics
No ratings yet
S1 S4microandnanoelectronics
186 pages
Spine & Wing
No ratings yet
Spine & Wing
38 pages
Java Computer Science Project
No ratings yet
Java Computer Science Project
43 pages
Concrete Strength ML Presentation
No ratings yet
Concrete Strength ML Presentation
10 pages
Shaft Absolute Module Configuration Guide
No ratings yet
Shaft Absolute Module Configuration Guide
74 pages
Aquaculture Expert Pooja Kumari CV
No ratings yet
Aquaculture Expert Pooja Kumari CV
3 pages
Memorial On Behalf of Petitioner
No ratings yet
Memorial On Behalf of Petitioner
55 pages

Exp 8

Uploaded by

Exp 8

Uploaded by

LAB Manual

Implementation of Agglomerative hierarchical clustering in any programming language like

Hierarchical Clustering algorithms:-

1. Start with each document being a single cluster.

Dendogram: Hierarchical Clustering

Single-link: Similarity of the most cosine-similar (single-link)

Complete-link: Similarity of the “furthest” points, the least cosine-similar

Average-link: Average cosine between pairs of elements

Roll. No. Name:

B.1 Software Code written by student:

@attribute age numeric

B.5 Question of Curiosity

Q1: Explain the advantages and disadvantages of agglomeration and hierarchical

Q2: What is the relationship between top-down, bottom-up and division

You might also like