0% found this document useful (0 votes)

17 views26 pages

Data Visualization 13

Uploaded by

Keerthana Venkadapathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views26 pages

Data Visualization 13

Uploaded by

Keerthana Venkadapathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

13.

Visualization Techniques for

Classification & Clustering

Prof. Pattabiraman.V
SCSE, VIT, Chennai
KDD Process

• Selection
• Obtain data from all of sources
• Preprocessing
• After selecting the data, clean it to make sure it is consistent
• Transformation
• After preprocessing the data, analyze the format/amount of data
• Data Mining
• Once the data is in a useable format, apply various algorithms
based upon the results trying to be achieved
• Interpretation/Evaluation
• Finally, present the results of the data mining step to the user, so
that the results can be used to solve the business need at hand
Importance of
Data Visualization

• The final step in the KDD process :

• Highly dependent on the Data Visualization technique
• Bad/inappropriate technique may result in
misunderstanding
• Misunderstanding may cause an incorrect (or no)
decision

It is important to consider that the KDD process is

useless if the results are not understandable
Suggested Direction

• Need to determine techniques that balance

simplicity with completeness
• If this can be done for non-expert users
• Simplicity & Completeness  Understanding
• Understanding  Trust
• Trust  more use of KDD/DM
• Result will be:
• Better business value
• Higher ROI

4
Common
Visualization Techniques

• Visualization techniques dependent upon

• The type of data mining technique chosen
• The underlying structure and attributes of the data

Classification Clustering
- Decision Trees - Scatter Plots
- Scatter Plots - Dendrograms
- Axis-Parallel Decision Trees - Smoothed Data Histograms
- Circle Segments - Self-Organizing Maps
- Decision Tables - Proximity Matrixes

5
Classification

6
Decision Tree

Information limited to
Attributes
Splitting values
Terminal node class assignments

7
Decision Tree
with Histograms

• Data mining rarely classify

100% of the data correctly:
• Include the success of properly
classifying the data - histogram
added for each terminal node
• Percentage of data that was
classified correctly/incorrectly
• Assists users in determining if
the classification is „good
enough‟

8
Decision Tree
Different Format

Vertical representation -
allows for easy user
interaction
Combines the split points and
classification accuracy -
compactly
Key difference - colors are
matched with a specific
classification

9
Scatter Plot
with Regression Line

• Excellent way to view

2-dimensional data
• Familiar to anyone who
has taken high-school
algebra
• Regression lines provide
descriptive techniques for
classification

10
Axis-Parallel Decision Tree

• Combination Scatter
Plot and Decision Tree
• Areas divided in parallel
regions on the axis
• Well suited for
classification problems
with two attribute values
• High visibility into the
impact of outliers

11
Circle Segments

• Multi-dimension data
• Maps dataset with n
dimensions onto a circle
divided by n segments
• Each segment is a different
attribute
• Each pixel inside a segment is a
single value of the attribute
• Values of each attribute are then
sorted (independently) and
assigned a different colors
based upon its class

12
Decision Table
• Interactive technique
• Maps attribute data to a 2D hierarchical matrix
• Levels can be drilled down - another set of attributes
• Height of a cell conveys the number of data entities
• Cells color coded
• Neutral color  no data in that intersection point
• Color coded by class (percentage)

13
Clustering

14
Scatter Plot

• Extensions include, displaying points in:

• Various sizes and colors to indicate additional attributes
• Shading of points to introduce a third dimension
• Using different brightness levels of the same color to represent
continuous values for the same attribute
• Using various points or classification identifiers (i.e., numbers,
symbols)
• Using various glyphs to display additional attributes

15
Scatter Plot

• Map decision trees

on top of scatter
plots to describe
clusters

16
Scatter Plot with
Regression Lines

17
Scatter Plot with
Min Spanning Tree

18
Dendrogram

• Intuitive representation - hierarchical

decomposition of data into sets of
nested clusters.
• From an agglomerative perspective:
• Each leaf - a single data entity
• Each internal node - the union of all data
entities in its sub-tree
• The root - the entire dataset
• The height of any internal node - the
similarity between its „children‟.

19
Dendrogram
with Exemplars

• The “most typical

member of each
cluster” [Wishart99]
• Underlined labels of the
leafs
• Done in combination
with shading to identify
the clustering level

20
Smoothed
Data Histogram

• Represents data
on a „display map‟
• Similar data items
are located close
to each other
• More defined the
clusters – lighter
colors

21
Self-Organizing Map
‘Grid’

• Source of 1 5
Smoothed Data
2 3 2 5 6 5
Histogram
2 2 2 4 5 5 5
• Numbers
indicate most 7 1 1 1 5 7

„common‟ 7 8 7 7 7 10 7
cluster
7 9 7 7 11 7

8 7 10 7

22
Proximity Matrix

• Graphically display the

relationship between
data elements
• Usually symmetric, but
can be sorted by the
strength of relationships

23
Proximity Matrix and
Dendrogram

24
Summary

• Data visualization techniques are extremely important

for understanding the KDD process
• A balance of simplicity and completeness is important
• The techniques discussed allow average users to
understand the results of the KDD process
• Understanding  KDD results to be interpreted/trusted
by non-expert users  extending the business value
• If data visualization techniques do not establish a high
level of trust in the KDD process, the process will fail

25
Thank You

Data Mining
No ratings yet
Data Mining
44 pages
All Unit DV Notes
No ratings yet
All Unit DV Notes
31 pages
Unit No 3
No ratings yet
Unit No 3
10 pages
Data Mining Notes
No ratings yet
Data Mining Notes
3 pages
DWDM Unit-3
No ratings yet
DWDM Unit-3
4 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
Notes DV 2025
No ratings yet
Notes DV 2025
10 pages
L5 Data Visualization
No ratings yet
L5 Data Visualization
33 pages
DWDM Unit-II Notes
No ratings yet
DWDM Unit-II Notes
29 pages
Data Mining: Techniques and Methods
No ratings yet
Data Mining: Techniques and Methods
20 pages
R18CSE4102-UNIT 2 Data Mining Notes
100% (1)
R18CSE4102-UNIT 2 Data Mining Notes
31 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Lecture 13
No ratings yet
Lecture 13
51 pages
UG BSF Clustering
No ratings yet
UG BSF Clustering
119 pages
Introduction To Data Analytics MCA-3282 Open Elective - 6 Sem B.Tech Topic - Grouping
No ratings yet
Introduction To Data Analytics MCA-3282 Open Elective - 6 Sem B.Tech Topic - Grouping
44 pages
Knowledge Mining Using Classification Through Clustering
No ratings yet
Knowledge Mining Using Classification Through Clustering
6 pages
Lecture 3.2.1 3.2.2
No ratings yet
Lecture 3.2.1 3.2.2
28 pages
Data Analytics for B.Tech Students
No ratings yet
Data Analytics for B.Tech Students
98 pages
Data Mining Notes
No ratings yet
Data Mining Notes
25 pages
Data Mining Techniques Using R Unit 1
No ratings yet
Data Mining Techniques Using R Unit 1
26 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
Data Mining & Machine Learning Guide
No ratings yet
Data Mining & Machine Learning Guide
19 pages
DS9 - Clustering
No ratings yet
DS9 - Clustering
35 pages
UNIT 1 Introduction of Data Mining
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
MODULE-V - Data Mining Trends and Research Frontiers
No ratings yet
MODULE-V - Data Mining Trends and Research Frontiers
52 pages
Clustering Algorithms Overview
No ratings yet
Clustering Algorithms Overview
6 pages
Data Preprocessing
No ratings yet
Data Preprocessing
76 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
35 pages
Clustering in Data Mining Explained
No ratings yet
Clustering in Data Mining Explained
12 pages
3analysing Important Trend
No ratings yet
3analysing Important Trend
52 pages
Data Mining: Tasks, Models, and Issues
No ratings yet
Data Mining: Tasks, Models, and Issues
19 pages
Module III Data Mining
No ratings yet
Module III Data Mining
7 pages
BI Unit 3 Part 1
No ratings yet
BI Unit 3 Part 1
51 pages
Concepts and Techniques: - Chapter 13
No ratings yet
Concepts and Techniques: - Chapter 13
52 pages
Clustering
No ratings yet
Clustering
45 pages
02 Data
No ratings yet
02 Data
24 pages
Unit VII
No ratings yet
Unit VII
30 pages
Data Mining 1
No ratings yet
Data Mining 1
7 pages
#CH-2 2 2
No ratings yet
#CH-2 2 2
16 pages
Fifth Chapter Classification Clustering
No ratings yet
Fifth Chapter Classification Clustering
16 pages
Data Mining
No ratings yet
Data Mining
63 pages
Data Mining Module 1 Theory
No ratings yet
Data Mining Module 1 Theory
4 pages
Data Mining-1
No ratings yet
Data Mining-1
7 pages
Clustering Agglo Devisive DBSCAN
No ratings yet
Clustering Agglo Devisive DBSCAN
78 pages
Da Mid 2
No ratings yet
Da Mid 2
12 pages
Unit1 - Intoduction To Data Mining
No ratings yet
Unit1 - Intoduction To Data Mining
10 pages
BI Chapter 04 - Unlocked
No ratings yet
BI Chapter 04 - Unlocked
47 pages
5 Data Exploration
No ratings yet
5 Data Exploration
41 pages
Unit 4
No ratings yet
Unit 4
42 pages
MR22-DM 1
No ratings yet
MR22-DM 1
21 pages
Done DataMiningAssignment
No ratings yet
Done DataMiningAssignment
24 pages
Data Mining Knowledge Representation
No ratings yet
Data Mining Knowledge Representation
19 pages
Data Mining Unit-IV
No ratings yet
Data Mining Unit-IV
37 pages
Data Mining and Knowledge Discovery Guide
No ratings yet
Data Mining and Knowledge Discovery Guide
21 pages
Data Mining Implementation
No ratings yet
Data Mining Implementation
9 pages
تنقيب بيانات 7 بعد التعديل Maj
No ratings yet
تنقيب بيانات 7 بعد التعديل Maj
35 pages
Chapter 2
No ratings yet
Chapter 2
53 pages
Lect3 Clustering
No ratings yet
Lect3 Clustering
86 pages
Laser Types: According To The Wavelength: Infra-Red, Visible, Ultra-Violet (UV) or X-Ray Lasers
No ratings yet
Laser Types: According To The Wavelength: Infra-Red, Visible, Ultra-Violet (UV) or X-Ray Lasers
23 pages
Abs AN620 20230519-1
No ratings yet
Abs AN620 20230519-1
2 pages
Daikin 60 HZ FCU - EDB (Technical Manual)
No ratings yet
Daikin 60 HZ FCU - EDB (Technical Manual)
100 pages
Data Sheet 22 KW
No ratings yet
Data Sheet 22 KW
2 pages
Tecumseh Quick Select Guide - Compressors
100% (1)
Tecumseh Quick Select Guide - Compressors
12 pages
Child Development Insights
No ratings yet
Child Development Insights
5 pages
Demo English 9 Earth
No ratings yet
Demo English 9 Earth
3 pages
GATE EService ID Application Process
No ratings yet
GATE EService ID Application Process
2 pages
The Comprehension of Idioms 1
No ratings yet
The Comprehension of Idioms 1
16 pages
SG Resume 2025
No ratings yet
SG Resume 2025
2 pages
House Hearing, 112TH Congress - Creating Opportunities Through Improved Government Spectrum Efficiency
No ratings yet
House Hearing, 112TH Congress - Creating Opportunities Through Improved Government Spectrum Efficiency
191 pages
Effectiveness of A 2D TLD and Its Numerical Modeling: M. J. Tait, A.M.ASCE N. Isyumov, F.ASCE and A. A. El Damatty
No ratings yet
Effectiveness of A 2D TLD and Its Numerical Modeling: M. J. Tait, A.M.ASCE N. Isyumov, F.ASCE and A. A. El Damatty
13 pages
Fundamentals of Dimensional Metrology 6th Edition Connie L. Dotson Updated 2025
No ratings yet
Fundamentals of Dimensional Metrology 6th Edition Connie L. Dotson Updated 2025
150 pages
Light Bulb
No ratings yet
Light Bulb
12 pages
JEE Physics: Kinematics and Motion Concepts
No ratings yet
JEE Physics: Kinematics and Motion Concepts
81 pages
Abbreviations of The Cable Technology
No ratings yet
Abbreviations of The Cable Technology
2 pages
Top 10 Female Entrepreneurs in Pakistan
No ratings yet
Top 10 Female Entrepreneurs in Pakistan
37 pages
Industrial Ventilation Fans
No ratings yet
Industrial Ventilation Fans
4 pages
Editorial
No ratings yet
Editorial
20 pages
A Textbook of Electronic Devices and Circuits - S. Prakash and S. Rawat
No ratings yet
A Textbook of Electronic Devices and Circuits - S. Prakash and S. Rawat
74 pages
COM302 Critical Essay Instructions and Tips
No ratings yet
COM302 Critical Essay Instructions and Tips
3 pages
Non Ideal Reactors
No ratings yet
Non Ideal Reactors
17 pages
The Future of Forensic Psychology Core Topics and Emerging Trends - 1st Edition ISBN 1032311959, 9781032311951 Unlimited Ebook Download
No ratings yet
The Future of Forensic Psychology Core Topics and Emerging Trends - 1st Edition ISBN 1032311959, 9781032311951 Unlimited Ebook Download
14 pages
Papertopics2 PDF
No ratings yet
Papertopics2 PDF
2 pages
13th Sign - Ophiuchus NASA JPL Ephemeris
100% (1)
13th Sign - Ophiuchus NASA JPL Ephemeris
29 pages
Roller Mill Brochure
No ratings yet
Roller Mill Brochure
6 pages
UNIT 3 - TEENAGERS - Done
No ratings yet
UNIT 3 - TEENAGERS - Done
7 pages
Tinuvin 400-DW (N) : Technical Data Sheet
No ratings yet
Tinuvin 400-DW (N) : Technical Data Sheet
3 pages
EHYHBH-AV32, EHYHBX-AV3 - Operation Manual - 4PEN349588-1F - English
No ratings yet
EHYHBH-AV32, EHYHBX-AV3 - Operation Manual - 4PEN349588-1F - English
16 pages
Series 805 Protective Cover-62707
No ratings yet
Series 805 Protective Cover-62707
4 pages

Data Visualization 13

Uploaded by

Data Visualization 13

Uploaded by

13.

Visualization Techniques for

• The final step in the KDD process :

It is important to consider that the KDD process is

• Need to determine techniques that balance

• Visualization techniques dependent upon

• Data mining rarely classify

• Excellent way to view

• Extensions include, displaying points in:

• Map decision trees

• Intuitive representation - hierarchical

• The “most typical

• Graphically display the

• Data visualization techniques are extremely important

You might also like