0% found this document useful (0 votes)

185 views32 pages

Decision Trees (I) : ISOM3360 Data Mining For Business Analytics, Session 4

The document discusses decision trees for classification and regression. It recaps data understanding and preprocessing steps. Decision trees are popular due to their interpretability and computational efficiency. Classification trees classify examples based on attribute tests from the root node to a leaf node. Regression trees are used for numeric target variables. The document provides examples of classification trees and how they are constructed through recursive partitioning to split data into purer subgroups.

Uploaded by

Hiu Tung Chan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

185 views32 pages

Decision Trees (I) : ISOM3360 Data Mining For Business Analytics, Session 4

Uploaded by

Hiu Tung Chan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

ISOM3360 Data Mining for Business Analytics, Session 4

Decision Trees (I)

Instructor: Rong Zheng

Department of ISOM
Fall 2020
Recap: Data Understanding

Preliminary investigation of the data to better understand

its specific characteristics
 Help in selecting appropriate data mining algorithms

Things to look at
 Class imbalance
 Dispersion of data attribute values
 Skewness, outliers, missing values
 Correlation analysis

Visualization tools are important

 Histograms, box plots
 Scatter plot
2
Recap: Data Preprocessing

Data transformation might be needed

 Handling missing values
 Handling categorical variables
 Feature transformation
 e.g., log transformation
 Normalization (back to this when clustering is discussed)
 Feature discretization

3
Question

Which of the following normalization method(s)

may transform the original variable to a negative
value: min-max, z-score, or both, or neither?

4
What Induction Algorithm Shall We Use?

5
Commonly Used Induction Algorithms

Post-mortem analysis of a popular data mining competition

6
Why Decision Trees?

Decision trees are one of the most popular data mining

tools.
 Classification trees (used when target variable is categori
cal)
 Regression trees (used when target variable is numeric)

They are easy to understand, implement and use, and c

omputationally cheap.
The model comprehensibility is important for communi
cation to non-DM-savvy stakeholders.

7
Classification Tree

Employed Balance Age Default

Yes 123,000 50 No
No 51,100 40 Yes
No 68,000 55 No
Yes 34,000 46 Yes
Yes 50,000 44 No
No 100,000 50 Yes

Objective: predicting borrowers who will default on loan payments.

8
Classification Tree: Upside-Down

Employed Root Node

Yes No
Class = Not
Default Balance Node

>=50K <50K
Class = Not
Default
Age

>=45 <45

Class = Not Class =

Leaf Default Default

9
Classification Tree: Divide and Conquer
“Recursive Partitioning”
Employed
Nodes
 Each node represents one attribute Yes No
 Tests on nominal attribute: number Class = Not
of splits (branches) is number of po Default Balance
ssible values or 2 (one value vs. res
>=50K <50K
t)
 Continuous attributes are discretize Class = Not
d Default
Age

Leaves >=45 <45

 A class assignment (e.g., default /not default) Class = Not Class =
 Also provide a distribution over all possible Default Default
classes (e.g., default with probability 0.25, not
default with prob. 0.75)
10
How a Tree is Used for Classification ?
To determine the class of a new exa
mple: e,g., Mark, age 40, retired, bal Employed
ance 38K. Yes No
 The example is routed down the tre Class = Not
e according to values of attributes . Default Balance
 At each node, a test is applied to one >=50K <50K
attribute.
Class = Not
 When a leaf is reached, the example Default
is assigned to a class—or alternativel Age
y to a distribution over the possible c >=45 <45
lasses (e.g., default with probability
0.25, not default with prob. 0.75). Class = Not Class =
Default Default

11
Assigning Probability Estimates

Age >=45

Entire Population
Age <45
Balance >= 50k

Balance < 50k

Age >=45

Age < 45

Q: How would you assign estimates of class probability (e.g.,

probability of default/not default) based on such trees?
12
Exercise: Assume this is the classification tree you learned from the
A Process View to a Business Problem
past defaulting data. A new guy, who is 45 and has 20K balance, is
applying a credit card issued by your company. Can you predict if
this new guy is gonna default? How confident are you about your
prediction? Another girl is also applying the same credit card. But
the only information we have about her is she has 70k balance. Can
you predict if she will default? How sure about that?

13
Classification Tree Learning

Objective: based on customer attributes, partition the custome

rs into subgroups that are less impure – with respect to the clas
s (i.e., such that in each group most instances belong to the sa
me class)
No
Yes
No Yes Yes
Yes Yes

No
Yes Yes No No

14
Classification Tree Learning

Partitioning into “purer” groups

Orange Bodies Purple Bodies

Yes Yes Yes Yes No No

Yes Yes No Yes No No

15
Classification Tree Learning

Partitioning into “purer” groups recursively

Purple Bodies
Yes No No

Yes No No

16
Classification Tree Learning

Purple Bodies

Red Head Green Head

Yes No No

17
Classification Tree Learning

Partitioning into “purer” groups

Orange Bodies Purple Bodies

Yes Yes Yes Yes No No

Yes Yes No Yes No No

18
Classification Tree Learning

Partitioning into “purer” groups recursively

Orange Bodies
Yes Yes Yes

Yes Yes No

19
Classification Tree Learning
Orange Bodies

Blue Limbs Yellow Limbs

Yes Yes Yes
No

Yes Yes

20
Classification Tree Learning
Body

Orange Bodies Purple Bodies

Yes Yes Yes Yes No No

Yes Yes No Yes No No

Head
Limbs
Red Head Green Head
Blue Limbs
Yes No No
Yellow Limbs
Yes Yes Yes Yes Yes No
Yes No No

21
Summary: Classification Tree Learning

A tree is constructed by recursively partitioning the e

xamples.
With each partition, the examples are split into subgr
oups that are “increasingly pure”.

22
Let’s play a game. I have someone in my mind, and
your job is to guess this person. You can only ask
yes/no question.

This person is an entrepreneur.

Go!

23
Next…

Some important questions without being answered

yet:
 How to automatically choose which attribute to be
used to split the data?
 When to stop the splitting?

24
How to Choose Which Attribute to Split Over?

Objectives
 For each splitting node, choose the attribute that best p
artitions the population into less impure groups.
 All else being equal, fewer nodes is better (more compr
ehensible, easy to use, reduce overfitting)

Impurity measures: many available but most common

one (from information theory) is: entropy.

25
Entropy

is the proportion of class in the data

For example: our initial population is composed of 16 cases

of class “Default” and 14 cases of class “Not default”

Entropy (entire population of examples) =

26
Entropy Exercise

A dataset is composed of 10
1
cases of class “Positive” and
10 cases of class “Negative” 0.75

Entropy
Entropy=?
0.5
A dataset is composed of 0 c
ases of class “Positive” and 2 0.25
0 cases of class “Negative”
0
Entropy=? 0.25 0.5 0.75 1
% of one class

Two-class entropy function

𝑡𝑖𝑝 : 0 log 2 0=0

27
Information Gain (based on Entropy)

The information gain is based on the reduction in entro

py after a dataset is split on an attribute

Information Gain =
entropy (parent) – [weighted average] entropy (children)

where weight of each child is given by the

proportion of the examples in that child.
Balance<50k Balance>=50k

28
Information Gain Example
30 instances

Balance < 50k Balance >= 50k

17 instances
13 instances

(Weighted) Average Impurity of Children =

Information Gain = 0.997 - 0.615 = 0.382

29
What If We Split Over “Age” First?
30 instances

Age < 45 Age >=45

15 instances
15 instances

Impurity Impurity
=? =? Exercise!
Information Gain = ?
recall, gain from first splitting on “Balance” = 0.382
30
Our Original Question

Now, ready to answer anxiously awaiting question:

How to choose which attribute to split over?

Answer: at each node, choose the attribute that obt

ains maximum information gain!

31
Decision Tree Algorithm (Full Tree)

 Step 1: Calculate the information gain from splitting over

each attribute using the dataset
 Step 2: Split the set into subsets using the attribute for which
the information gain is maximum
 Step 3: Make a decision tree node containing that attribute,
divide the dataset by its branches and repeat the same
process on every node.
 Step 4a: A node with entropy of 0 is a leaf.
 Step 4b: A node with entropy more than 0 needs further
splitting.

Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Trees
No ratings yet
Trees
78 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
MM5425 WK5 Lecture Decision Tree
No ratings yet
MM5425 WK5 Lecture Decision Tree
34 pages
Module - 4.1-DM-1
No ratings yet
Module - 4.1-DM-1
63 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
ML-chap9 2024 110217
No ratings yet
ML-chap9 2024 110217
52 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Lecture4 Supervised Segmentation For Students
No ratings yet
Lecture4 Supervised Segmentation For Students
44 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Decision Tree Classification in Python
No ratings yet
Decision Tree Classification in Python
14 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
Decision Tree Basics for Data Scientists
No ratings yet
Decision Tree Basics for Data Scientists
61 pages
Lecture 5 DecisionTree
No ratings yet
Lecture 5 DecisionTree
21 pages
Tree Based Algorithms in Machine Learning
No ratings yet
Tree Based Algorithms in Machine Learning
8 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
08 Class Basic
No ratings yet
08 Class Basic
86 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
56 pages
Lecture 11 Classification-1
No ratings yet
Lecture 11 Classification-1
30 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Understanding Classification and Decision Trees
No ratings yet
Understanding Classification and Decision Trees
80 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Lecture 6 Classification-Decision Tree Rule Based K-NN
No ratings yet
Lecture 6 Classification-Decision Tree Rule Based K-NN
73 pages
Classification
No ratings yet
Classification
75 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
33 pages
CHTKT - DataScience - Chapter03 - Machine Learning With Python - 02
No ratings yet
CHTKT - DataScience - Chapter03 - Machine Learning With Python - 02
34 pages
Decision Tree Course Guide
No ratings yet
Decision Tree Course Guide
37 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Supervised Learning Algorithm
No ratings yet
Supervised Learning Algorithm
59 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
11 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Decision Trees: Classifier
No ratings yet
Decision Trees: Classifier
23 pages
Decision Tree
No ratings yet
Decision Tree
23 pages
Decision Tree Learning Basics
No ratings yet
Decision Tree Learning Basics
36 pages
Classification and Regression Tree Construction
No ratings yet
Classification and Regression Tree Construction
18 pages
Classification
No ratings yet
Classification
45 pages
Classification&Decision Tree
No ratings yet
Classification&Decision Tree
10 pages
Decision Tree
No ratings yet
Decision Tree
21 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Data Mining I: Summer Semester 2017
No ratings yet
Data Mining I: Summer Semester 2017
52 pages
UNIT 3 Classification
No ratings yet
UNIT 3 Classification
17 pages
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
No ratings yet
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
61 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
DS535 Note 6 (Page1-14)
No ratings yet
DS535 Note 6 (Page1-14)
13 pages
Unit-2 Material
No ratings yet
Unit-2 Material
52 pages
Lecture11 Ch8 ClassBasic Part1
No ratings yet
Lecture11 Ch8 ClassBasic Part1
38 pages
Chapter 4 SqCzYr
No ratings yet
Chapter 4 SqCzYr
47 pages
Chap4 Classification Lecture 5
No ratings yet
Chap4 Classification Lecture 5
74 pages
4 & 5 DWM 2024-25
No ratings yet
4 & 5 DWM 2024-25
32 pages
ECE Classification Concepts
No ratings yet
ECE Classification Concepts
69 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Research - Paper (1) (1) (1) Final
No ratings yet
Research - Paper (1) (1) (1) Final
4 pages
TOEFL Listening Strategies Guide
No ratings yet
TOEFL Listening Strategies Guide
10 pages
Final Lesson Plan for A2 Students
No ratings yet
Final Lesson Plan for A2 Students
14 pages
COMSATS University Islamabad Second Sessional Examination: Fall 2020
No ratings yet
COMSATS University Islamabad Second Sessional Examination: Fall 2020
5 pages
CH 6
100% (1)
CH 6
29 pages
Leadership & Motivation Models
No ratings yet
Leadership & Motivation Models
49 pages
Listening With The Third Ear - SAFRAN JEREMY
No ratings yet
Listening With The Third Ear - SAFRAN JEREMY
13 pages
Introduction + Methodology + Result + Discussion
No ratings yet
Introduction + Methodology + Result + Discussion
15 pages
19th Century Literary Collection
100% (1)
19th Century Literary Collection
296 pages
Department of Education: Republic of The Philippines
100% (1)
Department of Education: Republic of The Philippines
7 pages
PPT-Teaching Beginning Reading
100% (2)
PPT-Teaching Beginning Reading
20 pages
Selecting Effective KAIZEN Themes
No ratings yet
Selecting Effective KAIZEN Themes
15 pages
Advanced Rhetoric
No ratings yet
Advanced Rhetoric
3 pages
DLL Catch Up Friday Week 3
No ratings yet
DLL Catch Up Friday Week 3
8 pages
GNED 01 - w1 3
No ratings yet
GNED 01 - w1 3
31 pages
ELL Strategies for 5th Grader Noah
No ratings yet
ELL Strategies for 5th Grader Noah
22 pages
Grade 9 Lesson Plan
No ratings yet
Grade 9 Lesson Plan
403 pages
Writing The Report Survey
67% (9)
Writing The Report Survey
28 pages
Enhance Reading Skills in Kpotame Basic 5
No ratings yet
Enhance Reading Skills in Kpotame Basic 5
32 pages
Bruner, J. S. (1966) - Toward A Theory of Instruction (Vol. 59) - Harvard University Press.
83% (6)
Bruner, J. S. (1966) - Toward A Theory of Instruction (Vol. 59) - Harvard University Press.
28 pages
The-21st-Century-Instructional-Leader Lecture 2
100% (6)
The-21st-Century-Instructional-Leader Lecture 2
53 pages
Unit 1 Reading-Personalities
No ratings yet
Unit 1 Reading-Personalities
4 pages
Query-Dependent Video Retrieval
No ratings yet
Query-Dependent Video Retrieval
12 pages
Vocabulary Learning and Teaching
No ratings yet
Vocabulary Learning and Teaching
3 pages
Linguistics MCQs for Students
100% (3)
Linguistics MCQs for Students
81 pages
Organisational Behaviour
No ratings yet
Organisational Behaviour
12 pages
7 Steps to Manifest Godhood
100% (2)
7 Steps to Manifest Godhood
3 pages
Análisis de la Competencia Comunicativa
No ratings yet
Análisis de la Competencia Comunicativa
12 pages
Competency-Based Education Overview
No ratings yet
Competency-Based Education Overview
35 pages
Rosetta Stone - English US-TeachersGuide - Level1
100% (3)
Rosetta Stone - English US-TeachersGuide - Level1
242 pages