0% found this document useful (0 votes)

19 views13 pages

4.3-DecisionTreesLearningAlgorithms Part 1

Uploaded by

Sujithra Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views13 pages

4.3-DecisionTreesLearningAlgorithms Part 1

Uploaded by

Sujithra Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

NPTEL

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Week 4 Inductive Learning based on

Symbolic Representations
and Weak Theories

Video 4.3 Decision Tree Learning Algorithms Part 1

Agenda for the lecture

• Decision Trees in general

• TDIDT algorithms
• Information theoretical measures
• ID3 algoritm
• Treating overfitting through pruning
• Alternative algorithms
Decision Trees
The challenge is to design a decision tree such that the tree optimizes the
fit of considered data-items and the predictive performance for still
unseen data-items (minimal prediction error).

Decision trees analysis are of two main types:

- Classification tree analysis when the leaves are labelled according to
the k target classes included in the data-set
- Regression tree analysis when the leaves are real numbers or intervals.

Decision trees represent a disjunction of conjunctions of constraints on

the feature values of instances i.e.,(...∧... ∧...) ∨(... ∧... ∧...) ∨...

A decision tree can also be seen as equivalent to a set of if-then-rules

where each branch represents one if-then-rule where the if part
corresponds to the conjunctions of feature tests on the nodes and the
then-part corresponds to the class label or numerical range of the branch.

.
Decision Trees and Learning Algorithms
Pros

• Easy interpretable for humans

• Very compact formalism
• Easy handling of irrelevant anttributes
• Reasonable handling of missing data and noise
• Very fast at testing time

Cons

• Only axis-aligned splits of data-items

• Greedy and may not find the globally optimal tree.
Learning for this Representation
We will no focus on a particular category of learning techniques called
Top-Down Induction of Decision Trees (TDIDT).

The scenario for learning is supervised non-incremental data-driven learning from examples.

The systems are presented with a set of instances and develops a decision tree from the top down, guided
by frequency information in the examples. The trees are constructed beginning with the root of the tree
and proceeding down to its leaves.

The order in which instances are handled is not supposed to influence the build up of the trees.
The systems typically examine and re-examine all of the instances at many stages during learning.

Building the tree from from the top and downward, the issue is to choose and order features that
discriminate data-items (instances) in an optimal way.

Subtopics:
- Use of information theoretic measures to guide the selection and ordering of features
- Avoiding underfitting and overfitting by pruning of the tree
- Generation of several decision trees in parallel (e.g. random forest).
- Introduction of some kind of Inductive Bias (e.g Occam´s razor)
Purity or Homogeneity
The entire Data-set (all training instances) is associated
with the tree as a whole (the Root).

For every decision split based on a chosen feature and its

values, the Data-set is partitioned and the sub-sets
become associated with the nodes.

This is repeated recursively down to the leaves.

Purity or Homogeneity refers to the distribution of Data-

items of the k target classes both for the root and for each
of the nodes. Less degree of mix of classes implies higher
purity.

Most algorithms aim to maximize the purity of all nodes.

The purity or im-purity of nodes is measured by a set of

alternative information theoretic metrics.
Information Theoretic Measures

Information Gain based on Entropy

Information Gain based on Gini measure

Variance reduction
Information Gain and Entropy measures

Information Gain is a statistical measure that indicates how well

a given feature F separates (discriminates) instances according to the target classes
for an arbitrary collection of examples = S. |S|= cardinality of S.

Gain (S, F) = Entropy ( S ) − (v ∈values(F)): Sum ((|S v|/ |S|) * Entropy ( S v ))

S v = subsets of sets with value v of feature F.

Entropy is a statistical measure from information theory that characterizes

impurity of an arbitrary collection of examples = S.

For binary classiﬁcation: H(S) = −p (+) log2 p(+) − p (-) log2 p(-)

For n-ary classiﬁcation: H(S) = - (all c in target classes):Sum (p(c) * log2 p (c))
Gini Gain and Gini Impurity

Gini Gain is a statistical measure that indicates how well a given feature F
separates the instances in a given set S. |S|= cardinality of S.

Gini Gain (S, F) = Gini impurity ( S )

− (v ∈values(F)) Sum ( (|Sv|/ |S|) * Gini impurity ( S v ))

S v = subsets of sets with value v of feature F.

Gini Impurity is a measurement of the likelihood of an incorrect classification of

a new instance of a random variable, if that new instance were randomly
classified according to the distribution of class labels from the data set

Gini impurity: Gini(S) = 1 - (all c in target classes): Sum ( (p c) * 2)

Example
Example cont.
Entropy H of S (whole Data-set)
S={D1,...,D14}=[9+,5−]
H(S)=−9 /14*log2 9/14 −5/14*log2 5/14=0.940

Information Gain for Wind feature

S for Weak value ={D1,D3,D4,D5,D8,D9,D10,D13}=[6+,2−]
S for Strong value ={D2,D6,D7,D11,D12,D4}=[3+,3−]
Gain(S, Wind) = H(S) – for al v of Wind Sum ( |Sv|/|S| *H(Sv) ).
= H(S) – 8/14*H(S weak)-6/14*H(S Strong) =0.940-8/14*0.811 – 6/14* 1.0 =0.048

Information gains for the four features:

Gain(S,Outlook)=0.246 Gain(S,Humidity)=0.151
Gain(S,Wind)=0.048 Gain(S,Temperature)=0.029

Outlook has the highest Information Gain and is the preferred feature to discriminate
among data-items.
Example cont.
Gini impurity of S (whole Data-set)
S={D1,...,D14}=[9+,5−]
Gini(S)= 1 - 9/14^2 – 5/14^2 =…....

Gini impurity for Wind feature

S for Weak value ={D1,D3,D4,D5,D8,D9,D10,D13}=[6+,2−]
S for Strong value ={D2,D6,D7,D11,D12,D4}=[3+,3−]
Gini(S Weak) = 1 - 6/8^2 – 2/8^2 =........
Gini(S Strong) = 1- 3/6^2 – 3/6^2 =............

Gini Gain for Wind feature

Gini Gain(S, Wind) = Gini(S) – for al v of Wind Sum ( |Sv|/|S| *Gini(Sv) ).
= Gini(S) – 8/14* Gini(S weak) – 6/14 *Gini(S Strong) = .........
To be continued in Part 2

Tasks On Decision Trees
No ratings yet
Tasks On Decision Trees
11 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Decision Trees
No ratings yet
Decision Trees
16 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Inductive Inference with Decision Trees
No ratings yet
Inductive Inference with Decision Trees
53 pages
Unit - 3 ML
No ratings yet
Unit - 3 ML
17 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Decision Tree Example
No ratings yet
Decision Tree Example
21 pages
Decistion Tree
No ratings yet
Decistion Tree
27 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
Decision Tree Algorithms Guide
No ratings yet
Decision Tree Algorithms Guide
49 pages
Data Mining Unit-IV
No ratings yet
Data Mining Unit-IV
7 pages
RB's ML2 Notes
No ratings yet
RB's ML2 Notes
5 pages
Class Basic
No ratings yet
Class Basic
75 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Understanding Decision Trees and Information Gain
No ratings yet
Understanding Decision Trees and Information Gain
3 pages
Decision Tree
No ratings yet
Decision Tree
8 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
P4-DTRF 1
No ratings yet
P4-DTRF 1
63 pages
Assignment 3
No ratings yet
Assignment 3
8 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
33 pages
Classification Algorithms
No ratings yet
Classification Algorithms
31 pages
Training Day 22
No ratings yet
Training Day 22
48 pages
DM Unit 4
No ratings yet
DM Unit 4
24 pages
2c Decision Tree Algorithm
No ratings yet
2c Decision Tree Algorithm
21 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
80 pages
Decision Tree Classifier: GINI vs Info Gain
No ratings yet
Decision Tree Classifier: GINI vs Info Gain
8 pages
Trees
No ratings yet
Trees
78 pages
MLT 3 UNIT-Part-1
No ratings yet
MLT 3 UNIT-Part-1
28 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Decision Tree - Associative Rule Mining
No ratings yet
Decision Tree - Associative Rule Mining
69 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
48 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
56 pages
08ClassBasic L
No ratings yet
08ClassBasic L
78 pages
Classification
No ratings yet
Classification
45 pages
Attribute Selection Measures Explained
No ratings yet
Attribute Selection Measures Explained
46 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Decision Trees
No ratings yet
Decision Trees
61 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Unit 4 DM
No ratings yet
Unit 4 DM
88 pages
Unit 3 Part 2
No ratings yet
Unit 3 Part 2
21 pages
Classification
No ratings yet
Classification
45 pages
4.3-DecisionTreesLearningAlgorithms Part 2
No ratings yet
4.3-DecisionTreesLearningAlgorithms Part 2
15 pages
3 4-ArtificialNeuralNetworks
No ratings yet
3 4-ArtificialNeuralNetworks
18 pages
3 3-BayesianNetworks
No ratings yet
3 3-BayesianNetworks
13 pages
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
No ratings yet
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
18 pages
3 5-GeneticAlgorithms
No ratings yet
3 5-GeneticAlgorithms
16 pages
2 3-FeatureRelatedIssues
No ratings yet
2 3-FeatureRelatedIssues
10 pages
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
No ratings yet
Week 2 Characterization of Learning Problems: Nptel Video Course On Machine Learning
17 pages
3 6-LogicProgramming
No ratings yet
3 6-LogicProgramming
8 pages
Week 2 Watermark
No ratings yet
Week 2 Watermark
84 pages
Week1 Annotated
No ratings yet
Week1 Annotated
4 pages
Week 1
No ratings yet
Week 1
12 pages
Unit 1 Datamining
No ratings yet
Unit 1 Datamining
16 pages
Review Machine Learning Techniques Applied To Cybersecurit
No ratings yet
Review Machine Learning Techniques Applied To Cybersecurit
14 pages
Big Mart Sales Prediction Using ML
No ratings yet
Big Mart Sales Prediction Using ML
25 pages
Text Mining for Jakarta Hotel Reviews
No ratings yet
Text Mining for Jakarta Hotel Reviews
11 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
3 pages
Data Science and Machine Learning Essentials: Lab 4A - Working With Regression Models
No ratings yet
Data Science and Machine Learning Essentials: Lab 4A - Working With Regression Models
24 pages
Agriculture 12 02137 v2
No ratings yet
Agriculture 12 02137 v2
23 pages
ML Unit 2
No ratings yet
ML Unit 2
53 pages
Decision Tree Algorithm in Healthcare AI
No ratings yet
Decision Tree Algorithm in Healthcare AI
10 pages
Internship IN Data Analysis Using Machine Learning: Gopal Tiwari
No ratings yet
Internship IN Data Analysis Using Machine Learning: Gopal Tiwari
44 pages
Data Science Problem Solving
No ratings yet
Data Science Problem Solving
3 pages
MLT Unit-3 Important Questions
No ratings yet
MLT Unit-3 Important Questions
8 pages
Estimating The Class Prior in Positive and Unlabeled Data Through Decision Tree Induction
No ratings yet
Estimating The Class Prior in Positive and Unlabeled Data Through Decision Tree Induction
8 pages
Car Insurance Claim Prediction with ML
No ratings yet
Car Insurance Claim Prediction with ML
26 pages
Mini Project Documentation
No ratings yet
Mini Project Documentation
81 pages
1) Define and Explain Following Terms.: I) Bayesian Network
No ratings yet
1) Define and Explain Following Terms.: I) Bayesian Network
3 pages
Interpretable ML in Genomics
No ratings yet
Interpretable ML in Genomics
30 pages
Machine Learning Report Official
No ratings yet
Machine Learning Report Official
17 pages
ML Question Bank
No ratings yet
ML Question Bank
4 pages
Unit-3 Classification
No ratings yet
Unit-3 Classification
28 pages
Unit3 ML
No ratings yet
Unit3 ML
7 pages
Report Group 14
No ratings yet
Report Group 14
43 pages
Multivariate Data Analysis: Overview of Methods
100% (1)
Multivariate Data Analysis: Overview of Methods
30 pages
Google Play Download Behavior Analysis
No ratings yet
Google Play Download Behavior Analysis
11 pages
Predictive HR Analytics for Retention
No ratings yet
Predictive HR Analytics for Retention
33 pages
Comparison of Classification Algorithms
No ratings yet
Comparison of Classification Algorithms
11 pages
Salary Prediction
No ratings yet
Salary Prediction
4 pages
AIML MCQ All
No ratings yet
AIML MCQ All
20 pages
Gender Recong Paper 4
No ratings yet
Gender Recong Paper 4
9 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages

4.3-DecisionTreesLearningAlgorithms Part 1

Uploaded by

4.3-DecisionTreesLearningAlgorithms Part 1

Uploaded by

NPTEL

Video Course on Machine Learning

Professor Carl Gustaf Jansson, KTH

Week 4 Inductive Learning based on

Video 4.3 Decision Tree Learning Algorithms Part 1

• Decision Trees in general

Decision trees analysis are of two main types:

Decision trees represent a disjunction of conjunctions of constraints on

A decision tree can also be seen as equivalent to a set of if-then-rules

• Easy interpretable for humans

• Only axis-aligned splits of data-items

For every decision split based on a chosen feature and its

This is repeated recursively down to the leaves.

Purity or Homogeneity refers to the distribution of Data-

Most algorithms aim to maximize the purity of all nodes.

The purity or im-purity of nodes is measured by a set of

Information Gain based on Entropy

Information Gain based on Gini measure

Information Gain is a statistical measure that indicates how well

Gain (S, F) = Entropy ( S ) − (v ∈values(F)): Sum ((|S v|/ |S|) * Entropy ( S v ))

S v = subsets of sets with value v of feature F.

Entropy is a statistical measure from information theory that characterizes

Gini Gain (S, F) = Gini impurity ( S )

S v = subsets of sets with value v of feature F.

Gini Impurity is a measurement of the likelihood of an incorrect classification of

Gini impurity: Gini(S) = 1 - (all c in target classes): Sum ( (p c) * 2)

Information Gain for Wind feature

Information gains for the four features:

Gini impurity for Wind feature

Gini Gain for Wind feature

You might also like