0% found this document useful (0 votes)

16 views118 pages

ML Module 2 2025

Uploaded by

dd1738851

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views118 pages

ML Module 2 2025

Uploaded by

dd1738851

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Module-2

Decision Tree Learning

• Introduction,
• Decision tree representation,
• Appropriate problems for decision tree learning,
• Issues In Decision Tree Learning ,
• Basic decision tree learning algorithm,
• Tree Construction,
• Testing and storing the classifier,
• Example: using decision trees to predict contact lens type

1
Decision tree
Decision tree learning is a method for approximating discrete-valued target functions,
in which the learned function is represented by a decision tree.

2
Decision tree

3
Decision Tree

Raw data Decision making trees

4
Terminology used in Decision Tree

• Root Node
• Splitting
• Decision Node
• Leaf/ Terminal Node:
• Pruning
• Branch / Sub-Tree
• Parent and Child Node

5
Decision tree representation

6
Convert the following Boolean functions into decision trees.
1. A Λ B(A AND B)

A B AΛB

7
• Convert the following boolean functions into decision trees.
1. A Λ B

A B AΛB A
T F
T T T
T F F
F T F
F F F

8
• Convert the following boolean functions into decision trees.
1. A Λ B

A B AΛB A
T F
T T T B F
T F F
F T F
F F F

9
• Convert the following boolean functions into decision trees.
1. A Λ B

A B AΛB A
T F
T T T B
T F F
T F F
F T F
T F
F F F

10
2. A XOR B

11
2. A XOR B

12
3. A Λ ‫ﬧ‬B

13
4. A V [ B Λ C ]

14
5. [A Λ B] V [C Λ D]

15
Appropriate problems for decision tree learning
o Instances are represented by attribute-value pairs
o The target function has discrete output values
o Disjunctive descriptions may be required
o The training data may contain errors
o The training data may contain missing attribute values

Problems :
 Classify medical patients by their disease
 Equipment malfunctions by their cause
 Loan applicants by their likelihood of defaulting on payments

16
17
Entropy

18
Entropy:
Entropy is the measures of impurity, disorder or uncertainty in
a bunch of examples.

• Example: S is a collection of 14 examples → [9+, 5-]

Entropy(S) = — (9/14) log2(9/14) — (5/14)log2(5/14)
Entropy(S) = 0.940
• The entropy is 0 if all members of S belong to the same class.
For example, if all members are positive
Entropy(S) = — (1) log2(1) — (0)log2(0) = 0

19
• The entropy is 1 when the collection contains an equal number of
positive and negative examples.
• If the collection contains unequal numbers of positive and negative
examples, the entropy is between 0 and 1.
• Entropy of S :

Interpretation of entropy
p+ = 1
p+ = 0.5
p+ = 0.8

20
INFORMATION GAIN MEASURES THE EXPECTED
REDUCTION IN ENTROPY
• Given entropy as a measure of the impurity in a collection of training examples,
we can now define a measure of the effectiveness of an attribute in classifying
the training data.
• The measure we will use, called information gain, is simply the expected
reduction in entropy caused by partitioning the examples according to this
attribute. More precisely, the information gain, Gain(S, A) of an attribute A,
relative to a collection of examples S, is defined as

S → Collection of examples, A → Attribute, Values(A) → possible values of attribute A

Sv → subset of S for which attribute A has value v
Gain(S, A) is therefore the expected reduction in entropy caused by knowing the value of
attribute A. Put another way, Gain(S, A) is the information provided about the target
&action value, given the value of some other attribute A.

21
22
Example:
• Suppose S is a collection of training-example days described by attributes
including Wind, which can have the values Weak or Strong.
• As before, assume S is a collection containing 14 examples, [9+, 5-].
• Of these 14 examples, suppose 6 of the positive and 2 of the negative
examples have Wind = Weak, and the remainder have Wind = Strong.

The information-gain due to sorting the original 16 examples by the attribute

Wind may then be calculated as Gain(S, Wind)

Values(Wind) = { Weak, Strong }

S = [9+, 5-]

Sweak = [6+, 2-] → Entropy(Sweak) = 0.811

Sstrong = [3+, 3-] → Entropy(Sstrong) = 1.00
Gain(S, Wind) = Entropy(S)- 8/14 .Entropy(Sweak ) – 6/14 .Entropy (Sstrong)
= 0.048

23
• Consider the following set of training examples
a) What is the entropy of this collection of training example with respect to
the target function classification?
b) What is the information gain of a2 relative to these training examples?

24
a) no. of +ve instances =3
no. of -ve instances =3
Entropy(S) = 1

b) Gain (S,a2) = Entropy(S)-[4/6 Entropy(ST) + 2/6 Entropy(SF)]

Entropy(ST) = -(2/4) log2(2/4) - (2/4) log2(2/4) =1

Entropy(SF) = -(1/2) log2(1/2) -(1/2) log2(1/2) =1

Gain (S,a2) = 1- [4/6(1)+2/6(1)]

= 1-1
=0

25
26
27
28
Algorithm used in decision trees:
ID3
Gini Index
Chi-Square
Reduction in Variance

• ID3(Iterative Dichotomiser)
ID3 → J. R. Quinlan in 1986.

29
Basic decision tree learning algorithm

• Most algorithms that have been developed for learning decision trees
are variations on a core algorithm that employs a top-down, greedy
search through the space of possible decision trees.
• Basic algorithm, ID3, learns decision trees by constructing them
topdown, beginning with the question "which attribute should be tested
at the root of the tree?‘
• To answer this question, each instance attribute is evaluated using a
statistical test to determine how well it alone classifies the training
examples.
• The best attribute is selected and used as the test at the root node of the
tree. A descendant of the root node is then created for each possible
value of this attribute, and the training examples are sorted to the
appropriate descendant node

30
• “which attribute should be tested at the root of the tree?”
Instance attribute → statistical test
Descendant of the root node
Best attribute
concept learning

• Which Attribute is the Best Classifier

Selecting attribute
Information gain
Entropy

31
Basic decision tree learning algorithm
• A simplified version of the algorithm, specialized to learning
boolean-valued functions (i.e., concept learning)

32
Algorithm ID3 (Examples, TargetAttribute, Attributes)
1. Create a Root node for the tree
2. If all Examples are positive → +
3. If all Examples are negative → -
4. If Attributes is empty → Root
Otherwise Begin
A ← the attribute from Attributes that best classifies Examples
The decision attribute for Root ← A
For each possible value vi of A,
Add a new tree branch below Root, corresponding to the test A = vi
Let Examples_vi be the subset of Examples that have value vi for A
If Examples_vi is empty
Then below this new branch add a leaf node with label = most
common value of TargetAttribute in Examples
Else below this new branch add the subtree
ID3(Examples_vi , TargetAttribute, Attributes – {A})
End
5. Return Root
33
Summary of the ID3 algorithm
• Summary of the ID3 algorithm specialized to learning
boolean-valued functions. ID3 is a greedy algorithm that
grows the tree top-down, at each node selecting the attribute
that best classifies the local training examples.
• This process continues until the tree perfectly classifies the
training examples, or until all attributes have been used.

34
An Illustrative Example

35
• Entropy(S) S(9+,5-)
= – PYes . log2 PYes – PNo . log2 PNo = 0.940

• Information Gain
1. Gain(S, Outlook)
Values(Outlook) = { Sunny, Overcast, Rain }
S = [9+, 5-]
OutlookSunny = [+, -] Entropy(OutlookSunny ) =
OutlookOvercast = [+, -] Entropy(OutlookOvercast) =

OutlookRain = [+, -] Entropy(OutlookRain) =

Gain(S, Outlook) = E(S) - 5/14 E(OutlookSunny ) – 4/14 E(OutlookOvercast) – 5/14 Entropy(OutlookRain)

36
Entropy([9+,5-] = – (9/14) log2 (9/14) – (5/14) log2 (5/14) = 0.940

38
2. Gain(S, Temparature)

Values(Temparature) = { Hot, Cool, mild }

S = [9+, 5-]
TemparatureHot = [+, -] Entropy(TemparatureHot ) =

TemparatureCool = [+, -] Entropy(TemparatureCool) =

TemparatureMild = [+,-] Entropy(TemparatureMild) =

Gain(S, Temparature) = E(S) - 4/14 E(TemparatureHot ) – 4/14 E (TemparatureCool) – 6/14 E(TemparatureMild)

39
Entropy([9+,5-] = – (9/14) log2 (9/14) – (5/14) log2 (5/14) = 0.940

41
3. Gain(S, Humidity)

Values(Humidity) = { High, Normal}

S = [9+, 5-]
HumidityHigh = [3+, 4-] Entropy(Humidity High )=
Humidity Normal = [6+, 1-] Entropy(Humidity Normal) =

Gain(S, Humidity) = E(S) - 7/14 Entropy(Humidity High ) – 7/14 Entropy (Humidity Normal )

= 0.1518

42
Entropy([9+,5-] = – (9/14) log2 (9/14) – (5/14) log2 (5/14) = 0.940

44
4. Gain(S, Wind)

Values(Wind) = { Weak, Strong }

S = [9+, 5-]
Sweak = [6+, 2-] Entropy(Sweak) =
Sstrong = [3+, 3-] Entropy(Sstrong) =

Gain(S,Wind) = Entropy(S)- 8/14 Entropy(Sweak ) – 6/14 Entropy (Sstrong)

= 0.048

45
Entropy([9+,5-] = – (9/14) log2 (9/14) – (5/14) log2 (5/14) = 0.940

47
1. Gain(S, Outlook) = 0.246
2. Gain(S, Temparature) = 0.029
3. Gain(S, Humidity) = 0.1516
4. Gain(S, Wind) = 0.048

• Root node decision on the tree

48
50
 Overcast outlook on decision

 Outlook is overcast → Yes

51
 Overcast outlook on decision

52
53
• Sunny outlook on decision

• Entropy(OutlookSunny) → 5 instances → S(2+,3-)

→ 0.9709
1- Gain(Outlook=Sunny|Temperature)
= 0.570
2- Gain(Outlook=Sunny|Humidity)
= 0.970
3- Gain(Outlook=Sunny|Wind)
= 0.019
54
• If humidity is High → No

• If humidity is Normal → Yes

59
60
61
• Rain outlook on decision

• Entropy(OutlookRain) → 5 instaces → S(3+,2-)

1- Gain(Outlook=Rain | Temperature)
= 0.019973
2- Gain(Outlook=Rain | Humidity)
= 0.019973
3- Gain(Outlook=Rain | Wind)
= 0.97095
62
• Rain outlook on decision

• If wind is weak and outlook is rain → Yes

• If wind is strong and outlook is rain → NO

67
• Final version of decision tree

68
• Final version of decision tree

69
Algorithm ID3 (Examples, TargetAttribute, Attributes)
1. Create a Root node for the tree
2. If all Examples are positive → +
3. If all Examples are negative → -
4. If Attributes is empty → Root
Otherwise Begin
A ← the attribute from Attributes that best classifies Examples
The decision attribute for Root ← A
For each possible value vi of A,
Add a new tree branch below Root, corresponding to the test A = vi
Let Examples_vi be the subset of Examples that have value vi for A
If Examples_vi is empty
Then below this new branch add a leaf node with label = most
common value of TargetAttribute in Examples
Else below this new branch add the subtree
ID3(Examples_vi , TargetAttribute, Attributes – {A})
End
5. Return Root
71
• Identify the entropy, information gain and draw the decision tree for the
following training examples.

76
Information Gain (S, Gender)
Information Gain (S, Ownership)
Information Gain (S, Travel cost)
Information Gain (S, Income level)
Gender : 0.101
Car Ownership : 0.3382
Travel cost : 0.76
Income Level : 0.437
Travel Cost

cheap standard expensive

Gender Bus
Train

Male Female

Bus Income Level

Bus Train
HYPOTHESIS SPACE SEARCH IN DECISION TREE LEARNING

• As with other inductive learning methods, ID3 can be characterized as

searching a space of hypotheses for one that fits the training examples.

• The hypothesis space searched by ID3 is the set of possible decision trees.

• ID3 performs a simple-to complex, hill-climbing search through this

hypothesis space, beginning with the empty tree, then considering
progressively more elaborate hypotheses in search of a decision tree that
correctly classifies the training data.

• The evaluation function that guides this hill-climbing search is the

information gain measure.

102
Hypothesis Space Search in Decision Tree Learning
This search is depicted in Figure

103
Capabilities and Limitations Of ID3 algorithm

• ID3's hypothesis space of all decision trees is a complete space of finite

discrete-valued functions, relative to the available attributes.
– Because every finite discrete-valued function can be represented by some decision tree, ID3 avoids one of the major
risks of methods that search incomplete hypothesis spaces (such as methods that consider only conjunctive
hypotheses): that the hypothesis space might not contain the target function.

• ID3 maintains only a single current hypothesis as it searches through the

space of decision trees.

• ID3 in its pure form performs no backtracking in its search.

– Once it, selects an attribute to test at a particular level in the tree, it never backtracks to reconsider this choice.
Therefore, it is susceptible to the usual risks of hill-climbing search without backtracking: converging to locally
optimal solutions that are not globally optimal.

• ID3 uses all training examples at each step in the search to make statistically
based decisions regarding how to refine its current hypothesis.

104
105
106
Issues in Decision Tree Learning
1. Avoiding Overfitting the Data

Reduced error pruning

Rule post-pruning

2. Incorporating Continuous-Valued Attributes

3. Alternative Measures for Selecting Attributes

4. Handling Training Examples with Missing Attribute Values

5. Handling Attributes with Differing Costs

107
Avoiding Over fitting the Data

o h=[0,1,2] h’=[0:100]
o Overfit: Given a hypothesis space H, a hypothesis h ∈ H is said to overfit the
training data if there exists some alternative hypothesis h' ∈ H, such that h has
smaller error than h' over the training examples, but h' has a smaller error than h
over the entire distribution of instances.

108
 Reasons for overfitting:
o Overfitting can occur when the training examples contain random
errors or noise

o When small numbers of examples are associated with leaf nodes.

 Approaches to avoid over fitting
o Pre-pruning (avoidance)
o Post-pruning (recovery)
o Reduced Error Pruning
o Rule post pruning

109
• Reduced-Error Pruning

112
• Training set
• Validation set
• Test set
113
• Rule Post-Pruning

115
• Incorporating Continuous-Valued Attributes

Temperature: 40 48 60 72 80 90
PlayTennis: No No Yes Yes Yes NO

116
• Incorporating Continuous-Valued Attributes

Temperature: 40 48 60 72 80 90
PlayTennis: No No Yes Yes Yes NO

• (48 + 60)/2 = 54
• (80 + 90)/2 =85
• Temperature >54 
• Temperature >85 

117
Alternative Measures for Selecting Attributes

118
• Tree construction- Refer Python programming
in Harringnton-Page 39
• Testing and storing the classifier( Harrington-
Page 56 )
• Example: using decision trees to predict
contact lens type –Page 57)

119
120
121

Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
79 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
79 pages
Machine Learning: Decision Trees Overview
No ratings yet
Machine Learning: Decision Trees Overview
41 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
81 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
21 pages
Unit 3
No ratings yet
Unit 3
90 pages
Decision Tree Learning in Machine Learning
No ratings yet
Decision Tree Learning in Machine Learning
20 pages
Decision Tree Learning in Machine Learning
No ratings yet
Decision Tree Learning in Machine Learning
42 pages
Lesson 5 Decision Tree Learning
No ratings yet
Lesson 5 Decision Tree Learning
10 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
42 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
7 pages
Decision Trees for Play Tennis Analysis
No ratings yet
Decision Trees for Play Tennis Analysis
51 pages
Understanding Decision Trees for Prediction
No ratings yet
Understanding Decision Trees for Prediction
61 pages
Module 3
No ratings yet
Module 3
101 pages
Decision Tree Learning Explained
No ratings yet
Decision Tree Learning Explained
70 pages
Decision Trees
No ratings yet
Decision Trees
31 pages
Decision Tree Learning in Machine Learning
No ratings yet
Decision Tree Learning in Machine Learning
37 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
66 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
Decision Tree Learning and Algorithms
No ratings yet
Decision Tree Learning and Algorithms
19 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
Decision Tree Learning Explained
No ratings yet
Decision Tree Learning Explained
102 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Understanding Decision Trees in Classification
No ratings yet
Understanding Decision Trees in Classification
75 pages
Artificial Intelligence 11. Decision Tree Learning
No ratings yet
Artificial Intelligence 11. Decision Tree Learning
25 pages
Decision Tree Learning Explained
No ratings yet
Decision Tree Learning Explained
46 pages
Understanding Decision Trees in AI
No ratings yet
Understanding Decision Trees in AI
61 pages
Decision Trees
No ratings yet
Decision Trees
20 pages
ML Lec 9 Decision Tree-1-18
No ratings yet
ML Lec 9 Decision Tree-1-18
18 pages
Decision Tree Induction for Classification
No ratings yet
Decision Tree Induction for Classification
55 pages
Decision Tree Learning Notes On 23rd July
No ratings yet
Decision Tree Learning Notes On 23rd July
23 pages
Unit 2
100% (1)
Unit 2
42 pages
NOTES Module 3 - Chapter 6 - Decision Tree Learning
No ratings yet
NOTES Module 3 - Chapter 6 - Decision Tree Learning
20 pages
ML Module 2
No ratings yet
ML Module 2
34 pages
Understanding Decision Trees in AI
No ratings yet
Understanding Decision Trees in AI
47 pages
Decision Tree Learning in Python
No ratings yet
Decision Tree Learning in Python
20 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
Decision Tree Learning in Machine Learning
No ratings yet
Decision Tree Learning in Machine Learning
10 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
625 pages
Decision Trees in Data Science
No ratings yet
Decision Trees in Data Science
61 pages
Decision Tree Learning Explained
No ratings yet
Decision Tree Learning Explained
17 pages
Ai Mod3@Azdocuments - in
No ratings yet
Ai Mod3@Azdocuments - in
42 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
ID3 Decision Tree Algorithm Overview
No ratings yet
ID3 Decision Tree Algorithm Overview
41 pages
Decision Tree Learning in Python
No ratings yet
Decision Tree Learning in Python
20 pages
Module 2 Notes
No ratings yet
Module 2 Notes
20 pages
Understanding Decision Trees and ID3 Algorithm
No ratings yet
Understanding Decision Trees and ID3 Algorithm
56 pages
Supervised Learning and Decision Trees
No ratings yet
Supervised Learning and Decision Trees
84 pages
Introduction to Decision Trees Basics
No ratings yet
Introduction to Decision Trees Basics
75 pages
2c Decision Tree Algorithm
No ratings yet
2c Decision Tree Algorithm
21 pages
Decision Tree Classifier & ID3 Guide
No ratings yet
Decision Tree Classifier & ID3 Guide
34 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
Decision Tree Learning in Machine Learning
No ratings yet
Decision Tree Learning in Machine Learning
21 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
33 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
ID3 Algorithm and Decision Trees Explained
No ratings yet
ID3 Algorithm and Decision Trees Explained
17 pages
Design and Analysis of Algorithms Course
No ratings yet
Design and Analysis of Algorithms Course
6 pages
DS Full Notes-1
No ratings yet
DS Full Notes-1
9 pages
Important Topics in DSA and Websites
No ratings yet
Important Topics in DSA and Websites
5 pages
DSU Important Questios
No ratings yet
DSU Important Questios
4 pages
Spconv2 Algo
No ratings yet
Spconv2 Algo
18 pages
Assignment 2 - 1 TR2103
No ratings yet
Assignment 2 - 1 TR2103
6 pages
Introduction to Data Structures
No ratings yet
Introduction to Data Structures
18 pages
ITE6201 Data Structures and Algorithms
No ratings yet
ITE6201 Data Structures and Algorithms
7 pages
Bisection vs. Newton Root Finding
No ratings yet
Bisection vs. Newton Root Finding
4 pages
Data Structures & Algorithms Guide
No ratings yet
Data Structures & Algorithms Guide
12 pages
Clustering
No ratings yet
Clustering
44 pages
AI and Machine Learning Exam Questions
No ratings yet
AI and Machine Learning Exam Questions
4 pages
Advanced Data Structures Guide
No ratings yet
Advanced Data Structures Guide
87 pages
Heapsort Algorithm and Data Structure Overview
No ratings yet
Heapsort Algorithm and Data Structure Overview
31 pages
Divide and Conquer PDF
No ratings yet
Divide and Conquer PDF
622 pages
Unit - 1: Part - A
No ratings yet
Unit - 1: Part - A
6 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
94 pages
GCSE Algorithms Topic Test
No ratings yet
GCSE Algorithms Topic Test
12 pages
Introduction to Data Structures & Algorithms
No ratings yet
Introduction to Data Structures & Algorithms
15 pages
Linear Programming-Wolfram Mathematica 9 ..
No ratings yet
Linear Programming-Wolfram Mathematica 9 ..
15 pages
DSA Module 4 Notes
No ratings yet
DSA Module 4 Notes
16 pages
CS 224 Problem Set 4 - Solutions
No ratings yet
CS 224 Problem Set 4 - Solutions
3 pages
Ultimate Interview Cracking Roadmap Final
No ratings yet
Ultimate Interview Cracking Roadmap Final
3 pages
Diff Between List-Dict-Tuple-Set
No ratings yet
Diff Between List-Dict-Tuple-Set
11 pages
cs311 Midterm Exam 2020
No ratings yet
cs311 Midterm Exam 2020
7 pages
C# Collections: Lists, ArrayLists, Hashtables
No ratings yet
C# Collections: Lists, ArrayLists, Hashtables
14 pages
Fibonacci Heap
No ratings yet
Fibonacci Heap
50 pages
A* Algorithm for 8-Puzzle Solving
No ratings yet
A* Algorithm for 8-Puzzle Solving
24 pages
Sns College of Technology: Data Structures
No ratings yet
Sns College of Technology: Data Structures
39 pages
Assignment III MATLAB CODE
No ratings yet
Assignment III MATLAB CODE
12 pages

ML Module 2 2025

Uploaded by

ML Module 2 2025

Uploaded by

Module-2

Decision Tree Learning

Raw data Decision making trees

• Example: S is a collection of 14 examples → [9+, 5-]

S → Collection of examples, A → Attribute, Values(A) → possible values of attribute A

The information-gain due to sorting the original 16 examples by the attribute

Values(Wind) = { Weak, Strong }

Sweak = [6+, 2-] → Entropy(Sweak) = 0.811

b) Gain (S,a2) = Entropy(S)-[4/6 Entropy(ST) + 2/6 Entropy(SF)]

Entropy(ST) = -(2/4) log2(2/4) - (2/4) log2(2/4) =1

Entropy(SF) = -(1/2) log2(1/2) -(1/2) log2(1/2) =1

Gain (S,a2) = 1- [4/6(1)+2/6(1)]

• Which Attribute is the Best Classifier

OutlookRain = [+, -] Entropy(OutlookRain) =

Gain(S, Outlook) = E(S) - 5/14 E(OutlookSunny ) – 4/14 E(OutlookOvercast) – 5/14 Entropy(OutlookRain)

Values(Temparature) = { Hot, Cool, mild }

TemparatureCool = [+, -] Entropy(TemparatureCool) =

Gain(S, Temparature) = E(S) - 4/14 E(TemparatureHot ) – 4/14 E (TemparatureCool) – 6/14 E(TemparatureMild)

Values(Humidity) = { High, Normal}

Values(Wind) = { Weak, Strong }

Gain(S,Wind) = Entropy(S)- 8/14 Entropy(Sweak ) – 6/14 Entropy (Sstrong)

• Root node decision on the tree

 Outlook is overcast → Yes

• Entropy(OutlookSunny) → 5 instances → S(2+,3-)

• If humidity is Normal → Yes

• Entropy(OutlookRain) → 5 instaces → S(3+,2-)

• If wind is weak and outlook is rain → Yes

• If wind is strong and outlook is rain → NO

cheap standard expensive

Bus Income Level

• As with other inductive learning methods, ID3 can be characterized as

• ID3 performs a simple-to complex, hill-climbing search through this

• The evaluation function that guides this hill-climbing search is the

• ID3's hypothesis space of all decision trees is a complete space of finite

• ID3 maintains only a single current hypothesis as it searches through the

• ID3 in its pure form performs no backtracking in its search.

Reduced error pruning

2. Incorporating Continuous-Valued Attributes

3. Alternative Measures for Selecting Attributes

4. Handling Training Examples with Missing Attribute Values

5. Handling Attributes with Differing Costs

o When small numbers of examples are associated with leaf nodes.

You might also like