0% found this document useful (0 votes)
4 views7 pages

Decision Tree Exercise

The document consists of exercises focused on decision trees, including tasks such as constructing decision trees for boolean functions, calculating entropy and information gain, and applying the ID3 algorithm to various datasets. It covers topics like building complete decision trees, evaluating attributes for splits, and understanding the relationship between decision trees and version spaces. Additionally, it presents scenarios for applying decision trees in practical contexts, such as evaluating candidates for doctoral programs and assessing restaurant meal satisfaction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views7 pages

Decision Tree Exercise

The document consists of exercises focused on decision trees, including tasks such as constructing decision trees for boolean functions, calculating entropy and information gain, and applying the ID3 algorithm to various datasets. It covers topics like building complete decision trees, evaluating attributes for splits, and understanding the relationship between decision trees and version spaces. Additionally, it presents scenarios for applying decision trees in practical contexts, such as evaluating candidates for doctoral programs and assessing restaurant meal satisfaction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Exercises on Decision Trees

1. For each of the following boolean functions, present a decision tree.


that represent them:

a∧ ¬ B
b) (A∧ B)∨ (C∧ D)

2. Build the complete set of decision trees from the following data:

A1 A2 Y
0 1 0
1 0 1
0 0 0
1 1 1

3. Suppose that one intends to build the decision tree using the ID3 algorithm,
based on the dataset presented the following table

A1 A2 A3 Y
0 1 1 1
1 0 1 1
0 0 0 0
1 1 1 0

a) Calculate the average entropy of each of the attributes


b) Which attribute would be chosen to split the data?
c) Build the decision tree.

4. Consider the following training set of examples:

Example A1 A2 Class
1 dark tall -
2 clear high +
3 clear bass -
4 dark low +
5 dark tall -
6 clear high +

a) Calculate the expected information or entropy of this set of examples and the
gain of information related to attribute A2. A2 may be the root of the tree of
decision obtained by the ID3 algorithm?
b) Present the complete decision tree that would be produced by the algorithm.
ID3.
c) Calculate the expected information or entropy of this set of examples and the
gain of information related to attribute A2. A2 may be the root of the tree
decision obtained by the ID3 algorithm?
d) Present the complete decision tree that would be produced by the algorithm
ID3.

1
Learning: Decision tree exercises

5. Consider the following dataset:

A1 A2 Y
0 0 1
0 0 0
0 0 1
0 1 0
0 1 0
0 1 1
1 0 1
1 0 0
1 0 1
1 1 1
1 1 1
1 1 1

a) Is it possible to obtain a hypothesis without any error from these data?


b) What will be the average entropy if we choose attribute A1?
c) And what if we choose A2?
d) What is the tree obtained, considering that if the leaves cannot be
When uniform, is the most common class chosen as the value?
e) Could this tree still be simplified?

6. Imagine that you want to use the ID3 algorithm to learn a function and that you are given
a set of examples and counter-examples is presented:

A1 A2 A3 A4 A5
0 1 1 0 0
1 0 1 0 0
1 1 1 0 1
0 0 0 1 1
1 0 0 1 0
0 1 0 1 0

a) Build the decision tree


b) How would the decision tree be if the process of subdividing the tree
It seems that one cannot improve the average gain and chooses for value.
Is the most common class of leaves?

7. The candidates for doctoral students at the fictional University of Martinland.


It is based on four criteria: the final grade, the university ranking.
where the course was held, the record of publications, and the letters of recommendation. For
simplifying the grade can take three values, which are 4.0, 3.7, and 3.5. The university
can be classified among the top 10, between the 10 and 20 best
(top-20) and among the 20 to 30 best (top-30). The record of publications is a
binary attribute - the candidate has published or not; and the letters of recommendation
they can be good or normal. Finally, the candidates can be classified
as accepted (A) or rejected (R). The following table shows a set of
examples of PhD candidates and their respective classification.

2
Learning: Decision Tree Exercises

Note Ranking Published Recommendation Class


4.0 top 10 yes good A
4.0 top-10 no good A
4.0 top-20 no normal A
3.7 top-10 yes boa A
3.7 top-20 no good R
3.7 top-30 yes good A
3.7 top-30 no good R
3.7 top-10 no good R
3.5 top-20 yes normal R
3.5 top-10 no normal R
3.5 top-30 yes normal R
3.5 top-30 no good R

Present the complete decision tree that would be produced by the ID3 algorithm.

8. Consider the problem of whether to wait or not to wait for a table at a restaurant. The
the objective is to learn a definition for the objective to wait, being that definition
express as a decision tree. There are the following attributes
to describe the situations example:

Alternative: is there or is there not a restaurant in the neighborhood that is an alternative.


The restaurant has a bar or does not have a bar.
Friday/Saturday: If this day is a Friday or a Saturday.
Hunger: are we hungry or not.
Clientes:Quantas pessoas estão no restaurante (nenhuma, algumas, cheio)
Preço:Três preços possíveis: ($, $$, $$$)
Rain: Is it raining outside or not.
Reservation: A reservation was made or not.
Tipo:O tipo de restaurante: (francês, italiano, tailandês e hamburgueria
Estimativa do tempo de espera:(0-10 minutos, 10-30, 30-60, >60).

Alter Bar Sexta Fome Clientes Preço Chuva Reserva Tipo TmpEsp Esperar
Yes No No Yes Some $$$ No Yes French 0-10 Yes
["Yes","No","No","Yes","Full"] $ No Not Thai 30-60 No
No Yes No No Some $ No No Hamburg 0-10 Yes
Sim Não Sim Sim Cheio $ No Not Thai 10-30 Yes
Yes No Yes No Full $$$ No Yes French >60 No
["No","Yes","No","Yes","Some"] $$ Yes Yes Italian 0-10 Yes
No Yes No No None $ Yes Not Hamburg 0-10 No
No No No Yes Some $$ Yes Thai Sim 0-10 Yes
No Yes Yes No Full $ Yes No Hamburg >60 No
Yes Yes Yes Yes Full $$$ No Yes Italian 10-30 No
No No No No None $ No Not Thai 0-10 No
Yes Yes Yes Yes Full $ No Not Hamburg 30-60 Yes

3
Learning: Decision tree exercises

a) Use the Weka tool, through the ID3 algorithm, to build the tree
what does this represent based on these data.
b) Assuming that D1 and D2 are decision trees representing functions
booleans, and that D2 is considered a development of D1 if the ID3 algorithm
can extend D1 to D2, indicate whether the following sentence is true or false: If
the D2 tree is an elaboration of D1, then D1 is more general than D2. If
consider it to be true, prove it; if you consider it to be false, present a
counter-example.

9. The ID3 algorithm finds only one consistent hypothesis while the algorithm
of version space (also called candidate elimination algorithm)
find all consistent hypotheses. Consider the correspondence between these
2 algorithms:

a) present the results obtained by each of these two algorithms from the
examples of training following, for the target concept doesSport:

Training examples:
Sky Temp. Hum. Prev makes Sports
0 sun normal hot equal yes
1 sol hot high same yes
2 rain cold high young no
3 sun hot high waste yes

b) What is the relationship between the learned decision tree and the version space?
obtained? The decision tree is equivalent to one of the members of the space of
versions?

c) Suppose you want to define a learning algorithm that, like ID3,


conduct a search in the decision tree space and, like the space of
versions, finds all the hypotheses consistent with the data. That is,
it is intended to apply the version space algorithm to search in a space of
hypotheses in which the hypotheses are decision trees. Present the sets
S and G that result from the 1st training example given. Show how S and G
they would be refined by the second training example (you can omit trees
syntactically distinct that represent the same concept). That
difficulties anticipated in applying the version space to hypothesis spaces
of decision trees?

10. Use the data from table 1 as a training set to learn to classify,
according to the 3 classes indicated in the attribute 'Burn'.

a. Calculate the entropy (or expected information) of the training set. Calculate the
information gain relative to each of the attributes and indicate the
attribute that would be chosen, according to this measure, as the root of the tree
of decision by the ID3 algorithm. Comment on the obtained result.
b. Now removing the attribute 'Name', and continuing to use the measure of
information gain, determine the complete decision tree that would be
produced by ID3.

4
Learning: Decision tree exercises

c. Using gain ratio as a measure and considering all


the attributes (including 'Name'), which attribute would be chosen as the root of
decision tree?

Name Hair Pele Sunscreen Burn


Emília Castanho Morena No Sem
Sara Louro Brown-skinnedNo slight
Diana Louroe Brunette Yes No
Andreia Lemon White Yes Sem
Leonor Lemon White No grave
Emilia Reddish White Yes grave
Diana Brown White No Sem
Fernão Ruddy Brunette No light
Carlos Ruddy Brunette Yes light
Joana White Brown Yes If
Table 1: Data related to sunburns

11. Polybius, in his assessment of meals in restaurants, considers the following


attributes and their respective possible values:
Restaurante: {Copélia, Palma, Primavera}
Qualidade: {boa, má}
Price: an integer
Refeição: {almoço, jantar, pequeno_almoço}
Políbio uses statements in Portuguese to express his processes of
classification of meals as satisfactory or unsatisfactory, instead of
use decision trees. For example, he would say:
I am satisfied with any meal of €10 or less, but there is none
no meal from the Palma restaurant for 7€ or less that I like.
Statements like this can translate into different decision trees. One
the trees that we could suggest for the above sentence are the following, with the possibility of
others with different numbers of nodes, possibly testing the attributes by
different order or performing different tests:

Indicate decision trees, with a minimum number of nodes, corresponding to each


one of the following statements:
I am satisfied with any good quality meal that doesn't cost more than
that 10€ and satisfied with low-quality meals that do not cost more than
that 5€.

5
Learning: Decision tree exercises

b) I am satisfied with a breakfast of €8 or less or with a dinner of


15€ or less. I am never satisfied with a meal of poor quality nor
if I have to pay for lunch (since I eat well and for free)

12. Imagine that we have the following dataset, where Y is the target attribute of
classification.

A B C Y
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0

What decision tree is chosen by the ID3 algorithm? There will not be another tree.
that involves testing fewer attributes (and possibly with less depth)
capable of correctly representing the data? What justification do you find
so that ID3 does not return this tree?

13. Consider the data presented in the grid below, described by two attributes.
(x and y axes, with integer values between 0 and 8) and classified into 3 classes,
represented by squares, crosses or circles. Using the ID3 algorithm
extended to perform binary tests on numerical attributes, build the tree
of decision in order to correctly classify the data.

14. The following data table is based on Tolkien's book 'The Lord of the Rings'.
the table presents information about a group of people/entities that
appear in this book.
Name Race Weight Lord of the Ring Class
Frodo Hobbit Light Yes Good
Rosie Hobbit Light No Good
Bilbo Hobbit Light Yes Good
Gollum Hobbit Light Yes Want
Faramir Human Medium No Good
Aragorn Human Medium No Good
Wormtongue Human Medium No Want
Celeborn Elves Light No Good
Galadriel Elves Raise Yes Good
Sharku Orc Heavy No Want

6
Learning: Decision tree exercises

Each entity has three attributes (Race, Weight, Ring-lord) and is classified as
being Good or Bad (i.e., whether he wants or does not want to kill Frodo). The attribute 'Lord-
"ring" indicates whether the person/entity has ever possessed the magic ring. Apply the
ID3 algorithm to the data in the table, considering that the leaf nodes are classified
with the majority class. However, introduce a small variation in which one
expand the nodes only when it results in an improvement in the gain.

15. Consider the following dataset, where Y corresponds to the class attribute.
Let's consider ways to prune the decision tree produced by ID3 that do not
involves the use of a test set.

V W X Y
0 00 0
0 10 1
1 00 1
1 10 0
1 11 0

a) Present the decision tree that would be constructed by ID3, without pruning.
b) One possible way to prune the tree is to start from the root node of the
tree, prune the subtree originating from a node if the information gain (or
another given criterion associated with this node is less than a small amount å.
This type of pruning is called descending pruning ('top-down pruning'). What is the
What is the decision tree returned by applying this type of pruning with å=0.0001?
error produced by this pruned tree for the given training set (% of
misclassified examples?)
c)Another possible way to prune the tree is to start from the parent nodes of
leaves of the tree, prune subtrees originating from a node if the information gain
(or another given criterion) for less than a small amount å. According to this
method, no ancestor of a node with high information gain is pruned.
This type of pruning is called upward pruning ('bottom-up pruning'). What is the
decision tree returned by applying this type of pruning with å=0.0001? What is the
error produced by this pruned tree for the given training set (% of
misclassified examples)?
Discuss the possible advantages and disadvantages of these two types of pruning, taking into account
for example, tell about the computational complexity involved and the accuracy of
classification.

You might also like