Decision Tree Rules - Study Notes
DECISION TREE RULES
The decision tree method is a popular and relatively simple supervised classification method. It
involves:
1. Nodes: Each node of the tree specifies a test of some attribute.
2. Branches: Each branch corresponds to one of the values of the attribute.
3. Paths: Each path from the root to a leaf of the decision tree consists of attribute tests, finally
reaching a leaf that describes the class.
The popularity of decision trees is due to the ease of understanding the rules specified by the
nodes. These rules can even be used to retrieve data from a relational database satisfying the rules
using SQL.
Advantages of Converting Decision Trees to Rules:
1. Ease of Pruning Decisions:
- It is easier to see the context of each rule.
2. Better Understanding:
- Rules remove the distinction between attribute tests near the root and those near the leaves.
- Rules are easier to read and understand compared to a decision tree.
IF-THEN Rules:
- Derived Rules: Rules are derived from the paths from the root to the leaf nodes.
- A simple approach generates as many rules as there are leaf nodes.
- Rules can often be combined to produce a smaller set of rules.
Example Rules:
1. If Gender = "Male", then class = B.
2. If Gender = "Female" and Married = "Yes", then class = C; otherwise, class = A.
Simplification of Rules:
1. Simplifying Rules:
- Rules with only one antecedent (e.g., "If Gender = 'Male' then class = B") cannot be simplified.
- Only rules with two or more antecedents are considered for simplification.
2. Removing Unnecessary Antecedents:
- Unnecessary rule antecedents that do not affect the conclusion may be removed.
3. Combining Rules:
- Rules leading to the same class may be combined.
Quality of Decision Trees and Rules:
- The quality of the decision tree and the derived rules depends on the training sample.
- If the training sample does not represent the population well, the derived rules may not be reliable.