CSL0777 L25

This document provides an overview of the decision tree classification algorithm for machine learning. It discusses key concepts like decision tree terminology, how the algorithm works, attribute selection measures, pruning, advantages and disadvantages. It also provides an example of implementing a decision tree in Python to predict whether users would purchase an SUV based on their profile data from social networking sites. The document is intended to teach students about supervised learning and decision trees.

Uploaded by

Konkobo Ulrich Arthur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views39 pages

CSL0777 L25

Uploaded by

Konkobo Ulrich Arthur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

Program: B.

Tech VII Semester

CSL0777: Machine Learning

Unit No. 3
Supervised learning Part-2

Lecture No. 25
Decision Tree Classification Algorithm
Mr. Praveen Gupta
Assistant Professor, CSA/SOET
Outlines
• Decision Tree Classification Algorithm for ML
• Why use Decision Trees?
• Decision Tree Terminologies
• How does the Decision Tree algorithm Work?
• Attribute Selection Measures
• Pruning: Getting an Optimal Decision tree
• Advantages and Disadvantages of Decision Tree
• Python implementation of the Decision Tree algorithm
• References
Student Effective Learning Outcomes(SELO)
01: Ability to understand subject related concepts clearly along with
contemporary issues.
02: Ability to use updated tools, techniques and skills for effective domain
specific practices.
03: Understanding available tools and products and ability to use it effectively.
Decision Tree Classification Algorithm
•Decision Tree is a Supervised learning technique that
can be used for both classification and Regression
problems, but mostly it is preferred for solving
Classification problems.
•It is a tree-structured classifier, where internal nodes
represent the features of a dataset, branches represent
the decision rules and each leaf node represents the
outcome.
•In a Decision tree, there are two nodes, which are
the Decision Node and Leaf Node.
•Decision nodes are used to make any decision and have
multiple branches, whereas Leaf nodes are the output of
those decisions and do not contain any further branches.
4 / 22
Decision Tree Classification Algorithm
•The decisions or the test are performed on the basis of
features of the given dataset.
•It is a graphical representation for getting all the
possible solutions to a problem/decision based on given
conditions.
•It is called a decision tree because, similar to a tree, it
starts with the root node, which expands on further
branches and constructs a tree-like structure.
•In order to build a tree, we use the CART
algorithm, which stands for Classification and Regression
Tree algorithm.
• A decision tree simply asks a question, and based on the
answer (Yes/No), it further split the tree into subtrees.
4 / 22
Decision Tree Classification Algorithm

4 / 22
Why use Decision Trees?
There are various algorithms in Machine learning, so
choosing the best algorithm for the given dataset and
problem is the main point to remember while
creating a machine learning model.
 Decision Trees usually mimic human thinking
ability while making a decision, so it is easy to
understand.
 The logic behind the decision tree can be easily
understood because it shows a tree-like structure.

4 / 22
Decision Tree
Terminologies
Root Node: Root node is from where the
decision tree starts. It represents the entire
dataset, which further gets divided into two or
more homogeneous sets.
Leaf Node: Leaf nodes are the final output
node, and the tree cannot be segregated
further after getting a leaf node.
Splitting: Splitting is the process of dividing the
decision node/root node into sub-nodes
according to the given conditions.
4 / 22
Decision Tree
Terminologies
Branch/Sub Tree: A tree formed by splitting
the tree.
Pruning: Pruning is the process of removing
the unwanted branches from the tree.
Parent/Child node: The root node of the tree
is called the parent node, and other nodes are
called the child nodes.

4 / 22
How does the Decision Tree algorithm Work?

•In a decision tree, for predicting the class of the

given dataset, the algorithm starts from the root
node of the tree. This algorithm compares the values
of root attribute with the record (real dataset)
attribute and, based on the comparison, follows the
branch and jumps to the next node.
•For the next node, the algorithm again compares the
attribute value with the other sub-nodes and move
further. It continues the process until it reaches the
leaf node of the tree.

4 / 22
How does the Decision Tree algorithm Work?
Step-1: Begin the tree with the root node, says S, which
contains the complete dataset.
Step-2: Find the best attribute in the dataset
using Attribute Selection Measure (ASM).
Step-3: Divide the S into subsets that contains possible
values for the best attributes.
Step-4: Generate the decision tree node, which contains
the best attribute.
Step-5: Recursively make new decision trees using the
subsets of the dataset created in step -3. Continue this
process until a stage is reached where you cannot further
classify the nodes and called the final node as a leaf
node.
4 / 22
How does the Decision Tree algorithm Work?
Example: Suppose there is a candidate who has a job
offer and wants to decide whether he should accept
the offer or Not.
Solution: to solve this problem, the decision tree
starts with the root node (Salary attribute by ASM).
The root node splits further into the next decision
node (distance from the office) and one leaf node
based on the corresponding labels. The next decision
node further gets split into one decision node (Cab
facility) and one leaf node. Finally, the decision node
splits into two leaf nodes (Accepted offers and
Declined offer).
4 / 22
How does the Decision Tree algorithm Work?

4 / 22
Attribute Selection Measures
•While implementing a Decision tree, the main issue
arises that how to select the best attribute for the
root node and for sub-nodes. So, to solve such
problems there is a technique which is called
as Attribute selection measure or ASM.

•By this measurement, we can easily select the best

attribute for the nodes of the tree. There are two
popular techniques for ASM, which are:

4 / 22
Attribute Selection Measures
1. Information Gain:
• Information gain is the measurement of changes in entropy
after the segmentation of a dataset based on an attribute.
•It calculates how much information a feature provides us
about a class.
• According to the value of information gain, we split the node
and build the decision tree.
•A decision tree algorithm always tries to maximize the value
of information gain, and a node/attribute having the highest
information gain is split first. It can be calculated using the
below formula:
Information Gain= Entropy(S)-
[(Weighted Avg) *Entropy(each feature)
4 / 22
Attribute Selection Measures
1. Information Gain:
Entropy: Entropy is a metric to measure the impurity in a
given attribute. It specifies randomness in data. Entropy can
be calculated as:
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Where,
S= Total number of samples
P(yes)= probability of yes
P(no)= probability of no

4 / 22
Attribute Selection Measures
2. Gini Index:
Gini index is a measure of impurity or purity used
while creating a decision tree in the
CART(Classification and Regression Tree) algorithm.
An attribute with the low Gini index should be
preferred as compared to the high Gini index.
It only creates binary splits, and the CART algorithm
uses the Gini index to create binary splits.
Gini index can be calculated using the below formula:
Gini Index= 1- ∑jPj2

4 / 22
Pruning: Getting an Optimal Decision tree
•Pruning is a process of deleting the unnecessary
nodes from a tree in order to get the optimal decision
tree.
•A too-large tree increases the risk of overfitting, and
a small tree may not capture all the important
features of the dataset. Therefore, a technique that
decreases the size of the learning tree without
reducing accuracy is known as Pruning. There are
mainly two types of tree pruning technology used:
 Cost Complexity Pruning
 Reduced Error Pruning.
4 / 22
Advantages and Disadvantages of Decision Tree

Advantages:
•It is simple to understand as it follows the
same process which a human follow while
making any decision in real-life.
•It can be very useful for solving decision-
related problems.
•It helps to think about all the possible
outcomes for a problem.
•There is less requirement of data cleaning
compared to other algorithms.
4 / 22
Advantages and Disadvantages of Decision Tree

Disadvantages:
•The decision tree contains lots of layers,
which makes it complex.
•It may have an overfitting issue, which
can be resolved using the Random Forest
algorithm.
•For more class labels, the computational
complexity of the decision tree may
increase.
4 / 22
Python implementation of the Decision Tree

Example: There is a dataset given which contains

the information of various users obtained from the
social networking sites. There is a car making
company that has recently launched a new SUV car.
So the company wanted to check how many users
from the dataset, wants to purchase the car.