The
Decision
Tree
A Visual Explanation for Classification and Regression
Presented by: Under the Guidance of:
• Anirudh Champawat Dr. Gokulnath C
• Pranjal
Algorithm
S.R.M Institute of Science and Technology
• Chinmay Sahu
• Aarnav Ray
Introduction to Decision Trees
The Decision Tree Algorithm is a powerful, non-parametric supervised machine learning method used for both
classification and regression tasks. It models decisions as a set of rules represented by a tree structure.
1 2
Supervised Learning Classification & Regression
Uses labeled training data to learn how to map inputs Effective for predicting categorical outcomes
to outputs. (classification) or continuous values (regression).
The Anatomy of a Decision Tree
A Decision Tree is structured like a flow chart, where each component represents a step in the decision-making process,
leading to a final outcome.
Root Node Internal Node
The starting point, representing the Represents a feature test or attribute,
entire dataset, which is then split into where the data is branched based on
two or more homogeneous sets. the outcome of the test.
Leaf (Terminal) Node Branch
Represents the final decision or Represents the outcome of the test or
classification result; no further splitting decision made at the internal node
occurs here. (e.g., "Yes" or "No").
Summary: Each node is a test, each branch is a decision, and each leaf is the final output.
How the Tree Grows: Recursive Splitting
The core principle of building a Decision Tree is to recursively partition the data into subsets that are as "pure"
(homogeneous) as possible, based on the most informative features.
Step 1: Choose the Best Attribute
Evaluate all potential features to determine which one yields the highest purity/information gain when split.
Step 2: Split the Data
Partition the dataset into branches according to the value of the chosen best attribute.
Step 3: Repeat Recursively
Apply the process (Steps 1 & 2) to each new subset until the nodes are pure or a stopping condition is met.
Measuring Purity: Entropy and Information Gain
To determine the "best" attribute for splitting (Step 1), Decision Tree algorithms use metrics like Entropy and Information Gain to quantify the homogeneity of the subsets.
Entropy (Measure of Impurity) Information Gain (Measure of Effectiveness)
Entropy measures the randomness or uncertainty in a dataset. Lower entropy means higher purity. Information Gain calculates the reduction in entropy achieved after a dataset is split on an attribute. The goal is to
maximize this value.
Where p_i is the proportion of samples belonging to class i.
The attribute that provides the maximum Information Gain is chosen for the split.
Visualizing the Split Process
The tree-building process is an iterative one, continuously optimizing the splits to achieve highly predictive, pure leaf nodes.
Feature B > Y?
Second split to purity
Subset 1
Less mixed
Feature A > X?
First split decision
Mixed Dataset
Starting heterogeneous node
Diverse Applications Across Industries
Decision Trees are widely adopted due to their versatility and ease of interpretation, making them valuable tools in
various sectors.
Finance Healthcare
Credit risk assessment, loan default prediction, and Disease diagnosis based on symptoms and patient
fraud detection by analyzing transaction patterns. history, aiding clinical decision support systems.
E-commerce Agriculture
Predictive modeling for customer behavior, product Predicting optimal crop yield, weather forecasting
recommendations, and churn prediction. impacts, and determining pest control strategies.
Strengths: Why Choose
Decision Trees?
Decision Trees offer several compelling advantages, particularly in scenarios
requiring transparency and ease of implementation.
Ease of Interpretation Handles Mixed Data Types
The tree structure is intuitive They can handle both
and easy to follow, making it a categorical features (like color or
"white box" model where the country) and numerical features
logic behind the prediction is (like age or income) without
clear. complex conversion.
Minimal Preprocessing
Unlike many other algorithms,
Decision Trees do not require
feature scaling or normalization.
Limitations and Challenges
While robust, Decision Trees are not without drawbacks. Understanding their limitations is crucial for effective model deployment.
Prone to Overfitting
Especially if the tree is allowed to grow too deep, it may fit the
training data too closely, leading to poor performance on unseen
data.
Sensitivity to Data Changes
A small change in the data can result in a completely different tree
structure, making the model unstable.
Bias towards Dominant Classes
In imbalanced datasets, the tree may be biased toward the majority
classes, necessitating techniques like pruning or class weighting.
Pruning methods are often used to address overfitting by removing branches that have low predictive power, simplifying the model.
Conclusion: Foundation for Advanced ML
Decision Trees serve as fundamental building blocks for many modern, high-performing machine learning systems.
Ensemble Methods
Decision Trees form the foundation of powerful ensemble models, which combine multiple tree
predictors to significantly enhance performance and robustness, mitigating issues like overfitting
and instability.
Random Forest: Builds multiple Decision Trees during training and outputs the mode of the
classes (for classification) or mean prediction (for regression).
Gradient Boosting Machines (GBM): Builds trees sequentially, where each new tree corrects
the errors of the previous ones.
Thank you!