0% found this document useful (0 votes)

26 views34 pages

Trees and Random Forest

The document discusses decision trees and random forests as powerful tools for classification and prediction, highlighting their interpretability and efficiency. It covers key concepts such as attribute selection, splitting criteria (Gini Index and Information Gain), and methods to prevent overfitting. Additionally, it introduces ensemble methods like bagging and the advantages and disadvantages of using random forests in data modeling.

Uploaded by

sanket.talturas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views34 pages

Trees and Random Forest

Uploaded by

sanket.talturas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Trees and Random Forest

- Kumar Sundram
Example 1: Salary
Example 2: Car Mileage
Decision Trees
Why Decision Tree?
▪ Decision trees are powerful and popular tools for classification and prediction.
▪ Decision trees represent rules, which can be understood by humans
▪ Relatively fast compared to other classification models
▪ Obtain similar and sometimes better accuracy compared to other models
Key Requirements
▪ Attribute-value description
▪ Predefined classes (target values): the target function has discrete output values (Boolean
or multiclass)
▪ Sufficient data: enough training cases should be provided to learn the model
Example
▪ Predicting Commute Time
Inductive Learning
▪ In this decision tree, we make a series of Boolean decisions and follow the corresponding
branch
▪ Did we leave at 10 AM?
▪ Did a car stall on the road?
▪ Is there an accident on the road?

▪ By answering each of these yes/no questions, we then come to a conclusion on how long
our commute might take
Splitting
Identifying the Best Attributes
▪ How did we decide to split on leave at and then on stall and accident and not weather?
Decision Tree Algorithms
▪ The basic idea behind any decision tree algorithm is as follows:
▪ Choose the best attribute(s) to split the remaining instances and make that attribute a decision
node
▪ Repeat this process for recursively for each child
▪ Stop when:
▪ All the instances have the same target attribute value
▪ There are no more attributes
▪ There are no more instances

▪ Choose the best attribute to split

How does a tree decide where to split?
▪ Gini Index
▪ Chi-Square
▪ Information Gain
▪ Reduction in Variance
How does a tree decide where to split?
▪ Gini Index
▪ It works with categorical target variable “Success” or “Failure”
▪ It performs only Binary splits
▪ Higher the value of Gini higher the homogeneity
▪ CART (Classification and Regression Tree) uses Gini method to create binary splits
Split by Gini Index
Split by Gini Index
▪ Split on Gender:
▪ Calculate, Gini for sub-node Female = (0.2)*(0.2)+(0.8)*(0.8)=0.68
▪ Gini for sub-node Male = (0.65)*(0.65)+(0.35)*(0.35)=0.55
▪ Calculate weighted Gini for Split Gender = (10/30)*0.68+(20/30)*0.55 = 0.59

▪ Similar for Split on Class:

▪ Gini for sub-node Class IX = (0.43)*(0.43)+(0.57)*(0.57)=0.51
▪ Gini for sub-node Class X = (0.56)*(0.56)+(0.44)*(0.44)=0.51
▪ Calculate weighted Gini for Split Class = (14/30)*0.51+(16/30)*0.51 = 0.51

▪ Above, you can see that Gini score for Split on Gender is higher than Split on Class, hence, the node
split will take place on Gender.
How does a tree decide where to split?
▪ Information Gain = 1 – Entropy

▪ Here, p and q is probability of success and failure respectively in that node.

▪ Entropy is also used with categorical target variable.
▪ It chooses the split which has lowest entropy compared to parent node and other splits.
▪ The lesser the entropy, the better it is.
▪ Calculate entropy of parent node
▪ Calculate entropy of each individual node of split and calculate weighted average of all sub-
nodes available in split.
Split by Information Gain
Split by Information Gain
▪ Entropy for Female node = -(2/10) log2 (2/10) – (8/10) log2 (8/10) = 0.72 and for male node, -
(13/20) log2 (13/20) – (7/20) log2 (7/20) = 0.93

▪ Entropy for split Gender = Weighted entropy of sub-nodes = (10/30)*0.72 + (20/30)*0.93 = 0.86

▪ Entropy for Class IX node, -(6/14) log2 (6/14) – (8/14) log2 (8/14) = 0.99 and for Class X node, -
(9/16) log2 (9/16) – (7/16) log2 (7/16) = 0.99.

▪ Entropy for split Class = (14/30)0.99 + (16/30)0.99 = 0.99

▪ Above, you can see that entropy for Split on Gender is the lowest among all, so the tree will split
on Gender.

▪ We can derive information gain from entropy as 1- Entropy.

Overfitting and pruning
Preventing Overfitting
Setting constraints on tree size
▪ Minimum samples for a node split
▪ Defines the minimum number of samples (or observations) which are required in a node to be considered for
splitting.

▪ Minimum samples for a terminal node (leaf)

▪ Defines the minimum samples (or observations) required in a terminal node or leaf.

▪ Maximum depth of tree (vertical depth)

▪ Defines the maximum depth of a tree.

▪ Maximum number of terminal nodes

▪ The maximum number of terminal nodes or leaves in a tree.
Ensemble Methods
What are ensemble methods in tree based modeling ?
▪ The literary meaning of word ‘ensemble’ is group. Ensemble methods involve group of predictive models to
achieve a better accuracy and model stability. Ensemble methods are known to impart supreme boost to
tree based models.
▪ Like every other model, a tree based model also suffers from the plague of bias and variance. Bias means,
‘how much on an average are the predicted values different from the actual value.’ Variance means, ‘how
different will the predictions of the model be at the same point if different samples are taken from the same
population’.
▪ You build a small tree and you will get a model with low variance and high bias. How do you manage to
balance the trade off between bias and variance ?
▪ Normally, as you increase the complexity of your model, you will see a reduction in prediction error due to
lower bias in the model. As you continue to make your model more complex, you end up over-fitting your
model and your model will start suffering from high variance.
▪ A champion model should maintain a balance between these two types of errors. This is known as
the trade-off management of bias-variance errors. Ensemble learning is one way to execute this trade off
analysis.
Bagging
▪ One of the commonly used ensemble methods include is Bagging.

▪ Bagging is a technique used to reduce the variance of our predictions by combining the result of
multiple classifiers modelled on different sub-samples of the same data set. The following figure
will make it clearer:
Bagging
▪ The steps followed in bagging are:
▪ Create Multiple DataSets:
▪ Sampling is done with replacement on the original data and new datasets are formed.
▪ The new data sets can have a fraction of the columns as well as rows, which are generally hyper-
parameters in a bagging model
▪ Taking row and column fractions less than 1 helps in making robust models, less prone to overfitting
▪ Build Multiple Classifiers:
▪ Classifiers are built on each data set.
▪ Generally the same classifier is modelled on each data set and predictions are made.
▪ Combine Classifiers:
▪ The predictions of all the classifiers are combined using a mean, median or mode value depending on the
problem at hand.
▪ The combined values are generally more robust than a single model.
Bagging
▪ Higher number of models are always better or may give similar performance than lower numbers.

▪ There are various implementations of bagging models.

▪ Random forest is one of them and we’ll discuss it next.

Random Forest
Random Forest - How does it work?
▪ Random Forest is considered to be a panacea of all data science problems. On a funny note, when
you can’t think of any algorithm (irrespective of situation), use random forest!

▪ Each tree is planted & grown as follows:

▪ Assume number of cases in the training set is N. Then, sample of these N cases is taken at random
but with replacement. This sample will be the training set for growing the tree.
▪ If there are M input variables, a number m<M is specified such that at each node, m variables are
selected at random out of the M. The best split on these m is used to split the node. The value of m is
held constant while we grow the forest.
▪ Each tree is grown to the largest extent possible and there is no pruning.
▪ Predict new data by aggregating the predictions of the ntree trees (i.e., majority votes for classification,
average for regression).
Random Forest - How does it work?
RF – Advantages and Disadvantages
Advantages
▪ This algorithm can solve both type of problems i.e. classification and regression and does a decent
estimation at both fronts.

▪ One of benefits of Random forest which excites me most is, the power of handle large data set with higher
dimensionality. It can handle thousands of input variables and identify most significant variables so it is
considered as one of the dimensionality reduction methods. Further, the model outputs Importance of
variable, which can be a very handy feature (on some random data set).
Disadvantages
▪ It surely does a good job at classification but not as good as for regression problem as it does not
give precise continuous nature predictions. In case of regression, it doesn’t predict beyond the
range in the training data, and that they may over-fit data sets that are particularly noisy.

▪ Random Forest can feel like a black box approach for statistical modelers – you have very little
control on what the model does. You can at best – try different parameters and random seeds!
Model Validation
Accuracy Matrix
ROC - AUC

Tree Based Algorithms in Machine Learning
No ratings yet
Tree Based Algorithms in Machine Learning
8 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
Random Forest
No ratings yet
Random Forest
25 pages
Random Forest
No ratings yet
Random Forest
29 pages
DS Unit - 4
No ratings yet
DS Unit - 4
76 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
41 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
Decision Tree & Regression
No ratings yet
Decision Tree & Regression
33 pages
Ch5 Data Science
No ratings yet
Ch5 Data Science
60 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
4 pages
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
No ratings yet
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
11 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
Naive Bayes and Decision Tree Classification
No ratings yet
Naive Bayes and Decision Tree Classification
21 pages
ML Lec6
No ratings yet
ML Lec6
4 pages
2.unit 2
No ratings yet
2.unit 2
23 pages
What Is Decision Tree
No ratings yet
What Is Decision Tree
35 pages
Unit 3 - ML (NEW)
No ratings yet
Unit 3 - ML (NEW)
68 pages
Random Forests Simplified
No ratings yet
Random Forests Simplified
39 pages
Session 9 10 Decision Tree
No ratings yet
Session 9 10 Decision Tree
41 pages
Unit 4
No ratings yet
Unit 4
33 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Tree Based Methods
No ratings yet
Tree Based Methods
64 pages
Ensemble Learning Explained
No ratings yet
Ensemble Learning Explained
32 pages
Random Forest
No ratings yet
Random Forest
6 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
34 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Understanding Random Forest Algorithm
No ratings yet
Understanding Random Forest Algorithm
8 pages
Randon Forest
No ratings yet
Randon Forest
34 pages
An Introduction To Random Forest Algorithm For Beginners
No ratings yet
An Introduction To Random Forest Algorithm For Beginners
16 pages
Random Forests for Data Scientists
100% (1)
Random Forests for Data Scientists
12 pages
NOTES
No ratings yet
NOTES
18 pages
Random Forest for ML Enthusiasts
No ratings yet
Random Forest for ML Enthusiasts
4 pages
Lecture-4 Unit 2
No ratings yet
Lecture-4 Unit 2
73 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
11 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
DS&RF
No ratings yet
DS&RF
20 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
25 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
No ratings yet
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
6 pages
Decision Tree
No ratings yet
Decision Tree
82 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Random Forest Class Lecture Notes
No ratings yet
Random Forest Class Lecture Notes
2 pages
Random Forest
100% (1)
Random Forest
83 pages
Random Forests
No ratings yet
Random Forests
43 pages
Tree-Based Machine Learning Methods
No ratings yet
Tree-Based Machine Learning Methods
28 pages
Decision Tree & Random ForestNotes
No ratings yet
Decision Tree & Random ForestNotes
11 pages
Data Science - Decision Tree - Random Forest
No ratings yet
Data Science - Decision Tree - Random Forest
15 pages
Decision Tree (Autosaved)
No ratings yet
Decision Tree (Autosaved)
14 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
30 pages
Decision Trees and Random Forest Overview
No ratings yet
Decision Trees and Random Forest Overview
79 pages
Decision Tree and Random Forest Overview
No ratings yet
Decision Tree and Random Forest Overview
15 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
BSC ML Ch3
No ratings yet
BSC ML Ch3
106 pages
Operations Research Exam
No ratings yet
Operations Research Exam
3 pages
Be Computer-Engineering Semester-8 2023 November Deep-Learning-2019-Pattern
No ratings yet
Be Computer-Engineering Semester-8 2023 November Deep-Learning-2019-Pattern
2 pages
Course Code: CS-71 Course Title: Computer Oriented Numerical Techniques
No ratings yet
Course Code: CS-71 Course Title: Computer Oriented Numerical Techniques
5 pages
DAA File
No ratings yet
DAA File
36 pages
Lecture 3 Solution To Simultaneous Linear Algebraic Equation
No ratings yet
Lecture 3 Solution To Simultaneous Linear Algebraic Equation
53 pages
Ali Salah Meta Whale Optimization
No ratings yet
Ali Salah Meta Whale Optimization
16 pages
Bda Assign
No ratings yet
Bda Assign
15 pages
Digital Speech Coding Techniques
No ratings yet
Digital Speech Coding Techniques
18 pages
Engineering Computation Guide
No ratings yet
Engineering Computation Guide
16 pages
ADA IMP Questions With Solution For GTU
No ratings yet
ADA IMP Questions With Solution For GTU
63 pages
Program No 6
No ratings yet
Program No 6
7 pages
CVRPTW Optimization for LPG Delivery
No ratings yet
CVRPTW Optimization for LPG Delivery
7 pages
Slides Ch4 Disk Scheduling
No ratings yet
Slides Ch4 Disk Scheduling
13 pages
Adaptive Linear Neuron
No ratings yet
Adaptive Linear Neuron
4 pages
UNIT - III (Trees)
No ratings yet
UNIT - III (Trees)
24 pages
Bi 8
No ratings yet
Bi 8
3 pages
Assignment 6 - Initial Value Problems - Attempt Review - Econcordia
No ratings yet
Assignment 6 - Initial Value Problems - Attempt Review - Econcordia
17 pages
Packing Optimization for Students
No ratings yet
Packing Optimization for Students
12 pages
Fast Learning in Networks of Locally-Tuned Processing Units
No ratings yet
Fast Learning in Networks of Locally-Tuned Processing Units
14 pages
Numerical Analysis With Optimiz PDF
No ratings yet
Numerical Analysis With Optimiz PDF
102 pages
Neural Networks Basics Course
No ratings yet
Neural Networks Basics Course
36 pages
Data Science Lab: Linear Regression
No ratings yet
Data Science Lab: Linear Regression
9 pages
Deep Learning2024july
No ratings yet
Deep Learning2024july
1 page
Data Structures Exam Guide
No ratings yet
Data Structures Exam Guide
33 pages
Big-M Method (Introduction) : Artificial Variable
100% (1)
Big-M Method (Introduction) : Artificial Variable
10 pages
Romberg Integration Explained
No ratings yet
Romberg Integration Explained
8 pages
Two Marks DSP
No ratings yet
Two Marks DSP
16 pages
MATLAB Runge-Kutta Guide
100% (1)
MATLAB Runge-Kutta Guide
6 pages
Polynomial Functions
No ratings yet
Polynomial Functions
9 pages

Trees and Random Forest

Uploaded by

Trees and Random Forest

Uploaded by

Trees and Random Forest

▪ Choose the best attribute to split

▪ Similar for Split on Class:

▪ Here, p and q is probability of success and failure respectively in that node.

▪ Entropy for split Class = (14/30)*0.99 + (16/30)*0.99 = 0.99

▪ We can derive information gain from entropy as 1- Entropy.

▪ Minimum samples for a terminal node (leaf)

▪ Maximum depth of tree (vertical depth)

▪ Maximum number of terminal nodes

▪ There are various implementations of bagging models.

▪ Random forest is one of them and we’ll discuss it next.

▪ Each tree is planted & grown as follows:

You might also like

▪ Entropy for split Class = (14/30)0.99 + (16/30)0.99 = 0.99