0% found this document useful (0 votes)
44 views49 pages

Lecture 16 Random Forest

The document discusses random forest algorithms including how they work, important terms, and applications. Random forests create multiple decision trees and output the class that is the mode of the classes of the individual trees. They can handle both classification and regression tasks.

Uploaded by

Fasih Ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views49 pages

Lecture 16 Random Forest

The document discusses random forest algorithms including how they work, important terms, and applications. Random forests create multiple decision trees and output the class that is the mode of the classes of the individual trees. They can handle both classification and regression tasks.

Uploaded by

Fasih Ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

1

Random Forest
2
Introduction
3
Supervised Learning
4
Why Random Forest
5
What is Random Forest?
6
Random Forest
7
Applications of Random Forest
8
Important Terms

 Entropy
 Information Gain
 Leaf node
 Decision Node
 Root Node
9
How does a Decision Tree Work?
10
How does a Decision Tree Work?
11
How does a Decision Tree Work?
12
How does a Decision Tree Work?
13
How does a Decision Tree Work?
14
How does a Decision Tree Work?
15
How does a Decision Tree Work?
16
How does a Decision Tree Work?
17
How does a Decision Tree Work?
18
How does a Decision Tree Work?
19
How does a Decision Tree Work?
20
How does a Decision Tree Work?
21
How does a Decision Tree Work?
22
How does a Decision Tree Work?
23
How does a Decision Tree Work?
24
How does a Decision Tree Work?
25
How does a Decision Tree Work?
27
Random Forest Prediction Pseudocode

To perform prediction using the trained random forest algorithm uses the below pseudocode:

1. Takes the test features and use the rules of each randomly created decision tree to predict the outcome and
stores the predicted outcome (target)

2. Calculate the votes for each predicted target.

3. Consider the high voted predicted target as the final prediction from the random forest algorithm.
28
How Random Forest Works

Let this be Tree 1


29
How Random Forest Works

Let this be Tree 2


30
How Random Forest Works

Let this be Tree 3


Results Compared 31

Tree 1 Tree 2 Tree 3

• This concept of voting is known as majority voting.


• Highest voted is the final outcome.
32
Missing Data Example

1. Diameter =3
2. Color =Orange
3. Grows in Summer = Yes
4. Shape = Circle

Tree 1 classifies it
as an Orange
33
Missing Data Example

1. Diameter =3
2. Color =Orange
3. Grows in Summer = Yes
4. Shape = Circle

Tree 2 classifies it
as Cherries
34
Missing Data Example

1. Diameter =3
2. Color =Orange
3. Grows in Summer = Yes
4. Shape = Circle

Tree 3 classifies it
as Oranges
Results Compared 35
Tree 1 Tree 2 Tree 3
36
Conclusions

 Random forests are an effective tool in prediction.

 Random inputs and random features produce good results in classification.

 For larger data sets, we can gain accuracy by combining random features
The problem is
same as before
You are given data by your manager of customers who have previously bought some older makes of
your
company’s SUV. The data includes a total of 400 instances

Independent variables
• Age
• Estimated salary

Dependent Variable
• Purchased (0 = no SUV purchased, 1 = SUV purchased)

Your company has just introduced a new SUV vehicle.

You are asked to predict who will buy the new SUV vehicle
Import the libraries

Import the dataset

Split the dataset into training and testing

samples Feature scaling

Training the model

Predicting the new results

Predicting the Test set results

Making the confusion matrix

Visualize the training set results

Visualize the test set results


n_estimators=10

Criterion=entrop

y Random

state=0
The confusion matric shows that 91 samples are correctly classifies and 9 are mis
classified.

The true negative values are 63 i.e., expected value is 0 and predicted value is 0.

The false negative values are 5 i.e., expected value is 1 and the predicted value is

0. The false positive values are 4 i.e., expected value is 0 and the predicted

value is 1. The true positive values are 28 i.e., expected value is 1 and the

predicted value is 1.

You might also like