Emaan Tahir-442
Kaneez Amna-448
Izza Shah-401
Aiman Arshad-445
Summaya Nazir-440
Presented To:
Ma’am Nazia
1977: Breiman, Stone, Friedman, and Olshen invented the
first CART version. 1984: The official publication with a CART
decision tree software. It was a revolution in the world of
algorithms. Even today, CART is one of the most used
methods for decision tree data analytics.
CART is an umbrella word that refers to the following types of
decision trees:
Classification Trees: When the target variable is continuous, the
tree is used to find the "class" into which the target variable is
most likely to fall.
Regression trees: These are used to forecast the value of a
continuous variable.
The representation used for CART is a binary tree.
Stopping criteria define how much tree learns and pruning can be
used to improve a learned tree.
Predictions are made with CART by traversing the binary tree given
a new input record.
The tree is learned using a greedy algorithm on the training data to
pick splits in the tree.
The nodes are split into subnodes on the basis of a threshold value of an
attribute. The CART algorithm does that by searching for the best
homogeneity for the subnodes, with the help of the Gini Index criterion.
1 If Height > 180 cm Then Male
2 If Height <= 180 cm AND Weight > 80 kg
3 Then Male
4 If Height <= 180 cm AND Weight <= 80
kg Then Female
Make Predictions With CART Models
The root node is taken as the training set and is split into two by considering
the best attribute and threshold value. Further, the subsets are also split
using the same logic. This continues till the last pure sub-set is found in the
tree or the maximum number of leaves possible in that growing tree. This is
also known as Tree Pruning.
Greedy Algorithm:
The input variables and the split points are selected through a greedy
algorithm.
Stopping Criterion:
The recursive binary splitting method described above must know
when to stop splitting. If the count is less than a certain threshold, the
split is rejected and the node is considered the last leaf node.
Tree pruning:
A decision tree's complexity is defined as the number of splits in the tree.
Only leaf nodes are eliminated if the total cost function for the complete
test set decreases.
EXAMPLE
Day Outlook Temp. Humidity Wind Decision
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
3 Overcast Hot High Weak Yes
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Weak Yes
11 Sunny Mild Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
14 Rain Mild High Strong No
Gini index is a metric for classification tasks in
CART. It stores sum of squared probabilities of
each class. We can formulate it as illustrated below.
Gini = 1 – Σ (Pi)2 for i=1 to number of classes
Number of
Outlook Yes No
instances
Sunny 2 3 5
Overcast 4 0 4
Rain 3 2 5
Gini(Outlook=Sunny) = 1 – (2/5)2 – (3/5)2 = 1 – 0.16 – 0.36 = 0.48
Gini(Outlook=Overcast) = 1 – (4/4)2 – (0/4)2 = 0
Gini(Outlook=Rain) = 1 – (3/5)2 – (2/5)2 = 1 – 0.36 – 0.16 = 0.48
Then, we will calculate weighted sum of gini indexes for outlook
feature.
Gini(Outlook) = (5/14) x 0.48 + (4/14) x 0 + (5/14) x 0.48 = 0.171 + 0
+ 0.171 = 0.342
Number of
Temperature Yes No
instances
Hot 2 2 4
Cool 3 1 4
Mild 4 2 6
Gini(Temp=Hot) = 1 – (2/4)2 – (2/4)2 = 0.5
Gini(Temp=Cool) = 1 – (3/4)2 – (1/4)2 = 1 – 0.5625 – 0.0625 = 0.375
Gini(Temp=Mild) = 1 – (4/6)2 – (2/6)2 = 1 – 0.444 – 0.111 = 0.445
We’ll calculate weighted sum of gini index for temperature feature
Gini(Temp) = (4/14) x 0.5 + (4/14) x 0.375 + (6/14) x 0.445 = 0.142 +
0.107 + 0.190 = 0.439
Number of
Humidity Yes No
instances
High 3 4 7
Normal 6 1 7
Gini(Humidity=High) = 1 – (3/7)2 – (4/7)2 = 1 – 0.183 – 0.326 =
0.489
Gini(Humidity=Normal) = 1 – (6/7)2 – (1/7)2 = 1 – 0.734 – 0.02 =
0.244
Weighted sum for humidity feature will be calculated next
Gini(Humidity) = (7/14) x 0.489 + (7/14) x 0.244 = 0.367
Number of
Wind Yes No
instances
Weak 6 2 8
Strong 3 3 6
Gini(Wind=Weak) = 1 – (6/8)2 – (2/8)2 = 1 – 0.5625 – 0.062 = 0.375
Gini(Wind=Strong) = 1 – (3/6)2 – (3/6)2 = 1 – 0.25 – 0.25 = 0.5
Gini(Wind) = (8/14) x 0.375 + (6/14) x 0.5 = 0.428
We’ve calculated gini index values for each feature. We will
select outlook feature because its cost is the lowest.
Feature Gini index
Outlook 0.342
Temperature 0.439
Humidity 0.367
Wind 0.428
outlook
OUTLOOK
YES
Day Outlook Temp. Humidity Wind Decision
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
11 Sunny Mild Normal Strong Yes
Gini of temperature for sunny outlook
Number of
Temperature Yes No
instances
Hot 0 2 2
Cool 1 0 1
Mild 1 1 2
Gini(Outlook=Sunny and Temp.=Hot) = 1 – (0/2)2 – (2/2)2 = 0
Gini(Outlook=Sunny and Temp.=Cool) = 1 – (1/1)2 – (0/1)2 = 0
Gini(Outlook=Sunny and Temp.=Mild) = 1 – (1/2)2 – (1/2)2 = 1 – 0.25 – 0.25 = 0.5
Gini(Outlook=Sunny and Temp.) = (2/5)x0 + (1/5)x0 + (2/5)x0.5 = 0.2
Humidity Yes No Number of instances
High 0 3 3
Normal 2 0 2
Gini(Outlook=Sunny and Humidity=High) = 1 – (0/3)2 – (3/3)2 = 0
Gini(Outlook=Sunny and Humidity=Normal) = 1 – (2/2)2 – (0/2)2 = 0
Gini(Outlook=Sunny and Humidity) = (3/5)x0 + (2/5)x0 = 0
Gini of wind for sunny outlook
Number of
Wind Yes No
instances
Weak 1 2 3
Strong 1 1 2
Gini(Outlook=Sunny and Wind=Weak) = 1 – (1/3)2 – (2/3)2 = 0.266
Gini(Outlook=Sunny and Wind=Strong) = 1- (1/2)2 – (1/2)2 = 0.2
Gini(Outlook=Sunny and Wind) = (3/5)x0.266 + (2/5)x0.2 = 0.466
Feature Gini index
Temperature 0.2
Humidity 0
Wind 0.466
Rain outlook
Day Outlook Temp. Humidity Wind Decision
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
10 Rain Mild Normal Weak Yes
14 Rain Mild High Strong No
Temperature Yes No Number of instances
Cool 1 1 2
Mild 2 1 3
Gini(Outlook=Rain and Temp.=Cool) = 1 – (1/2)2 – (1/2)2 = 0.5
Gini(Outlook=Rain and Temp.=Mild) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini(Outlook=Rain and Temp.) = (2/5)x0.5 + (3/5)x0.444 = 0.466
Gini of humidity for rain outlook
Humidity Yes No Number of instances
High 1 1 2
Normal 2 1 3
Gini(Outlook=Rain and Humidity=High) = 1 – (1/2)2 – (1/2)2 = 0.5
Gini(Outlook=Rain and Humidity=Normal) = 1 – (2/3)2 – (1/3)2 = 0.444
Gini(Outlook=Rain and Humidity) = (2/5)x0.5 + (3/5)x0.444 = 0.466
Wind Yes No Number of instances
Weak 3 0 3
Strong 0 2 2
Gini(Outlook=Rain and Wind=Weak) = 1 – (3/3)2 – (0/3)2 = 0
Gini(Outlook=Rain and Wind=Strong) = 1 – (0/2)2 – (2/2)2 = 0
Gini(Outlook=Rain and Wind) = (3/5)x0 + (2/5)x0 = 0
Decision for rain outlook
Feature Gini index
Temperature 0.466
Humidity 0.466
Wind 0
Example
Age Job House Credit Loan Approved
Young False No Fair No
Young False No Good No
Young True No Good Yes
Young True Yes Fair Yes
Young False No Fair No
Middle False No Fair No
Middle False No Good No
Middle True Yes Good Yes
Middle False Yes Excellent Yes
Middle False Yes Excellent Yes
Old False Yes Excellent Yes
Old False Yes Good Yes
Old True No Good Yes
Old True No Excellent Yes
Young False No Good ?
Young 5 Yes 2
No 3
Middle 5 Yes 3
No 2
Old 5 Yes 4
No 1
individual error, and total for Age attribute
Attribute Rules Error Total Error
Age Young->No 2/5 5/15
Middle->Yes 2/5
Old->Yes 1/5
False 10 Yes 4
No 6
True 5 Yes 5
No 0
individual error, and total for Job attribute
Attribute Rules Error Total Error
Job False->No 4/10 4/15
True->Yes 0/5
No 9 Yes 3
No 6
Yes 6 Yes 6
No 0
individual error, and total for House attribute
Attribute Rules Error Total Error
House No->No 3/9 3/15
Yes->yes 0/6
Fair 5 Yes 1
No 4
Good 6 Yes 4
No 2
Excellent 4 Yes 4
No 0
individual error, and total for Credit attribute
Attribute Rules Error Total Error
Credit Fair->No 1/5 3/15
Good->Yes 2/6
Excellent->Yes 0/4
Attribute Rules Error Total Error
Age Young->No 2/5 5/15
Middle->Yes 2/5
Old->Yes 1/5
Job False->No 4/10 4/15
True->Yes 0/5
House No->No 3/9 3/15
Yes->yes 0/6
Credit Fair->No 1/5 3/15
Good->Yes 2/6
Excellent->Yes 0/4
House
YES NO
YES Age Job Credit LA
young False Fair No
Young False Good No
young True Good Yes
young False Fair No
Middle False Fair No
Middle False Good No
Old True Good Yes
Old True Excellent Yes
Old False Fair No
Attribute Rules Error Total Error
Age Young->No 1/4 2/9
Middle->No 0/2
Old->Yes 1/3
Job False->No 0/6 0/9
True->Yes 0/3
Credit Fair->No 0/4 2/9
Good->Yes/No 2/4
Excellent->Yes 0/1
House
No
YES
YES
Job
False True
No YES
Loan
Age Job House Credit
Approved
Young False No Good No
Advantages of CART Algorithm
CART requires minimal supervision and produces easy-
to-understand models.
It focuses on finding interactions and signal
discontinuity.
It finds important variables automatically.
It uses any combination of continuous/ discrete
variables.
Disadvantages of CART Algorithm
This does not use combinations of variables.
The tree structure may be unstable.
It has a limited number of positions to accommodate
available predictors.