Classification as a Machine Learning Problem
Over view
Classification is a canonical problem in Machine Learning
Classifiers can be measured using accuracy, precision and
recall
Traditional ML models for classification include SVM and
Naive Bayes
Neural networks perform very well on classification problems
Classification and Classifiers
Machine Learning
Work with a huge maze of Make intelligent decisions
Find patterns
data
Machine Learning
Emails on a server Spam or Ham? Trash or Inbox
Types of Machine Learning Problems
Classification Regression Clustering Rule-extraction
Types of Machine Learning Problems
Classification Regression Clustering Rule-extraction
Whales: Fish or Mammals?
Mammals Fish
Members of the infraorder Cetacea Look like fish, swim like fish, move with
fish
Whales: Fish or Mammals?
ML-based Classifier
ML-based Classifier
Training Prediction
Feed in a large corpus of data classified Use it to classify new instances which it
correctly has not seen before
Training the ML-based Classifier
Classification
ML-based Classifier
Corpus
Feedback - loss
Improves model parameters function or cost
function
An algorithm might have high accuracy but
still be a poor machine learning model
Its predictions are useless
Accuracy, Precision, Recall
All-is-well Binary Classifier
Medical reports
Always classify as No Cancer
“normal”
Here, accuracy for rare cancer may be 99.9999%, but…
Accuracy
Some labels maybe much more common/rare
than others
Such a dataset is said to be skewed
Accuracy is a poor evaluation metric here
Confusion Matrix
Predicted Labels
No Cancer
Cancer
Actual Label
10 instances 4 instances
Cancer
No Cancer 5 instances 1000 instances
Confusion Matrix
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer
No Cancer 5 1000
True Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer
No Cancer 5 1000
Actual Label = Predicted Label
True Positive
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4
No Cancer 5 1000
Actual Label = Predicted Label
False Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer
No Cancer 5 1000
Actual Label =/ Predicted Label
False Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer
No Cancer 5
FP 1000
Actual Label =/ Predicted Label
True Positive
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer
No Cancer 5 1000
Actual Label = Predicted Label
True Negative
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer
No Cancer 5 1000 TN
Actual Label = Predicted Label
False Negative
Predicted Labels
No Cancer
Cancer
Actual Label
10 4
Cancer
No Cancer 5 1000
Actual Label =/ Predicted Label
False Negative
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 4 FN
No Cancer 5 1000
Actual Label =/ Predicted Label
Confusion Matrix
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Actual Label = Predicted Label
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Accuracy =
TP + TN
=
1010
= 99.12%
Num Instances 1019
Accuracy
Accuracy = 99.12%
Classifier gets it right 99.12% of the time
But…
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
People on chemotherapy, radiation when not required
Accuracy
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Cancer not detected, no treatment prescribed
Accuracy is not a good metric to evaluate
whether this model performs well
Precision
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Precision
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Precision = Accuracy when classifier flags cancer
Precision
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
TP 10
Precision = TP + FP = 15 = 66.67%
Precision = 66.67%
Precision
1 in 3 cancer diagnoses is incorrect
Recall
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Recall
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
Recall = Accuracy when cancer actually present
Recall
Predicted Labels
No Cancer
Cancer
Actual Label
Cancer
10 TP 4 FN
No Cancer 5 FP 1000 TN
TP 10
Recall = TP + FN = 14 = 71.42%
Recall = 71.42%
Recall
2 in 7 cancer cases missed
Choosing a Machine Learning Model
ML-based Binary Classifier
Breathes like a mammal
Mammal
Gives birth like a mammal
ML-based Classifier
Corpus
ML-based Binary Classifier
Breathes like a mammal
P(fish) = 0.45
Gives birth like a mammal
ML-based Classifier
Corpus
Applying Logistic Regression
Probability of
animal being (95%)
fish Lives in water, breathes with gills, lays
eggs
(60%)
Lives in water, breathes with lungs,does not lay
eggs
Lives on land, breathes with lungs,does not lay
eggs
(5%) (40%)
Whales: Fish or Mammals?
Choosing Decision Threshold
(50%)
Probability of
animal being
fish
Pthreshold (80%)
(95%)
(60%)
(5%) (20%) (40%)
Choosing Decision Threshold
Probability of
animal being
fish
Pthreshold (80%)
(95%)
(60%)
(5%) (20%) (40%)
If probability < Pthreshold, it’s a mammal
Applying Logistic Regression
Probability of
animal being
fish
Pthreshold (80%)
(95%)
(60%)
(5%) (20%) (40%)
If probability > Pthreshold, it’s a fish
Predicted
No Cancer
Cancer
Actual TP FN
Cancer 0 14
FP TN
“Always No Cancer
0 1005
Negative”
Pthreshold =1 - Recall = 0%
- Precision = Infinite
- Classifier too conservative
Precision vs.“Conservativeness”
1.0 Precision
0
1.0
“Conservativeness” of Decision Threshold
Predicted
No Cancer
Cancer
Actual TP FN
Cancer 14 0
FP TN
“Always No Cancer
1005 0
Positive”
Pthreshold = 0 - Recall = 100%
- Precision = 14/1019 = 13.7%
- Classifier not conservative enough
Recall vs.“Conser vativeness”
1.0
Recall
0
1.0
“Conservativeness” of Decision Threshold
Precision-Recall Tradeoff
1.0 Precision
Recall
0
1.0
“Conservativeness” of Decision Threshold
Precision-Recall Tradeoff
1.0
Precision
Recall
Heuristics to Choose a Model
ROC Curve
F1 Score
Plot a curve to maximize true positives,
Harmonic mean of precision and recall
minimize false positives
Heuristics to Choose a Model
ROC Curve
F1 Score
Plot a curve to maximize true positives,
Harmonic mean of precision and recall
minimize false positives
Precision x Recall
F1 = 2x
Precision + Recall
F1 Score - Harmonic mean of precision, recall
- Closer to lower of two
- Favors even tradeoff
Choosing Pthreshold
Tweak threshold values
Calculate F1 Score
Run training by changing threshold values for
Each training run produces a model, calculate F1 score for each model
each execution
High F1 score better
Calculate precision, recall
Choose threshold which results in the highest F1
Find values for each training run
score
Heuristics to Choose a Model
ROC Curve
F1 Score
Plot a curve to maximize true positives,
Harmonic mean of precision and recall
minimize false positives
Choosing Pthreshold
True
Positive
Rate
False Positive
Rate
Choosing Pthreshold
Should be as high as
True possible
Positive
Rate
False Positive
Rate
Choosing Pthreshold
Should be as low as
True possible
Positive
Rate
False Positive
Rate
Choosing Pthreshold
ROC Curve
(Receiver Operating
Characteristic)
True
Positive
Rate
False Positive
Rate
Choosing Pthreshold
1.0
True
Positive Different values of Pthreshold
(Hyperparameter tuning)
Rate
False Positive
Rate
Choosing Pthreshold
1.0
True
Positive Fit ROC curve from different
Rate values of Pthreshold
False Positive
Rate
ROC Cur ve
1.0
True
Positive Pick top-left corner point as Pthreshold
Rate Why? Maximises True Positive Rate,
minimises False Positive Rate
0
False Positive
Rate
ROC of Perfect Classifier
1.0
TP = 100%
FP = 0%
True
Positive
Rate
False Positive
Rate
ROC of Random Classifier
1.0
TP = FP
True
Positive
Rate
False Positive
Rate