ED5340 - Data Science: Theory apa th y
and Practise an
h ug
u t
M
a n
L24 - Evaluation Metrics n a th
a
Ram
Ramanathan Muthuganapathy ([Link]
Course web page: [Link]
Moodle page: Available at [Link]
Classification
• Confusion Matrix
h y
a t
• Precision n ap
a
h ug
• Recall u t
M
a n
• F1-Score n a th
a
• True positive rate Ram
• False positive rate
• Accuracy
• AUC
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Confusion Matrix
h y
a t
Actual Class ap TP - True Positive
a n
1 u g 0
th
u FP - False Positive
M
a n
t h
Predicted Class
TP n a FP
1 a FN - False Negative
a m
R
TN - True Negative
FN TN
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Details
• TP (True Positive) - Actual and Prediction are both
h y positive.
a t
ap
• FP (False Positive) - Actual is false but thegaprediction n is true (Prediction cancer
when there is no such case). h u
u t
M
a n
• FN (False Negative) - Actual is true n a t hbut the prediction is false (Prediction no
cancer when there is one). a
a m
R
• TN (True Negative) - Actual and Prediction are both negative.
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Precision and recall
• Precision - Of all the positive predicted cases, hwhat
y is the fraction that is
actually positive? a t
ap
a n Actual Class
TP u th ug 1 0
• P = M
TP + FP a th a n
Predicted Class
a n TP FP
am 1
R
FN TN
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Precision and recall
• Recall - Of all the actual positive cases, what ishythe fraction that has been
correctly predicted? a t
ap
a n Actual Class
TP u th ug 1 0
• R = M
TP + FN a th a n
Predicted Class
a n TP FP
am 1
R
FN TN
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Example
• Dataset - 50 cases, 40 true and 10 h y
false a t
ap
a n Actual Class
TP u th ug 1 0
• P = M
TP + FP a th a n
Predicted Class
n
TP Ram
a 1 30 FP
• R =
TP + FN
FN 3
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Precision or Recall
• E.g. - Email spam filter h y
a t Actual Spam
ap
• High precision or high recall an
1 0
h ug
u t
• FP - Genuine email getting a n
M
Predicted Spam
classified as spam th TP FP
n a 1
a
am
• R
FN - Spam coming to your
inbox
FN TN
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
F1 - Score (Harmonic mean)
• Dataset - 50 cases, 40 true and 10 false h y
a t
p
P*R a n a Actual Class
• F 1 = 2 * h u g
1 0
P+R M
u t
a n
th
a
Predicted Class
a n 30 FP
am 1
R
FN 3
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Accuracy
• 50 cases, 40 true and 10 false h y
a t
p
TP + TN a n a Actual Class
Acc = ug
• TP + FP + FN + TN u th 1 0
M
a n
th
a
Predicted Class
a n 30 FP
am 1
R
FN 3
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
TPR and FPR
• Dataset - 50 cases, 40 true and 10 h y
false a t
ap
a n Actual Class
TP u th ug 1 0
• TPR = M
TP + FN a t h a n
Predicted Class
n
FP R a m
a 1 TP FP
• FPR = (negative cases
FP + TN
being predicted incorrectly
FN TN
0
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Area under ROC curve (AUC)
TPR and FPR
• Higher the area, the better. h y
a t
ap
• Qn: How to get this curve? an
h ug
u t
TPR
M
a n
th
n a
a
Ram
FPR
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
Dice score / coefficient (pixel data)
|A ∩ B|
DC = 2 * th y
• |A| + |B| n apa
a
ug
Areaofintersection u th
DC = 2 * n
M
• Sumofthetwoareas a th a
a n
Ram
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras
IoU (Intersection over union)
|A ∩ B| y
IoU = th
• |A ∪ B| n apa
a
ug
Areaofintersection M
u th
DC = n
• Areaoftheunion a th a
a n
Ram
Ramanathan Muthuganapathy, Department of Engineering Design, IIT Madras