0% found this document useful (0 votes)
15 views11 pages

Week 05 Classification Performance

Uploaded by

sabrinashah2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views11 pages

Week 05 Classification Performance

Uploaded by

sabrinashah2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

05-09-2024

TOD 533
Classification Performance:
Validation and metrics
Amit Das
TODS / AMSOM / AU
[email protected]

Model validation: Holdout sample


• Training set: data for training model (optimum values of parameters)
• Validation set: assessing performance on data withheld from training
• Opportunity to set / refine some model (hyper)parameters
• Test set: Expose model to data of interest for prediction

• Avoid overfitting – customizing model to quirks of training data that


are absent in other (particularly, target) data
• Prefer simpler models (Occam’s razor)

1
05-09-2024

k-fold Cross-validation
• Divide training data into k equally-sized subsets
• Randomize order, if necessary
• Train model on subsets 2, 3, …, k
• Choose subset 1 for testing model

• Repeat with subsets 2, 3, …, k as testing sets


• Stratified k-fold cross-validation

• Average performance over k runs (accuracy, …)

Comparing predicted to actual

Confusion Classification
Matrix Table

2
05-09-2024

Performance: accuracy

Performance: precision

3
05-09-2024

Performance: sensitivity (recall)

Performance: specificity

4
05-09-2024

Accuracy, precision, sensitivity and specificity


Actual
Positive Negative
Positive True Positive False Positive
Predicted
TP FP
Negative False Negative True Negative
FN TN

Accuracy (TP + TN) / (TP + TN + FP + FN)


Precision TP / (TP + FP)
Sensitivity (Recall) TP / (TP + FN)
Specificity TN / (TN + FP)

In the Diabetes context


Predicted
Diabetic Healthy
Diabetic True Positive False Negative
TP FN
Actual
153 115
Healthy False Positive True Negative
FP TN
60 440

Accuracy (TP + TN) / (TP + TN + FP + FN) = 0.772


Precision TP / (TP + FP) = 0.718
Sensitivity (Recall) TP / (TP + FN) = 0.571
Specificity TN / (TN + FP) = 0.880

5
05-09-2024

Jamovi output: Classification table


Results
Classification Table – …
Predicted
Observed tested_negative tested_positive % Correct
tested_negative 445 55 89.0
tested_positive 112 156 58.2
Note. The cut-off value is set to 0.5

Results
Predictive Measures
Accuracy Specificity Sensitivity
0.783 0.890 0.582
Note. The cut-off value is set to 0.5

Accuracy of classification: Logistic Regression

Accuracy

6
05-09-2024

Confusion Matrix: Logistic Regression

Specificity

Precision Sensitivity

F-measure
• Harmonic mean of precision and recall

• More generally,

• b < 1 focuses on precision, while b > 1 emphasizes recall

7
05-09-2024

MCC (Matthews correlation coefficient)

• It can be calculated from the confusion matrix as:

ROC Curves
• ROC is an abbreviation of Receiver Operating Characteristic
coming from the signal detection theory, developed during
World War II (for analysis of radar images).
• In the context of classifiers, ROC plot is a useful tool to study
• the behavior of a classifier or
• comparing two or more classifiers.

• A ROC plot is a two-dimensional graph, where the x-axis


represents FP rate (FPR) and y-axis represents TP rate (TPR).

8
05-09-2024

Comparing classifiers using ROC Plot


• We can use the concept of the “area
under the curve” (AUC) as a method to
compare two or more classifiers
• If a model is perfect, then its AUC = 1
• If a model simply performs random
guessing, then its AUC = 0.5
• A model that is strictly better than
another has a larger value of AUC than
the other

• Here, C3 is best, and C2 is better than


C1 as AUC(C3) > AUC(C2) > AUC(C1)

Comparison of Area under the ROC curve (AUC)


Classifier Logistic Discriminant KNN-5 Naïve Bayes Decision Tree Decision Rules
AUC 0.832 0.832 0.766 0.819 0.751 0.739

Amit’s Grades
AUC > 0.9 Excellent
AUC 0.8 to 0.9 Very Good
AUC 0.7 to 0.8 Good
AUC 0.6 to 0.7 Needs Improvement
AUC 0.5 to 0.6 Hopeless

9
05-09-2024

Multiway Classification: The Iris dataset

SepalLength SepalWidth PetalLength PetalWidth Species


5.1 3.5 1.4 0.2 Iris-setosa
4.9 3 1.4 0.2 Iris-setosa
4.7 3.2 1.3 0.2 Iris-setosa
4.6 3.1 1.5 0.2 Iris-setosa
5 3.6 1.4 0.2 Iris-setosa
7 3.2 4.7 1.4 Iris-versicolor
6.4 3.2 4.5 1.5 Iris-versicolor
6.9 3.1 4.9 1.5 Iris-versicolor
5.5 2.3 4 1.3 Iris-versicolor
6.5 2.8 4.6 1.5 Iris-versicolor
6.3 3.3 6 2.5 Iris-virginica
5.8 2.7 5.1 1.9 Iris-virginica
7.1 3 5.9 2.1 Iris-virginica
6.3 2.9 5.6 1.8 Iris-virginica
6.5 3 5.8 2.2 Iris-virginica

Multinomial Logistic Regression


Model Coefficients - Species
Species Predictor Estimate SE Z p Odds ratio
Iris-versicolor -
Intercept 18.68 30.3 0.6165 0.538 1.30e+8
Iris-setosa
PetalWidth -3.09 39.7 -0.0779 0.938 0.04535
PetalLength 13.95 52.6 0.2655 0.791 1.15e+6
SepalWidth -8.65 134.2 -0.0645 0.949 1.75e-4
SepalLength -5.32 76.7 -0.0694 0.945 0.00488
Iris-virginica -
Intercept -23.70 31.2 -0.7594 0.448 5.10e-11
Iris-setosa
PetalWidth 15.10 40.2 0.3756 0.707 3.61e+6
PetalLength 23.34 52.9 0.4415 0.659 1.37e+10
SepalWidth -15.31 134.2 -0.1140 0.909 2.25e-7
SepalLength -7.78 76.7 -0.1015 0.919 4.17e-4

10
05-09-2024

Multiway classification (Weka)

Logistic Regression with ridge parameter of 1.0E-8


Coefficients...
Class
Variable Iris-setosa Iris-versicolor
=============================================== === Confusion Matrix ===
SepalLength 21.8065 2.4652
SepalWidth 4.5648 6.6809 a b c <-- classified as
PetalLength -26.3083 -9.4293 50 0 0 | a = Iris-setosa
PetalWidth -43.887 -18.2859 0 46 4 | b = Iris-versicolor
Intercept 8.1743 42.637 0 2 48 | c = Iris-virginica

Separability of classes

11

You might also like