Data Preprocessing and Model Evalution

The document discusses key concepts in data preprocessing and model evaluation, emphasizing the importance of metrics for both supervised and unsupervised learning. It outlines various evaluation metrics for classification (Accuracy, Precision, Recall, F1-Score) and regression models (MAE, MSE), as well as issues like overfitting and underfitting. Additionally, it highlights hyperparameter tuning and ensemble methods as strategies to enhance model performance.

Uploaded by

layappa44lk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views3 pages

Data Preprocessing and Model Evalution

Uploaded by

layappa44lk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Data preprocessing and model evalution

Model Evaluation: A critical step to determine how well your model is performing. This
involves metrics like:
For Supervised Learning: Accuracy, Precision, Recall, F1-Score, Mean Squared Error
(MSE), etc.
For Unsupervised Learning: Metrics like Silhouette Score or Adjusted Rand Index.
Overfitting and Underfitting: Understanding these two common problems is crucial.

Overfitting: When a model learns the training data too well, including its noise, and
performs poorly on new data.
Underfitting: When a model is too simple to capture the underlying patterns in the
data.
Hyperparameter Tuning: The process of adjusting a model's settings (hyperparameters) to
optimize its performance.
Ensemble Methods: Combining multiple models to create a single, more powerful
model. This includes techniques like Bagging (e.g., Random Forest) and Boosting (e.g.,
XGBoost, LightGBM

have i need to learn all above topics

Of course. Here are notes on the key evaluation metrics for supervised learning.

Evaluation Metrics for Supervised Learning

Evaluation metrics are crucial for quantifying the performance of a machine learning
model. They tell us how well the model is learning and how accurately it can make
predictions on new, unseen data. The choice of metric depends on the type of problem:
classification or regression.

Metrics for Classification Models

Classification models predict a category or class (e.g., "Spam" vs. "Not Spam", "Cat" vs.
"Dog").

1. Accuracy

What it is: The simplest metric. It measures the percentage of predictions that the
model got right.
Formula: (Number of Correct Predictions) / (Total Number of Predictions)
When to use it: Good for balanced datasets, where each class has a similar number of
examples.
Limitation: Can be misleading on imbalanced datasets. For example, if 95% of emails
are not spam, a model that always predicts "not spam" will be 95% accurate but is
completely useless.

2. Precision

What it is: Measures the accuracy of the positive predictions. Of all the times the
model predicted a positive outcome, how many were actually positive?
Formula: (True Positives) / (True Positives + False Positives)
When to use it: When the cost of a false positive is high.
Analogy 📧: In a spam filter, you want high precision. You would rather a spam email
get into your inbox (a false negative) than have an important email go to spam (a false
positive).


3. Recall (Sensitivity)
What it is: Measures how well the model finds all the actual positives. Of all the actual
positive cases, how many did the model correctly identify?
Formula: (True Positives) / (True Positives + False Negatives)
When to use it: When the cost of a false negative is high
Analogy 🩺: In medical screening for a disease, you want high recall. You would rather
tell a healthy person they might be sick (a false positive) than miss an actual case of
the disease (a false negative).


4. F1-Score
What it is: The harmonic mean of Precision and Recall. It provides a single score that
balances both concerns.
Formula: 2 * (Precision * Recall) / (Precision + Recall)
When to use it: When you need a balance between Precision and Recall, especially on
imbalanced datasets where accuracy is not a reliable metric. It's a good all-around
metric.

Metrics for Regression Models

Regression models predict a continuous, numerical value (e.g., price, temperature).

1. MAE (Mean Absolute Error)

What it is: The average of the absolute differences between the predicted and actual
values.
Formula: (1/n) * Σ|Actual - Predicted|
Key Feature: It's easy to interpret because the error is in the same units as the output
variable (e.g., an MAE of 5 means the predictions are off by an average of $5). It is not
very sensitive to outliers.


2. MSE (Mean Squared Error)

What it is: The average of the squared differences between the predicted and actual
values.
Formula: (1/n) * Σ(Actual - Predicted)²
Key Feature: It penalizes larger errors much more heavily than smaller ones. If
your model makes a very bad prediction, the MSE will be much higher. Its units are
"squared," making it less intuitive.


Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
Model Evaluation
No ratings yet
Model Evaluation
18 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
Performance Metrics
No ratings yet
Performance Metrics
8 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
6 pages
What Are The Evaluation Metrics in Machine Learning
No ratings yet
What Are The Evaluation Metrics in Machine Learning
3 pages
Unit Iv
No ratings yet
Unit Iv
17 pages
3.4. Evaluation Metrics For AI Models
No ratings yet
3.4. Evaluation Metrics For AI Models
36 pages
Machine Learning Model Evaluation - Zero To Mastery Academy
No ratings yet
Machine Learning Model Evaluation - Zero To Mastery Academy
1 page
Ads Exp4
No ratings yet
Ads Exp4
3 pages
CH 3 Evalauting Models
No ratings yet
CH 3 Evalauting Models
3 pages
Performance Measures
No ratings yet
Performance Measures
19 pages
Performance Metrics
No ratings yet
Performance Metrics
12 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
2 pages
Unit 4
No ratings yet
Unit 4
20 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
6 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
Lec 4
No ratings yet
Lec 4
24 pages
ML MAKAUT Unit-3
No ratings yet
ML MAKAUT Unit-3
6 pages
Evaluationnai
No ratings yet
Evaluationnai
5 pages
Worksheet For 8th
100% (1)
Worksheet For 8th
5 pages
Performance Metric (Summerized)
No ratings yet
Performance Metric (Summerized)
43 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
20 pages
Unit II - Model Validation
No ratings yet
Unit II - Model Validation
20 pages
Performance Metrics in Machine Learning
No ratings yet
Performance Metrics in Machine Learning
30 pages
Third Seminar Assignment On Machine Learning (CSC 912)
No ratings yet
Third Seminar Assignment On Machine Learning (CSC 912)
10 pages
Evaluating Models CH-3
No ratings yet
Evaluating Models CH-3
5 pages
Intel Assignment
No ratings yet
Intel Assignment
13 pages
22AIP3101A Session 3
No ratings yet
22AIP3101A Session 3
24 pages
A Complete Guide To Model Evaluation Metrics
No ratings yet
A Complete Guide To Model Evaluation Metrics
9 pages
How to Evaluate Machine Learning Models
No ratings yet
How to Evaluate Machine Learning Models
14 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Imp Notes For Aamd
No ratings yet
Imp Notes For Aamd
6 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
11 pages
Module 4 Supervised Algoritms-II
No ratings yet
Module 4 Supervised Algoritms-II
40 pages
ML Challenges and Metrics
No ratings yet
ML Challenges and Metrics
19 pages
Performance Metrics
No ratings yet
Performance Metrics
3 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Ads 5
No ratings yet
Ads 5
5 pages
Part B Unit 3
No ratings yet
Part B Unit 3
23 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
7383469-AI Handout PartB Unit-3 Evaluating Models
100% (1)
7383469-AI Handout PartB Unit-3 Evaluating Models
16 pages
Performance Metrics ML
No ratings yet
Performance Metrics ML
4 pages
08 Classifier Evaluation
No ratings yet
08 Classifier Evaluation
39 pages
Unit3 Evaluating Models
No ratings yet
Unit3 Evaluating Models
10 pages
Module 2 Theory
No ratings yet
Module 2 Theory
6 pages
S1 Evaluate Performance LKW 1mar2025
No ratings yet
S1 Evaluate Performance LKW 1mar2025
26 pages
Unit III Iml Final
No ratings yet
Unit III Iml Final
36 pages
Performance Evaluation
No ratings yet
Performance Evaluation
24 pages
Model Evaluation Metrics - A Comprehensive Guide For Beginners - by Yash - Medium
No ratings yet
Model Evaluation Metrics - A Comprehensive Guide For Beginners - by Yash - Medium
9 pages
Stupid Things To Remember in ML
No ratings yet
Stupid Things To Remember in ML
13 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Metric
No ratings yet
Metric
6 pages
Unit Iii
No ratings yet
Unit Iii
67 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Session 1 Evaluation Model
No ratings yet
Session 1 Evaluation Model
58 pages
Unit - I Chap-4 Model Evaluation and Development
No ratings yet
Unit - I Chap-4 Model Evaluation and Development
35 pages
ML Models
No ratings yet
ML Models
9 pages
Data Struccture Uaing PYTHON
No ratings yet
Data Struccture Uaing PYTHON
15 pages
Interview Preparation Roadmap
No ratings yet
Interview Preparation Roadmap
4 pages
1 Write A Short Note On Intelligent Business Proce
No ratings yet
1 Write A Short Note On Intelligent Business Proce
1 page
Module 5
No ratings yet
Module 5
15 pages
Santosh Kalabar Internship
No ratings yet
Santosh Kalabar Internship
29 pages
Module 4
No ratings yet
Module 4
21 pages
Leet Code
No ratings yet
Leet Code
45 pages
ADB04 BSBFIN601 Assessment 2 of 3
No ratings yet
ADB04 BSBFIN601 Assessment 2 of 3
11 pages
SAProuter Installation Guide for Windows
No ratings yet
SAProuter Installation Guide for Windows
9 pages
Environmental Sustainability Practices of The Hotel and Resort Owners Within Cenro Casiguran Jurisdiction
100% (1)
Environmental Sustainability Practices of The Hotel and Resort Owners Within Cenro Casiguran Jurisdiction
9 pages
TLT Manual
No ratings yet
TLT Manual
34 pages
(Xiczcix) Goodnight Punpun v5
100% (1)
(Xiczcix) Goodnight Punpun v5
435 pages
Communication Engineering Ec-Sem-4 - Google Search
No ratings yet
Communication Engineering Ec-Sem-4 - Google Search
2 pages
Imu Ug Admissions 2017 - Third List - Selected and Upgraded Candidates
No ratings yet
Imu Ug Admissions 2017 - Third List - Selected and Upgraded Candidates
62 pages
Introduction To Hindu Samskaras
100% (1)
Introduction To Hindu Samskaras
2 pages
CV Abdi Nesredin Mume
No ratings yet
CV Abdi Nesredin Mume
3 pages
Calculating Staff
No ratings yet
Calculating Staff
13 pages
The Effectiveness of 7E Learning Model To Improve Scientific Literacy
No ratings yet
The Effectiveness of 7E Learning Model To Improve Scientific Literacy
5 pages
Cpcindex Primavera
No ratings yet
Cpcindex Primavera
3 pages
Be It Enacted by The Senate and The House of Representatives of The Philippines in Congress Assembled
No ratings yet
Be It Enacted by The Senate and The House of Representatives of The Philippines in Congress Assembled
7 pages
Reading & Writing: Text & Structure
No ratings yet
Reading & Writing: Text & Structure
19 pages
Table of Content For The Preparation of Social Welfare Officer Job Through PPSC
No ratings yet
Table of Content For The Preparation of Social Welfare Officer Job Through PPSC
5 pages
6.11 R T.Y.B.Sc. Applied Component Drugs and Dyes Sem V VI CBCS
No ratings yet
6.11 R T.Y.B.Sc. Applied Component Drugs and Dyes Sem V VI CBCS
23 pages
Healthcare Workforce Report
No ratings yet
Healthcare Workforce Report
87 pages
Robotic Process Automation
100% (1)
Robotic Process Automation
9 pages
HR Selection Process Guide
No ratings yet
HR Selection Process Guide
20 pages
Module 3A: Designing Instruction in The Different Learning Delivery Modalities
No ratings yet
Module 3A: Designing Instruction in The Different Learning Delivery Modalities
5 pages
United Nations Lesson Plan Grades 6-8
No ratings yet
United Nations Lesson Plan Grades 6-8
7 pages
The Role of Music in The Conflict Zrinka Mozara
No ratings yet
The Role of Music in The Conflict Zrinka Mozara
15 pages
Social Media's Impact on Grade 11 STEM Performance
No ratings yet
Social Media's Impact on Grade 11 STEM Performance
39 pages
Market Research Analyst Profile Summary
No ratings yet
Market Research Analyst Profile Summary
3 pages
DLL - Tle 8 - Week 2 - Q4 - Sy 2021-2022
No ratings yet
DLL - Tle 8 - Week 2 - Q4 - Sy 2021-2022
2 pages
Evaluation Test Authorized Gas Tester
No ratings yet
Evaluation Test Authorized Gas Tester
2 pages
Expressive Speech Acts in Grade 10 Classroom
No ratings yet
Expressive Speech Acts in Grade 10 Classroom
23 pages
A Semi-Detailed Lesson Plan in Kindergarten I. Objectives: at The
92% (119)
A Semi-Detailed Lesson Plan in Kindergarten I. Objectives: at The
2 pages
ELCELB1 2019-2 B Eng Tech Study Guide
No ratings yet
ELCELB1 2019-2 B Eng Tech Study Guide
60 pages
2 Janice... The Salesclerk
No ratings yet
2 Janice... The Salesclerk
2 pages

Data Preprocessing and Model Evalution

Uploaded by

Data Preprocessing and Model Evalution

Uploaded by

Data preprocessing and model evalution

have i need to learn all above topics

Evaluation Metrics for Supervised Learning

Metrics for Classification Models

Metrics for Regression Models

1. MAE (Mean Absolute Error)

2. MSE (Mean Squared Error)

You might also like