0% found this document useful (0 votes)

9 views3 pages

Machine Learning Basics

The document outlines the basics of machine learning, focusing on supervised and unsupervised learning, along with regression and classification models. It details performance metrics for regression, classification, and clustering, as well as the steps for evaluating machine learning models. Additionally, it discusses the pros and cons of certain models, emphasizing the importance of understanding residuals in predictions.

Uploaded by

bluefaction

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views3 pages

Machine Learning Basics

Uploaded by

bluefaction

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

Machine Learning Basics:

Supervised:
- Requires Training data with independent variables & a dependent variable
(labelled data)
- Need labelled data to "supervise" the algorithm when learning drom the data
- Regressions Models
- Classification Models

UnSupervised:
- Requires training data with independent variables only
- No need labelled data that can "supervise" the algorithm when learning the data
- Clustering Models
- Outlier Detection Models

Regression:
- Can be used when the response variable to be predicted is a continuous
variable(scaler)
- Used to Predict continuous values, prediction tests
- price of house based on location, etc
- For instance: evaluate mean square
- Example: Linear Regression, fixed Effects Regression, XGBoost Regression

Classification:
- Can be used when the response varible is a categorized values
- For instance: Used for decision making tests
- Predict categorical values, take an input and categorize them into predetermined
categories
- For Instanct: evaluate for Accuracy, classify email as spam or non-spam, identify
the type of animal or image
- Example: Logistic Regression, XGBoost Classification, Random Forest

Regression Performance Metrics:

- Calculate the difference between the predicted and true values => lower value is
better feed for the model

- RSS: Residual Sum of Squares:

RSS(Beta) = Sum from i=1 to N (Square of( y(i) - y(hat) ) )

y(i) = ith observation value

y(hat) = model’s predicted value

Beta = Co-effficient

- MSS: Mean Square Error - Used to penalize large errors than smaller ones

1/N * (RSS)

- RMSE: Root Mean Square Error - Used to report error in a way its easier to
understand/explain

Square Root of (MSE)

- MAE: Mean Absolute Error - Use to penalize all errors equally

1/N * Sum from i=1 to N ( abs( y(i) - y ) )

Classification Performance Metrics:

- Accuracy: CorrectPrediction / (CorrectPrediction + IncorrectPrediction)

- Precision: TruePositive / (TruePositive + FalsePositive)

TruePositive: Where model correctly predicts the positive outcome

FalsePositive: Where model incorrectly predicts the positive outcome

- Recall: TruePositive / (TruePositive + FalseNegative)

- F1Score: 2 * (Recall * Precision) / (Recall + Precision) - Higher value is better

Clustering Performance Metrics:

- Homogeneity - higher is more homogenious

Homogeneity(n) = 1 - (Conditional entropy given clusted assignments) / Entrophy of

(predicted) class

- Silhouette Score - Similarity of data on one clusted compared to other clusters

- Higher means data points is well matched to own cluster
- Used in DB Scan/ K Means

s(o) = (b(o) - a(o)) / max{a(o), b(o)}

o = co-effcient of data point

a(o) = Average distance between o and other data points in cluster that o belongs
b(o) = Min Avg distance from o to all the clusters that o does not belong

- Completeness:
- To the Degree to which all the data points that belong to particular class
are assigned as the same cluster
- Higher value indicated more complete structure

Completeness(c) = 1 - (Conditional entropy given clusted assignments) / Entrophy if

(actual) class

ML Model Evaluation Steps:

1. Data Preparation: Split data into train, validation and test.

2. Model Training: Train the model on the training data and save the fitted model
3. Hyper-Parameter Tuning: Use the fitted model and validation set to find the
optional
set of parameters where the model performs
the best
4. Prediction: Use optimal set of parameters from Hyper-Parameter tuning stage and
training data,
to train these models again with hyper parameters,
use thus best fitted model to do predictions on test data
5. Test Error rate: Compute performance metrics for your model using
prediction and real values of the target variable
from your test data

Pros:
- Simple model
- Low Variance
- Low Bias
- Provides probability

Cons:
- Unable to model non-linear relationship
- Unstable when classes are well separable
- Unstable when > 2 classes

Residual meaning - the difference between predicted vs true values

Module 2 MMC201
No ratings yet
Module 2 MMC201
25 pages
Machine Learning
No ratings yet
Machine Learning
46 pages
PSCS511 - Machine Learning
No ratings yet
PSCS511 - Machine Learning
23 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Notes On Machine Learning Fundamentals
No ratings yet
Notes On Machine Learning Fundamentals
4 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
ML Notes
No ratings yet
ML Notes
16 pages
SDL Unit 1
No ratings yet
SDL Unit 1
7 pages
Module 4 Supervised Learning
No ratings yet
Module 4 Supervised Learning
4 pages
Machine Learning QUESTION AND ANSWERS
No ratings yet
Machine Learning QUESTION AND ANSWERS
13 pages
GATE ML Updated 111023
No ratings yet
GATE ML Updated 111023
109 pages
05 - Machine Learning
No ratings yet
05 - Machine Learning
31 pages
Ass Bigd
No ratings yet
Ass Bigd
9 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
8 pages
Chapter 2 Supervised Learning - p1-2
No ratings yet
Chapter 2 Supervised Learning - p1-2
45 pages
None
No ratings yet
None
16 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
Machine Learning - Course
No ratings yet
Machine Learning - Course
6 pages
مشین سیکھنا
No ratings yet
مشین سیکھنا
5 pages
3 Pred Analysis
No ratings yet
3 Pred Analysis
18 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Untitled
No ratings yet
Untitled
11 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Beginner's Guide to Machine Learning
No ratings yet
Beginner's Guide to Machine Learning
37 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Machine Learning Engineer Interview Preparation Guide
No ratings yet
Machine Learning Engineer Interview Preparation Guide
14 pages
Fam Question Bank CT
No ratings yet
Fam Question Bank CT
14 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
5 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
MachineLearning Chatgpt
No ratings yet
MachineLearning Chatgpt
19 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
5 pages
ML Topics
No ratings yet
ML Topics
18 pages
MCC Mba ML and Ai May30 2024
No ratings yet
MCC Mba ML and Ai May30 2024
201 pages
Machine Learning Techniques Explained
100% (1)
Machine Learning Techniques Explained
12 pages
Week 2: Machine Learning Intro: Instructor: Ting Sun
No ratings yet
Week 2: Machine Learning Intro: Instructor: Ting Sun
21 pages
Machine Learning Class Notes
No ratings yet
Machine Learning Class Notes
2 pages
CS Study Guide
No ratings yet
CS Study Guide
3 pages
Machine Learning Notes "2023
No ratings yet
Machine Learning Notes "2023
31 pages
Artificial Intelligence Template 16x9
No ratings yet
Artificial Intelligence Template 16x9
12 pages
ML - ML in Nutshell
No ratings yet
ML - ML in Nutshell
7 pages
AIMl TA2
No ratings yet
AIMl TA2
4 pages
Data Science Lecture: Classification & Regression
No ratings yet
Data Science Lecture: Classification & Regression
27 pages
Data Analytics Unit4 FullNotes
No ratings yet
Data Analytics Unit4 FullNotes
4 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
All About ML
No ratings yet
All About ML
18 pages
AI ML Concepts
No ratings yet
AI ML Concepts
97 pages
Lesson 2.4.1 What Is Scikit Learn Keynote
No ratings yet
Lesson 2.4.1 What Is Scikit Learn Keynote
21 pages
Module 2 Theory
No ratings yet
Module 2 Theory
6 pages
Machine Learning Engineer Cheatsheet
No ratings yet
Machine Learning Engineer Cheatsheet
3 pages
Final ML
No ratings yet
Final ML
2 pages
Machine Learning Reg
No ratings yet
Machine Learning Reg
45 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
Predictive Unit 1
No ratings yet
Predictive Unit 1
22 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Mathematics7 Q4 M59 v2
No ratings yet
Mathematics7 Q4 M59 v2
40 pages
Supply Chain Network Optimization Proposal
No ratings yet
Supply Chain Network Optimization Proposal
34 pages
Valero Pembroke Accreditation Details
No ratings yet
Valero Pembroke Accreditation Details
2 pages
Roshan CV Oct24
No ratings yet
Roshan CV Oct24
5 pages
HolidayHomework XII CS LSPAL
No ratings yet
HolidayHomework XII CS LSPAL
4 pages
Effective Communication in MYP Design
No ratings yet
Effective Communication in MYP Design
16 pages
MPSH/MPSJ/MPSK Airlocks Overview
No ratings yet
MPSH/MPSJ/MPSK Airlocks Overview
8 pages
Recykal
No ratings yet
Recykal
7 pages
Chapter 11 Intro To DeviceNet
No ratings yet
Chapter 11 Intro To DeviceNet
85 pages
Experiences On Online Buying
No ratings yet
Experiences On Online Buying
11 pages
Kusto Explorer Base Queries Guide
No ratings yet
Kusto Explorer Base Queries Guide
2 pages
LCD TV LG 26LC2R Service Manual
50% (2)
LCD TV LG 26LC2R Service Manual
45 pages
AI Boosts Caterpillar's Forecasting
No ratings yet
AI Boosts Caterpillar's Forecasting
4 pages
Build A Mouse Trap Using Arduino UNO
No ratings yet
Build A Mouse Trap Using Arduino UNO
6 pages
Comprobador de Puntuación de Ensayos
100% (1)
Comprobador de Puntuación de Ensayos
6 pages
FortiManager Training Overview
No ratings yet
FortiManager Training Overview
28 pages
MoviesVerse - Movies Verse - 480p Movies, 720p Movies, 1080p Mov
No ratings yet
MoviesVerse - Movies Verse - 480p Movies, 720p Movies, 1080p Mov
3 pages
Manage Your AT&T Bill Online
No ratings yet
Manage Your AT&T Bill Online
3 pages
s2 250 Brochure
No ratings yet
s2 250 Brochure
66 pages
Latest MK-300 Series English Manual 22.03.22
No ratings yet
Latest MK-300 Series English Manual 22.03.22
71 pages
oRGANİZATİONAL DESİGN - Manufacturing and Service Chapter 8
No ratings yet
oRGANİZATİONAL DESİGN - Manufacturing and Service Chapter 8
25 pages
HP Compaq
No ratings yet
HP Compaq
34 pages
Education Loan Sanction for Manchun Kumar
No ratings yet
Education Loan Sanction for Manchun Kumar
2 pages
IECEP NCR NE 15th AGM Convention Invitation 2025
No ratings yet
IECEP NCR NE 15th AGM Convention Invitation 2025
1 page
TCS NQT First Level Shortlisted - Sairam
No ratings yet
TCS NQT First Level Shortlisted - Sairam
12 pages
Bioinstrumentation 1: Associate Professor Dr. Nashrul Fazli Bin Mohd Nasir
No ratings yet
Bioinstrumentation 1: Associate Professor Dr. Nashrul Fazli Bin Mohd Nasir
32 pages
Xbedc 16 A
No ratings yet
Xbedc 16 A
4 pages
APGS Different Frequency Automatic Dielectric Loss Tester
No ratings yet
APGS Different Frequency Automatic Dielectric Loss Tester
16 pages
5th Week Grade 12 EIM Activity Sheets
No ratings yet
5th Week Grade 12 EIM Activity Sheets
5 pages
Pressure Measurement - Instrumentation Transducers Questions and Answers - Sanfoundry
No ratings yet
Pressure Measurement - Instrumentation Transducers Questions and Answers - Sanfoundry
10 pages

Machine Learning Basics

Uploaded by

Machine Learning Basics

Uploaded by

Machine Learning Basics:

Regression Performance Metrics:

- RSS: Residual Sum of Squares:

RSS(Beta) = Sum from i=1 to N (Square of( y(i) - y(hat) ) )

y(i) = ith observation value

Square Root of (MSE)

- MAE: Mean Absolute Error - Use to penalize all errors equally

1/N * Sum from i=1 to N ( abs( y(i) - y ) )

- Accuracy: CorrectPrediction / (CorrectPrediction + IncorrectPrediction)

- Precision: TruePositive / (TruePositive + FalsePositive)

TruePositive: Where model correctly predicts the positive outcome

- Recall: TruePositive / (TruePositive + FalseNegative)

- F1Score: 2 * (Recall * Precision) / (Recall + Precision) - Higher value is better

Clustering Performance Metrics:

- Homogeneity - higher is more homogenious

Homogeneity(n) = 1 - (Conditional entropy given clusted assignments) / Entrophy of

- Silhouette Score - Similarity of data on one clusted compared to other clusters

s(o) = (b(o) - a(o)) / max{a(o), b(o)}

o = co-effcient of data point

Completeness(c) = 1 - (Conditional entropy given clusted assignments) / Entrophy if

ML Model Evaluation Steps:

1. Data Preparation: Split data into train, validation and test.

Residual meaning - the difference between predicted vs true values

You might also like