0% found this document useful (0 votes)
9 views3 pages

Machine Learning Basics

The document outlines the basics of machine learning, focusing on supervised and unsupervised learning, along with regression and classification models. It details performance metrics for regression, classification, and clustering, as well as the steps for evaluating machine learning models. Additionally, it discusses the pros and cons of certain models, emphasizing the importance of understanding residuals in predictions.

Uploaded by

bluefaction
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views3 pages

Machine Learning Basics

The document outlines the basics of machine learning, focusing on supervised and unsupervised learning, along with regression and classification models. It details performance metrics for regression, classification, and clustering, as well as the steps for evaluating machine learning models. Additionally, it discusses the pros and cons of certain models, emphasizing the importance of understanding residuals in predictions.

Uploaded by

bluefaction
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

Machine Learning Basics:

Supervised:
- Requires Training data with independent variables & a dependent variable
(labelled data)
- Need labelled data to "supervise" the algorithm when learning drom the data
- Regressions Models
- Classification Models

UnSupervised:
- Requires training data with independent variables only
- No need labelled data that can "supervise" the algorithm when learning the data
- Clustering Models
- Outlier Detection Models

Regression:
- Can be used when the response variable to be predicted is a continuous
variable(scaler)
- Used to Predict continuous values, prediction tests
- price of house based on location, etc
- For instance: evaluate mean square
- Example: Linear Regression, fixed Effects Regression, XGBoost Regression

Classification:
- Can be used when the response varible is a categorized values
- For instance: Used for decision making tests
- Predict categorical values, take an input and categorize them into predetermined
categories
- For Instanct: evaluate for Accuracy, classify email as spam or non-spam, identify
the type of animal or image
- Example: Logistic Regression, XGBoost Classification, Random Forest

Regression Performance Metrics:


- Calculate the difference between the predicted and true values => lower value is
better feed for the model

- RSS: Residual Sum of Squares:

RSS(Beta) = Sum from i=1 to N (Square of( y(i) - y(hat) ) )

y(i) = ith observation value


y(hat) = model’s predicted value

Beta = Co-effficient

- MSS: Mean Square Error - Used to penalize large errors than smaller ones

1/N * (RSS)

- RMSE: Root Mean Square Error - Used to report error in a way its easier to
understand/explain

Square Root of (MSE)

- MAE: Mean Absolute Error - Use to penalize all errors equally

1/N * Sum from i=1 to N ( abs( y(i) - y ) )


Classification Performance Metrics:

- Accuracy: CorrectPrediction / (CorrectPrediction + IncorrectPrediction)

- Precision: TruePositive / (TruePositive + FalsePositive)

TruePositive: Where model correctly predicts the positive outcome


FalsePositive: Where model incorrectly predicts the positive outcome

- Recall: TruePositive / (TruePositive + FalseNegative)

- F1Score: 2 * (Recall * Precision) / (Recall + Precision) - Higher value is better

Clustering Performance Metrics:

- Homogeneity - higher is more homogenious

Homogeneity(n) = 1 - (Conditional entropy given clusted assignments) / Entrophy of


(predicted) class

- Silhouette Score - Similarity of data on one clusted compared to other clusters


- Higher means data points is well matched to own cluster
- Used in DB Scan/ K Means

s(o) = (b(o) - a(o)) / max{a(o), b(o)}

o = co-effcient of data point


a(o) = Average distance between o and other data points in cluster that o belongs
b(o) = Min Avg distance from o to all the clusters that o does not belong

- Completeness:
- To the Degree to which all the data points that belong to particular class
are assigned as the same cluster
- Higher value indicated more complete structure

Completeness(c) = 1 - (Conditional entropy given clusted assignments) / Entrophy if


(actual) class

ML Model Evaluation Steps:

1. Data Preparation: Split data into train, validation and test.


2. Model Training: Train the model on the training data and save the fitted model
3. Hyper-Parameter Tuning: Use the fitted model and validation set to find the
optional
set of parameters where the model performs
the best
4. Prediction: Use optimal set of parameters from Hyper-Parameter tuning stage and
training data,
to train these models again with hyper parameters,
use thus best fitted model to do predictions on test data
5. Test Error rate: Compute performance metrics for your model using
prediction and real values of the target variable
from your test data

Pros:
- Simple model
- Low Variance
- Low Bias
- Provides probability

Cons:
- Unable to model non-linear relationship
- Unstable when classes are well separable
- Unstable when > 2 classes

Residual meaning - the difference between predicted vs true values

You might also like