0% found this document useful (0 votes)

40 views25 pages

Moule 3

The document compares Machine Learning (ML) and Traditional Programming, highlighting their differences in problem-solving approaches, complexity handling, adaptability, data dependency, and debugging. It outlines the key components of the learning problem in ML, including input features, output targets, hypothesis space, loss functions, and learning algorithms. Additionally, it discusses model training, parameter functions, evaluation metrics, and the importance of data splitting for effective model performance.

Uploaded by

sandeep194singhai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views25 pages

Moule 3

Uploaded by

sandeep194singhai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

MOULE 3

Machine Learning (ML) differs from Traditional Programming in several fundamental ways. Here’s a
comparison:

1. Approach to Problem-Solving

 Traditional Programming:

o Developers write explicit rules (logic) to process input data and produce output.

o Example: Writing a function to calculate tax based on income brackets.

o Flow: Input → Program (Rules) → Output

 Machine Learning:

o Instead of writing rules, the system learns patterns from data to make predictions.

o Example: Training a model on past tax records to predict future taxes.

o Flow: Input + Output → Model Training → Learned Model → Predictions

2. Handling Complexity

 Traditional Programming:

o Works well for well-defined, rule-based problems (e.g., sorting, calculations).

o Struggles with complex, ambiguous tasks (e.g., image recognition, natural language
processing).

 Machine Learning:

o Excels at problems where rules are hard to define (e.g., spam detection,
recommendation systems).

o Can adapt to new patterns in data without manual updates.

3. Adaptability

 Traditional Programming:

o Requires manual updates if rules change (e.g., updating tax laws means rewriting
code).

 Machine Learning:

o Can improve over time with more data (retraining).

o Adapts to new patterns automatically (if designed properly).

4. Data Dependency

 Traditional Programming:
o Relies on logic written by developers.

o Works even with small or no data (if rules are correct).

 Machine Learning:

o Heavily depends on quality and quantity of data.

o Poor data leads to poor predictions (Garbage In → Garbage Out).

5. Debugging & Interpretability

 Traditional Programming:

o Bugs can be traced to specific lines of code.

o Logic is transparent and explainable.

 Machine Learning:

o Harder to debug (errors may come from data, model choice, or hyperparameters).

o Some models (e.g., deep learning) act as "black boxes" (hard to interpret).

6. Use Cases

Traditional Programming Machine Learning

Calculator apps Fraud detection

Database queries Speech recognition

Web servers Self-driving cars

Sorting algorithms Personalized recommendations

Summary

Feature Traditional Programming Machine Learning

Logic Source Handwritten by developers Learned from data

Adaptability Static (needs manual updates) Dynamic (improves with data)

Feature Traditional Programming Machine Learning

Best For Rule-based problems Pattern recognition

Debugging Straightforward Complex (depends on data/model)

Example Excel formulas ChatGPT

Understanding and Formalizing the Learning Problem in Machine

Learning
In machine learning, the learning problem refers to the task of training a model to make predictions
or decisions based on data. Unlike traditional programming (where rules are explicitly coded), ML
systems learn from examples to generalize patterns.

1. Key Components of the Learning Problem

To formalize a learning problem, we define:

(A) Input (Features)

 Represented as X (a vector or matrix of features).

 Example: In house price prediction, features could be:

o Size (sq. ft.)

o Number of bedrooms

o Location

(B) Output (Target Variable)

 Represented as Y (what we want to predict).

 Example:

o Regression: Price of the house (continuous value).

o Classification: "Spam" or "Not Spam" (discrete label).

(C) Hypothesis Space (Model Class)

 A set of possible functions/models that can map X → Y.

 Example:

o Linear models: Y=wX+bY=wX+b

o Decision trees, Neural Networks, etc.

(D) Loss Function (Cost Function)

 Measures how well the model performs (difference between predictions and true values).

 Example:

o Mean Squared Error (MSE) for regression.

o Cross-Entropy Loss for classification.

(E) Learning Algorithm

 Adjusts model parameters to minimize the loss function.

 Example:

o Gradient Descent (optimization method).

o Backpropagation (for neural networks).

2. Formal Definition of the Learning Problem

Given:

 A dataset D={(x1,y1),(x2,y2),...,(xn,yn)}D={(x1,y1),(x2,y2),...,(xn,yn)}

 A hypothesis space HH (possible models).

 A loss function LL (measures prediction errors).

Goal:
Find a function h∈Hh∈H that minimizes the expected loss over new, unseen data:

h∗=arg⁡min⁡h∈HE(x,y)[L(h(x),y)]h∗=argh∈HminE(x,y)[L(h(x),y)]

Interpretation:

 The model should generalize (perform well on unseen data, not just training data).

 Avoid overfitting (memorizing training data) and underfitting (failing to learn patterns).

4. Example: Formalizing a Simple Linear Regression Problem

Problem Statement:

Predict house prices based on size (sq. ft.).

Formalization:
1. Input (X): House sizes [1000,1500,2000,...][1000,1500,2000,...].

2. Output (Y): Prices [200k,250k,300k,...][200k,250k,300k,...].

3. Model: Linear function Y=wX+bY=wX+b.

4. Loss Function: Mean Squared Error (MSE).

5. Learning Algorithm: Gradient Descent (optimizes w,bw,b to minimize MSE).

1. What is a Model?

A model is a function that maps input features (X) to output predictions (Ŷ).

 It defines the relationship between inputs and outputs.

 Different models make different assumptions about data patterns.

Examples of Models:

Model Type Formula (Example) Use Case

Linear
Y^=wX+bY^=wX+b Predicting house prices
Regression

Logistic Binary classification (spam

Y^=11+e−(wX+b)Y^=1+e−(wX+b)1
Regression detection)

Decision Tree Splits data based on feature thresholds Customer churn prediction

Neural Complex layered

Image recognition
Network functions Y^=fn(...f2(f1(X)))Y^=fn(...f2(f1(X)))

2. What are Parameters?

Parameters are the internal settings of a model that are learned from data during training.

 They define how the model transforms input into output.

 The goal of training is to find the best parameters that minimize prediction errors.

Examples of Parameters:
Model Parameters Role

Controls slope and intercept of

Linear Regression ww (weight), bb (bias)
the line

Neural Network Weights & biases in each neuron Adjusts how signals propagate

Support Vector Machine Support vectors & margin

Determines the decision bou
(SVM) coefficients

3. Model Training: How Parameters are Learned

1. Initialize Parameters

o Start with random values (e.g., w=0.1,b=0w=0.1,b=0).

2. Make Predictions

o Compute Y^=Model(X)Y^=Model(X).

3. Calculate Loss

o Compare predictions (Y^Y^) with true values (YY) using a loss function (e.g., Mean
Squared Error).

4. Update Parameters

o Adjust ww and bb to reduce loss (using optimization like gradient descent).

5. Repeat

o Iterate until loss is minimized (or convergence).

Hyperparameters vs. Parameters

Parameters Hyperparameters

Learned from data (e.g., weights in a Set before training (e.g., learning rate, number of trees
neural network). in a random forest).

Adjusted automatically during training. Tuned manually or via grid search.

Example: Coefficients in linear Example: Depth of a decision tree.

Parameters Hyperparameters

regression.

To build a reliable machine learning model, data is typically split into three distinct sets:
1. Training Data
2. Validation Data
3. Test Data

1. Training Data

Purpose

 Used to train the model (i.e., adjust its parameters).

 The model learns patterns from this data.

Characteristics

 Typically 60-80% of the total dataset.

 The larger the training set, the better the model can learn (but needs to be balanced with
validation/test data).

Example

 In image classification, the model sees labeled images (e.g., "cat" or "dog") and adjusts its
weights to minimize prediction errors.

2.Validation Data

Purpose

 Used to tune hyperparameters (e.g., learning rate, number of layers in a neural network).

 Helps select the best model architecture.

 Prevents overfitting (model memorizing training data instead of generalizing).

Characteristics

 Typically 10-20% of the total dataset.

 Not used in training—only for model selection.

Example

 Trying different learning rates (0.01 vs. 0.001) and picking the one that performs best on the
validation set.

3. Test Data

Purpose

 Used only once for final evaluation of the trained model.

 Simulates real-world performance on unseen data.

Characteristics

 Typically 10-20% of the total dataset.

 Must never be used during training or validation (to avoid bias).

Example

 After training a spam classifier, you evaluate its accuracy on a held-out test set of emails.

4. Why Split Data?

Problem Solution

Overfitting (model works well on training data but fails on Validation set checks performance
new data) during training.

Optimistic bias (if test data is used for tuning, performance

Keep test data completely separate.
estimates are inflated)

Use validation data to pick the best

Model selection (comparing different algorithms)
one fairly.

5. Common Splitting Strategies

(A) Holdout Method (Simple Split)

 70% Train | 15% Validation | 15% Test

 Best for large datasets.

(B) K-Fold Cross-Validation (Better for Small Data)

 Split data into K folds (e.g., K=5).

 Train on K-1 folds, validate on the remaining fold.

 Repeat K times and average results.

 No separate test set unless held out initially.

(C) Stratified Splitting (For Imbalanced Data)

 Ensures each split has the same class distribution.

 Example: If 20% of data is "spam," each set (train/val/test) keeps ~20% spam.

Function of Model Parameters in Machine Learning

Model parameters are the internal variables that a machine learning model learns from training
data. They define how input features are transformed into predictions.

Key Functions:

1. Define Model Behavior

o Parameters control how the model makes decisions (e.g., weights in a neural
network, coefficients in linear regression).

2. Optimize Predictions

o Adjusted during training to minimize errors (e.g., via gradient descent).

3. Capture Patterns

o Store learned relationships between features and targets (e.g., positive/negative

correlations).

Types of Parameters

1. Weights (e.g., in Neural Networks, Linear Regression)

o Determine feature importance.

o Example: In y=w1x1+w2x2+by=w1x1+w2x2+b, w1,w2w1,w2 are weights.

2. Biases (Intercept Terms)

o Shift the prediction function (e.g., bb in y=wx+by=wx+b).

3. Support Vectors (in SVM)

o Define the decision boundary.

Parameter vs. Hyperparameter

Aspect Parameters Hyperparameters

Learned from Training data Set by the developer

Example Weights in a neural network Learning rate, batch size

Adjusted via gradient

Optimization Tuned via grid search
descent

Key Tradeoffs

1. Bias-Variance Tradeoff

o More parameters → Lower bias (fits training well) but higher variance (overfitting).

2. Interpretability vs. Performance

o Linear models (few params) are interpretable; deep learning (many params) is
powerful but opaque.

When to Adjust Parameters?

 Underfitting? Increase model complexity (more parameters).

 Overfitting? Reduce parameters or add regularization.

Metrics for Evaluating Model Performance

Evaluating a machine learning model's performance is crucial to ensure it generalizes well to unseen
data. The choice of metric depends on the type of problem (classification, regression, clustering)
and business goals.

1. Classification Metrics

Used when the output is a category (e.g., spam/not spam, fraud/legit).

(A) Confusion Matrix

A table showing:

 True Positives (TP): Correctly predicted positives.

 True Negatives (TN): Correctly predicted negatives.

 False Positives (FP): Negative samples wrongly predicted as positive.

 False Negatives (FN): Positive samples wrongly predicted as negative.

Predicted: Yes Predicted: No

Actual: Yes TP FN

Actual: No FP TN

(B) Accuracy

 Measures overall correctness.

 Formula:

Accuracy=TP+TN/TP+TN+FP+FN

 Measures how many predicted positives are actually positive.

 Formula: Precision=TP/TP+FP

 Use Case: Important when FP are costly (e.g., falsely flagging legit emails as spam).

(D) Recall (Sensitivity)

 Measures how many actual positives were correctly predicted.

 Formula: Recall=TP/TP+FN

 Use Case: Important when FN are costly (e.g., missing a cancer diagnosis).

(E) F1-Score

 Harmonic mean of precision and recall (balances both).

 Formula: F1=2×Precision×Recall/Precision+Recall

 Use Case: Best for imbalanced datasets.

(F) ROC-AUC (Receiver Operating Characteristic - Area Under Curve)

 Measures model’s ability to distinguish classes at different thresholds.

 AUC = 1.0: Perfect classifier.

 AUC = 0.5: Random guessing.

 Use Case: Comparing models in binary classification.

2. Regression Metrics

Used when the output is a continuous value (e.g., house price, temperature).

(A) Mean Absolute Error (MAE)

 Average of absolute errors.

 Formula: MAE=1/n∑∣yi−y^i∣

(B) Mean Squared Error (MSE)

 Average of squared errors.

 Formula: MSE=1/n∑(yi−y^i)^2

(C) Root Mean Squared Error (RMSE)

 Square root of MSE.

3. Clustering Metrics (Unsupervised Learning)

(A) Silhouette Score

 Measures how similar a sample is to its own cluster vs. other clusters.

 Range: -1 (worst) to +1 (best).

(B) Davies-Bouldin Index

 Lower values = better clustering.

4. Key Takeaways

Problem Type Best Metrics

Classification Accuracy, Precision, Recall, F1, ROC-AUC

Regression MAE, RMSE, R²

Clustering Silhouette Score, Davies-Bouldin

Key Machine Learning Evaluation Concepts Explained

1. Accuracy
 What it measures: Overall correctness of predictions

 Formula: (TP + TN) / (TP + TN + FP + FN)

 Best for: Balanced datasets where all classes are equally important

 Limitation: Misleading for imbalanced data (e.g., 95% negative class)

2. Precision

 What it measures: Quality of positive predictions

 Formula: TP / (TP + FP)

 When to use: When false positives are costly (e.g., spam detection)

 Example: High precision means when your model says "spam", it's very likely correct

3. Recall (Sensitivity)

 What it measures: Ability to find all positive instances

 Formula: TP / (TP + FN)

 When to use: When false negatives are dangerous (e.g., cancer detection)

 Example: High recall means your model finds most actual positive cases

4. Confusion Matrix

A visualization tool showing:

Predicted

Positive Negative

Actual Positive TP FN

Actual Negative FP TN

5. Bias-Variance Tradeoff

 Bias Error: From oversimplified assumptions (underfitting)

o High bias = model is too simple (e.g., linear model for complex data)

 Variance Error: From excessive sensitivity to training data (overfitting)

o High variance = model is too complex (memorizes noise)

 Tradeoff: As model complexity increases:

o Bias decreases (fits training data better)

o Variance increases (generalizes worse to new data)

6. Overfitting vs Underfitting
Overfitting Underfitting

Model memorizes training data (including

Definition Model fails to learn patterns
noise)

Training
Excellent Poor
Performance

Test Performance Poor Poor

Overly smooth, simple

Visualization Complex, wiggly decision boundary
boundary

- Regularization
- More complex model
- More training data
Solutions - More features
- Feature selection
- Longer training
- Early stopping

Practical Implications

 For medical diagnosis: Prioritize recall (don't miss real cases)

 For spam filters: Prioritize precision (don't block legit emails)

 Model selection: Use validation set to find sweet spot in bias-variance tradeoff

 Debugging:

o High training error? → Likely underfitting

o Large gap between train/test performance? → Likely overfitting

Real-world Example

Imagine building a fraud detection system:

 High precision = When it flags fraud, it's probably right

 High recall = It catches most actual fraud cases

 Overfitting = Model flags transactions as fraud based on random quirks in training data

 Underfitting = Model misses obvious fraud patterns

Types of Machine Learning

Machine learning approaches can be categorized based on the learning paradigm and the nature of
supervision. Here's a comprehensive breakdown:

1. Supervised Learning

Definition: Learns from labeled training data (input-output pairs)

Goal: Predict outputs for new inputs
Key Characteristics:

 Requires a fully labeled dataset

 Most common type in practical applications

 Two main subtypes:

Type Output Examples

Classificatio
Discrete categories Spam detection, image recognition
n

Regression Continuous values House pricing, stock market prediction

Algorithms:

 Linear/Logistic Regression

 Decision Trees, Random Forests

 SVM, Neural Networks

Pros:
✔ Predictions are interpretable
✔ Well-established techniques

Cons:
❌ Requires labeled data (often expensive)
❌ May not generalize beyond training distribution

2. Unsupervised Learning

Definition: Discovers patterns in unlabeled data

Goal: Find hidden structures or groupings
Key Use Cases:

Technique Purpose Examples

Clustering Group similar data points Customer segmentation

Technique Purpose Examples

Dimensionality Simplify data while preserving

PCA for visualization
Reduction structure

Fraud detection in credit card

Anomaly Detection Identify unusual patterns
transactions

Algorithms:

 K-Means, DBSCAN

 Autoencoders

 Apriori (Association Rules)

Pros:
✔ Works with unlabeled data
✔ Reveals hidden insights

Cons:
❌ Harder to evaluate objectively
❌ Results may be ambiguous

3. Semi-Supervised Learning

Definition: Uses both labeled and unlabeled data

Goal: Improve learning accuracy with limited labels
Applications:

 Speech recognition

 Medical imaging (where labeling is expensive)

Approaches:

 Self-training

 Co-training

Pros:
✔ Reduces labeling costs
✔ More robust than pure supervised

Cons:
❌ Complex implementation
❌ Quality depends on initial labeled data

4. Reinforcement Learning (RL)

Definition: Learns by interacting with an environment
Goal: Maximize cumulative reward
Key Components:

 Agent: The learner

 Environment: The world agent interacts with

 Reward: Feedback signal

Applications:

 Game playing (AlphaGo)

 Robotics control

 Autonomous vehicles

Algorithms:

 Q-Learning

 Deep Q Networks (DQN)

 Policy Gradient Methods

Pros:
✔ Can handle complex, dynamic environments
✔ Doesn't require pre-labeled data

Cons:
❌ Computationally expensive
❌ Hard to design proper reward functions

5. Self-Supervised Learning

Definition: Generates its own labels from data

Goal: Learn useful representations
Examples:

 Masked language modeling (BERT)

 Predicting image rotations

Pros:
✔ Eliminates manual labeling
✔ Powerful for pre-training

Cons:
❌ Requires massive data
❌ Task-specific design needed

6. Transfer Learning
Definition: Applies knowledge from one task to another
Approach:

1. Pre-train on large dataset

2. Fine-tune on target task

Applications:

 Image classification (using ImageNet pre-trained models)

 NLP (using BERT/GPT embeddings)

Pros:
✔ Saves computation time
✔ Works well with limited target data

Cons:
❌ Potential negative transfer if domains mismatch

. Memory-Based Learning

Definition: Systems that store and retrieve specific training instances to make predictions.

Key Types:

1. Instance-Based Learning:

o Stores raw training examples

o Predicts based on similarity to stored cases

o Example: k-Nearest Neighbors (k-NN)

2. Case-Based Reasoning:

o Stores problem-solution pairs

o Retrieves similar past cases to solve new problems

o Used in medical diagnosis, legal reasoning

Characteristics:

 No explicit model training phase

 Prediction happens at query time

 Requires efficient similarity metrics

Pros:
✔ Adapts easily to new data
✔ Handles complex relationships
Cons:
❌ Computationally expensive at runtime
❌ Sensitive to irrelevant features

. Hebbian Learning

Core Principle: "Neurons that fire together, wire together" (Donald Hebb, 1949)

Mechanism:

 If two connected neurons activate simultaneously:

o Connection strength increases

 If activation is uncorrelated:

o Connection weakens

Mathematical Form:
Δw_ij = η x_i x_j
(where η is learning rate, x_i and x_j are neuron activations)

Applications:

 Unsupervised feature learning

 Neural network initialization

 Neuromorphic computing

Modern Variants:

 Oja's Rule (adds weight normalization)

 BCM Theory (incorporates sliding threshold)

. Other Specialized Learning Paradigms

A. Competitive Learning

 Neurons compete to respond to inputs

 Winner updates its weights (Winner-Take-All)

 Used in:

o Self-Organizing Maps (SOM)

o Learning Vector Quantization (LVQ)

B. Error-Corrective Learning

 Adjusts weights based on output error

 Backpropagation is the most famous example

 Includes:
o Perceptron Learning Rule

o Delta Rule

C. Reinforcement Learning Variants

1. Temporal Difference Learning:

o Updates predictions based on subsequent predictions

o Combines Monte Carlo and dynamic programming

2. Actor-Critic Methods:

o Separates policy (actor) and value function (critic)

o Provides more stable learning

D. Meta-Learning

 "Learning to learn"

 Systems that improve their learning ability over time

 Includes:

o Model-Agnostic Meta-Learning (MAML)

o Memory-Augmented Neural Networks

E. Neuromodulated Learning

 Mimics biological neurotransmitter systems

 Uses additional modulation signals

 Enables:

o Context-dependent learning

o Emotional weighting of memories

Comparative Analysis of Machine Learning Paradigms

Example
Learning Supervisi Weaknesse Best Use
Mechanism Strengths Algorith
Type on s Cases
ms

Supervise Full labels Learns input- High Needs Classification CNN,

d output accuracy, labeled , Regression Random
mappings interpreta Forest,
Example
Learning Supervisi Weaknesse Best Use
Mechanism Strengths Algorith
Type on s Cases
ms

ble data SVM

Works K-
Discovers Clustering,
Unsupervi with Hard to Means,
No labels hidden Dimensionali
sed unlabeled evaluate PCA,
patterns ty Reduction
data GANs

Label
Medical
Semi- Uses both Reduces Complex Propaga
Partial imaging,
Supervise labeled/unla labeling implementa tion,
labels Speech
d beled data costs tion Self-
recognition
Training

Handles Needs
Maximizes Q-
Reinforce Reward dynamic careful Game AI,
cumulative Learning,
ment signals environm reward Robotics
reward PPO
ents design

Creates BERT,
Self- Auto- Eliminate Requires NLP,
labels from Contrasti
Supervise generate s manual massive Computer
data ve
d d labeling data Vision
structure Learning

Saves
Fine-
computa
Leverages Domain tuning,
Transfer tion,
Varies pre-trained mismatch All domains Feature
Learning works
models risk Extractio
with little
n
data

k-NN,
Stores/ Adapts to Computatio Recommend Case-
Memory-
Varies retrieves new data nally ation Based
Based
instances easily expensive systems Reasonin
g

Hebbian Unsuperv Strengthens Biologicall Limited to Neuromorph Oja's

Learning ised co-active y Rule,
Example
Learning Supervisi Weaknesse Best Use
Mechanism Strengths Algorith
Type on s Cases
ms

neuron BCM
plausible simple tasks ic computing
connections Theory

Competiti Neurons Sensitive to

Unsuperv Good for Vector SOM,
ve compete to initializatio
ised clustering quantization LVQ
Learning respond n

Fast
Learns Complex,
Meta- Multi- adaptatio Few-shot MAML,
learning data-
Learning task n to new learning Reptile
strategies hungry
tasks

When to Use Which Paradigm?

Scenario Recommended Approach Reason

Abundant labeled data Supervised Learning Maximizes predictive accuracy

No labels but need structure

Unsupervised Learning Reveals hidden patterns
discovery

Semi-Supervised or Transfer Leverages unlabeled data/pre-

Limited labeled data
Learning trained models

Sequential decision-making Reinforcement Learning Optimizes long-term outcomes

Hebbian/Competitive Biologically plausible

Neuromorphic hardware
Learning implementation
Scenario Recommended Approach Reason

Rapid adaptation to new

Meta-Learning "Learning to learn" capability
tasks

Real-time personalization Memory-Based Learning Fast instance-based reasoning

Confusion Matrix: Definition, Processing, and Interpretation

A confusion matrix (or error matrix) is a performance evaluation tool for classification models that
visualizes prediction results by comparing actual vs. predicted class labels. It helps identify model
strengths/weaknesses and calculate key metrics.

1. Structure of a Confusion Matrix

For a binary classifier (e.g., spam detection):

Predicted: Negative (0) Predicted: Positive (1)

Actual: Negative
True Negative (TN) False Positive (FP)
(0)

Actual: Positive (1) False Negative (FN) True Positive (TP)

 True Positives (TP): Correctly predicted positives.

 True Negatives (TN): Correctly predicted negatives.

 False Positives (FP): Negative samples wrongly predicted as positive (Type I error).

 False Negatives (FN): Positive samples wrongly predicted as negative (Type II error).

Example:

 Actual: 100 emails (90 non-spam, 10 spam)

 Model Predictions:

o Correctly classified 85 non-spam (TN)

o Misclassified 5 non-spam as spam (FP)

o Correctly classified 8 spam (TP)

o Missed 2 spam (FN)

Resulting confusion matrix:

Predicted: 0 Predicted: 1

Actual: 0 85 (TN) 5 (FP)

Actual: 1 2 (FN) 8 (TP)

2. Processing a Confusion Matrix

Step 1: Generate Predictions

 Train a model (e.g., logistic regression, random forest).

 Predict class labels on a test set (unseen data).

Step 2: Build the Matrix

Compare actual vs. predicted labels:

Python code

[from sklearn.metrics import confusion_matrix

y_true = [0, 0, 1, 1, 0, 1] # Actual labels

y_pred = [0, 1, 1, 0, 0, 1] # Predicted labels

cm = confusion_matrix(y_true, y_pred)]

Step 3: Calculate Metrics

From the matrix, compute:

1. Accuracy: Overall correctness

Accuracy=TP+TN/TP+TN+FP+FN

2. Precision: How many predicted positives are real?

Precision=TP/TP+FP

Recall (Sensitivity): How many actual positives were caught?

Recall=TP/TP+FN

F1-Score: Harmonic mean of precision/recall

F1=2×Precision×Recall/Precision+Recall
Example Calculations (from the email classifier):

 Accuracy = (85 + 8) / 100 = 93%

 Precision = 8 / (8 + 5) = 61.5%

 Recall = 8 / (8 + 2) = 80%

 F1 = 2 × (0.615 × 0.8) / (0.615 + 0.8) = 69.6%

Tutorial Sheet1 (M.L.)
No ratings yet
Tutorial Sheet1 (M.L.)
49 pages
ML Basics for MIT Students
No ratings yet
ML Basics for MIT Students
5 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
5 pages
Intro DL 01
No ratings yet
Intro DL 01
64 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
5 pages
Ids Ashber
No ratings yet
Ids Ashber
9 pages
Python Linear Regression Guide
No ratings yet
Python Linear Regression Guide
153 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Aiml Notes
No ratings yet
Aiml Notes
12 pages
Introduction To ML
No ratings yet
Introduction To ML
48 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
56 pages
1 Intro
No ratings yet
1 Intro
18 pages
ML Important
No ratings yet
ML Important
8 pages
Rohit Unit 1 ML Notes
No ratings yet
Rohit Unit 1 ML Notes
27 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
15 pages
Study Notes - Lesson 1 - 7 PDF
No ratings yet
Study Notes - Lesson 1 - 7 PDF
25 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
23 pages
Machine Learning - I
No ratings yet
Machine Learning - I
126 pages
APS1070 Lecture (3) Slides
No ratings yet
APS1070 Lecture (3) Slides
70 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
137 pages
Unit 1 ML
No ratings yet
Unit 1 ML
41 pages
Algorithm Comparison Guide
No ratings yet
Algorithm Comparison Guide
14 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Machine Learning
No ratings yet
Machine Learning
26 pages
Unit 1
No ratings yet
Unit 1
93 pages
Lecture 2
No ratings yet
Lecture 2
36 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Foundations of Machine Learning and Data Science - Concepts, Techniques, and Applications
No ratings yet
Foundations of Machine Learning and Data Science - Concepts, Techniques, and Applications
9 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
Machine Learning Algorithms Overview
No ratings yet
Machine Learning Algorithms Overview
8 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
53 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
132 pages
PSCS511 - Machine Learning
No ratings yet
PSCS511 - Machine Learning
23 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
134 pages
Unit 1
No ratings yet
Unit 1
38 pages
MachineLearning Perplexity
No ratings yet
MachineLearning Perplexity
5 pages
Air Quality Prediction Using Machine Learning
No ratings yet
Air Quality Prediction Using Machine Learning
29 pages
Notes Unit 1-3 Part-II
No ratings yet
Notes Unit 1-3 Part-II
20 pages
Machine Learning?
100% (6)
Machine Learning?
114 pages
Unit 1
No ratings yet
Unit 1
92 pages
Lec2 Intro To ML
No ratings yet
Lec2 Intro To ML
35 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
Lecture 1 - Introduction To Machine Learning-HO - Ch0
No ratings yet
Lecture 1 - Introduction To Machine Learning-HO - Ch0
44 pages
ML Exam
No ratings yet
ML Exam
32 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
ML Notes All
No ratings yet
ML Notes All
32 pages
Firoz Topic 0
No ratings yet
Firoz Topic 0
24 pages
Kang Sir Elmo
No ratings yet
Kang Sir Elmo
3 pages
Row
No ratings yet
Row
1,256 pages
Let's Practice Translation: Keona's Journey
No ratings yet
Let's Practice Translation: Keona's Journey
8 pages
The Exploding Nothingness of Never Define by Anne Tardos Book Preview
100% (1)
The Exploding Nothingness of Never Define by Anne Tardos Book Preview
24 pages
Advanced Nursing Practice Overview
100% (1)
Advanced Nursing Practice Overview
12 pages
Tune In Workbook: Exercises Guide
100% (1)
Tune In Workbook: Exercises Guide
8 pages
BDPR3103 Final Exam Answer
No ratings yet
BDPR3103 Final Exam Answer
5 pages
Congreve's Use of Wit and Humour. SH C
100% (1)
Congreve's Use of Wit and Humour. SH C
3 pages
The Husband Who Was To Mind The House - Asbjørnsen & Moe
No ratings yet
The Husband Who Was To Mind The House - Asbjørnsen & Moe
4 pages
JSS 2 Week One Lesson Note 2025
No ratings yet
JSS 2 Week One Lesson Note 2025
1 page
3431 Wksiteposter en
No ratings yet
3431 Wksiteposter en
2 pages
Maryama Sarauniya Book Complete by Ummi Aisha .PDF by Sufi - Com.ng
No ratings yet
Maryama Sarauniya Book Complete by Ummi Aisha .PDF by Sufi - Com.ng
214 pages
Sociology 5th Edition Schaefer Test Bank PDF
No ratings yet
Sociology 5th Edition Schaefer Test Bank PDF
39 pages
Rivera Blas
No ratings yet
Rivera Blas
2 pages
Direct Memory Access (DMA)
50% (4)
Direct Memory Access (DMA)
15 pages
Analyzing Albert Einsteins Theory of Relativity Thesis Defense 1
No ratings yet
Analyzing Albert Einsteins Theory of Relativity Thesis Defense 1
40 pages
The Roman Ritual
100% (1)
The Roman Ritual
275 pages
PDF Probability Theory and Mathematical Statistics For Engineers 1st Edition Paolo L. Gatti Download
100% (28)
PDF Probability Theory and Mathematical Statistics For Engineers 1st Edition Paolo L. Gatti Download
55 pages
Critical Geography
No ratings yet
Critical Geography
7 pages
Creating Your LinkedIn Company Page
No ratings yet
Creating Your LinkedIn Company Page
2 pages
Vlasons Vs CA
No ratings yet
Vlasons Vs CA
3 pages
OT7 - Seven Step Format
No ratings yet
OT7 - Seven Step Format
34 pages
Grade 12 Inquiry Module 3 Overview
No ratings yet
Grade 12 Inquiry Module 3 Overview
23 pages
TIME Magazines 100 Best Childrens Books
No ratings yet
TIME Magazines 100 Best Childrens Books
1 page
Inclusive Pronoun Practices for Instructors
No ratings yet
Inclusive Pronoun Practices for Instructors
2 pages
Psychoed Report III
No ratings yet
Psychoed Report III
22 pages
English Phonetics Assignment
No ratings yet
English Phonetics Assignment
9 pages
Becoming A Critical Thinker: A User-Friendly Manual, 6/e
36% (11)
Becoming A Critical Thinker: A User-Friendly Manual, 6/e
35 pages
Hexaamminecobalt (III) Chloride
100% (1)
Hexaamminecobalt (III) Chloride
2 pages
Alleluia Solemnelle Score
No ratings yet
Alleluia Solemnelle Score
7 pages
The Relationship Between Senior High School Students Nutrition and Cognitiv
No ratings yet
The Relationship Between Senior High School Students Nutrition and Cognitiv
22 pages