MODULE 1
SHORT NOTES
Module-1 (Overview of machine learning) Introduction to Machine Learning, Machine learning
paradigms-supervised, semi-supervised, unsupervised, reinforcement learning. Supervised
learning- Input representation, Hypothesis class, Version space, Vapnik-Chervonenk is (VC)
Dimension, Probably Approximately Correct Learning (PAC), Noise, Learning Multiple classes,
Model Selection and Generalization
1. Introduction to Machine Learning
Machine Learning (ML) is a part of Artificial Intelligence that focuses on creating algorithms
that learn patterns from data and make predictions or decisions without being explicitly
programmed.
● The ML process involves collecting data → training a model → testing on new data.
● It improves automatically with more data and better algorithms.
● Example: Spam filter learns from examples of spam and non-spam emails.
2. Machine Learning Paradigms
2.1 Supervised Learning
● Definition: Learning from labeled datasets where both input (X) and output (Y) are
known.
● Goal: Learn a mapping f:X→Yf: X \to Yf:X→Y to predict Y for new X.
● Types:
○ Regression → predicts continuous values (e.g., temperature).
○ Classification → predicts categories (e.g., “pass” or “fail”).
● Example: Predicting house prices from size, location, and number of rooms.
INTRODUCTION TO MACHINE LEARNING -module 1 short notes
2.2 Semi-Supervised Learning
● Uses a small labeled dataset + a large unlabeled dataset.
● Useful when labeling data is expensive or time-consuming.
● Labeled data guides the learning, unlabeled data helps improve accuracy.
● Example: Language translation with few manually translated sentences and many
unlabelled ones.
2.3 Unsupervised Learning
● Learns from unlabeled data (no output labels given).
● Goal: Find hidden patterns, clusters, or structures.
● Example:
○ Clustering: Group customers based on buying behavior.
○ Dimensionality Reduction: Reduce features while keeping important info (PCA).
2.4 Reinforcement Learning
● Learn by interacting with an environment and receiving rewards or penalties.
● Key terms:
○ Agent → learner/decision-maker.
○ Environment → system where agent acts.
○ Reward → positive or negative feedback.
● Example: Game-playing AI that learns to win by trial and error.
3. Supervised Learning – Key Concepts
Prepared by:Prof Merlin Joshi 1
INTRODUCTION TO MACHINE LEARNING -module 1 short notes
3.1 Input Representation
● How data is presented to the model (features).
● Good feature selection improves performance.
● Types:
○ Numeric → Age, height, salary.
○ Categorical → Gender, color, city (often converted to numeric using encoding).
3.2 Hypothesis Class
● The complete set of functions the learning algorithm can pick from.
● Example: For linear regression, the hypothesis class is the set of all straight-line
equations y=mx+c
● Bigger hypothesis class → more flexibility but risk of overfitting.
3.3 Version Space
● The subset of hypotheses from the hypothesis class that fit all training examples.
● As we get more training data, wrong hypotheses are removed, and the version space
becomes smaller.
3.4 VC Dimension
● Vapnik–Chervonenkis (VC) dimension measures the capacity of a hypothesis class.
● Higher VC → can fit more complex patterns.
● Example: A straight line in 2D can perfectly separate at most 3 points in all possible ways
(VC = 3).
3.5 PAC Learning
Prepared by:Prof Merlin Joshi 2
INTRODUCTION TO MACHINE LEARNING -module 1 short notes
Probably Approximately Correct learning framework.
States that a learning algorithm should produce a hypothesis that:
● Has error ≤ ϵ (approximately correct).
● With probability ≥ 1−δ1 (probably correct).
Example: With 95% confidence (δ=0.05), the error is less than 5% (ϵ=0.05).
3.6 Noise
● Unwanted variations in data that don’t represent the true pattern.
● Sources:
○ Wrong labels (label noise).
○ Faulty measurements (attribute noise).
● Noise can mislead the model, reducing accuracy.
3.7 Learning Multiple Classes
● Many problems have more than 2 categories.
● Approaches:
○ One-vs-All (OvA) → One classifier per class vs all others.
○ One-vs-One (OvO) → Classifier for each pair of classes.
○ Direct multi-class algorithms (e.g., decision trees, softmax regression).
3.8 Model Selection
Prepared by:Prof Merlin Joshi 3
INTRODUCTION TO MACHINE LEARNING -module 1 short notes
● Choosing the best model and hyperparameters for a task.
● Methods:
○ Cross-validation → test model on different data splits.
○ Grid search → try combinations of parameters.
● Goal: Best accuracy while avoiding overfitting.
3.9 Generalization
● The ability of a trained model to work well on new, unseen data.
● Overfitting: Learns noise, works badly on new data.
● Underfitting: Too simple, misses patterns.
● Good generalization comes from balanced model complexity and enough training data.
Prepared by:Prof Merlin Joshi 4