0% found this document useful (0 votes)
41 views5 pages

Module 1 ML Short Notes

Module 1 provides an overview of machine learning, including its definition, processes, and various paradigms such as supervised, semi-supervised, unsupervised, and reinforcement learning. Key concepts of supervised learning are discussed, including input representation, hypothesis class, VC dimension, and model selection. The document emphasizes the importance of generalization and the challenges of noise in data.

Uploaded by

Alphonse Joy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views5 pages

Module 1 ML Short Notes

Module 1 provides an overview of machine learning, including its definition, processes, and various paradigms such as supervised, semi-supervised, unsupervised, and reinforcement learning. Key concepts of supervised learning are discussed, including input representation, hypothesis class, VC dimension, and model selection. The document emphasizes the importance of generalization and the challenges of noise in data.

Uploaded by

Alphonse Joy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

MODULE 1

SHORT NOTES
Module-1 (Overview of machine learning) Introduction to Machine Learning, Machine learning
paradigms-supervised, semi-supervised, unsupervised, reinforcement learning. Supervised
learning- Input representation, Hypothesis class, Version space, Vapnik-Chervonenk is (VC)
Dimension, Probably Approximately Correct Learning (PAC), Noise, Learning Multiple classes,
Model Selection and Generalization

1. Introduction to Machine Learning

Machine Learning (ML) is a part of Artificial Intelligence that focuses on creating algorithms
that learn patterns from data and make predictions or decisions without being explicitly
programmed.

●​ The ML process involves collecting data → training a model → testing on new data.​

●​ It improves automatically with more data and better algorithms.​

●​ Example: Spam filter learns from examples of spam and non-spam emails.​

2. Machine Learning Paradigms

2.1 Supervised Learning

●​ Definition: Learning from labeled datasets where both input (X) and output (Y) are
known.​

●​ Goal: Learn a mapping f:X→Yf: X \to Yf:X→Y to predict Y for new X.​

●​ Types:​

○​ Regression → predicts continuous values (e.g., temperature).​

○​ Classification → predicts categories (e.g., “pass” or “fail”).​

●​ Example: Predicting house prices from size, location, and number of rooms.​
INTRODUCTION TO MACHINE LEARNING -module 1 short notes

2.2 Semi-Supervised Learning

●​ Uses a small labeled dataset + a large unlabeled dataset.​

●​ Useful when labeling data is expensive or time-consuming.​

●​ Labeled data guides the learning, unlabeled data helps improve accuracy.​

●​ Example: Language translation with few manually translated sentences and many
unlabelled ones.​

2.3 Unsupervised Learning

●​ Learns from unlabeled data (no output labels given).​

●​ Goal: Find hidden patterns, clusters, or structures.​

●​ Example:​

○​ Clustering: Group customers based on buying behavior.​

○​ Dimensionality Reduction: Reduce features while keeping important info (PCA).​

2.4 Reinforcement Learning

●​ Learn by interacting with an environment and receiving rewards or penalties.​

●​ Key terms:​

○​ Agent → learner/decision-maker.​

○​ Environment → system where agent acts.​

○​ Reward → positive or negative feedback.​

●​ Example: Game-playing AI that learns to win by trial and error.​

3. Supervised Learning – Key Concepts

Prepared by:​Prof Merlin Joshi 1


INTRODUCTION TO MACHINE LEARNING -module 1 short notes

3.1 Input Representation

●​ How data is presented to the model (features).​

●​ Good feature selection improves performance.​

●​ Types:​

○​ Numeric → Age, height, salary.​

○​ Categorical → Gender, color, city (often converted to numeric using encoding).​

3.2 Hypothesis Class

●​ The complete set of functions the learning algorithm can pick from.​

●​ Example: For linear regression, the hypothesis class is the set of all straight-line
equations y=mx+c​

●​ Bigger hypothesis class → more flexibility but risk of overfitting.​

3.3 Version Space

●​ The subset of hypotheses from the hypothesis class that fit all training examples.​

●​ As we get more training data, wrong hypotheses are removed, and the version space
becomes smaller.​

3.4 VC Dimension

●​ Vapnik–Chervonenkis (VC) dimension measures the capacity of a hypothesis class.​

●​ Higher VC → can fit more complex patterns.​

●​ Example: A straight line in 2D can perfectly separate at most 3 points in all possible ways
(VC = 3).​

3.5 PAC Learning

Prepared by:​Prof Merlin Joshi 2


INTRODUCTION TO MACHINE LEARNING -module 1 short notes

Probably Approximately Correct learning framework.​

States that a learning algorithm should produce a hypothesis that:​

●​ Has error ≤ ϵ (approximately correct).​

●​ With probability ≥ 1−δ1 (probably correct).​

Example: With 95% confidence (δ=0.05), the error is less than 5% (ϵ=0.05).

3.6 Noise

●​ Unwanted variations in data that don’t represent the true pattern.​

●​ Sources:​

○​ Wrong labels (label noise).​

○​ Faulty measurements (attribute noise).​

●​ Noise can mislead the model, reducing accuracy.​

3.7 Learning Multiple Classes

●​ Many problems have more than 2 categories.​

●​ Approaches:​

○​ One-vs-All (OvA) → One classifier per class vs all others.​

○​ One-vs-One (OvO) → Classifier for each pair of classes.​

○​ Direct multi-class algorithms (e.g., decision trees, softmax regression).​

3.8 Model Selection

Prepared by:​Prof Merlin Joshi 3


INTRODUCTION TO MACHINE LEARNING -module 1 short notes

●​ Choosing the best model and hyperparameters for a task.​

●​ Methods:​

○​ Cross-validation → test model on different data splits.​

○​ Grid search → try combinations of parameters.​

●​ Goal: Best accuracy while avoiding overfitting.​

3.9 Generalization

●​ The ability of a trained model to work well on new, unseen data.​

●​ Overfitting: Learns noise, works badly on new data.​

●​ Underfitting: Too simple, misses patterns.​

●​ Good generalization comes from balanced model complexity and enough training data.​

Prepared by:​Prof Merlin Joshi 4

You might also like