Introduction to Machine Learning
Definition:
Machine Learning is a subset of Artificial Intelligence (AI) where systems learn patterns from
data and improve performance on tasks without explicit programming.
Example: Email Spam Filter
What happens without ML:
A programmer writes strict rules:
“If email contains the word ‘lottery’ → mark as spam.”
But spammers quickly outsmart these rules by using tricks like “l0ttery” or “you win!”.
What happens with ML:
o You feed the computer thousands of emails marked as spam or not spam.
o The computer automatically figures out patterns — like suspicious sender
addresses, strange wording, too many links, etc.
o Over time, it improves on its own as it sees more examples.
o You never told it specific rules — it learned the rules from data.
What is Supervised Learning?
It’s a type of machine learning where you train a model on labeled data.
Labeled data means: for every input, you already know the correct output.
The model learns to map input → output, so it can predict outputs for new inputs.
Example:
Teaching a child to identify animals
1. Training phase (with supervision):
o You show the child many pictures of animals.
o For each picture, you tell them:
“This is a cat.”
“This is a dog.”
“This is a rabbit.”
o The child learns what features make a cat (pointy ears, whiskers), a dog (snout, tail),
or a rabbit (long ears).
2. Prediction phase (on new data):
o Now you show them an animal they haven’t seen before.
o If it has whiskers and pointy ears, the child says: “It’s a cat!”
3. Why it’s supervised:
o You supervised the learning by giving the correct label for every example during
training.
o The child didn’t figure it out blindly — they had a teacher providing answers.
What is Unsupervised Learning?
In unsupervised learning, data has no labels.
The model isn’t told the correct answer — it has to find patterns or structure on its own.
It’s like giving the computer a box of mixed puzzle pieces without showing the final picture
— it has to figure out how they fit together.
Key Goals of Unsupervised Learning
1. Clustering → Group similar data points together.
2. Dimensionality Reduction → Compress data into fewer features while keeping important
information.
Real-life Example: Customer Segmentation
The problem:
A shopping website wants to understand its customers better — but there’s no label like
“type of customer.”
How it works:
o You feed the algorithm data such as: age, purchase history, income, products
browsed.
o The algorithm groups customers into clusters:
Group A: Young customers buying gadgets
Group B: Middle-aged customers buying household items
Group C: Older customers buying health products
Outcome:
The company can then target marketing campaigns to each group without anyone ever
labeling them beforehand.
Core Concepts in ML
Feature: Individual measurable property of data (e.g., age, income).
Label: The target/output variable in supervised learning.
Model: Mathematical representation of data → used for prediction.
Training: Process of fitting the model to data.
Testing: Evaluating model performance on unseen data.
Overfitting: Model memorizes training data → poor generalization.
Underfitting: Model is too simple → fails to learn patterns.