🤖 Machine Learning Fundamentals –
Student Notes
1. Introduction to Machine Learning
Machine Learning (ML) is a field of Artificial Intelligence (AI) that enables computers to learn
from data and improve performance without being explicitly programmed.
● Traditional Programming: Rules + Data → Output
● Machine Learning: Data + Output → Algorithm learns rules
Real-world examples:
● Netflix recommending movies
● Detecting spam emails
● Predicting house prices
● Self-driving cars
2. Categories of Machine Learning
2.1 Supervised Learning
● Labeled data (input + correct output).
● Goal: Learn mapping function.
Examples:
● Predicting house prices (Regression)
● Classifying emails as spam/not spam (Classification)
Algorithms:
● Linear Regression
● Logistic Regression
● Decision Trees
● Support Vector Machines
2.2 Unsupervised Learning
● Unlabeled data (no output).
● Goal: Find patterns or structure.
Examples:
● Customer segmentation (Clustering)
● Market basket analysis (Association rules)
Algorithms:
● K-Means Clustering
● Hierarchical Clustering
● PCA (Dimensionality Reduction)
2.3 Reinforcement Learning
● Agent interacts with environment → learns by rewards & penalties.
Examples:
● AlphaGo beating humans at Go
● Robots learning to walk
● Dynamic pricing in e-commerce
3. Key ML Concepts
3.1 Features and Labels
● Features: Input variables (e.g., number of rooms, area).
● Label: Target variable (e.g., house price).
3.2 Training vs Testing
● Training set: Used to train model.
● Testing set: Used to evaluate accuracy.
3.3 Overfitting vs Underfitting
● Overfitting: Model memorizes training data, fails on new data.
● Underfitting: Model too simple, misses patterns.
4. Important ML Algorithms
4.1 Linear Regression
● Predicts continuous values.
● Formula: y = mx + c
● Example: Predicting salary from years of experience.
4.2 Logistic Regression
● Classification (yes/no, 0/1).
● Example: Predicting whether a customer will buy a product.
4.3 Decision Trees
● Split data based on features.
● Easy to interpret.
● Example: Loan approval.
4.4 Random Forest
● Collection of decision trees.
● Reduces overfitting, improves accuracy.
4.5 K-Nearest Neighbors (KNN)
● Classifies based on closest data points.
● Example: Handwriting recognition.
4.6 Neural Networks (Intro)
● Inspired by human brain.
● Layers of neurons learn patterns.
● Basis of Deep Learning.
5. Model Evaluation Metrics
● Accuracy = (Correct predictions / Total predictions)
● Precision = True Positives / (True Positives + False Positives)
● Recall = True Positives / (True Positives + False Negatives)
● F1-Score = Harmonic mean of precision & recall
● RMSE (Root Mean Squared Error): For regression
6. Data Preparation for ML
● Handling Missing Values (mean/median imputation, deletion)
● Scaling Features (Standardization, Normalization)
● Encoding Categorical Data (One-Hot Encoding, Label Encoding)
● Train-Test Split (e.g., 80% training, 20% testing)
7. Case Study: Predicting House Prices
Problem: Predict house prices based on size, location, and number of rooms.
Steps:
1. Collect dataset ([Link])
2. Clean data (remove missing values)
3. Select features: size, location, rooms
4. Train model (Linear Regression)
5. Evaluate with RMSE
Expected Output: Model predicts price within ±10% of actual value.
8. Practice Exercises
1. Classification:
Build a logistic regression model to predict whether a student passes (1) or fails (0)
based on hours studied.
2. Clustering:
Use K-Means to group customers into 3 clusters based on purchase history.
3. Regression:
Predict car prices using regression with features: brand, mileage, year.
4. Evaluation:
Given predictions, calculate accuracy, precision, recall, and F1-score.
9. Common Tools & Libraries
● Python: scikit-learn, pandas, NumPy
● R: caret, randomForest
● Platforms: Databricks, Azure ML, Google Vertex AI
10. Summary
Machine Learning is about making systems learn from data. To succeed, you need:
1. Good data preprocessing
2. Right algorithm selection
3. Proper evaluation metrics
Mastering the basics prepares you for advanced topics like Deep Learning, NLP, and
Computer Vision.