■ Machine Learning Algorithms Handbook
With Graphs, Formulas, Loss Functions, Benefits, Drawbacks, Comparisons & Use Cases
Prepared for Interview Success ■
■ Linear Regression
Graph Intuition: Straight line fitting data points.
Formula: y = β0 + β1x + ε
Loss Function: MSE = (1/n) Σ(y - y_pred)^2
Benefits: Simple, interpretable, good baseline.
Drawbacks: Only captures linear relationships.
Why Next Algorithm? Fails for classification → Logistic Regression.
Use Case: Predict sales, housing prices.
■ Logistic Regression
Graph Intuition: S-shaped sigmoid curve for probability.
Formula: P(y=1|x) = 1 / (1 + e^(-wx))
Loss Function: Binary Cross-Entropy = -Σ[y log(p) + (1-y) log(1-p)]
Benefits: Probabilistic, interpretable.
Drawbacks: Only linear boundaries.
Why Next Algorithm? Needs non-linear → Decision Trees.
Use Case: Spam detection, medical diagnosis.
■ Decision Tree
Graph Intuition: Splits data based on features into branches.
Formula: Entropy = -Σ p log(p), Gini = 1 - Σ p²
Loss Function: Impurity (Entropy/Gini).
Benefits: Easy to interpret, non-linear.
Drawbacks: Overfits easily.
Why Next Algorithm? Bagging reduces variance → Random Forest.
Use Case: Loan approval, rule-based systems.
■ Random Forest
Graph Intuition: Many trees combined by majority voting.
Formula: Final Prediction = Majority vote/average of trees.
Loss Function: Same as decision tree loss.
Benefits: Reduces overfitting, robust.
Drawbacks: Less interpretable, slower.
Why Next Algorithm? Boosting for more accuracy → Gradient Boosting.
Use Case: Fraud detection, credit scoring.
■ Gradient Boosting (XGBoost)
Graph Intuition: Sequential trees fixing errors of previous ones.
Formula: Fm(x) = Fm-1(x) + γhm(x)
Loss Function: Customizable (MSE, log loss, etc.).
Benefits: High accuracy, flexible.
Drawbacks: Slower, overfits if not tuned.
Why Next Algorithm? High-dim sparse → SVM.
Use Case: Risk scoring, Kaggle competitions.
■ Support Vector Machine (SVM)
Graph Intuition: Max margin hyperplane separating classes.
Formula: arg max (margin), subject to constraints.
Loss Function: Hinge Loss = max(0, 1 - y(wx+b))
Benefits: Works well in high-dimensions.
Drawbacks: Slow on very large datasets.
Why Next Algorithm? Simpler → KNN.
Use Case: Image, text classification.
■ K-Nearest Neighbors (KNN)
Graph Intuition: Classifies based on majority of nearest neighbors.
Formula: Distance metric (Euclidean, Manhattan).
Loss Function: No explicit training loss.
Benefits: Simple, no training required.
Drawbacks: Slow at prediction, memory-heavy.
Why Next Algorithm? Needs faster → Naïve Bayes.
Use Case: Recommendation systems, anomaly detection.
■ Naïve Bayes
Graph Intuition: Applies Bayes theorem with independence assumption.
Formula: P(y|x) = (P(x|y)P(y)) / P(x)
Loss Function: Negative log likelihood.
Benefits: Very fast, works with small data.
Drawbacks: Strong independence assumption.
Why Next Algorithm? No labels → move to clustering (K-Means).
Use Case: Spam filtering, text classification.
■ K-Means Clustering
Graph Intuition: Groups points into K clusters around centroids.
Formula: J = Σ Σ ||xi - µk||²
Loss Function: WCSS (within cluster sum of squares).
Benefits: Simple, scalable.
Drawbacks: Needs K, assumes spherical clusters.
Why Next Algorithm? Dimensionality reduction → PCA.
Use Case: Customer segmentation, anomaly detection.
■ Principal Component Analysis (PCA)
Graph Intuition: Projects data to lower dimensions with max variance.
Formula: Eigen decomposition of covariance matrix.
Loss Function: Reconstruction error minimization.
Benefits: Reduces noise, improves speed.
Drawbacks: Hard to interpret.
Why Next Algorithm? For complex tasks → Neural Networks.
Use Case: Feature reduction, visualization.
■ Neural Networks
Graph Intuition: Layers of neurons transforming input to output.
Formula: a(l) = f(Wa(l-1)+b)
Loss Function: MSE, Cross-Entropy, etc.
Benefits: Captures complex patterns, very powerful.
Drawbacks: Needs lots of data, black-box.
Why Next Algorithm? Basis for deep learning.
Use Case: Image recognition, NLP, speech.
■ Interview Tips
- Start with **intuition (graph explanation)** - State **formula & loss** clearly - Compare with **previous
algorithm** (why better/worse) - List **benefits & drawbacks** concisely - Give a **real-world example**
- If possible, sketch the **graph** during the interview