Machine Learning Algorithms - Summary & Use Cases
1. Linear Regression
**Key Points**: Supervised, Regression, Assumes linear relationship, Sensitive to outliers.
**Use Cases**: Predicting house prices, stock trends (short-term), sales forecasting.
**When to Use**: When you believe the output depends linearly on input features.
**When Not to Use**: Non-linear patterns, multicollinearity present.
**Example**: Predicting salary based on years of experience.
2. Logistic Regression
**Key Points**: Supervised, Classification, Outputs probability, sigmoid function.
**Use Cases**: Spam detection, medical diagnosis, credit default prediction.
**When to Use**: Binary classification with linearly separable data.
**When Not to Use**: Complex boundaries, non-linear separability.
**Example**: Predicting if a tumor is malignant or benign.
3. Decision Trees
**Key Points**: Supervised, Handles both regression/classification, interpretable, prone to overfitting.
**Use Cases**: Loan approval, customer churn, medical treatment.
**When to Use**: You need explainable models.
**When Not to Use**: High variance datasets; prone to overfit without pruning.
**Example**: Classifying whether a student passes based on attendance and study hours.
4. Random Forest
**Key Points**: Ensemble, Reduces overfitting, better accuracy than a single tree.
**Use Cases**: Feature selection, fraud detection, credit scoring.
**When to Use**: Better generalization with many trees.
**When Not to Use**: Real-time applications (slow inference).
**Example**: Predicting stock market movement using multiple financial indicators.
5. Support Vector Machine (SVM)
Machine Learning Algorithms - Summary & Use Cases
**Key Points**: Works well for small to medium datasets, margin maximization, kernel trick.
**Use Cases**: Face detection, bioinformatics, handwriting recognition.
**When to Use**: Clear margin of separation, high-dimensional space.
**When Not to Use**: Large datasets, noisy data.
**Example**: Email spam vs. non-spam classifier.
6. K-Nearest Neighbors (KNN)
**Key Points**: Lazy learner, No training phase, sensitive to K and scaling.
**Use Cases**: Recommendation systems, image recognition.
**When to Use**: Small datasets with well-separated classes.
**When Not to Use**: High-dimensional data or large datasets.
**Example**: Classifying handwritten digits.
7. Naive Bayes
**Key Points**: Probabilistic, assumes feature independence.
**Use Cases**: Text classification, spam filtering, sentiment analysis.
**When to Use**: High-dimensional text data.
**When Not to Use**: When features are highly correlated.
**Example**: Classifying news articles by topic.
8. K-Means Clustering
**Key Points**: Unsupervised, centroid-based, sensitive to initialization and scale.
**Use Cases**: Customer segmentation, image compression.
**When to Use**: When number of clusters is known, data is spherical.
**When Not to Use**: Unequal cluster sizes or densities.
**Example**: Grouping users based on website activity.
9. Principal Component Analysis (PCA)
**Key Points**: Unsupervised, dimensionality reduction, maximizes variance.
Machine Learning Algorithms - Summary & Use Cases
**Use Cases**: Preprocessing, visualization, compression.
**When to Use**: High-dimensional data with correlated features.
**When Not to Use**: When interpretability of features is important.
**Example**: Visualizing high-dimensional gene expression data in 2D.
10. Gradient Boosting / XGBoost
**Key Points**: Ensemble, uses weak learners, highly accurate, prone to overfitting if not regularized.
**Use Cases**: Kaggle competitions, fraud detection, sales forecasting.
**When to Use**: Structured/tabular data with nonlinear patterns.
**When Not to Use**: Real-time predictions due to complexity.
**Example**: Predicting product demand using historical sales and promotions.