MACHINE LEARNING ALGORITHM
1. LINEAR REGRESSION
Type: Supervised Learning
Task: Regression
Output: Continuous value
Use Case:
Predict a numerical value based on relationships between features.
BTS EX: Predict how many **Spotify streams** a song will get based on Yoongi’s verse
length and Jungkook’s high notes.
When to Use:
- When there's a clear linear relationship
- One dependent variable, one/multiple independent variables
Best For:
- House price prediction
- Song popularity prediction
- Sales forecasting
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
2. LOGISTIC REGRESSION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: Supervised Learning
Task: Classification
Output: Binary or multi-class label
Use Case:
Predict probabilities and classify categories.
BTS EX: Predict if ARMY will cry during a **concert finale** (Yes/No) based on number
of ballads sung.
When to Use:
- Binary outcomes (e.g., Yes/No, True/False)
- Good when the output is a probability
Best For:
- Spam detection
- Disease diagnosis
- Event prediction (e.g., churn/retention)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
3. K-NEAREST NEIGHBORS (k-NN)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: Supervised Learning
Task: Classification (and some regression)
Output: Category or continuous value
Use Case:
Classifies based on the ‘k’ closest neighbors in feature space.
BTS EX: Guess Yoongi’s **stage outfit theme** based on how similar his past concert fits
were.
When to Use:
- When decision boundaries are not clear
- Small datasets
- You want interpretability
Best For:
- Image recognition
- Music genre classification
- Recommender systems
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
4. DECISION TREE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: Supervised Learning
Task: Both Classification & Regression
Output: Category or continuous value
Use Case:
Make decisions by learning simple rules (like a flowchart).
BTS EX: Predict who sings the **chorus** based on BPM, genre, and era (e.g., HYYH vs.
Proof).
When to Use:
- Need a visual model
- For interpretable decision-making
- Mixed feature types (categorical + numerical)
Best For:
- Customer segmentation
- Medical diagnosis
- Any scenario with if/else logic
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
5. RANDOM FOREST
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: Supervised Learning
Task: Both Classification & Regression
Output: Category or continuous value
Use Case:
Ensemble of decision trees for better accuracy and less overfitting.
BTS EX: Predict who **wins most fan polls** using 100 decision trees (one for each poll
type).
When to Use:
- Complex datasets with lots of features
- You want high accuracy
- Avoiding overfitting
Best For:
- Product recommendation
- Stock prediction
- Loan approval systems
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
6. NAIVE BAYES
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: Supervised Learning
Task: Classification
Output: Category
Use Case:
Uses probability to classify based on Bayes’ Theorem with a “naive” assumption of
independence.
BTS EX: Predict whether a fan tweet is about **Yoongi, Tae, or Jimin** based on word
patterns
When to Use:
- Text classification (spam, sentiment)
- Fast and lightweight models
- Categorical data
Best For:
- Email spam detection
- Sentiment analysis
- Language detection
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
7. K-MEANS CLUSTERING
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: Unsupervised Learning
Task: Clustering
Output: Cluster groups
Use Case:
Groups similar data points into clusters based on proximity.
BTS EX: Group ARMY into **vibe clusters**: soft stans, performance stans, lyric-lovers
When to Use:
- No labels in your data
- You want to find hidden patterns
- Grouping based on similarity
Best For:
- Customer segmentation
- Image compression
- Behavioral clustering
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Bonus BTS Tip:
If you’re ever stuck, just ask yourself:
*“Is this about predicting something (supervised)? Or discovering patterns
(unsupervised)?”*
And then imagine Jungkook trying to use it for styling decisions