- standard scaler uses standardization method to scale
- scaling is important to find the relationship without any much difference
https://youtu.be/wZOFkLlRa8c?si=oyCHFQ1P7BsxUMAm
1. Unbalanced data:-
- Oversamapling (Duplicate minority class)
- Undersampling (Minimize the majority class i.e majority class length = min
class length)
- SMOTE: Synthetic Minority Oversampling Technique:-
Oversampling but without duplicacy it generates new datapoints
Uses interpolation
a. Train a KNN on minority
b. Select examples from the minority class at random
c. Select a neighbour of each example at random (for interpolation)
d. Extract a random number between 0 and 1
e. Calculate the new examples as
original sample - factor*(original sample - neighbour)
f. The final dataset consists of the original dataset + the newly created
examples