0% found this document useful (0 votes)
20 views1 page

Notes

Uploaded by

Saptarshi Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views1 page

Notes

Uploaded by

Saptarshi Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd

- standard scaler uses standardization method to scale

- scaling is important to find the relationship without any much difference

https://youtu.be/wZOFkLlRa8c?si=oyCHFQ1P7BsxUMAm

1. Unbalanced data:-

- Oversamapling (Duplicate minority class)

- Undersampling (Minimize the majority class i.e majority class length = min
class length)

- SMOTE: Synthetic Minority Oversampling Technique:-


Oversampling but without duplicacy it generates new datapoints
Uses interpolation
a. Train a KNN on minority
b. Select examples from the minority class at random
c. Select a neighbour of each example at random (for interpolation)
d. Extract a random number between 0 and 1
e. Calculate the new examples as

original sample - factor*(original sample - neighbour)


f. The final dataset consists of the original dataset + the newly created
examples

You might also like