Skip to content

Automated estimation of number resamplings given the size of the train data #231

@ivan-marroquin

Description

@ivan-marroquin

Is your feature request related to a problem? Please describe.
When defining the "cv" splitter using the Subsample class, it is required to provide the "n_resamplings" and "n_samples". If the "n_resamplings" is not properly selected, the following warning message is raised:

"WARNING: at least one point of training set belongs to every resamplings. Increase the number of resamplings"

Describe the solution you'd like
I think it will be beneficial if there is an automated way to estimate "n_resamplings" given the "n_samples". For instance, a user would choose to fix the "n_samples" in the following manner: n_samples= int(0.25 * gral_train_inputs.shape[0])

Then, the "n_resamplings" is determined accordingly to the size of the training data.

Describe alternatives you've considered
In my case, I decided to fix the "n_samples" as shown above. But now, I have to do trail/error to find the minimum "n_resamplings" to avoid the warning message to ensure good statistical results.

Kind regards,
Ivan

Metadata

Metadata

Assignees

No one assigned

    Labels

    Good first issueEasy issue to start to contribute to MAPIE

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions