Skip to content

Conversation

@thibaultcordier
Copy link
Collaborator

@thibaultcordier thibaultcordier commented Jun 11, 2024

Description

Decoupling infinite interval production and quantile calculation

Fixes #462

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • Test when alpha and the number of effective calibration samples are compatible
  • Test that the production of infinite intervals and the calculation of quantiles are properly decoupled

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully : make lint
  • Typing passes successfully : make type-check
  • Unit tests pass successfully : make tests
  • Coverage is 100% : make coverage
  • Documentation builds successfully : make doc

@thibaultcordier thibaultcordier marked this pull request as ready for review June 13, 2024 15:13
Copy link
Collaborator

@LacombeLouis LacombeLouis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @thibaultcordier,
I think the PR is great! Thank you for including the examples! This PR could be approved as is, but there are some small changes that I think could help make it clearer. Especially in the example, I don't think it's needed to add the prefit_asym part.

Thank you!



##############################################################################
# Experiment 2: Again but without fixing random_state
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only keep the experiment 2 where you fix the random_state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to keep this part, because the reader will no longer understand why in the next plot, the different methods no longer overlap when they did before.

)


def get_effective_calibration_samples(scores: NDArray, sym: bool):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the sym not by default True?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sym is by default True in ConformityScore. In any case, we need to compute the effective calibration sample size to compute the quantile.

By default ``False``.
allow_infinite_bounds: bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we call it allow_infinite_bounds and the unbounded?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MapieRegressor, it is helpful to keep allow_infinite_bounds to understand what it is. In get_bounds, it is not helpful to understand the code because too long.



##############################################################################
# Experiment 4: Extensive experimentation on different delta and n_calib
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only show the reference, prefit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important to show the asymmetric case to avoid any misunderstanding/publication of an issue that is not an issue.



##############################################################################
# Section 2: Comparison with different MAPIE CP methods
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that this example adds a lot of insights. I would make this example shorter by getting rid of the prefit_asym part.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, this part could be improved by adding more MAPIE methods (like split, CV+, CQR etc). In any case, we let the user test it themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Coverage validity not verified in MapieRegressor when producing infinite intervals

2 participants