ENH add from_cv_results in RocCurveDisplay (single RocCurveDisplay)#30399
ENH add from_cv_results in RocCurveDisplay (single RocCurveDisplay)#30399jeremiedbb merged 77 commits intoscikit-learn:mainfrom
from_cv_results in RocCurveDisplay (single RocCurveDisplay)#30399Conversation
sklearn/utils/_plotting.py
Outdated
| if n_multi is None: | ||
| name = self.estimator_name if name is None else name | ||
| else: | ||
| name = [f"{curve_type} fold {curve_idx}:" for curve_idx in range(n_multi)] |
There was a problem hiding this comment.
If we go with this implementation, I thought this change could be used for other multi cv displays. Not 100% sure on this change though.
|
As discussed in today's meeting, this is my favorite solution because it's the simplest and least surprising one from a user point of view, even though it adds a bit more internal complexity than the others. And I think we can mitigate some of it by extracting parts of the It also looks like a good portion of the added complexity will be exactly the same for other displays like PRCurveDisplay, so there might be a chance that we'll be able to factorize some parts to be used by several displays. |
|
The changes in f0908e1 and 7e77d4c factorizes out common code (compared to #30508), adding helper function to either |
glemaitre
left a comment
There was a problem hiding this comment.
Only a couple of nitpicks. It looks great on my side and almost ready to be merged.
|
I think I have addressed everything, thanks @glemaitre ! |
|
@jeremiedbb gentle ping on this, thanks! |
jeremiedbb
left a comment
There was a problem hiding this comment.
I pushed tiny nitpicks.
LGTM. Thanks !
|
Thanks @jeremiedbb ! |
Co-authored-by: Jérémie du Boisberranger <[email protected]>
Co-authored-by: Jérémie du Boisberranger <[email protected]>
Co-authored-by: Jérémie du Boisberranger <[email protected]>
Reference Issues/PRs
Reference Issues/PRs
Supercedes #25939
This is part of a group of draft PRs to determine best API for adding plots for cv results to our displays.
from_cv_resultsinRocCurveDisplay(Multi-display) #30359)For all 3 options we take the output of
cross_validate, and use the fitted estimator and test indicies. No fitting is done in the display.We do recalculate the predictions (which would have already been done in
cross_validate), which could be avoided if we decided to changecross_validateto optionally return the predictions as well (note this would makecross_val_predictredundant).See more thread: #25939 (comment)). I think should be outside of the scope of this body of work though.
What does this implement/fix? Explain your changes.
Not 100% I've implemented this optimally.
RocCurveDisplayobject may contain data (fpr/tpf etc) for single or multi curvesRocCurveDisplayreturns single mpl Artist object, or list of objects for multi curvesRocCurveDisplay.plothandles both single and multi-curve plotting, this has meant a lot more checking is required (c.f. the other 2 implementations, as this is the only case where you can useplotdirectly to plot multi-curves)More specific concerns detailed in review comments
Plot looks like:
TODO
We should update
visualization.rstafter this PR is in to add a section aboutfrom_cv_results.