MAINT Introduce BaseCriterion as a base abstraction for Criterions#24678
MAINT Introduce BaseCriterion as a base abstraction for Criterions#24678adam2392 wants to merge 49 commits intoscikit-learn:refactor/cython-tree-class-hierarchyfrom
BaseCriterion as a base abstraction for Criterions#24678Conversation
|
I think this PR includes by mistake unrelated Cython files from the |
ogrisel
left a comment
There was a problem hiding this comment.
You might want to check for BaseCriterion instead of Criterion in:
scikit-learn/sklearn/tree/_classes.py
Line 324 in bcff21e
to make it easier to subclass BaseDecisionTree with a custom criterion although other parts of the fit method assume a float64 dtyped y array. The various steps of the fit method should probably be split in private submethod to make it easier to subclass. This can be addressed in subsequent PRs.
|
Cross-linking: #20648 |
|
Hi @jjerphan @ogrisel and @glemaitre just wanted to do a light ping to see if there's any feedback regarding this and if this is good to merge? This is probably the most simple change I proposed in the roadmap for improving trees. |
This comment was marked as outdated.
This comment was marked as outdated.
|
@jshinm has volunteered to try to address the comments you outlined in the code. Thanks! |
|
I re-ran the asv benchmark after closing all open apps on my Mac M1: There are no performance regressions with including this refactoring. However, not sure why we saw an improvement in memory for sparse data. Might be chance? Saw similar results when I ran it yesterday tho (#24678 (comment)) cc: @jjerphan Our next steps are for @jshinm to re-run the asv benchmarks and to also address the code changes. |
|
Just wanted to follow up to get some idea what might be blocking this? We have several downstream PRs that fully flush out the modularity of criterion/splitter in the right directions w/o introducing performance degradation, so don't wanna continue working on those if something is drastically going to change here. Thanks! |
It has not got response nor attention mainly because of maintainers' lack of time, unfortunately. |
There was a problem hiding this comment.
@Vincent-Maladiere and I are currently taking some time to get a better understanding of those implementations.
In the meantime, I think we better use a feature branch for such refactor (this will allow us to integrate this work more safely and confidently). In this regards, I have created refactor/cython-tree-class-hierarchy from main.
@adam2392: could you change the target branch of this PR to refactor/cython-tree-class-hierarchy?
Done! |
|
The only CI issue is the CHANGELOG, but IIRC, that is fine. |
|
Here is an example of how such a refactoring is utilized: https://github.com/neurodata/scikit-tree/blob/main/sktree/tree/_unsup_criterion.pyx |
|
Good job! |
|
What happened here? Why was it closed? |
|
Sorry for the annoyance. @adrinjalali mentioned the I just have repushed the branch since I had a copy on my clone locally. @adam2392: I cannot reopen this PR, can you? |
|
Sure I can re-open this PR. May I ask what is the desire in doing so? Do we think this might move forward? I don't want to get my hopes up :p Do I target the now deleted branch, or do I target |
I can't guarantee this will move forward. We need to have people be involved on the Cython side of the project (I personally am contributing to scikit-learn on my free time solely).
|
Reference Issues/PRs
Fixes #24577
What does this implement/fix? Explain your changes.
Adds the
BaseCriterionclass as an abstract class for just the API relevant changes, while keepingCriterionas the assumed "supervised learning" criterion.This change is backwards compatible.
Any other comments?