Skip to content

Conversation

@Micky774
Copy link
Owner

Reference Issues/PRs

Downstream of scikit-learn#22616

What does this implement/fix? Explain your changes.

This is meant to be a sort of "live tracking" PR which handles the reconciliation and reintroduction of the Boruvka algorithm. It exists because I find it more convenient to adapt to incremental changes in the upstream branch as they come out, rather than trying to reconcile a massive gap once the upstream branch is actually merged into scikit-learn/main.

Any other comments?

Everything here is volatile and subject to great change.

- Added support for `n_features_in_`
- Improved validation and added support for `feature_names_in_`
- Renamed `kwargs` to `metric_params` and added safety check
  for an empty dict
- Removed attributes set in init and deferred to properties
- Raised error if tree query is performed with too few samples
- Cleaned up some list/dict comprehension logic
- Removed internal minkowski metric parameter validation in favor
  of `sklearn.metrics` built-in handling
- Removed default argument and presence of `p` in hdbscan functions
- Now users must pass `p` in through `metric_params`, consistent w/
  other metrics such as `wminkowski` and `mahalanobis`
- Removed vestigial estimator check -- now supported via common tests
- Fixed bug where `boruvka_kdtree` algorithm's accepted metrics were
  based off of `BallTree` not `KDTree`
- Cleaned up lines with unused returns by indexing output of `hdbscan`
- Greatly expanded scope of algorithm/metric compatability tests
- Streamlined some other tests
- Delted commented out tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants