-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
ENH _hdbscan HIERARCHY data type introduction
#25826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH _hdbscan HIERARCHY data type introduction
#25826
Conversation
jjerphan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Micky774!
Here is a first pass. I guess conversion of cnp.ndarray to (const-qualify) might be done in another PRs not to pollute the diff of this PR.
Co-authored-by: Julien Jerphanion <[email protected]>
|
@thomasjpfan @jjerphan wondering what you two think of the PR now |
jjerphan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the heads-up, @Micky774.
Co-authored-by: Julien Jerphanion <[email protected]>
Co-authored-by: Julien Jerphanion <[email protected]>
| cluster = cluster_map[cluster_num] | ||
| max_lambda = deaths[cluster] | ||
| if max_lambda == 0.0 or not np.isfinite(lambda_array[n]): | ||
| if max_lambda == 0.0 or isinf(lambda_array[n]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be?
| if max_lambda == 0.0 or isinf(lambda_array[n]): | |
| if max_lambda == 0.0 or not isinf(lambda_array[n]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original checks that it is not finite rather than not infinite, so I think the current isinf is correct -- unless I have it further mixed up somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I'm thinking about:
| if max_lambda == 0.0 or isinf(lambda_array[n]): | |
| if max_lambda == 0.0 or not isfinite(lambda_array[n]): |
Because of nan behavior:
not isfinite(NaN) == 1
isinf(NaN) == 0Is lambda_array[n] never nan by construction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, lambda_array stems from the computed minimum reachability distance, which is never expected to be nan since we separate those points before processing
Co-authored-by: Julien Jerphanion <[email protected]>
Reference Issues/PRs
Towards #24686
What does this implement/fix? Explain your changes.
HIERARCHYdtypeandc-typefloat64_tarrays withHIERARCHYwhere appropriateAny other comments?
This is a blocker for more sweeping algorithm refactors and simplifications which rely on the structure of the dtype