Improve numerical precision of discrete uniform and geometric ICDFs#6671
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6671 +/- ##
==========================================
- Coverage 91.99% 88.84% -3.16%
==========================================
Files 95 95
Lines 16016 16024 +8
==========================================
- Hits 14734 14236 -498
- Misses 1282 1788 +506
|
329e78e to
b95d06d
Compare
|
|
b95d06d to
2feed3e
Compare
There was a problem hiding this comment.
You have to use a switch here, ifelse will only work for scalar cases
|
Make sure you pass |
2feed3e to
4a40ee6
Compare
|
Thanks, replaced Even with |
There was a problem hiding this comment.
Do the old implementations fail with just 3 9's? I am wary of testing such an extreme value by default because it puts a high burden on future icdfs on being very accurate for extreme values. We know these things must always break at some point, specially for unbounded distributions.
| domain = Domain([0, 0.1, 0.5, 0.75, 0.95, 0.99, 0.9999, 1]) # Values we test the icdf at | |
| domain = Domain([0, 0.1, 0.5, 0.75, 0.95, 0.99, 0.999, 1]) # Values we test the icdf at |
There was a problem hiding this comment.
If you want to cover the changes in this PR, you can add a small test where you just evaluate the icdf at the values that used to fail for the two distributions
There was a problem hiding this comment.
If you want to cover the changes in this PR, you can add a small test where you just evaluate the icdf at the values that used to fail for the two distributions
To clarify my understanding, you mean I add the specific test for those distributions that fail and not something general that applies to all future distributions right?
There was a problem hiding this comment.
Do the old implementations fail with just 3 9's? I am wary of testing such an extreme value by default because it puts a high burden on future icdfs on being very accurate for extreme values. We know these things must always break at some point, specially for unbounded distributions.
I added 0.9999, since as outlined in #6670,
for a Geometric distribution with p=0.99, and CDF value 0.9999 whose ICDF is expected to be 2 but it gives 3 as the result. It works fine where the ICDF of 0.99 is 1. To summarize this in a table:
p=0.99
| Value | CDF | expected ICDF of the CDF | returned ICDF of the CDF | Bug |
|---|---|---|---|---|
| 1 | 0.99 | 1 | 1 | No |
| 2 | 0.9999 | 2 | 3 | Yes |
There was a problem hiding this comment.
To clarify my understanding, you mean I add the specific test for those distributions that fail and not something general that applies to all future distributions right?
Yes that's what I suggest. Also note that instead of tweaking the icdf value we can tweak the domain of the distribution parameters that is tested. Probably testing more extreme Geometric p values would also have shown the error?
Regardless, a separate test is also fine
There was a problem hiding this comment.
Thanks @ricardoV94 .
I have decided to do this, let me know if it makes sense:
With my latest commit, I have removed the value 0.9999 in the test for ICDF to not burden all future ICDF code regarding high accuracy.
p=0.99 is already used, as defined by Unit in testing.py.
The ICDF tests will instead be covered by the self-consistency check for ICDF of discrete distributions, which I will add in this PR #6642 . (Note that the self-consistency check for ICDF of continuous distributions is already added in #6642, and the discrete self consistency check is blocked on the current PR because of this accuracy issue. Thus I will add it to #6642 after this current PR is closed.)
Perfect |
db0b98f to
2348e4a
Compare
|
Is this ready to merge then? |
Yes, in my opinion. |
|
Awesome, great work! |
|
Thank you! :) |
What is this PR about?
Improve numerical precision issues for the ICDF function for the discrete geometric distribution.
Actually, this is an issue in general, wherever the ceiling function is used on a floating point number to get the ICDF value. Thus this fix for the precision issue applies to most discrete ICDF functions, and not only the Geometric distribution.
Details are outlined in this issue #6670 .
Closes #6670
Checklist
Major / Breaking Changes
New features
Bugfixes
Documentation
Maintenance
📚 Documentation preview 📚: https://pymc--6671.org.readthedocs.build/en/6671/