Skip to content

Relax tolerance of some tests for numpy 1.22#12682

Merged
pllim merged 3 commits intoastropy:mainfrom
mhvk:numpy-precision-issues
Jan 5, 2022
Merged

Relax tolerance of some tests for numpy 1.22#12682
pllim merged 3 commits intoastropy:mainfrom
mhvk:numpy-precision-issues

Conversation

@mhvk
Copy link
Contributor

@mhvk mhvk commented Jan 1, 2022

Most likely as a consequence of numpy now supporting the faster but slightly less precise Intel Short Vector Math Library (SVML), some of our tests are failing, in ways that are not really interesting. So, up the tolerance slightly.

Note that I cannot reproduce the failures locally, so this may need some trial and error.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see "When to rebase and squash commits".
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the Extra CI label.
  • Is a change log needed? If yes, did the change log check pass? If no, add the no-changelog-entry-needed label. If this is a manual backport, use the skip-changelog-checks label unless special changelog handling is necessary.
  • Is a milestone set? Milestone must be set but astropy-bot check might be missing; do not let the green checkmark fool you.
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate backport-X.Y.x label(s) before merge.

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems uncontroversial enough. Thanks!

@pllim

This comment has been minimized.

@mhvk
Copy link
Contributor Author

mhvk commented Jan 1, 2022

The python3.9 windows failure seems also related to the new numpy - and may be a bug...

I'm also unsure about the matplotlib failure - that might well be numpy related too, and thus need a new baseline image, but don't know how to investigate that... Maybe to get things going again it is an idea to pin numpy for those for now?

@mhvk
Copy link
Contributor Author

mhvk commented Jan 1, 2022

For the win32 failure, I think this really is a numpy problem so raised an issue at numpy/numpy#20699

@pllim
Copy link
Member

pllim commented Jan 2, 2022

Thanks for looking into this!

Hmm. Should we wait for numpy/numpy#20699 to be resolved before proceeding?

@pllim

This comment has been minimized.

@mhvk
Copy link
Contributor Author

mhvk commented Jan 2, 2022

On looking more closely, I now also wonder about the test case that failed: there is actually a huge difference in parameters (EDIT: the last one!)

E       AssertionError: 
E       Not equal to tolerance rtol=0.15, atol=0
E       
E       Mismatched elements: 1 / 6 (16.7%)
E       Max absolute difference: 1549800.82273902
E       Max relative difference: 0.99999981
E        x: array([9.999454, 4.999725, 5.000437, 4.000786, 3.999523, 0.30056 ])
E        y: array([9.999455e+00, 4.999673e+00, 5.000516e+00, 4.000724e+00,
E              3.999598e+00, 1.549801e+06])

@mhvk
Copy link
Contributor Author

mhvk commented Jan 2, 2022

Annoyingly, I cannot reproduce that result...

@mhvk mhvk force-pushed the numpy-precision-issues branch from 41ae13c to 4f33333 Compare January 2, 2022 19:23
@mhvk
Copy link
Contributor Author

mhvk commented Jan 2, 2022

Possibly, this is because Gaussian2D's theta was not actually initialized. 🤞

@pllim

This comment has been minimized.

@pllim
Copy link
Member

pllim commented Jan 3, 2022

FYI: I rebased and see if some of the failures will go away.

Update: I also updated the truths for mpl311 over at astropy-data instead of trying to pin here, which broke both CirlcleCI jobs, whereas only mpl311 is failing with those 2 images in main. I don't see anything in the diff, but whatever.

p.s. I think the ResourceWarning about test3.fits is because the other test failed and didn't clean up open file pointers properly. It will go away when we skip the affected test.

@pllim pllim added the Extra CI Run cron CI as part of PR label Jan 3, 2022
@pllim
Copy link
Member

pllim commented Jan 3, 2022

@mhvk , I took the liberty to also try to fix all the other failures that cropped up over the holiday weekend here. Hope you don't mind!

@mhvk
Copy link
Contributor Author

mhvk commented Jan 3, 2022

@pllim - of course fine to push further attempts, thanks!

It does look like in the modelling something really changed, allowing theta to wander off (of course, it is module 2pi that counts, but it is strange that the angle goes so much further off than before!). Were you able to confirm any of these on your own machine?

Most likely as a consequence of numpy now supporting the faster but
slightly less precise Intel Short Vector Math Library (SVML), some of
our tests are failing, in ways that are not really interesting.  So,
up the tolerance slightly.

See numpy/numpy#19478

Temporarily pin numpy for matplotlib tests

Ensure theta of Gaussian2D is initialized
@pllim pllim force-pushed the numpy-precision-issues branch from 3ce6f0f to 6eecd23 Compare January 3, 2022 21:39
@pllim
Copy link
Member

pllim commented Jan 3, 2022

Were you able to confirm any of these on your own machine?

No, I am doing other work while the CI churns away. The most curious thing is not all jobs are failing, so something specific is triggering it. I wonder if @WilliamJamieson has any idea...

@pllim
Copy link
Member

pllim commented Jan 3, 2022

Really don't know what is going on with modeling. I don't see this failure on main and I don't see how this patch can trigger this failure. I pushed a commit to just skip it on Windows to see if other jobs even have the same failure or not.

@pllim pllim force-pushed the numpy-precision-issues branch from bb9bc96 to 4210769 Compare January 3, 2022 22:30
@pllim
Copy link
Member

pllim commented Jan 4, 2022

Maybe we don't need this patch if #12684 works... 👀

@pllim
Copy link
Member

pllim commented Jan 4, 2022

@WilliamJamieson is investigating if he can rewrite the offending modeling test to be more robust.

@pllim
Copy link
Member

pllim commented Jan 4, 2022

I also see this warning in a local install when I do a pip install numpy -U to upgrade to 1.22. Should we be worried about this?

oldest-supported-numpy 0.15 requires numpy==1.17.3; python_version == "3.8"
and(platform_machine != "arm64" or platform_system != "Darwin")
and platform_machine != "aarch64" and platform_machine != "s390x"
and platform_python_implementation != "PyPy", but you have numpy 1.22.0 which is incompatible.
Successfully installed numpy-1.22.0

@pllim
Copy link
Member

pllim commented Jan 4, 2022

p.s. I am unable to reproduce the modeling failure on Windows 10 locally (after unskipping it, of course).

@mhvk
Copy link
Contributor Author

mhvk commented Jan 4, 2022

I tried to follow all the experiments, but fear it just left me super-confused! In particular of course not being able to actually reproduce the failures...

@pllim
Copy link
Member

pllim commented Jan 4, 2022

Hmm. Looks like we don't have to touch most of the tests if we go with #12685 . What do you think?

@mhvk
Copy link
Contributor Author

mhvk commented Jan 5, 2022

Ideally, we run our tests as users would, so without setting environment variables. I don't mind relaxing the constraints, but the changes in modelling are really worrying.

So, perhaps the best is to use #12685 for now, so that the tests pass again and we can continue to merge new stuff, but raise an issue to remind us to try to find out what is causing the modelling problems - we don't want users to get wrong results if they don't have that variable set!!

@WilliamJamieson
Copy link
Contributor

@mhvk I suspect that the "modeling problem" is just a case of equivalent parameter sets. This is because the only parameter that is "wrong" is the rotation angle of the Gaussian2D model, this parameter is technically periodic with period 2pi. My theory is that the scipy.optimize.leastlsq method is just wandering to a different "equivalent" parameter. At some point we should consider switching over to using scipy.optimize.leaset_squares, which is now the recommended method.

Note that the test changes I suggested for #12684, are passing for the Gaussian2D, this change compares the outputs of the two models over many test points, demonstrating that the two models seem to be arriving at the same results. This type of test is indifferent to "equivalent" values for parameters. The concerning problem to me is that this seems to only happen in a specific environment.

@pllim
Copy link
Member

pllim commented Jan 5, 2022

we run our tests as users would, so without setting environment variables

That is a good point as well. Now I am torn... We have two options:

  1. If we decide we do not want to rely on magic env var, we go with this PR but first I need to cherry-pick @WilliamJamieson 's fixes from TST: Use NPY_DISABLE_CPU_FEATURES for numpy 1.22 #12684 .
  2. If we decide we do not want to adjust tests, we go with TST: Use NPY_DISABLE_CPU_FEATURES for numpy 1.22 (try 2) #12685 .

@saimn
Copy link
Contributor

saimn commented Jan 5, 2022

Not using the env var seems better since it's hard to ensure that it's always set, e.g. when running pytest directly. So as it seems we now have a solution for the various tests that are failing, adjusting the constraints, I would go with that.

@mhvk
Copy link
Contributor Author

mhvk commented Jan 5, 2022

@WilliamJamieson - thanks, that makes sense. As you say, the weird thing is that it depends on environment. But as long as one gets a fit that actually works, it does not matter too much.

@mhvk
Copy link
Contributor Author

mhvk commented Jan 5, 2022

I also would tend to go with this PR, i.e., not using the environment variable. I have some hope that over at numpy improvements will be made so that the tolerance becomes better again, but really for our purposes the matches are still amply good enough.

pllim and others added 2 commits January 5, 2022 11:47
TST: Pin numpy in pyinstaller job
because I have no idea what is going on there. Undoing getting rid of numpy.distutils give
me error about _private.

Skip failing FITS test due to numpy/numpy#20699
but we need to open follow-up issue to unskip it.

DOC: Replace broken Simbad links.
Added back in original assert, ignoring only for Gaussian2D.
@pllim pllim force-pushed the numpy-precision-issues branch from 4210769 to c33be6f Compare January 5, 2022 16:50
@pllim
Copy link
Member

pllim commented Jan 5, 2022

Okay, I cleaned up this PR. Hopefully tests will pass now. I still don't understand why pyinstaller fails with numpy 1.22 but I have pinned it to 1.21.* for now.

@pllim
Copy link
Member

pllim commented Jan 5, 2022

"Allowed failure" failure is an unrelated socket timeout.

@pllim pllim merged commit 3d522eb into astropy:main Jan 5, 2022
@pllim
Copy link
Member

pllim commented Jan 5, 2022

Thank you, everyone!

@lumberbot-app

This comment has been minimized.

pllim added a commit to pllim/astropy that referenced this pull request Jan 5, 2022
Co-authored-by: Marten van Kerkwijk <[email protected]>
Co-authored-by: William Jamieson <[email protected]>
@mhvk mhvk deleted the numpy-precision-issues branch August 26, 2023 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants