-
-
Notifications
You must be signed in to change notification settings - Fork 12k
WIP, BLD: Test (break) NumPy import from wheel install by obscuring DLL resolution #12994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Travis missing from CI list it seems -- wonder if that's related to the "draft PR" or something else. |
|
same for circleci -- maybe a longer queue, or related to the draft? |
|
I added a commit with the distutils changes that seemed to restore all-green Azure CI in my debug branch, overcoming the DLL path resolution narrowing from the first commit. Looks like Shippable really doesn't like this change at first glance! I don't even see suitable error output there. |
|
Closing/Reopening to restart shippable builds |
|
We should also remove the code that modifies |
|
You mean remove the line that I changed instead of changing it? Pretty sure I've adjusted that line you link to in this PR, right? Travis still not triggered for draft PR it seems, which means codecov isn't either. |
|
If an inappropriate extension is currently being loaded, shouldn't a unit test fail somewhere for linalg? |
|
This chunk of code modifies I am pretty sure this change will cause linalg to fail to load, since |
* OpenBLAS-related DLL is now placed in numpy/core instead of numpy/.libs to alleviate issues with DLL resolution that have cropped up in Anaconda environments * added a test in Azure pipelines workflow to mimic the Anaconda DLL resolution mangling
04b5dff to
fbb93eb
Compare
|
Ok, I removed that chunk of code and everything works just fine for Azure Windows on my branch, so I've pushed the revisions to this branch now & lifted draft status on PR so Travis can run too. Note that there's ongoing debate between @rgommers & @zooba in the linked issue as to whether this approach is really sustainable for the broader scientific Python ecosystem, though I'm not sure if Ralf is implying we can't do this for NumPy for now. Anyway, I'll leave this open for now--I'm expecting the CI to pass based on my debug work on another branch, but I guess we'll see. Not sure if we'd want to make an actual unit test in the test suite instead of a CI stage intervention for "testing" if we do go this route. |
|
|
||
| # Setup directory for storing generated extra DLL files on Windows | ||
| self.extra_dll_dir = os.path.join(self.build_temp, '.libs') | ||
| self.extra_dll_dir = os.path.join(self.build_temp, 'core') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes from a generic solution to something that's specific to numpy itself. I think we can do require each package to add some code in combination with changes here, but we can't just make numpy-specific directory layout assumptions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if other packages are using package root level .libs folder we'd probably need to retain that default and either have a generic way to offer a customized switch for the DLL storage folder, or bake in a switch if building NumPy only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scipy-wheels is still on a commit before 473a6f0, so it still has extra-dll rather than .libs. But it looks like we'll need to upgrade that repo to latest numpy master to get the fix for this PATH issue.
not sure yet - but as in my other comment, this |
|
What if we add a |
I think that falls within the lexicon of my previous suggestion above to:
But want to be cautious here. |
Works in principle, however a practical issue is that by default there is no I do think I have a preference for sticking with |
|
This PR does seem to work for NumPy. I think the
That's an extra release to be done, but we're getting multiple bug reports a day that will mostly dry up then, so probably worth it. |
|
I'm just trying the other solution now, but if that doesn't work out swiftly then we may have to I guess |
@mattip this turned out to not be an issue right? reason: |
|
I marked this for backport just so it doesn't get lost. |
|
The fix seems good, linalg still works even without the code to modify |
|
This PR will likely break SciPy because of the distutils lib change. The alternate won't. Each has its own tradeoffs. I think I like the alternate more to reduce impact on the ecosystem by steering clear of distutils. |
|
Do we want to pursue this test? Anaconda/numpy seems to be working OK now, we have the check in the |
|
We do have the regression test used in Azure CI here already merged to master with the alternative solution. I'm still not clear on exactly why either solution only affects conda. I think it is more about likelihood of being helpful--technically it seems like a user could do what I did with the test here / there to break a source install by bricking access to But this is so obscure beyond the recent Anaconda event that there's some arguments that the fix should be migrated to the wheels repo only. If we do that, I'd probably want to clone in the modified On the other hand, with Anaconda reversing the design decision that caused the issue in the first place, there's even less justification for the fix. Can the alternative fix actually do harm? Is guarding against this obscure scenario worth it only for wheels or not even for wheels anymore? I like being defensive against the issue possibly happening again, but there was certainly a wide-ranging set of opinions & some solutions affecting downstream things of course. Probably want to wait a bit for clear consensus. |
I don't care so much about the test, but we do need to fix the problem before it re-occurs; it will likely be reintroduced for Python 3.8 through the Windows Store even if Anaconda stays as is.
It doesn't only affect conda.
There's no urgency at the moment I think, but let's mark it for the next release to not completely forget about it. |
|
I'll just ping @rgommers to double check on that. I think I wasn't quite getting the full picture, but maybe the "sub-optimal" approach is indeed ok. |
|
I think the current approach that's in master is probably fine. However there are still two problems that needs solving before the next release: #13019 (comment) Could one of you tackle those before closing this (or close this and open a new issue for the todos with the 1.17.0 milestone)? |
|
@mattip is this still needed/wanted? |
|
Closing. This was resolved and is tested differently |
|
actually closing now |
Fixes #12667
The simplest & apparently favored fix for failed NumPy imports on (conda) Windows proposed in #12667 is to ensure that requisite DLL(s) are physically present in the
numpy/corepath.Before trying a fix that moves the DLL path(s) within a built NumPy wheel, I thought it might be sensible to have a "test" that tries to mimic the conda-related import failures described in the above issue.
Not sure if this is too aggressive, by confining DLL search resolution to
system32, but it does seem to nicely generate anImportError: DLL load failed.I guess the answer to "is this too aggressive?" is if the physical placement of the DLL(s) at the core path restores the functional
numpyimport.