Skip to content

Improve build isolation in PythonPipBuilder#40224

Merged
adamjstewart merged 1 commit intospack:developfrom
haampie:python/more-isolation
Oct 4, 2023
Merged

Improve build isolation in PythonPipBuilder#40224
adamjstewart merged 1 commit intospack:developfrom
haampie:python/more-isolation

Conversation

@haampie
Copy link
Copy Markdown
Member

@haampie haampie commented Sep 27, 2023

This PR supersedes and is the more general / proper fix for #31131

Needs #40234 (currently rebased on it).

We run pip with --no-build-isolation because we don't wanna let pip install
build deps. As a consequence, when pip runs hooks, it runs hooks of any
package it can find in sys.path 1

This is problematic both for external and non-external python.

  • For external python: pip may run hooks of any package in system and user
    site-packages dirs
  • For non-external python: spack by design doesn't have system site-packages
    2, or at least that's empty, but user site-packages are still picked up.

An example of the problem is when you e.g. spack install py-setuptools@63 ^[email protected]
when you have e.g. setuptools==44.0.0 installed as a user package in
~/.local.

During the build, pip executes the hook setuptools.finalize_distribution_options,
and looks for entry points through (vendored) importlib.metadata.entry_points(...).
This finds ~/.local/.../setuptools.egg-info/entry_points.txt and executes its
hooks, and that fails badly, cause setuptools that's being built is already
imported, and the old hooks of user setuptools refer to functions that no
longer exist in newer setuptools.

Solution for external Python

Creating an empty venv before invoking pip in Spack's build env ensures that
pip can no longer see user/system site-package, but is limited to standard
libraries and packages in PYTHONPATH.

The downside of this approach is that shebangs generated by pip now will point
to the temporary virtual environment path, instead of the underlying python.
The current workaround for that is to fixup shebangs post install.

Solution for Spack managed Python

We only have to drop user site-packages from sys.path, and for that an
actual environment variable exists: PYTHONNOUSERSITE=1, so pip is invoked
with that.

Alternative solutions

Alternative solution that didn't work well was to try and run pip like python -S -m pip,
which also drops the system / user site packages paths -- problem is that pip
spawns subprocesses and those flags don't propagate.

Using environment variables for external Python is also not an option, because
there's no corresponding env variable to -S, only to -s.

TL;DR: use venv so pip cannot run hooks from system / user packages, only from
stuff in PYTHONPATH, aka our "build environment".

Footnotes

  1. This is through importlib.metadata.entry_points(...); for older Python
    and older setuptools things are even worse, cause it has a buggy old vendored
    importlib.

  2. although users may install py-pip ^python, and then "accidentally"
    pollute the original spack python prefix with pip install system packages.

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Sep 27, 2023

Dunno if somebody can suggest a better solution for ensuring console_script entrypoints/script have shebangs point to the underlying python instead of the temporary build venv path.

@haampie haampie force-pushed the python/more-isolation branch 2 times, most recently from 4eb3976 to 9a184a1 Compare September 27, 2023 19:06
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Sep 28, 2023

@adamjstewart thoughts?

This works in Gitlab. The GHA failure is also correct (setuptools is missing as a dependency in py-isort). Fixed in #40234.

The fact that the GHA failed with this PR only shows that we need this for better isolation from system python packages :)

haampie added a commit to haampie/spack that referenced this pull request Sep 28, 2023
Detected in spack#40224.

In the past, system setuptools could be picked up when using an external
python, so py-isort@4 would install fine. With the linked PR, pip can
only consider packages that Spack controls from PYTHONPATH, so the issue
of missing py-setuptools showed up.
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Sep 28, 2023

Alternative / compromise:

  1. If Python is non-external, it should suffice to run pip with PYTHONNOUSERSITE=1, since system site-package is empty.
  2. If Python is external, create a venv and patch shebangs post install.

@johnwparent
Copy link
Copy Markdown
Contributor

Alternative / compromise:

  1. If Python is non-external, it should suffice to run pip with PYTHONNOUSERSITE=1, since system site-package is empty.
  2. If Python is external, create a venv and patch shebangs post install.

Would using or extending EnvBuilder work for this case? At a glance it looks like this method could be leveraged to do what you want.

@haampie haampie force-pushed the python/more-isolation branch from b3eb42e to ebf9358 Compare September 28, 2023 14:15
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Sep 28, 2023

Alternative / compromise:

  1. If Python is non-external, it should suffice to run pip with PYTHONNOUSERSITE=1, since system site-package is empty.
  2. If Python is external, create a venv and patch shebangs post install.

Would using or extending EnvBuilder work for this case? At a glance it looks like this method could be leveraged to do what you want.

Not immediately sure if that can be used.

Bit more context, here's an example pyproject.toml that lets pip generate two wrapper scripts <pkg>/bin/black and <pkg>bin/blackd.

Here's where pip figures out the shebang line:

https://github.com/pypa/pip/blob/2ba5acc8a4772ca1dc0d62a2dfe8967af28649d6/src/pip/_vendor/distlib/scripts.py#L166

The get_executable returns sys.executable, and this points to the virtual environment's python. Can't and shouldn't change this, cause sys.executable is also used to spawn subprocesses for which we do want to use the venv python s.t. system/user search paths are disabled.

The only way to override this, is if you can somehow pass a custom executable here or here, but that looks impossible. (especially since all we have is the pip command line...)

@haampie haampie changed the title Use a venv in PythonPipBuilder Improve build isolation in PythonPipBuilder Sep 28, 2023
@haampie haampie force-pushed the python/more-isolation branch from ebf9358 to 16c7c03 Compare September 28, 2023 14:50
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Sep 29, 2023

@pradyunsg, do you have opinions about this? The top level comment was updated to describe the issue and solution, the rest of the comments is not relevant.

@pradyunsg
Copy link
Copy Markdown

problem is that pip
spawns subprocesses and those flags don't propagate.

Hmm... we could make pip respect this in a future release, I think. File an issue over at pypa/pip about this?

This might also get resolved once we get to the possible-but-needs-work improvement to actually avoid pip-subprocesses inside pip. :)


This seems reasonable to me, with the caveat that it's... a few layers of detail-dependent bodges/behaviours (which tends toward some fragility), but they're reasonable to solve the problem as long as that fragility isn't a huge concern.

@alalazo alalazo added this to the v0.21.0 milestone Oct 2, 2023
@adamjstewart
Copy link
Copy Markdown
Member

Fantastic work, completely agree with all of the decisions reached here. I like the path forward of:

  1. Use PYTHONNOUSERSITE=1 for internal Python
  2. Use a venv for external Python
  3. Open an issue in pip's repo about subprocesses
  4. Remove the venv stuff in the distant future

The fact that this PR discovered bugs in our recipes is all the proof I need that this is useful. I'll review any separate PRs for bugs found in packages and then we can merge this. Thanks for reviewing @pradyunsg.

TIL about footnotes!1

Footnotes

  1. https://github.blog/changelog/2021-09-30-footnotes-now-supported-in-markdown-fields/!

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Oct 3, 2023

I still have to check if I need to add more patching: pypa/pip#12309 (comment)

Edit: ok, everything seems to be about shebangs really, and I'm recursing the full install prefix, so should be fine.

adamjstewart pushed a commit that referenced this pull request Oct 4, 2023
* py-isort: needs setuptools build dep before v5

Detected in #40224.

In the past, system setuptools could be picked up when using an external
python, so py-isort@4 would install fine. With the linked PR, pip can
only consider packages that Spack controls from PYTHONPATH, so the issue
of missing py-setuptools showed up.

* py-importlib-metadata: fix lowerbounds on python

* review

* py-isort unconditionally add optional setuptools dep to prevent picking up user package at runtime

* style

* drop optional py-setuptools run dep
@haampie haampie force-pushed the python/more-isolation branch from 45b28ba to 103afea Compare October 4, 2023 14:06
@haampie haampie force-pushed the python/more-isolation branch from 103afea to 537a109 Compare October 4, 2023 14:11
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Oct 4, 2023

@adamjstewart should be ready for review now

@haampie haampie force-pushed the python/more-isolation branch from 537a109 to 2432879 Compare October 4, 2023 15:35
We run pip with `--no-build-isolation` because we don't wanna let pip
install build deps.

As a consequence, when pip runs hooks, it runs hooks of *any* package it
can find in `sys.path`.

For Spack-built Python this includes user site packages -- there
shouldn't be any system site packages. So in this case it suffices to
set the environment variable PYTHONNOUSERSITE=1.

For external Python, more needs to be done, cause there is no env
variable that disables both system and user site packages; setting the
`python -S` flag doesn't work because pip runs subprocesses that don't
inherit this flag (and there is no API to know if -S was passed)

So, for external Python, an empty venv is created before invoking pip in
Spack's build env ensures that pip can no longer see anything but
standard libraries and `PYTHONPATH`.

The downside of this is that pip will generate shebangs that point to
the python executable from the venv. So, for external python an extra
step is necessary where we fix up shebangs post install.
@haampie haampie force-pushed the python/more-isolation branch from 2432879 to 58a340d Compare October 4, 2023 15:37
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Oct 4, 2023

And by now I mean now.

Coverage is low, but it's tested in bootstrapping gha w/o coverage anyways.

@adamjstewart adamjstewart merged commit 0f43074 into spack:develop Oct 4, 2023
@haampie haampie deleted the python/more-isolation branch October 5, 2023 06:46
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Oct 5, 2023

Thanks for the review all :)

AdhocMan pushed a commit to AdhocMan/spack that referenced this pull request Oct 10, 2023
* py-isort: needs setuptools build dep before v5

Detected in spack#40224.

In the past, system setuptools could be picked up when using an external
python, so py-isort@4 would install fine. With the linked PR, pip can
only consider packages that Spack controls from PYTHONPATH, so the issue
of missing py-setuptools showed up.

* py-importlib-metadata: fix lowerbounds on python

* review

* py-isort unconditionally add optional setuptools dep to prevent picking up user package at runtime

* style

* drop optional py-setuptools run dep
AdhocMan pushed a commit to AdhocMan/spack that referenced this pull request Oct 10, 2023
We run pip with `--no-build-isolation` because we don't wanna let pip
install build deps.

As a consequence, when pip runs hooks, it runs hooks of *any* package it
can find in `sys.path`.

For Spack-built Python this includes user site packages -- there
shouldn't be any system site packages. So in this case it suffices to
set the environment variable PYTHONNOUSERSITE=1.

For external Python, more needs to be done, cause there is no env
variable that disables both system and user site packages; setting the
`python -S` flag doesn't work because pip runs subprocesses that don't
inherit this flag (and there is no API to know if -S was passed)

So, for external Python, an empty venv is created before invoking pip in
Spack's build env ensures that pip can no longer see anything but
standard libraries and `PYTHONPATH`.

The downside of this is that pip will generate shebangs that point to
the python executable from the venv. So, for external python an extra
step is necessary where we fix up shebangs post install.
haampie added a commit to haampie/spack that referenced this pull request Oct 31, 2023
haampie added a commit that referenced this pull request Nov 1, 2023
* Revert "Improve build isolation in PythonPipBuilder (#40224)"

This reverts commit 0f43074.

* Revert "py-setuptools: sdist + rpath patch backport (#40205)"

This reverts commit 512e41a.
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Nov 2, 2023
* py-isort: needs setuptools build dep before v5

Detected in spack#40224.

In the past, system setuptools could be picked up when using an external
python, so py-isort@4 would install fine. With the linked PR, pip can
only consider packages that Spack controls from PYTHONPATH, so the issue
of missing py-setuptools showed up.

* py-importlib-metadata: fix lowerbounds on python

* review

* py-isort unconditionally add optional setuptools dep to prevent picking up user package at runtime

* style

* drop optional py-setuptools run dep
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Nov 2, 2023
We run pip with `--no-build-isolation` because we don't wanna let pip
install build deps.

As a consequence, when pip runs hooks, it runs hooks of *any* package it
can find in `sys.path`.

For Spack-built Python this includes user site packages -- there
shouldn't be any system site packages. So in this case it suffices to
set the environment variable PYTHONNOUSERSITE=1.

For external Python, more needs to be done, cause there is no env
variable that disables both system and user site packages; setting the
`python -S` flag doesn't work because pip runs subprocesses that don't
inherit this flag (and there is no API to know if -S was passed)

So, for external Python, an empty venv is created before invoking pip in
Spack's build env ensures that pip can no longer see anything but
standard libraries and `PYTHONPATH`.

The downside of this is that pip will generate shebangs that point to
the python executable from the venv. So, for external python an extra
step is necessary where we fix up shebangs post install.
gabrielctn pushed a commit to gabrielctn/spack that referenced this pull request Nov 24, 2023
* Revert "Improve build isolation in PythonPipBuilder (spack#40224)"

This reverts commit 0f43074.

* Revert "py-setuptools: sdist + rpath patch backport (spack#40205)"

This reverts commit 512e41a.
mtaillefumier pushed a commit to mtaillefumier/spack that referenced this pull request Dec 14, 2023
* Revert "Improve build isolation in PythonPipBuilder (spack#40224)"

This reverts commit 0f43074.

* Revert "py-setuptools: sdist + rpath patch backport (spack#40205)"

This reverts commit 512e41a.
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Jan 11, 2024
We run pip with `--no-build-isolation` because we don't wanna let pip
install build deps.

As a consequence, when pip runs hooks, it runs hooks of *any* package it
can find in `sys.path`.

For Spack-built Python this includes user site packages -- there
shouldn't be any system site packages. So in this case it suffices to
set the environment variable PYTHONNOUSERSITE=1.

For external Python, more needs to be done, cause there is no env
variable that disables both system and user site packages; setting the
`python -S` flag doesn't work because pip runs subprocesses that don't
inherit this flag (and there is no API to know if -S was passed)

So, for external Python, an empty venv is created before invoking pip in
Spack's build env ensures that pip can no longer see anything but
standard libraries and `PYTHONPATH`.

The downside of this is that pip will generate shebangs that point to
the python executable from the venv. So, for external python an extra
step is necessary where we fix up shebangs post install.
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Jan 31, 2024
We run pip with `--no-build-isolation` because we don't wanna let pip
install build deps.

As a consequence, when pip runs hooks, it runs hooks of *any* package it
can find in `sys.path`.

For Spack-built Python this includes user site packages -- there
shouldn't be any system site packages. So in this case it suffices to
set the environment variable PYTHONNOUSERSITE=1.

For external Python, more needs to be done, cause there is no env
variable that disables both system and user site packages; setting the
`python -S` flag doesn't work because pip runs subprocesses that don't
inherit this flag (and there is no API to know if -S was passed)

So, for external Python, an empty venv is created before invoking pip in
Spack's build env ensures that pip can no longer see anything but
standard libraries and `PYTHONPATH`.

The downside of this is that pip will generate shebangs that point to
the python executable from the venv. So, for external python an extra
step is necessary where we fix up shebangs post install.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants