Skip to content

ensurepip error during Poetry venv creation when using outdated Python versions #1697

@edmorley

Description

@edmorley

Looking at the buildpack metrics, I found a handful of failing builds from a single app, where the Poetry venv install was failing with:

-----> Installing Poetry 1.8.4
Error: Command '['/tmp/codon/tmp/cache/.heroku/python-poetry/venv/bin/python', '-Im', 'ensurepip', '--upgrade', '--default-pip']' returned non-zero exit status 1.

 !     Internal Error: Unable to create virtual environment for Poetry.
 !     
 !     The 'python -m venv' command to create a virtual environment did
 !     not exit successfully.
 !     
 !     See the log output above for more information.

In theory this should never fail, since the python -m venv command creates a virtual environment (including installing pip) using the wheels bundled in the Python stdlib, so doesn't have to fetch any remote resources - so in general should not be fallible unless there was a disk I/O error or similar.

Unfortunately, the venv module doesn't surface the underlying error message which made this hard to debug, particularly since I initially couldn't get it to reproduce either on another app or locally.

After a fair amount of trial and error, I managed to track this down to an intersection of the following:

  1. In some old versions of Python (3.8.0 to 3.8.13, 3.9.0 to 3.9.13, 3.10.0 to 3.10.5), there was a bug in the ensurepip module where it didn't correctly enable isolated mode (-I) for the child Python process: ensurepip bootstrap breaks out of isolated environment python/cpython#90355. Which means the current working directory is added to sys.path when ensurepip is invoked during venv creation.
  2. As of Python 3.3+, Python treats directories with no __init__.py as a namespace package, rather than ignoring the directory: https://peps.python.org/pep-0420/
  3. Some less common third-party forks of the Nginx buildpack create a directory named brotli in the build directory (example). (Even though they don't add it to PATH or otherwise use it directly.)
  4. Even though pip doesn't use the brotli package, one of its vendored dependencies (urllib3) uses it in an optional feature, and prior to pip 22.1 the brotli import wasn't correctly patched out/replaced during vendoring: Disable brotli import in urllib3 pypa/pip#10993. This meant pip would try and import brotli from sys.path when it was run (compared to all of the other imports, which are imported from pip's vendor directory). And whilst the urllib3 implementation gracefully handles brotli not being installed (since it was an optional feature), if it finds the module, then it (quite reasonably) presumes it has the expected API.

When all of the above conditions are combined, the result is that ensurepip will fail with an AttributeError: module 'brotli' has no attribute 'error' exception (which sadly isn't shown in the logs, due to the venv command eating the output of the ensurepip subprocess).

This can be reproduced without the buildpack in question by creating a directory named brotli in the app source (with any file within it, so Git is able to commit it). The app needs to be using one of the old Python versions listed above, and using Poetry as the package manager (which currently is the only package manager for which we invoke venv, but in the future we'll be using venv in other places too).

The best way for apps to avoid this issue is to update to a newer patch release of their Python version. Not only will this include a fix for the ensurepip bug, but also several other security vulnerabilities. We do emit warnings already for these out of date patch versions, but unfortunately some users ignore them.

From the buildpack side, our options are:

  1. Block these outdated Python patch versions (either just when using venv / Poetry, or entirely).
  2. Open PRs against the affected third-party buildpacks so they don't create a top-level brotli directory.
  3. Run the python -m venv command from a different working directory (eg /tmp), so it's not affected by the contents of the build directory even when isolation mode is not correctly enabled.

I'm inclined to go with (3) given it's a small easy to maintain workaround, that won't require users to change Python patch version (as much as they really should just update for security reasons).

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions