Skip to content

Generating distributable python environments (e.g. for use in PySpark) #2389

@brendan-morin

Description

@brendan-morin

Hi! Let's start off - this is a low priority issue and more something to think about as you continue to build out functionality and your roadmap. I understand uv uses something like copy-on-write or symbolic linking to create the environments, which helps provide its speed.

There is one use case where we have seen a really poor user experience, and that is in creating distributable python environments. In particular, ones that include a python interpreter with them so that all executors in a distributed environment can meet the same dependencies.

For some context, here is PySpark's documentation on the issue: https://spark.apache.org/docs/latest/api/python/user_guide/python_packaging.html

The gist of the issue is that regular venv environments tend not to be well suited for distribution because the symbolic linking, which does not guarantee your intended python version will be available on the executors. Right now, most folks I know who work through this use e.g. miniconda environments which bundle the interpreter as part of them.

Would be great to see if this was something uv eventually was capable of handling with a nicer user experience, especially if you get into the business of managing the python version for the environment as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or improvement to existing functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions