Skip to content

Pipstar extras not matched due to hyphen vs underscore normalization #3587

@kpark-hrp

Description

@kpark-hrp

🐞 bug report

Affected Rule

pip.parse / whl_library (Pipstar extras resolution)

Is this a regression?

Yes, this worked in rules_python 1.6.3 (non-Pipstar path). The extras fix in #3468 correctly passes extras down the call stack, but a normalization mismatch prevents them from being matched against Requires-Dist markers.

Description

PR #3468 fixed the issue of extras not being passed down in Pipstar. However, extras containing hyphens in requirements.txt are not normalized to underscores before being matched against the wheel's Requires-Dist metadata, which uses the PEP 685-normalized form (underscores).

For example, with sqlalchemy[asyncio,postgresql-asyncpg,postgresql-psycopg2binary]==2.0.36 in requirements.txt, the generated BUILD.bazel for the sqlalchemy whl_library shows:

extras = [
    "asyncio",
    "postgresql-asyncpg",        # hyphens from requirements.txt
    "postgresql-psycopg2binary",
],
requires_dist = [
    # ...
    "asyncpg ; extra == 'postgresql_asyncpg'",             # underscores in wheel metadata
    "psycopg2-binary ; extra == 'postgresql_psycopg2binary'",  # underscores in wheel metadata
    # ...
],

The extras list uses hyphens (postgresql-asyncpg) while Requires-Dist markers use underscores (postgresql_asyncpg). The comparison fails silently, so psycopg2-binary and asyncpg are never added as transitive dependencies of sqlalchemy.

Note that asyncio (no hyphens) works correctly because it doesn't need normalization.

Workaround: Use underscores instead of hyphens in the extras specifier in requirements.in:

sqlalchemy[asyncio,postgresql_psycopg2binary,postgresql_asyncpg]==2.0.36

🔬 Minimal Reproduction

requirements.in:

sqlalchemy[asyncio,postgresql-psycopg2binary,postgresql-asyncpg]==2.0.36
psycopg2-binary~=2.9.9
asyncpg~=0.30.0

MODULE.bazel:

pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")
pip.parse(
    hub_name = "pypi",
    python_version = "3.11",
    requirements_lock = "//:requirements.txt",
)
use_repo(pip, "pypi")

Then query:

bazel query 'kind("py_library", deps(@pypi//sqlalchemy))' | grep -E 'psycopg|asyncpg'
# Returns empty — psycopg2_binary and asyncpg are missing

On rules_python 1.6.3 (non-Pipstar), the same query correctly returns:

@pypi//psycopg2_binary:pkg
@pypi//asyncpg:pkg

🔥 Exception or Error


No error at build time. At runtime, tests fail with:
ModuleNotFoundError: No module named 'psycopg2'

🌍 Your Environment

Operating System:

  
macOS 15.3.1 (darwin 25.2.0, arm64)
  

Output of bazel version:

  
bazel 8.1.1
  

Rules_python version:

  
1.8.4
  

Anything else relevant?

The root cause appears to be that extras parsed from the requirement string in whl_library.bzl (via requirement(rctx.attr.requirement).extras) preserves the original hyphenated form from requirements.txt, but the Requires-Dist markers in wheel METADATA use PEP 685-normalized names (hyphens → underscores). The matching logic needs to normalize extras before comparison.

Related: #3352, #3468

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions