Skip to content

get sdist requires from egg-info#88

Merged
srittau merged 10 commits intotypeshed-internal:mainfrom
Avasam:get-sdist-requires-from-egg-info
May 4, 2023
Merged

get sdist requires from egg-info#88
srittau merged 10 commits intotypeshed-internal:mainfrom
Avasam:get-sdist-requires-from-egg-info

Conversation

@Avasam
Copy link
Copy Markdown
Contributor

@Avasam Avasam commented Jan 15, 2023

CC @AlexWaygood This was the idea I mentioned.

When the requirements can't be found in requires_dist, fallback to searching in *.egg-info. All without executing any arbitrary code.

I move the f"Expected dependency {req} to be present in the allowlist" error higher so it can fail earlier and leave the network requests last.

It's my first time playing around with downloading and manipulating archives in python. Let me know what I can do better here.

I tested the idea with pycocotools (best case scenario) click-default-group (on my windows machine with 7zip there's an extra folder between the gz and tar, still worked), numpy (no *.egg-info folder), mypy (has empty lines and [extras]) and pywin32 (no Source Distribution)

@Avasam
Copy link
Copy Markdown
Contributor Author

Avasam commented Jan 15, 2023

Tests found at least one case where this errors out. I'll look into it.
Edit: empty lines and [extras] definitions. Now accounted for and fixed.

Comment thread stub_uploader/metadata.py Outdated
@Avasam
Copy link
Copy Markdown
Contributor Author

Avasam commented Jan 15, 2023

Should I delete the files afterward or does it not matter as code meant for the CI? (which I assume just cleans itself up)

Comment thread stub_uploader/metadata.py Outdated
Copy link
Copy Markdown
Member

@srittau srittau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, remarks below.

Comment thread stub_uploader/metadata.py Outdated
Comment on lines +199 to +202
for info in file_in:
folder_name = info.name
# Assume a single folder
break
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a check here, whether folder_name was actually set, or maybe even check that there really is exactly one folder to account for broken eggs. But for the former case, the following should be simpler and have that required error check:

Suggested change
for info in file_in:
folder_name = info.name
# Assume a single folder
break
# Assume a single folder
folder_name = next(iter(file_in)).name

You can then also drop the initialization of folder_name.

Or just use * instead of folder_name in the glob below.

Copy link
Copy Markdown
Contributor Author

@Avasam Avasam May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just use * instead of folder_name in the glob below.

New code looks much cleaner like this, and makes no assumption. Especially since it's extracting to a temp folder now.

Comment thread stub_uploader/metadata.py Outdated
)


def get_sdist_requires(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "get" makes this function sound simpler than what it actually does. Maybe "extract"?

Also I think it would be safer to extract the sdist into an empty temporary directory than to where ever we currently are.

use glob to not have to assume match count
@Avasam Avasam requested a review from srittau May 3, 2023 16:22
@AlexWaygood
Copy link
Copy Markdown
Contributor

At least some of the test failures will be fixed by #95

@Avasam
Copy link
Copy Markdown
Contributor Author

Avasam commented May 3, 2023

I was thinking integration tests might be too fragile, but I can just pin the version I'm testing against. I'm sure you'd like me to add tests for this?

Quick testing script I've been using:

from __future__ import annotations

import requests
from packaging.requirements import Requirement

from stub_uploader.metadata import extract_sdist_requires, validate_response

for req_name in ("pycocotools", "click-default-group", "numpy", "mypy", "pywin32"):  # Pin these
    req = Requirement(req_name)

    resp = requests.get(f"https://pypi.org/pypi/{req.name}/json")
    validate_response(resp, req)
    data = resp.json()

    sdist_data = next(
        (
            url_data
            for url_data in reversed(data["urls"])
            if url_data["packagetype"] == "sdist"
        ),
        None,
    )
    if sdist_data is None:
        print(f"{req_name}: sdist_data is None!")
        continue

    requirements = extract_sdist_requires(sdist_data, req)
    print(f"{req_name}: {list(requirements)}")

Results:

pycocotools: [<Requirement('matplotlib>=2.1.0')>, <Requirement('numpy')>]
click-default-group: [<Requirement('click')>]
numpy: []
mypy: [<Requirement('typing_extensions>=3.10')>, <Requirement('mypy_extensions>=1.0.0')>, <Requirement('tomli>=1.1.0')>, <Requirement('typed_ast<2,>=1.4.0')>, <Requirement('psutil>=4.0')>, <Requirement('pip')>, <Requirement('typed_ast<2,>=1.4.0')>, <Requirement('lxml')>]
pywin32: sdist_data is None!

@srittau srittau merged commit 518ddfe into typeshed-internal:main May 4, 2023
@Avasam Avasam deleted the get-sdist-requires-from-egg-info branch May 4, 2023 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants