[ty] Provide docstrings for stdlib APIs when hovering over them in an IDE#19311
[ty] Provide docstrings for stdlib APIs when hovering over them in an IDE#19311AlexWaygood merged 4 commits intomainfrom
Conversation
dhruvmanila
left a comment
There was a problem hiding this comment.
Thanks! This is great
Is there a reason that this is in a separate repository? As it's a script, can it be added / moved to the Ruff repository mainly so that it's easier to maintain?
|
Do we know if the script (or something else) that Pylance is using to fetch these docstrings is open-sourced and available? If so, can it be used by us? |
The published ty artifact (built with uv build in the ty repository) increases from 3187088 (3.18MB) to 3218280 (3.21MB). I think that's neglectable. |
.github/workflows/sync_typeshed.yaml
Outdated
| uvx --python=3.12 --from=git+https://github.com/AlexWaygood/docstring-adder.git add-docstrings --stdlib-path ./typeshed/stdlib | ||
| uvx --python=3.11 --from=git+https://github.com/AlexWaygood/docstring-adder.git add-docstrings --stdlib-path ./typeshed/stdlib | ||
| uvx --python=3.10 --from=git+https://github.com/AlexWaygood/docstring-adder.git add-docstrings --stdlib-path ./typeshed/stdlib | ||
| uvx --python=3.9 --from=git+https://github.com/AlexWaygood/docstring-adder.git add-docstrings --stdlib-path ./typeshed/stdlib |
There was a problem hiding this comment.
Does typeshed not support Python 3.8. If that's the case, it probably doesn't make sense for ty to still support Python 3.8 😆
There was a problem hiding this comment.
That's correct, it only supports 3.9+ these days
|
It's good to know that our parser is fast enough that the size increase in code to parse doesn't impact runtime performance in a meaningful way Do you know why this isn't something that has been done before? |
I believe it's closed-source. An early open-source prototype is available at https://github.com/gramster/stubsplit, but note that the README for that project says:
which is basically what I've done ;) |
One reason why I'd be interested in keeping it separate is that I'd like to explore running this script at build time in https://github.com/typeshed-internal/stub_uploader when uploading typeshed's third-party stubs packages to PyPI. This would also be really beneficial to ty users, as well as users of other type checkers. I'd be interested in moving the repo to the |
I think that would be great. You should have the necessary permissions to create a new repository |
I'm not sure we actually know that from this PR, because this PR itself doesn't actually add docstrings (it just adds the workflow that means they'll be auto-added in the next typeshed sync). I was going to trigger the workflow manually immediately after landing this, but I can open a draft PR now with all the docstrings added to check performance doesn't degrade on Codspeed. |
Whooops. I didn't realize this. That also means that my binary size measurement is off because what I did is checkout this PR. Can you measure the binary size increase of the released ty artifact (use |
For the latest release of ty, I have these numbers:
Updating the submodule to #19327, I have:
|
|
The Codspeed report on #19327 reports some regressions, but these appear most pronounced on the microbenchmarks (which makes sense, as parsing the vendored typeshed stubs takes up a much higher percentage of the total execution time for smaller projects). There are regressions of up to 4% on the microbenchmarks, a 2% regression on the cold tomllib benchmark, and regressions of 1% or lower for all other benchmarks. |
|
Thanks @AlexWaygood for getting all those numbers. The binary size increase makes way more sense than the numbers I shared. I think those regressions are fine, considering the value they provide in an IDE context and in-stubs documentation has much better ergonomics when reading a builtin-stub file in the IDE over an external JSON file. I also don't think that it justifies shipping typeshed twice. The long-term solution here is to pre-process typeshed so that we don't need to parse the files in the first place. |
Okay, the repo is now at https://github.com/astral-sh/docstring-adder ! |
|
I manually triggered the workflow, and it created #19334 -- everything seems to be working as expected 🥳 |
* dcreager/merge-arguments: (223 commits) fix docs Combine CallArguments and CallArgumentTypes [ty] Sync vendored typeshed stubs (#19334) [`refurb`] Make example error out-of-the-box (`FURB122`) (#19297) [refurb] Make example error out-of-the-box (FURB177) (#19309) [ty] ignore errors when reformatting codemodded typeshed (#19332) [ty] Provide docstrings for stdlib APIs when hovering over them in an IDE (#19311) [ty] Add virtual files to the only project database (#19322) Add t-string fixtures for rules that do not need to be modified (#19146) [ty] Remove `FileLookupError` (#19323) [ty] Fix handling of metaclasses in `object.<CURSOR>` completions [ty] Use an interval map for scopes by expression (#19025) [ty] List all `enum` members (#19283) [ty] Handle configuration errors in LSP more gracefully (#19262) [ty] Use python version and path from Python extension (#19012) [`pep8_naming`] Avoid false positives on standard library functions with uppercase names (`N802`) (#18907) Update Rust crate toml to 0.9.0 (#19320) [ty] Fix server version (#19284) Update NPM Development dependencies (#19319) Update taiki-e/install-action action to v2.56.13 (#19317) ...

Summary
I made a codemod that will auto-add docstrings to stub files by dynamically inspecting the value of the docstrings at runtime. This PR adds a step to our typeshed-sync workflow that applies the codemod, so that we always have docstrings for the stdlib checked into our vendored stubs for the standard library. This will allow us to display the docstrings when users hover over stdlib symbols in their IDE.
The source code for the codemod is here. The changes the codemod makes can be viewed here.
The only issue I know of is that if you have version-dependent method definitions, e.g.then the codemod will only add a docstring to the definition in the firstsys.version_infobranch, not the second. This issue only exists for nested scopes, however; version-dependent class or function definitions in the global scope should have docstrings added to all definitions without issue.^EDIT: I fixed this issue.
Codemodding docstrings into the stubs increases the size of the vendored-typeshed zipfile that we include as part of the ty binary. Locally, a release build of the ty binary increases in size from 38.7MB to 39.9MB. I think that's probably worth it, given that there's no other way to provide docstrings for C-extension modules in the stdlib. Even modules that are nominally written in Python, such as the
typingmodule, often have several classes in them that are actually written in C (typing.TypeVar, for example); it would be impossible for ty to obtain docstrings for these classes by inspecting the runtime source code of the stdlib, so codemodding the docstrings into the stub seems to be a more resilient strategy here.Codemodding docstrings into the stubs at typeshed-sync time is preferable to attempting to maintain these docstrings upstream in typeshed, because docstrings are constantly changing upstream in CPython, and it would be extremely difficult to keep the copies of these docstrings in typeshed up to date. An automated codemod solves this issue.
Test Plan