Skip to content

[ty] Remove redundant re-exports that share the same top-most module#22581

Merged
BurntSushi merged 5 commits intomainfrom
ag/remove-redundant-auto-import-symbols
Jan 15, 2026
Merged

[ty] Remove redundant re-exports that share the same top-most module#22581
BurntSushi merged 5 commits intomainfrom
ag/remove-redundant-auto-import-symbols

Conversation

@BurntSushi
Copy link
Member

The idea here is to avoid flooding completions with a bunch of
redundant symbols when those symbols are just re-exports of one
another. And in particular, to avoid suggesting symbols "deeper"
in the module graph.

The implementation here is (to me) surprisingly complicated. The main
constraints leading to some of that complexity are:

  1. Trying to limit the redundant detection to as few of the symbols
    we extract as possible. In particular, while I haven't done benchmarking
    on this, I perceive the redundancy detection to be somewhat expensive
    and auto-import can return lots of symbols. So we're careful to only do
    this extra checking on (typically) small groups of symbols that could
    possibly be merged.

  2. Even by restricting our work, this merging process could still be
    called quite a bit. (Thousands of times in my "standard data scienc-y
    test environment.") So I went out of my way to amortize allocs.

  3. Re-exports can form a chain and we want to find all of them.

  4. We (probably) don't want to remove redundant re-exports unless
    they share the same top-level module. Otherwise, e.g., a library
    that re-exports another library's symbols could have all of its
    re-exports dropped.

  5. We want to only keep the top-most re-exports, and there may be
    multiple such re-exports. So we keep all of them.

  6. We can't assume anything about the relationship of re-exports
    and the original definition. A re-export could be deeper in the
    module hierarchy than the original definition or above it.

Reviewers are encouraged to read this PR commit by commit.

Fixes astral-sh/ty#1857

@BurntSushi
Copy link
Member Author

Demo:

2026-01-14T14.17.47-05.00.mp4

Here's what the status quo looks like:

2026-01-14T14.19.40-05.00.mp4

@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 14, 2026

Typing conformance results

No changes detected ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 14, 2026

mypy_primer results

Changes were detected when running on open source projects
prefect (https://github.com/PrefectHQ/prefect)
- src/integrations/prefect-dbt/prefect_dbt/core/settings.py:94:28: error[invalid-assignment] Object of type `T@resolve_block_document_references | int | dict[str, Any] | ... omitted 4 union elements` is not assignable to `dict[str, Any]`
+ src/integrations/prefect-dbt/prefect_dbt/core/settings.py:94:28: error[invalid-assignment] Object of type `T@resolve_block_document_references | dict[str, Any]` is not assignable to `dict[str, Any]`
- src/integrations/prefect-dbt/prefect_dbt/core/settings.py:99:28: error[invalid-assignment] Object of type `int | T@resolve_variables | float | ... omitted 4 union elements` is not assignable to `dict[str, Any]`
+ src/integrations/prefect-dbt/prefect_dbt/core/settings.py:99:28: error[invalid-assignment] Object of type `T@resolve_variables | dict[str, Any]` is not assignable to `dict[str, Any]`
- src/prefect/cli/deploy/_core.py:86:21: error[invalid-assignment] Object of type `T@resolve_block_document_references | int | dict[str, Any] | ... omitted 4 union elements` is not assignable to `dict[str, Any]`
+ src/prefect/cli/deploy/_core.py:86:21: error[invalid-assignment] Object of type `T@resolve_block_document_references | dict[str, Any]` is not assignable to `dict[str, Any]`
- src/prefect/cli/deploy/_core.py:87:21: error[invalid-assignment] Object of type `int | T@resolve_variables | float | ... omitted 4 union elements` is not assignable to `dict[str, Any]`
+ src/prefect/cli/deploy/_core.py:87:21: error[invalid-assignment] Object of type `T@resolve_variables` is not assignable to `dict[str, Any]`
- src/prefect/deployments/steps/core.py:137:38: error[invalid-argument-type] Argument is incorrect: Expected `T@resolve_variables`, found `T@resolve_block_document_references | int | dict[str, Any] | ... omitted 4 union elements`
+ src/prefect/deployments/steps/core.py:137:38: error[invalid-argument-type] Argument is incorrect: Expected `T@resolve_variables`, found `T@resolve_block_document_references | dict[str, Any]`
- src/prefect/utilities/templating.py:320:13: error[invalid-assignment] Invalid subscript assignment with key of type `object` and value of type `T@resolve_block_document_references | int | dict[str, Any] | ... omitted 4 union elements` on object of type `dict[str, Any]`
+ src/prefect/utilities/templating.py:320:13: error[invalid-assignment] Invalid subscript assignment with key of type `object` and value of type `T@resolve_block_document_references | dict[str, Any]` on object of type `dict[str, Any]`
- src/prefect/utilities/templating.py:323:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_block_document_references | dict[str, Any]`, found `list[T@resolve_block_document_references | int | dict[str, Any] | ... omitted 5 union elements]`
+ src/prefect/utilities/templating.py:323:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_block_document_references | dict[str, Any]`, found `list[T@resolve_block_document_references | dict[str, Any] | Unknown]`
- src/prefect/utilities/templating.py:437:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `dict[object, int | T@resolve_variables | float | ... omitted 5 union elements]`
+ src/prefect/utilities/templating.py:437:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `dict[object, T@resolve_variables | Unknown]`
- src/prefect/utilities/templating.py:442:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `list[int | T@resolve_variables | float | ... omitted 5 union elements]`
+ src/prefect/utilities/templating.py:442:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `list[T@resolve_variables | Unknown]`
- src/prefect/workers/base.py:232:13: error[invalid-argument-type] Argument is incorrect: Expected `T@resolve_variables`, found `T@resolve_block_document_references | int | dict[str, Any] | ... omitted 4 union elements`
+ src/prefect/workers/base.py:232:13: error[invalid-argument-type] Argument is incorrect: Expected `T@resolve_variables`, found `T@resolve_block_document_references | dict[str, Any]`
- src/prefect/workers/base.py:234:20: error[invalid-argument-type] Argument expression after ** must be a mapping type: Found `int | T@resolve_variables | float | ... omitted 4 union elements`
+ src/prefect/workers/base.py:234:20: error[invalid-argument-type] Argument expression after ** must be a mapping type: Found `T@resolve_variables`

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
+ src/scikit_build_core/build/wheel.py:99:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 47 diagnostics
+ Found 48 diagnostics

static-frame (https://github.com/static-frame/static-frame)
- static_frame/core/bus.py:671:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[Bus[Any], object_]`, found `InterGetItemLocReduces[Bus[Any] | Bottom[Series[Any, Any]] | ndarray[Never, Never] | ... omitted 6 union elements, object_]`
- static_frame/core/bus.py:675:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Bus[Any], object_]`, found `InterGetItemILocReduces[Bus[Any] | Bottom[Index[Any]] | TypeBlocks | ... omitted 6 union elements, object_ | Self@iloc]`
+ static_frame/core/bus.py:675:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Bus[Any], object_]`, found `InterGetItemILocReduces[Self@iloc | Bus[Any], object_ | Self@iloc]`
- static_frame/core/node_selector.py:526:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[TVContainer_co@InterfaceSelectQuartet, Any]`, found `InterGetItemLocReduces[Unknown | Bottom[Series[Any, Any]], Any]`
+ static_frame/core/node_selector.py:526:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[TVContainer_co@InterfaceSelectQuartet, Any]`, found `InterGetItemLocReduces[Bottom[Series[Any, Any]] | Unknown, Any]`
- static_frame/core/series.py:772:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Series[Any, Any], TVDtype@Series]`, found `InterGetItemILocReduces[Series[Any, Any] | ndarray[Never, Never] | TypeBlocks | ... omitted 6 union elements, TVDtype@Series]`
+ static_frame/core/series.py:772:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Series[Any, Any], TVDtype@Series]`, found `InterGetItemILocReduces[Series[Any, Any] | Bottom[Index[Any]] | TypeBlocks | ... omitted 6 union elements, TVDtype@Series]`
- static_frame/core/yarn.py:418:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Yarn[Any], object_]`, found `InterGetItemILocReduces[Yarn[Any] | Bottom[Index[Any]] | TypeBlocks | ... omitted 6 union elements, object_]`
- Found 1825 diagnostics
+ Found 1823 diagnostics

core (https://github.com/home-assistant/core)
- homeassistant/util/variance.py:47:12: error[invalid-return-type] Return type does not match returned value: expected `(**_P@ignore_variance) -> _R@ignore_variance`, found `_Wrapped[_P@ignore_variance, _R@ignore_variance | int | float | datetime, _P@ignore_variance, _R@ignore_variance | int | float | datetime]`
- Found 14509 diagnostics
+ Found 14508 diagnostics

No memory usage changes detected ✅

@BurntSushi BurntSushi force-pushed the ag/remove-redundant-auto-import-symbols branch from d259075 to 987d25d Compare January 14, 2026 19:23
@BurntSushi BurntSushi added server Related to the LSP server ty Multi-file analysis & type inference labels Jan 14, 2026
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. There's certainly a fair amount of complexity in merge (functional but also to avoid allocations) that requires a fair amount of concentration to understand what's going on.

I have one design question on whether filtering out re-export must be sensitive to the current module.

/// assert_eq!(ModuleName::new_static("foo.bar.baz").unwrap().top(), "foo");
/// ```
#[must_use]
pub fn top(&self) -> &str {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm leaning towards using root here, which IMO goes better with ancestors and parent than top, but I can't specifically say why :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure whether "root" was really appropriate since I'm not sure, e.g., a.b.c has a "root" at a per se.

I could just make this more verbose and call it first_component. Which I think I kind of like better since it's less ambiguous and it isn't frequently used either, so the verbosity doesn't bother me.

Comment on lines 234 to 239
/// * Another thing we might consider doing differently here is to not
/// completely remove symbols deeper in the package hierarchy, and
/// instead just expose the fact that they are redundant somehow on
/// `AllSymbolInfo`. Then callers (like for completions) might just
/// choose to rank the redundant symbols much lower. But for now, we
/// just remove them entirely.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One concern that I have with removing it without considering the outer context is if you're working on said library. Let's say I work on pandas and are writing a module in pandas.io.parsers.readers or a sibling package of it. It's probably desired to then import the symbol from the closest module rather than always preferring the outer most module?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maaaaybe. I think that's very plausible, but I'm not sure.

The other similar thing in this space is that we might only want to do this merging in non-first party code, under the reasoning that in first party code, you probably want to see all your re-exports. That would also I think address this concern without coming up with a notion of what "closest module" means.

(We talked about this in our 1-on-1. The reason I didn't go down this route is that it's pretty annoying to set up the test infrastructure for it.)

If you're okay with it, I'd like to move forward with this as-is and prioritize future work here based on user feedback. I filed astral-sh/ty#2519 to track this.

Comment on lines 435 to 436
merged.extend(group.merge(db));
group.clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this semantically the same as "finishing" a group? If so, should we add a finish_into(&mut self, target: &mut Vec) method (or finish(&mut self) -> impl Iterator<...> ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that kind of API is what I started with. But the iterator returned borrows from the Group and that makes it difficult to clear the other state.

This could be fixed with a dedicated iterator type and a Drop impl that does the clearing, but it just seemed like too much ceremony for a one-time-use API. If this were a "real" API with broader scope, yeah absolutely, it'd be worth some strong encapsulation and coupling reduction.

Comment on lines +305 to +308
// Merge the kind from the original symbol into the
// reexported symbol. We do this unconditionally since
// it is applicable in all cases, even when the symbols
// are in different package hierarchies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are cases where the kind between two symbols with the same fully qualified name is different?

Would two different kinds suggest that these are different symbols?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good question! I added this comment to the source clarifying why:

                    // The reason why we do this unconditionally is that
                    // reexport symbols usually do not have a very specific
                    // kind, and are typically just assigned a kind of
                    // "variable." This is because the symbol extraction
                    // reports these symbols as-is by inspecting import
                    // statements and doesn't follow them to their original
                    // definition to discover their "true" kind. But that's
                    // exactly what we're doing here! We have the original
                    // definition. So assign the kind from the original,
                    // even when it crosses package hierarchies.

Does that help?

Previously, we were constructing this at a higher level layer. But this
commit pushes it down a layer of abstraction. This shouldn't result in
constructing the fully qualified name any more frequently than we
previously did. Namely, we're still only doing it for symbols that match
the caller provided search query.

The motivation here is that we want to do some de-duplication based on
module name, and having the fully qualified name of a symbol seems like
the most straight-forward way to go about this. (We'll need one more
piece of information that we add in a subsequent commit.)
This information should let us filter out (or rather, merge)
re-exported symbols across a package hierarchy for the purposes
of auto-completions.
This roughly mimics how pandas defines and re-exports
`read_csv` at the top-level module.
@BurntSushi BurntSushi force-pushed the ag/remove-redundant-auto-import-symbols branch from 987d25d to bfc1573 Compare January 15, 2026 15:43
The invariants established by the constructors for `ModuleName`
guarantee that this is always available. It's useful for
determining the "top level" module for where a symbol lives.
The implementation here is (to me) surprisingly complicated. The main
complications are:

1. Trying to limit the redundant detection to as few of the symbols
we extract as possible. In particular, while I haven't done benchmarking
on this, I perceive the redundancy detection to be somewhat expensive
and auto-import can return lots of symbols. So we're careful to only do
this extra checking on (typically) small groups of symbols that could
possibly be merged.

2. Even by restricting our work, this merging process could still be
called quite a bit. (Thousands of times in my "standard data scienc-y
test environment.") So I went out of my way to amortize allocs.

3. Re-exports can form a chain and we want to find all of them.

4. We (probably) don't want to remove redundant re-exports unless
they share the same top-level module. Otherwise, e.g., a library
that re-exports another library's symbols could have all of its
re-exports dropped.

5. We want to only keep the top-most re-exports, and there may be
multiple such re-exports. So we keep all of them.

6. We can't assume anything about the relationship of re-exports
and the original definition. A re-export could be deeper in the
module hierarchy than the original definition or above it.
@BurntSushi BurntSushi force-pushed the ag/remove-redundant-auto-import-symbols branch from bfc1573 to e092bd5 Compare January 15, 2026 15:59
@BurntSushi BurntSushi merged commit 048f02f into main Jan 15, 2026
49 checks passed
@BurntSushi BurntSushi deleted the ag/remove-redundant-auto-import-symbols branch January 15, 2026 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

server Related to the LSP server ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-import can show redundant completions when the same item is exposed from multiple modules

2 participants