[red-knot] Add support for re-export conventions for imports#15848
[red-knot] Add support for re-export conventions for imports#15848dhruvmanila wants to merge 3 commits intomainfrom
Conversation
CodSpeed Performance ReportMerging #15848 will not alter performanceComparing Summary
|
250c6ea to
6b3a600
Compare
6b3a600 to
0fbe7db
Compare
## Summary Related to #15848, this PR adds the imports explicitly as we'll now flag these symbols as undefined. --------- Co-authored-by: Alex Waygood <[email protected]>
0abf09b to
66aa74c
Compare
|
Hmm, it's not obvious to me why we see a 4% regression in the incremental benchmark. It either suggests that we now create more ingredients (which I didn't see) or that some queries are invalidated more often (which they shouldn't, because the benchmark only inserts a comment). Did you look into where we're spending more time? |
Oh, I missed that completely as I didn't see the comment being updated and I didn't expect this change to regress. Let me look into it. Thanks for catching that. Edit: I think I've a hunch on where it might be but need to confirm. The |
carljm
left a comment
There was a problem hiding this comment.
Thank you for adjusting here as we figured out on the fly what the right semantics should be! The difference in behavior between the stub and non-stub case definitely made this more complex than I'd initially realized. In hindsight we could have postponed the non-stub opt-in rule entirely, but I think we would have wanted it eventually, so this way it forces us to consider it in the design.
This first review pass is mostly just about the tests and the semantics; will do a more detailed review of the code changes once we have that stuff nailed down.
crates/red_knot_python_semantic/resources/mdtest/import/conventions.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/import/conventions.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/import/conventions.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/import/conventions.md
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/resources/mdtest/import/conventions.md
Outdated
Show resolved
Hide resolved
|
Todo:
|
d757fc0 to
17245b2
Compare
| qualifiers: TypeQualifiers, | ||
| re_export: ReExport, |
There was a problem hiding this comment.
did you consider encoding the information on whether or not the type is a re-export in the qualifiers field? It feels like "metadata carried along with the type" in a similar way to the way that whether a symbol is a class variable or not is additional metadata that we carry around with the type of the symbol. We could possibly even rename TypeQualifiers to TypeMetadata, and add documentation saying that the metadata we record includes (but is not limited to) type qualifiers such as ClassVar?
I suppose the ReExport::Maybe variant means that we'd have to have two separate flags on the TypeQualifiers struct: MAYBE_REEXPORT and DEFINITE_REEXPORT.
Curious what @carljm thinks of this idea
There was a problem hiding this comment.
I like that idea and it seems more prominent now because for the binding, the Type would also need to contain the re-export metadata. My only concern with this would be that the metadata structure would be shared for both declared and binding type so there's a chance that we could mistakenly set metadata for binding type while it's only relevant for a declared type. For example, setting the Final qualifier. Any thoughts on this? If it's not a big concern then I might prefer to go this route.
There was a problem hiding this comment.
The feeling I've had in looking at this PR is that this should be metadata on a Definition (possibly actually just on the Import and ImportFrom variants of DefinitionKind, which might save some space if those are not currently the largest variants). Then it would be handled in symbol_from_bindings and symbol_from_declarations (where we'd ignore non-re-exported definitions if querying a public global symbol from a stub file). In general, I think we should be looking for ways for this metadata to spread less widely, not more widely, because the cases where it is relevant are limited. I don't think it needs to travel with Types in general, and I think ideally code in TypeInferenceBuilder would be totally unaffected by this change.
If we were just implementing the stub-file semantics this approach would be pretty simple, I think. The trickier part is the implicit-reexport rule, where we do want the type from the definition, plus some metadata that allows us to emit a diagnostic. To do this, we need symbol_from_bindings, symbol_from_declarations, and symbol to be able to pass back the metadata via Symbol that "this type came from, or maybe came from, a not-re-exported definition", and then we do at least need the import handling code in TypeInferenceBuilder to inspect that metadata and emit a diagnostic.
I'm very tempted to just consider the opt-in implicit-reexport rule out of scope for now, because it's adding a lot of complexity here, and I think the much more important thing right now is to fix the builtins.pyi problem. The ability to also opt-in to this rule in diagnostic form for non-stub files is nice to have, but not high priority right now.
There was a problem hiding this comment.
Happy to go with Carl's suggestion here; he's looked at this more closely than I have and it sounds very reasonable!
There was a problem hiding this comment.
The feeling I've had in looking at this PR is that this should be metadata on a
Definition(possibly actually just on theImportandImportFromvariants ofDefinitionKind, which might save some space if those are not currently the largest variants). Then it would be handled insymbol_from_bindingsandsymbol_from_declarations(where we'd ignore non-re-exported definitions if querying a public global symbol from a stub file). In general, I think we should be looking for ways for this metadata to spread less widely, not more widely, because the cases where it is relevant are limited. I don't think it needs to travel with Types in general, and I think ideally code inTypeInferenceBuilderwould be totally unaffected by this change.
I looked into implementing based on this comment, I originally thought of this as well but I didn’t pursue this because of the requirement of inferring the types.
To give context, the plan is to define a is_reexported field on both ImportDefinitionKind and ImportFromDefinitionKind and use that as Definition::is_reeexported which would default to true for non-import definitions.
This will be then used in symbol_from_declarations and symbol_from_bindings to filter out the definitions. This creates a problem as both functions does not know whether the symbol is coming from the current module or from another module. This means that for the following code (file other.pyi):
from typing import Any
# error: Name `Any` used when not defined
reveal_type(Any)The reason being that while looking up the Any symbol in the reaveal_type call, we’ll first encounter the import definition which doesn’t re-export the symbol but the symbol_from_* does not know that this symbol is coming from the current module itself. We’ll need to pass additional info for the symbol_from_* functions to only consider the re-exports from another module.
Implementation diff: https://github.com/astral-sh/ruff/compare/dhruv/re-export-2?expand=1
Given the above context, I took a stab at addressing some of the short comings for conditional re-exports via the existing implementation and I think I've addressed Carl's feedback from #15848 (comment).
I've added four more examples to test conditional re-exports.
There was a problem hiding this comment.
We’ll need to pass additional info for the symbol_from_* functions to only consider the re-exports from another module.
Yes, I meant to imply that with "if querying a public global symbol from a stub file", but I should have been more explicit. Adding this extra context to symbol and friends doesn't bother me because I think we'll need it anyway in order to resolve #15777 -- our current conflation of "type as seen by import" and "type as seen by use from nested scope" is not going to last. We probably need a function specifically for getting an imported symbol from a different file.
I still think it would be better to switch to this approach and add this additional context. I don't like how the re-export metadata in the current version of the diff infects everywhere we use Symbol, even though it is relevant in so few places, and I suspect this is also related to the regression.
I think it would actually be better to separate the stub-file unconditional behavior from the implicit-reexport rule as separate PRs. One small reason is just that it would be nice to add the implicit-reexport rule once we already have full support for making it opt-in, to avoid TODO churn in other tests. It is worth considering how we can implement the opt-in rule on top of the second approach, but I think the plan would be fairly clear: the function that implements getting a symbol's type for import would return a wrapper around Symbol that also includes the re-exported metadata, so the importing code can issue the diagnostic if necessary.
90b0e80 to
85f5fd0
Compare
85f5fd0 to
dcfd595
Compare
|
|
||
| if coinflip(): | ||
| from c import f | ||
| from c import f as f |
There was a problem hiding this comment.
can we add a TODO here? once we are able to properly make implicit-reexport rule disabled by default, I would rather this test not use this explicit re-export syntax, it distracts from the point of the test
|
|
||
| if coinflip(): | ||
| from c import x | ||
| from c import x as x |
| qualifiers: TypeQualifiers, | ||
| re_export: ReExport, |
There was a problem hiding this comment.
We’ll need to pass additional info for the symbol_from_* functions to only consider the re-exports from another module.
Yes, I meant to imply that with "if querying a public global symbol from a stub file", but I should have been more explicit. Adding this extra context to symbol and friends doesn't bother me because I think we'll need it anyway in order to resolve #15777 -- our current conflation of "type as seen by import" and "type as seen by use from nested scope" is not going to last. We probably need a function specifically for getting an imported symbol from a different file.
I still think it would be better to switch to this approach and add this additional context. I don't like how the re-export metadata in the current version of the diff infects everywhere we use Symbol, even though it is relevant in so few places, and I suspect this is also related to the regression.
I think it would actually be better to separate the stub-file unconditional behavior from the implicit-reexport rule as separate PRs. One small reason is just that it would be nice to add the implicit-reexport rule once we already have full support for making it opt-in, to avoid TODO churn in other tests. It is worth considering how we can implement the opt-in rule on top of the second approach, but I think the plan would be fairly clear: the function that implements getting a symbol's type for import would return a wrapper around Symbol that also includes the re-exported metadata, so the importing code can issue the diagnostic if necessary.
| # error: [possibly-unbound-import] "Member `Foo` of module `a` is possibly unbound" | ||
| from a import Foo | ||
|
|
||
| reveal_type(Foo) # revealed: Literal[Foo] | str |
There was a problem hiding this comment.
This should be just str; the Foo class type should not be visible here at all. I think this behavior change will naturally fall out of the alternate implementation strategy.
| ```pyi | ||
| ``` | ||
|
|
||
| ## Conditional re-export in stub file |
There was a problem hiding this comment.
These new tests look great! Modulo one comment below
|
Closing in favor of #16073 |
This is an alternative implementation to #15848. ## Summary This PR adds support for re-export conventions for imports for stub files. **How does this work?** * Add a new flag on the `Import` and `ImportFrom` definitions to indicate whether they're being exported or not * Add a new enum to indicate whether the symbol lookup is happening within the same file or is being queried from another file (e.g., an import statement) * When a `Symbol` is being queried, we'll skip the definitions that are (a) coming from a stub file (b) external lookup and (c) check the re-export flag on the definition This implementation does not yet support `__all__` and `*` imports as both are features that needs to be implemented independently. closes: #14099 closes: #15476 ## Test Plan Add test cases, update existing ones if required.
Summary
This PR adds support for re-export conventions for imports.
The implementation differs for stub files and runtime files as follows:
unresolved-importand the inferred type isUnknownby way of making the symbol is unboundimplicit-reexportand the type is inferred as usualThe current implementation updates the
Symbol::Typevariant to include a newReExportenum which specifies whether the symbol is defined in an import statement and if so if it's being implicitly or explicitly re-exported. This is then used downstream to perform the check.This implementation does not yet support
__all__and*imports as both are features that needs to be implemented independently.Other potential solution
While implementing this, another potential solution would be to use the visibility qualifier for a symbol to specify whether this is an implicitly exported symbol which translates to:
This would then mean that this enum could be created for the
Definitionwhile building the semantic index and it could also consider the__all__values. One issue that I saw with this is that the values in__all__could come from another module which isn't possible to do in the semantic index builder because that does not cross the file boundary.closes: #14099
closes: #15476
Test Plan
Add test cases, update existing ones if required.