[ty] support implicit recursive union type aliases#22238
[ty] support implicit recursive union type aliases#22238mtshiba wants to merge 32 commits intoastral-sh:mainfrom
Conversation
Typing conformance results improved 🎉The percentage of diagnostics emitted that were expected errors increased from 81.39% to 81.52%. The percentage of expected errors that received a diagnostic increased from 71.60% to 72.25%. Summary
True positives addedDetails
|
|
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
invalid-argument-type |
92 | 27 | 2,152 |
invalid-key |
0 | 305 | 0 |
invalid-assignment |
70 | 5 | 117 |
type-assertion-failure |
10 | 0 | 96 |
invalid-return-type |
26 | 0 | 62 |
possibly-missing-attribute |
15 | 0 | 62 |
unresolved-attribute |
0 | 1 | 49 |
unused-type-ignore-comment |
8 | 28 | 0 |
unsupported-operator |
10 | 0 | 16 |
not-iterable |
0 | 3 | 18 |
no-matching-overload |
19 | 0 | 0 |
invalid-context-manager |
1 | 0 | 10 |
invalid-parameter-default |
0 | 0 | 9 |
invalid-type-form |
0 | 0 | 9 |
not-subscriptable |
0 | 0 | 9 |
assert-type-unspellable-subtype |
0 | 0 | 4 |
redundant-cast |
0 | 2 | 2 |
invalid-await |
2 | 0 | 1 |
invalid-type-arguments |
0 | 0 | 3 |
missing-typed-dict-key |
0 | 2 | 0 |
call-non-callable |
0 | 1 | 0 |
| Total | 253 | 374 | 2,619 |
CodSpeed Performance ReportMerging this PR will degrade performance by 11.29%Comparing Summary
Performance Changes
|
|
The codspeed (walltime) results seem to have improved in some cases and worsened in others. |
|
mypy and pyright seem to expand type aliases as much as possible, regardless of whether they are implicit or PEP-695 style (in the case of recursive type aliases, mypy seems to omit the recursive part and display it such as |
… of mypy_primer diffs" This reverts commit 1b16d89.
|
I see this was marked draft -- I'm assuming that means you don't feel it is ready for review/merge yet? |
|
It appears that the interaction of recent changes with the changes in this PR has caused |
8cc4c89 to
a100e6c
Compare
|
I opened #22786. Please check that out first. |
Everything else seems basically fine, so I'll mark it as ready for review. |
carljm
left a comment
There was a problem hiding this comment.
Reviewed the test changes here and most of the code changes. Overall this looks really good! Need to look more closely at the union-type changes, and the ecosystem impact. This will likely uncover more cases where we aren't handling Type::TypeAlias correctly, since implicit type aliases are much more used than PEP 695 ones today.
carljm
left a comment
There was a problem hiding this comment.
Thanks again for working on this, and sorry about the delay to complete my review. Lots of good stuff here, but I think we can do better than all the repeat inference we do in this PR.
Looking at the ecosystem report, the new aioredis diagnostics are false positives and a regression. It looks like we aren't resolving up the new implicit type alias type there in the generics constraint solver, so the constraint solver can't match to the formal parameter and infer typevars. I'm guessing this is a simple change to fix? (It may even be a pre-existing bug that would also show up if the same code used PEP 695 type alias, not sure.)
| # error: [invalid-argument-type] | ||
| elif isinstance(x, Any | NamedTuple | list[int]): | ||
| reveal_type(x) # revealed: int | list[int] | bytes | ||
| # TODO: This should be an error |
There was a problem hiding this comment.
What is the limitation that prevents an error in these cases?
| /// The type of the full union, which can be used when this `UnionType` instance | ||
| /// is used in a type expression context. For `int | str`, this would contain |
There was a problem hiding this comment.
But we don't use this method when the UnionType instance is used in a type expression context, we use custom logic there to create a TypeAliasType for the lazy variant. So this is maybe not the best way to describe the function.
| /// `<class 'str'>`. For `Union[int, str]`, this field is `None`, as we infer | ||
| /// the elements as type expressions. Use `value_expression_types` to get the | ||
| /// corresponding value expression types. | ||
| #[returns(ref)] |
There was a problem hiding this comment.
Why did we remove returns-ref here?
| UnionTypeInstanceInner::Lazy(_) => Err(InvalidTypeExpressionError { | ||
| fallback_type: Type::unknown(), | ||
| invalid_expressions: smallvec::smallvec_inline![ | ||
| InvalidTypeExpression::InvalidType(ty, self.definition.scope(db)) | ||
| ], | ||
| }), |
There was a problem hiding this comment.
This should be unreachable, right? Should we mark it as such instead of emitting a confusing error?
| other => Err(InvalidTypeExpressionError { | ||
| fallback_type: Type::unknown(), | ||
| invalid_expressions: smallvec::smallvec_inline![ | ||
| InvalidTypeExpression::InvalidType(other, self.definition.scope(db)) | ||
| ], | ||
| }), |
There was a problem hiding this comment.
Shouldn't this also be unreachable?
| // Convert eager union instances to lazy for assignments | ||
| // This enables lazy evaluation for recursive type aliases like `Foo = int | list["Foo"]` | ||
| let value_ty = if let Some(instance) = value_ty.as_union_type_instance() | ||
| && let Some(eager) = instance.as_eager(self.db()) |
There was a problem hiding this comment.
So at this point we have already created an eager UnionTypeInstance, which means we've already inferred all its elements as both value and type expressions. Then, if it's convertible to lazy, we throw all of that away and store just the definition.
I wonder a) why we need to be so eager about creating the union_type that we do it before we check if we can convert to lazy, and b) why we throw away the value types rather than storing them in the lazy variant too.
I realize that typing.Union would need to be handled differently since there we never infer elements as value expressions. But in that case it seems like we have more control (we already know it's a union before inferring any element types) -- I think we could have special handling in the assignment inference for a top-level typing.Union subscripting, where we create a lazy union instance and don't infer any of the elements at all until we need them (using the existing infer_deferred_types mechanism that is already used for elements of TypeVar etc.)
| DefinitionKind::AnnotatedAssignment(assignment) => assignment.value(&module).unwrap(), | ||
| _ => unreachable!(), | ||
| }; | ||
| let value_expression = Expression::new( |
There was a problem hiding this comment.
This means we re-infer the expression type again, potentially twice. Why do we need to do this, when we already inferred its value types in type inference? Why can't the lazy UnionTypeInstance variant, and the implicit TypeAliasType variant, both store the value types we originally inferred?
(I realize this means typing.Union would need a different path; discussed that in a different comment.)
| cycle_initial=lazy_union_value_type_cycle_initial, | ||
| heap_size=ruff_memory_usage::heap_size | ||
| )] | ||
| fn lazy_union_value_type<'db>( |
There was a problem hiding this comment.
This method seems basically like a copy of ImplicitTypeAliasType::value_type, including the cycle handling. Could one call the other, or extract a shared helper?
Summary
Part of astral-sh/ty#1738
This PR allows implicit (or PEP-613) union type aliases to be self-referential.
To achieve this, two new Rust structs are added:
LazyUnionTypeInstanceandImplicitTypeAliasType.As a result, the previous
UnionTypeInstanceis now calledEagerUnionTypeInstance. Also, anImplicit(ImplicitTypeAliasType)variant is added toTypeAliasType.First, a union type instance is created as
EagerUnionTypeInstance. Then, when registering a binding, it is checked whether it can be promoted toLazyUnionTypeInstance. The promotion criteria include whether it is a valid union type and whether it is a non-generic type (this should be relaxed in the future, but is marked as TODO for this PR).LazyUnionTypeInstanceis converted toImplicitTypeAliasTypein a type context.ImplicitTypeAliasTypebehaves almost identically to PEP-695 type alias (except that it doesn't have a generic context yet).There are other
KnownInstances that can be converted toImplicitTypeAliasTypein a type context, but in this PR we will focus on union type instances.Also, this change means that implicit type aliases are now shown with their name rather than their expanded type.
This is preferable, assuming there is reasonable intent behind the naming of type aliases.
Fixes astral-sh/ty#1883 (with #22241)
Fixes astral-sh/ty#1998
Performance analysis
#22238 (comment)
Ecosystem analysis
The new false positive errors appears to be due to astral-sh/ty#2015. Type aliases should basically behave the same as value types, but as in astral-sh/ty#2015, inconsistencies in behavior have been observed in implementations, which may also affect performance.
Test Plan
New corpus test
mdtest updated