[ty] Reimplement equivalence as mutual redundancy#23428
Conversation
Typing conformance resultsNo changes detected ✅ |
|
Memory usage reportSummary
Significant changesClick to expand detailed breakdownflake8
trio
sphinx
prefect
|
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
unresolved-attribute |
1 | 0 | 17 |
type-assertion-failure |
0 | 0 | 10 |
invalid-argument-type |
0 | 0 | 7 |
redundant-cast |
0 | 5 | 0 |
unsupported-operator |
0 | 0 | 4 |
invalid-return-type |
0 | 1 | 1 |
invalid-assignment |
0 | 1 | 0 |
invalid-await |
1 | 0 | 0 |
missing-argument |
0 | 1 | 0 |
| Total | 2 | 8 | 39 |
6b1a08e to
04aa5cc
Compare
|
Overall the ecosystem results look great:
|
Move the detailed docstring from `Redundancy` to `PureRedundancy` (the "pure" mathematical version used for type equivalence), reorder so `PureRedundancy` comes first, and add a new docstring to `Redundancy` explaining how it differs as a practical adaptation for union simplification. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
There was a problem hiding this comment.
We deleted the tests from here because it's harder to construct the synthesized variants directly without .normalize(). And it doesn't seem important to directly test something if the behaviour can't be captured in an mdtest.
099517d to
65e5cc2
Compare
This comment was marked as resolved.
This comment was marked as resolved.
| // `Generator` special case: Prior to 3.13, the `_ReturnT_co` type didn't appear in any | ||
| // methods (except `__iter__`, but that returns the self type recursively, so it can't rule | ||
| // out equivalence). We don't want generators with different return types to be equivalent | ||
| // to each other. In this case we compare the `ClassType`s nominally. | ||
| if let Protocol::FromClass(self_class) = self.inner | ||
| && let Protocol::FromClass(other_class) = other.inner | ||
| && self_class.known(db) == Some(KnownClass::Generator) | ||
| && other_class.known(db) == Some(KnownClass::Generator) | ||
| && Program::get(db).python_version(db) < PythonVersion::PY313 | ||
| { | ||
| return (*self_class).is_equivalent_to_impl(db, *other_class, inferable, visitor); | ||
| } |
There was a problem hiding this comment.
Some of the code from #23386 can now be removed, but the fix that PR also applied to Type::satisfies_protocol is untouched by this PR
| pub(crate) fn normalized(self, db: &'db dyn Db) -> Self { | ||
| self.normalized_impl(db, &NormalizedVisitor::default()) | ||
| } | ||
|
|
| self, | ||
| db: &'db dyn Db, | ||
| other: Type<'db>, | ||
| inferable: InferableTypeVars<'_, 'db>, |
There was a problem hiding this comment.
Removing this parameter is probably supported by the ecosystem results, but I think it is a subtle change in semantics and we might want to keep it for now, since the caller might have a particular "inferable typevar" set in mind that the equivalence check should take into account.
(Eventually the goal is that all of the type relations will use constraint sets to record information about typevars. has_relation_to_impl still currently performs some checks directly, and in different ways depending on whether a typevar is inferable or not. The goal is for has_relation_to_impl to always return a constraint set, which doesn't care about whether a typevar is inferable or not; and then for things like is_assignable_to to call satisfied_by_all_typevars instead of is_always_satisfied. And at that point, all of these methods will drop this parameter. But right now, I think it's best for them all to be consistent with each other in terms of still taking the parameter in.)
There was a problem hiding this comment.
Removing this parameter was necessary to avoid a big regression in our inference of urllib.parse.urlunparse() in this PR, which showed up in the ecosystem results on early versions of this PR. All of the new regression tests in crates/ty_python_semantic/resources/mdtest/call/overloads.md on this PR branch only pass because this parameter has been removed.
There was a problem hiding this comment.
I can restore the parameter and just pass in InferrableTypeVars::None to all callsites, if you prefer. But that felt a little silly.
There was a problem hiding this comment.
It seems to me, based on my analysis of the modified callsites, that Alex is right that we currently never check type equivalence in a context in which we should be inferring typevars at all (modulo the AlwaysTruthy/AlwaysFalsy case I commented on, where I don't think we should be checking equivalence anyway).
And this intuitively makes sense to me as well -- generally typevar inference should be happening in assignability checks, not equivalence checks. So it seems reasonable to me to just eliminate this parameter. But definitely interested if you think I'm missing something, @dcreager .
There was a problem hiding this comment.
That convinces me! And this formulation will still be correct in the future world where we're not passing this parameter around at all.
| .when_equivalent_to( | ||
| db, | ||
| current_parameter_type, | ||
| overload.inferable_typevars, |
There was a problem hiding this comment.
Here, for instance — with this change we're now ignoring that some of the typevars in the overload are inferable and some aren't, and are now treating all typevars as non-inferable. Is that correct? (I'm not sure!)
There was a problem hiding this comment.
I'm convinced by the added test ("Unknown argument with TypeVar overload") that this is a correct change. This equivalence check is not part of typevar inference from arguments (which is where we need to treat some typevars as inferable), it's an "internal" part of the overload algorithm, comparing overload parameter types for equivalence with other overload parameter types, in order to determine which overloads are capable of distinguishing based on an argument. This check should treat all parameter types (including typevars) as non-inferable and equivalent only to themselves.
| .when_equivalent_to( | ||
| db, | ||
| current_parameter_type, | ||
| overload.inferable_typevars, |
There was a problem hiding this comment.
I'm convinced by the added test ("Unknown argument with TypeVar overload") that this is a correct change. This equivalence check is not part of typevar inference from arguments (which is where we need to treat some typevars as inferable), it's an "internal" part of the overload algorithm, comparing overload parameter types for equivalence with other overload parameter types, in order to determine which overloads are capable of distinguishing based on an argument. This check should treat all parameter types (including typevars) as non-inferable and equivalent only to themselves.
| overload.inferable_typevars, | ||
| ) | ||
| .is_always_satisfied(db) | ||
| .is_equivalent_to(db, first_overload_return_type) |
There was a problem hiding this comment.
Rationale here is the same as above. This equivalence check is an "internal" check simply to decide whether two overloads return the same type or not, it's not part of generic inference from arguments. So we should not be inferring types for any typevars as part of this check, we should consider typevar returns equivalent only to themselves.
| self, | ||
| db: &'db dyn Db, | ||
| other: Type<'db>, | ||
| inferable: InferableTypeVars<'_, 'db>, |
There was a problem hiding this comment.
It seems to me, based on my analysis of the modified callsites, that Alex is right that we currently never check type equivalence in a context in which we should be inferring typevars at all (modulo the AlwaysTruthy/AlwaysFalsy case I commented on, where I don't think we should be checking equivalence anyway).
And this intuitively makes sense to me as well -- generally typevar inference should be happening in assignability checks, not equivalence checks. So it seems reasonable to me to just eliminate this parameter. But definitely interested if you think I'm missing something, @dcreager .
|
After rebasing on top of 1e33c4e, more changes unexpectedly started showing up in the ecosystem report: 5 ruff/crates/ty_vendored/vendor/typeshed/stdlib/typing.pyi Lines 1461 to 1464 in a83276d That meant that this assertion failed on from ty_extensions import static_assert, is_equivalent_to
from typing import Awaitable
static_assert(not is_equivalent_to(Awaitable[str], Awaitable[int]))which then meant that we were inaccurately reporting a bunch of But the fix that 1e33c4e applied for A regression test has been added in fa0a563! |
|
Re-requesting review just to be safe! Ready for another look. |
carljm
left a comment
There was a problem hiding this comment.
I love this PR so much. Great work tracking down the ecosystem changes and adding appropriate regression tests!
This looks good to me. I'd prefer if @dcreager could take a look and see if our conclusions about the inferrable typevars thing make sense to him, before we land.
dcreager
left a comment
There was a problem hiding this comment.
👍 to the inferable typevar parameter changes, thank you for looking into that and coming up with nice tests exercising it!
| self, | ||
| db: &'db dyn Db, | ||
| other: Type<'db>, | ||
| inferable: InferableTypeVars<'_, 'db>, |
There was a problem hiding this comment.
That convinces me! And this formulation will still be correct in the future world where we're not passing this parameter around at all.
| relation_visitor, | ||
| disjointness_visitor, | ||
| ) | ||
| }) |
There was a problem hiding this comment.
And this formulation works even in the presence of typevars. It will return the typevar solutions that cause the redundancy check to succeed in both directions.
|
Thanks both for the reviews!! Let's go 😃 🚀 |
|
Let's go |
Summary
Equivalence is now implemented via "mutual redundancy"* rather than a custom deeply recursive query. This change:
self.is_equivalent_to(db, other)the same performance characteristics asself.is_subtype_of(db, other) && other.is_subtype_of(db, self), removing a major footgun from the codebase(*Well, in actual fact, we introduce yet another
TypeRelationvariant to achieve this rather than reusing the existingRedundancyvariant. I've called it "PureRedundancy" currently, but the name can be bikeshedded. It works exactly the same way asRedundancy, except that it makes sure thatLiteral[False]is always equivalent toLiteral[False], whereasLiteral[False]is not always redundant withLiteral[False]if the firstLiteral[False]is promotable and the second is not.)Test Plan
QUICKCHECK_TESTS=1000000 cargo test --profile=profiling -p ty_python_semantic -- --ignored types::property_tests::stablelocallyCo-authored-by: Carl Meyer [email protected]