[ty] Add support for narrowing on tuple match cases#22134
[ty] Add support for narrowing on tuple match cases#22134charliermarsh wants to merge 5 commits intomainfrom
Conversation
Typing conformance resultsNo changes detected ✅ |
|
Given the motivating example: def func(subj: tuple[int | str, int | str]):
match subj:
case x, str():
reveal_type(subj) # tuple[int | str, str]
case y:
reveal_type(subj) # tuple[int | str, int]This PR doesn't fully solve this case, because |
|
e132b1c to
e14cf84
Compare
|
@charliermarsh why did you close this PR? Was it just that it wasn't getting a review, and didn't seem like it fully solved the case? I haven't looked at the code in detail here, but it seems to me that getting the right narrowing intersections is step one here; the remaining part is astral-sh/ty#493 -- it might be fine to land those separately? So I think we could reopen this and take a look at it? |
|
@carljm -- just that it wasn't getting a review and we had a big backlog so I figured I'd need to revisit regardless. I'll re-open and look at rebasing the changes. |
e14cf84 to
e886c7a
Compare
Oh, I think we probably weren't reviewing it because it was always in draft before! |
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
invalid-argument-type |
0 | 1 | 31 |
unsupported-operator |
0 | 0 | 2 |
unused-type-ignore-comment |
1 | 1 | 0 |
not-iterable |
0 | 0 | 1 |
type-assertion-failure |
0 | 1 | 0 |
unresolved-attribute |
0 | 0 | 1 |
| Total | 1 | 3 | 35 |
e886c7a to
9f655d8
Compare
Memory usage reportMemory usage unchanged ✅ |
859fe10 to
5333cc5
Compare
carljm
left a comment
There was a problem hiding this comment.
Thanks for doing this! I do think this is going to take some more work. But it's an important feature, so I think it's worth working on.
| }; | ||
|
|
||
| let Some(tuple_spec) = tuple_spec else { | ||
| // Subject is not a tuple type; can't determine if it matches. |
There was a problem hiding this comment.
There are a number of cases where we could conclude AlwaysFalse here: final types that can never be sequences (some literals are the easy case, but also final nominal types that aren't sequences), or sequence types whose contained type is disjoint with one or more of the sequence patterns. Seems out of scope for this PR (and lower priority than handling unions and intersections) but worth a TODO comment here.
crates/ty_python_semantic/src/semantic_index/reachability_constraints.rs
Show resolved
Hide resolved
| result | ||
| } | ||
| TupleSpec::Variable(_) => { | ||
| // Variable-length tuples could match patterns of various lengths. |
There was a problem hiding this comment.
Similarly here, we could return AlwaysFalse if the variable element type doesn't match at least one element type.
| return None; | ||
| } | ||
|
|
||
| // Negative narrowing for sequences is not supported. It would produce types like |
There was a problem hiding this comment.
I think this should be a TODO. Both pyright and mypy support negative narrowing (pyrefly doesn't).
| match subj: | ||
| case ((str(), _), _): | ||
| # The inner tuple is narrowed by intersecting with the pattern's constraint | ||
| reveal_type(subj) # revealed: tuple[tuple[int | str, int] & tuple[str, object], int | str] |
There was a problem hiding this comment.
The problem with this isn't just that it doesn't simplify, it's that it's the wrong type (too narrow). The sub-pattern (str(), _) cannot be represented as an intersection with tuple[str, object], because it doesn't require that the subject is a tuple at all! It could be any other kind of two-element sequence. It could be a list[str] that happens to have two elements. But list[str] & tuple[str, object] would simplify to Never.
So I think in these nested sequence pattern cases, we either need to not narrow at all, or we need to recursively descend with the full narrowing logic, not just fall back to an intersection.
| PatternPredicateKind::Value(expr) => Some(infer_same_file_expression_type( | ||
| self.db, | ||
| *expr, | ||
| TypeContext::default(), | ||
| self.module, | ||
| )), |
There was a problem hiding this comment.
Value patterns are compared by equality, and equality behavior is very customizable in Python, so it's not correct to just say that whatever the type of the value pattern is, it constrains the subject to that type. Should add a test for this.
| .iter() | ||
| .map(|p| self.pattern_to_type_constraint(p).unwrap_or(Type::object())) | ||
| .collect(); | ||
| Some(Type::heterogeneous_tuple(self.db, elements)) |
There was a problem hiding this comment.
This is wrong for the reason discussed above -- a sequence pattern does not constrain the subject to be a tuple.
| .iter() | ||
| .zip(element_patterns.iter()) | ||
| .map(|(element_ty, pattern)| { | ||
| if let Some(constraint_ty) = self.pattern_to_type_constraint(pattern) { |
There was a problem hiding this comment.
I think we can't fall back to an intersection with some type computed in isolation by pattern_to_type_constraint here -- we have to figure out how to recursively use the full logic of evaluate_match_*
| }; | ||
|
|
||
| // Apply pattern constraint if present. | ||
| if let Some(constraint_ty) = self.pattern_to_type_constraint(pattern) { |
There was a problem hiding this comment.
Same as above -- I think the logic to find the right element here is good, but we can't fall back to just intersecting with pattern_to_type_constraint
5333cc5 to
6586923
Compare
ad93ca5 to
9917d66
Compare
35b7394 to
f703127
Compare
|
This PR sort-of feels like it's trying to run before we can walk -- we don't even do any narrowing for this basic case yet: from typing import Sequence
def f(x: Sequence[int] | None):
match x:
case [*_]:
reveal_type(x)Whatever the length implied by a sequence pattern, and whatever the narrowing constraints implied by any sub-patterns inside the sequence pattern, we should always be able to intersect the type with with We also haven't implemented astral-sh/ty#560 yet. There's some detailed discussion in that issue describing possible ways we could implement length narrowing for unions of tuples, which look different to the way that this PR currently approaches the problem -- but we should probably share a common implementation for the two different kinds of tuple length narrowing. While I agree with @carljm that this is an important feature, to me it would make sense to do the more foundational features first and then build tuple narrowing on top of those |
|
I'll have to defer to @carljm on that. I just don't have that broader context though I'm happy to look at the linked issues and start smaller if that's the decision. It would be nice to get some consensus on that before I do yet more cycles here though, since it seems like something we could've figured out before working on it at all :) |
It makes sense to handle this case, but it also seems like an uncommon edge case that will likely be of very limited ecosystem impact on its own? Maybe we should implement it by just intersecting with that type, but maybe not, if that doesn't integrate well with how we need to handle the more advanced forms of narrowing done in this PR.
We definitely want the forms of narrowing implemented in this PR, and they are not possible to implement by "just intersecting with some type", unless we would add brand new very complex non-spellable types for I think it would be great to tackle tuple length narrowing soon/also, but I'm not seeing a strong argument here for why it should be done first. If anything, both of the above-mentioned cases make me think that it's really good we're looking at the full problem up front, so we don't design one thing for the easier cases and then have to throw it all out. (@charliermarsh feel free to hold off on this if you want until @AlexWaygood and I reach consensus here, in case I've missed something important!) |
it doesn't feel like that much of an edge case to me (as one who has written a lot of code spelunking through Python ASTs 😄), and I can certainly find some projects that would benefit from this kind of narrowing. But it's true that most projects that use sequence patterns appear to already know that they have a sequence of some kind.
Whether it's the best way (I suspect it may not be) I'm not sure, but conceptually I think it's certainly possible given this pattern: def f(x: object):
match x:
case [int(), str()]:
reveal_type(x)to apply a narrowing constraint of class P(Protocol):
def __len__(self, /) -> Literal[2]: ...
@overload
def __getitem__(self, value: Literal[0], /) -> int: ...
@overload
def __getitem__(self, value: Literal[1], /) -> str: ...To an extent, the synthesized |
|
Anyway, while I would still be inclined to build up our narrowing capabilities incrementally here, I don't feel that strongly about the order of work and it sounds like you robustly disagree :-) so feel free to proceed... |
|
I think you're right that synthesized protocols with overloaded From discussion in Discord: one concern with implementing narrowing by intersecting with structural types is that interaction of union/intersection simplification with structural types and nominal types breaking Liskov is our leading theory for the main cause of non-determinism. So it might be best to hold off here until we have clearer conclusions from the non-determinism investigation, and then evaluate whether we want to pursue intersecting with structural types or an approach more like what this PR does. Thanks @charliermarsh -- I think either way this PR will be valuable in showing what the non-protocol version of this narrowing would look like. |
Summary
Closes astral-sh/ty#561.