[red-knot] Rewrite `Type::try_iterate()` to improve type inference and diagnostic messages by AlexWaygood · Pull Request #16321 · astral-sh/ruff

AlexWaygood · 2025-02-22T22:05:51Z

Summary

This PR rewrites Type::try_iterate() to greatly improve the quality of our type inference for edge cases involving iterables, and greatly improve the quality of our diagnostics. IterateError is renamed to IterationError, and ~~is expanded to have 10(!) inner variants~~. (Following code review, I've managed to reduce this to 4 inner variants.)

Fixes #16272. Fixes #16123. Helps a lot with #13989.

Note: some of the diagnostic messages are now somewhat verbose. In the long run, I think we could probably change a lot of them to be diagnostics with concise messages but with several notes attached to them.

Test Plan

Several new mdtests, and lots of diagnostic snapshots

MichaReiser

The error messages look great. Although they're all rather long now. It makes me wonder if now is the right time to improve them or if we should just wait for when we have notes (or another mechanism to explain why a certain diagnostic was created) to make the improvement.

Either way. I'm concerned about having 10 IterateError variants. It doesn't seem to scale well and I'm somewhat convinced that we'll have to retain even more information for diagnostics in the future (e.g. the range where the __getitem__ and __iterate__ method are defined).

This makes me believe that we should not distinguish between all those variants in IterateError but instead redo some of the iterate logic in the IterateError::report_diagnostic.

crates/red_knot_python_semantic/src/types.rs

AlexWaygood · 2025-02-23T14:23:30Z

The error messages look great.

Thanks!

Although they're all rather long now. It makes me wonder if now is the right time to improve them or if we should just wait for when we have notes (or another mechanism to explain why a certain diagnostic was created) to make the improvement.

Yeah. The main reasons why I wanted to do something now were:

We have some error messages on main that are just flat-out incorrect, and are pretty confusing ([red-knot] Incorrect error message when iterating over an object that has an __iter__ method that might not return an object with a __next__ method #16272)
We finally can have both type inference that is more precise and diagnostics that are more informative, following your great work this week refacting Type::call() and Type::iterate(). I wanted to try it out 😄
I think we will want all this information in our diagnostics. I agree that we probably won't want it presented all in one long sentence in the final version of our diagnostics, but it seems to me that this work will make it easier to migrate to those kinds of diagnostics-with-notes when the time comes -- because the changes I'm making here mean that when the time comes, all the information will be available to us to produce those kinds of diagnostics.

Either way. I'm concerned about having 10 IterateError variants.

I looked again following your comments, and managed to reduce it to 4 variants without sacrificing the precision of our type inference or our diagnostic messages.

...ntic/resources/mdtest/snapshots/unpacking.md_-_Unpacking_-_Right_hand_side_not_iterable.snap

MichaReiser · 2025-02-23T15:54:13Z

crates/red_knot_python_semantic/resources/mdtest/loops/for.md


 ## With non-callable iterator

+<!-- snapshot-diagnostics -->


I'm curious to hear your take on this and how you decided to use snapshoting because it's not clear to me when we want to use snapshot tests. I've found updating the messages very painful when working on unsupported-bool-conversion but having the assertions (and messages) inline is significantely more readable and avoids bugs slipping through by accepting the snapshot tests.

I think it's a failure if we start using snapshot tests everywhere (or for a large majority of tests) because the experience is than very close to what we have from Ruff. Instead, we should use them mainly to validate that the diagnostic ranges are correct.

It's a great question.

Overall I feel like mdtest (as it is currently -- obviously we can keep improving it!) is fantastic in a lot of ways:

It allows us to write tests very concisely and declaratively

It's very easy to make lots of assertions on which types are revealed in which cases

It's very easy to make lots of assertions that "this Python code leads us to emit an error on that line with error code X".

What I think mdtest isn't very well suited to right now is making assertions on the full error message:

It's very tedious to have to go through all locations and update the asserted error message if you make a change to the error message

Even if you do make an assertion on the full error message, you don't see the full details of how the diagnostic will be rendered to the user. And as we've been discussing, that seems pretty relevant here, since cramming all the information into one sentence obviously isn't the ideal way of presenting these diagnostics.

[This one is very fixable] I was pretty surprised that my changes didn't initially lead to any tests failing, even though I had changed a bunch of error messages. That's because currently if the error message is "Object X is not iterable because foo bar baz" and the assertion is error: [not-iterable] "Object X is not iterable", the assertion will pass, since mdtest just checks whether the actual error message contains the asserted errror message (it doesn't check that the asserted error message is a fullmatch).

So using snapshots felt like a good fit for a lot of these tests, where what we really want to test is how the information is reported to the user, and where we expect that the presentation of the diagnostics will continue to change in the future. However, I do agree that it's not ideal:

It would be so much nicer with inline snapshots; having them far away from the Python snippets makes it much harder to tell whether the snapshot is correct or incorrect

It's much easier to just blindly accept the snapshot without checking that it's correct

Our current snapshotting setup for mdtest isn't ideal. cargo test -p red_knot_python_semantic stops testing a markdown file as soon as it finds a failing snapshot in that markdown file. That meant that I had to run cargo test -p red_knot_python_semantic; cargo insta review several times before all snapshots for for.md were updated, which was pretty frustrating.

All told, I'm also not sure that using this much snapshotting is a great idea right now. I'm happy to switch it to standard mdtest assertions on full diagnostic messages if you'd prefer it :-)

Thanks for the detailed explanation. I'd probably create one snapshot test for each unique diagnostic rendering variant. All tests that only enforce type inference feature should not use snaphshot testing. I leave it up to you to assess whether that's the case.

I suspect that @carljm has opinions on this as well ;)

I'd probably create one snapshot test for each unique diagnostic rendering variant.

That makes sense to me. I did a quickish audit and removed a few hundred lines of snapshots. We're pretty much there on this PR now, I think.

I don't think I have a lot to add here. As far as the current state, I think I basically agree with what Micha suggests above.

This thread did prompt me to add a few thoughts over on https://github.com/astral-sh/ruff/issues/16255

Regarding messages in # error: comments being a "contains" check rather than a "full match", that was intentional, the idea being to reduce update pain by only asserting the critical information that a certain test wants to ensure is present in the message. I'm not sure this works out in practice, though; we almost always use the full message, and it reads pretty oddly if you don't. So I'd be OK changing this to full match.

Regarding messages in # error: comments being a "contains" check rather than a "full match", that was intentional, the idea being to reduce update pain by only asserting the critical information that a certain test wants to ensure is present in the message. I'm not sure this works out in practice, though; we almost always use the full message, and it reads pretty oddly if you don't. So I'd be OK changing this to full match.

Yeah, I remember the reasons for the decision! But I also agree that it hasn't proven useful in practice. And as I say, it actually confused me briefly while working on this PR -- so I'd also favour changing it now to require a full match.

Hmm, I tried out making this change and a surprising number of tests failed. I guess we use this feature more than I thought.

It's probably worth revisiting this once we have support for subdiagnostics/notes. Once those are implemented, we'll generally have shorter summary lines in our diagnostics, so it'll be less cumbersome to enforce a fullmatch on the summary line as part of mdtest.

crates/red_knot_python_semantic/src/types.rs

…d diagnostic messages

carljm

This looks great to me, thank you! I think the number of variants of IterationErrorKind is reasonable, and the current PR does a good job of deferring any extra work that is solely for diagnostic purposes into the path where there definitely is a diagnostic to report (mostly by wrapping the underlying call error in the iteration error, which seems like a good pattern to me.)

carljm · 2025-02-24T18:36:49Z

crates/red_knot_python_semantic/resources/mdtest/loops/for.md


 ## With non-callable iterator

+<!-- snapshot-diagnostics -->


I don't think I have a lot to add here. As far as the current state, I think I basically agree with what Micha suggests above.

This thread did prompt me to add a few thoughts over on https://github.com/astral-sh/ruff/issues/16255

Regarding messages in # error: comments being a "contains" check rather than a "full match", that was intentional, the idea being to reduce update pain by only asserting the critical information that a certain test wants to ensure is present in the message. I'm not sure this works out in practice, though; we almost always use the full message, and it reads pretty oddly if you don't. So I'd be OK changing this to full match.

crates/red_knot_python_semantic/resources/mdtest/loops/for.md

...snapshots/for.md_-_For_loops_-_Possibly_unbound_`__iter__`_and_bad_`__getitem__`_method.snap

crates/red_knot_python_semantic/src/types.rs

* main: (38 commits) [red-knot] Use arena-allocated association lists for narrowing constraints (#16306) [red-knot] Rewrite `Type::try_iterate()` to improve type inference and diagnostic messages (#16321) Add issue templates (#16213) Normalize inconsistent markdown headings in docstrings (#16364) [red-knot] Better diagnostics for method calls (#16362) [red-knot] Add argfile and windows glob path support (#16353) [red-knot] Handle pipe-errors gracefully (#16354) Rename `venv-path` to `python` (#16347) [red-knot] Fixup some formatting in `infer.rs` (#16348) [red-knot] Restrict visibility of more things in `class.rs` (#16346) [red-knot] Add diagnostic for class-object access to pure instance variables (#16036) Add `per-file-target-version` option (#16257) [PLW1507] Mark fix unsafe (#16343) [red-knot] Add a test to ensure that `KnownClass::try_from_file_and_name()` is kept up to date (#16326) Extract class and instance types (#16337) Re-order changelog entries for 0.9.7 (#16344) [red-knot] Add support for `@classmethod`s (#16305) Update Salsa (#16338) Update Salsa part 1 (#16340) Upgrade Rust toolchain to 1.85.0 (#16339) ...

AlexWaygood added the ty Multi-file analysis & type inference label Feb 22, 2025

AlexWaygood requested review from MichaReiser, carljm and sharkdp as code owners February 22, 2025 22:05

AlexWaygood force-pushed the alex/bad-iter-msg-2 branch 2 times, most recently from 85abfd5 to 2c8bab5 Compare February 22, 2025 22:16

MichaReiser reviewed Feb 23, 2025

View reviewed changes

crates/red_knot_python_semantic/src/types.rs Show resolved Hide resolved

crates/red_knot_python_semantic/src/types.rs Show resolved Hide resolved

AlexWaygood force-pushed the alex/bad-iter-msg-2 branch 2 times, most recently from 1bce75e to c08746f Compare February 23, 2025 14:08

MichaReiser approved these changes Feb 23, 2025

View reviewed changes

AlexWaygood added 10 commits February 24, 2025 13:30

[red-knot] Rewrite Type::try_iterate() to improve type inference an…

4f565cc

…d diagnostic messages

improve tests

5cb6859

Reduce the number of variants

0d969b5

add back some accidentally deleted docs

5d6b762

improve tests

e293a4b

Improve naming consistency and reduce line lengths

a918d95

more docs

b6a10c3

shorten variant names and variable names

1418981

Fix HTML comment typo and add TODO about diagnostic explanations

a48a7e5

Fix type inference edge case

243164c

AlexWaygood force-pushed the alex/bad-iter-msg-2 branch 2 times, most recently from 4d26391 to a9b984c Compare February 24, 2025 18:23

remove some unnecessary snapshots

2f53378

AlexWaygood force-pushed the alex/bad-iter-msg-2 branch from a9b984c to 2f53378 Compare February 24, 2025 18:24

AlexWaygood mentioned this pull request Feb 24, 2025

[red-knot] Make it easier to avoid typos in  mdtest HTML comments #16352

Closed

carljm approved these changes Feb 24, 2025

View reviewed changes

Improve description and precision of edge-case test

685c4c3

AlexWaygood force-pushed the alex/bad-iter-msg-2 branch from 55aa839 to 90016d9 Compare February 25, 2025 13:57

better handling of unions

31f4ebe

AlexWaygood force-pushed the alex/bad-iter-msg-2 branch from 90016d9 to 31f4ebe Compare February 25, 2025 13:58

AlexWaygood enabled auto-merge (squash) February 25, 2025 14:01

AlexWaygood merged commit 5c007db into astral-sh:main Feb 25, 2025
20 checks passed

AlexWaygood mentioned this pull request Feb 25, 2025

[red-knot] Ignore surrounding whitespace when looking for  directives in mdtests #16380

Merged

AlexWaygood mentioned this pull request Feb 27, 2025

[red-knot] Switch to a handwritten parser for mdtest error assertions #16422

Merged

AlexWaygood mentioned this pull request Apr 29, 2025

[red-knot] Improve not-iterable diagnostic #13989

Closed

AlexWaygood deleted the alex/bad-iter-msg-2 branch July 25, 2025 18:32

Comments

Conversation

AlexWaygood commented Feb 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

AlexWaygood commented Feb 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

carljm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AlexWaygood commented Feb 22, 2025 •

edited

Loading

AlexWaygood commented Feb 23, 2025 •

edited

Loading