[red-knot] Add control flow for try/except blocks#13338
[red-knot] Add control flow for try/except blocks#13338AlexWaygood wants to merge 12 commits intomainfrom
try/except blocks#13338Conversation
|
|
The def foo():
try:
x = 42
except TypeError:
return
# x is guaranteed to be defined here
def bar():
try:
x = 42
except TypeError:
raise RuntimeError
# x is guaranteed to be defined here
def baz():
for i in range(5):
try:
x = 42
except TypeError:
continue
# x is guaranteed to be defined here
def eggs():
for i in range(5):
try:
x = 42
except TypeError:
break
# x is guaranteed to be defined here |
I discussed this in person with @carljm -- for now, we'll defer fixing this, since it's a general issue that also applies to other kinds of control flow such as |
To be more specific here: the buggy thing I was doing in an early version of the PR was this: --- a/crates/red_knot_python_semantic/src/semantic_index/builder.rs
+++ b/crates/red_knot_python_semantic/src/semantic_index/builder.rs
@@ -725,7 +725,6 @@ where
// states during the `try` block before visiting the `except` blocks.
let pre_try_block_state = self.flow_snapshot();
let saved_definition_states = std::mem::take(&mut self.try_block_definition_states);
- let visiting_nested_try_block = self.visiting_try_block;
// Visit the `try` block!
//
@@ -733,7 +732,7 @@ where
// in `self.try_block_definition_states` after each new definition is recorded.
self.visiting_try_block = true;
self.visit_body(body);
- self.visiting_try_block = visiting_nested_try_block;
+ self.visiting_try_block = false;I'm pretty sure that this was incorrect (I don't think it handles nested |
MichaReiser
left a comment
There was a problem hiding this comment.
Uff, excepts are hard haha. Great work!
carljm
left a comment
There was a problem hiding this comment.
This is fantastic!! Probably the most complex control flow we'll need to handle, due to the "control flow can depart anywhere in the middle of a block" wrinkle.
|
|
||
| assert_public_ty(&db, "src/a.py", "e", "NameError"); | ||
| assert_public_ty(&db, "src/a.py", "f", "error"); | ||
| assert_public_ty(&db, "src/a.py", "e", "Unbound | NameError"); |
There was a problem hiding this comment.
In principle I think these tests would be better if they asserted the inferred type of the name inside the except block, rather than at end of scope.
But that's quite a major pain to do in our current test framework, so I would just leave the tests as-is for now and wait for the new test framework with reveal_type :)
| } | ||
|
|
||
| #[test] | ||
| fn except_handler_multiple_definitions_in_try_block() -> anyhow::Result<()> { |
There was a problem hiding this comment.
I think a test that is missing here is something verifying that within the else block, names bound in the try block are definitely bound. (This is along the same lines as the above comment; we're generally weak on tests demonstrating inferred types inside a particular block, rather than at end of scope.)
But as mentioned above, this is difficult to do with current test infra. (You can assign the name to another name, but then that other name is still potentially unbound, so the test isn't super clear.) But it would be nice to have a TODO to add this test, once we have the new test framework.
6d1607b to
8c9e02b
Compare
|
Noting it here for myself because I'm context-switching to something else: even not considering nested try:
x = 1
except TypeError:
x = 2
else:
x = 3
finally:
passThis means that the inferred type of |
c25592d to
297ff65
Compare
|
Re-requesting review as I had to change the approach quite radically to fix issues with nested |
carljm
left a comment
There was a problem hiding this comment.
Awesome work tracking down all these different test cases and sorting out how to make them all pass!
| // If `visiting_try_block == true`, this informs `.add_definition()` to record the state | ||
| // in `self.try_block_definition_states` after each new definition is recorded. |
There was a problem hiding this comment.
This comment looks a bit outdated relative to the implementation?
| // That's actually an irrelevant detail for us here, though, because | ||
| // a nonexistent `else` or `finally` block is just represented | ||
| // as an empty `Vec` in our AST, and visiting an empty `Vec` will not | ||
| // change the flow state at all. |
There was a problem hiding this comment.
It could be relevant to perf. Snapshotting and restoring flow states is sufficiently expensive (if there are lots of symbols) that it might be worth explicitly checking for empty else and finally, if that would allow us to skip any snapshots or restores/merges. And it looks like we could skip a fair amount of those, if there is no finally body and no parent try block?
| self.flow_merge(state); | ||
| } | ||
| if !exhaustive_except_branches { | ||
| for state in visiting_try_block_states.snapshots { |
There was a problem hiding this comment.
Do we also need to merge the pre-try-block state here? (We could jump straight from before completing any statement in the try block, to executing the finally block and/or into an outer try block.)
| // Account for the fact that we could be visiting a nested `try` block, | ||
| // in which case the outer `try` block must be kept informed about all the possible | ||
| // between-definition states we could have encountered in the inner `try` block(!) | ||
| if let Some(current_try_block) = |
There was a problem hiding this comment.
I find it slightly confusing that within the same try block visit, above we refer to an outer try block as parent_try_block and here we refer to the same thing as current_try_block. Let's be consistent about what is "current" and what is "parent"?
| &db, | ||
| "src/a.py", | ||
| "x", | ||
| "Literal[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]", |
There was a problem hiding this comment.
I think x can be Unbound here; it's never set in an exhaustive except, or in a finally.
I think this might be the issue I pointed out above, where we don't include the pre-try-block state as a pre-finally possibility.
| )?; | ||
|
|
||
| assert_file_diagnostics(&db, "src/a.py", &[]); | ||
| assert_public_ty(&db, "src/a.py", "x", "Unbound | Literal[1, 2, 101]"); |
There was a problem hiding this comment.
I don't see how x could actually be unbound in this scenario, but I'm also not sure if it's worth fixing. We assume that we could exit the outer try with an exception before executing the inner try at all, but that's not actually true in this example; nothing happens in the outer try before we enter the inner try, so in actual fact we are guaranteed to either complete the assignment x = 1 in the inner try, or have an exception caught by the exhaustive inner handler, which also binds x.
I think we would handle this correctly if, instead of saving a pre-try state, and then pushing a state post each definition, we instead pushed a state pre each definition, and then saved a post-try state.
There was a problem hiding this comment.
I think this comment is wrong; given that we don't currently try to distinguish non-raising statements, we have to assume that x = 2 in the inner exhaustive except could also raise, so it is correct for x to be possibly unbound.
| // would not necessarily have been overridden by the otherwise-exhaustive | ||
| // `except`/`else` branches in the inner block. |
There was a problem hiding this comment.
"otherwise-exhausive except/else branches in the inner block" doesn't seem right -- those aren't exhaustive
| // block due to another exception, in which case the assignments in the inner `try` block | ||
| // would not necessarily have been overridden by the otherwise-exhaustive | ||
| // `except`/`else` branches in the inner block. | ||
| assert_public_ty(&db, "src/a.py", "x", "Unbound | Literal[1, 2, 3, 4]"); |
There was a problem hiding this comment.
How is this assertion different from if the outer try/except weren't there?
| )?; | ||
|
|
||
| assert_file_diagnostics(&db, "src/a.py", &[]); | ||
| assert_public_ty(&db, "src/a.py", "x", "Unbound | Literal[4, 5]"); |
There was a problem hiding this comment.
How is this assertion different from if the outer try/except weren't there?
297ff65 to
e9311ed
Compare
e9311ed to
11a5499
Compare
MichaReiser
left a comment
There was a problem hiding this comment.
To me it's unclear why we need the extra except handler stack vs just using the function stack (all you need is the current context and the parent's). Keeping an extra Vec around isn't expensive but I think it's slightly more complicated than just making use of the stack.
| "/src/tomllib/_parser.py:579:12: Name 'char' used when possibly not defined.", | ||
| "/src/tomllib/_parser.py:580:63: Name 'char' used when possibly not defined.", | ||
| "/src/tomllib/_parser.py:629:38: Name 'datetime_obj' used when possibly not defined.", | ||
| ]; |
There was a problem hiding this comment.
I recommend moving this comment next to the relevant errors. It increases the likelihood that the comment gets deleted when the errors are resolved (I would miss the comment up there)
| ]; | |
| static EXPECTED_DIAGNOSTICS: &[&str] = &[ | |
| "/src/tomllib/_parser.py:5:24: Module '__future__' has no member 'annotations'", | |
| // The failed import from 'collections.abc' is due to lack of support for 'import *'. | |
| "/src/tomllib/_parser.py:7:29: Module 'collections.abc' has no member 'Iterable'", | |
| "Line 69 is too long (89 characters)", | |
| "Use double quotes for strings", | |
| "Use double quotes for strings", | |
| "Use double quotes for strings", | |
| "Use double quotes for strings", | |
| "Use double quotes for strings", | |
| "Use double quotes for strings", | |
| "Use double quotes for strings", | |
| // The "possibly not defined" errors in `tomllib/_parser.py` are because | |
| // we don't yet understand terminal statements (`return`/`raise`/etc.) in our control-flow analysis | |
| "/src/tomllib/_parser.py:66:18: Name 's' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:98:12: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:101:12: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:104:14: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:104:14: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:115:14: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:115:14: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:126:12: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:348:20: Name 'nest' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:353:5: Name 'nest' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:453:24: Name 'nest' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:455:9: Name 'nest' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:482:16: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:566:12: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:573:12: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:579:12: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:580:63: Name 'char' used when possibly not defined.", | |
| "/src/tomllib/_parser.py:629:38: Name 'datetime_obj' used when possibly not defined.", | |
| ]; |
crates/red_knot_python_semantic/src/semantic_index/except_handlers.rs
Outdated
Show resolved
Hide resolved
| if let Some(parent_try_block) = | ||
| self.try_block_contexts().borrow_mut().current_try_block() | ||
| { | ||
| parent_try_block.record_visiting_nested_try_stmt(); |
There was a problem hiding this comment.
The method name here seems misleading to me because it is called after we visit the try body. It seems it's about visiting the except handlers, finaly , and or else blocks of a try.
Related: What's the reason that we don't record definitions in a nested try? E.g. what if
try:
try:
x = 10:
maybe_panic()
finally:
y = panic()
finally:
yshouldn't the use of y be an error?
but
try:
try:
x = 10:
maybe_panic()
finally:
y = 10
finally:
yshould be okay?
There was a problem hiding this comment.
I commented elsewhere that it seems we currently don't have any tests that fail if we stop paying attention to visiting_nested_try_stmt entirely.
And it seems to me that Micha is right here -- outside the inner try block body itself, it does seem like intermediate states in the except/else/finally should all get registered with the outer try block, because we could have an exception happen in any of those places. (And such an exception would not cause us to still enter an inner finally block, which only applies to the inner try body; it would jump straight to being handled by the outer try block.) So it does seem to me like visiting_nested_try_stmt might be inherently semantically not right?
There was a problem hiding this comment.
(Cases to consider for tests here are e.g. x = 1; x = 2 inside an except, else, or finally block in a nested try statement -- the x = 1 possibility should be visible to the outer try block.
There was a problem hiding this comment.
(And such an exception would not cause us to still enter an inner
finallyblock, which only applies to the innertrybody; it would jump straight to being handled by the outertryblock.)
What do you mean? The inner finally block is indeed always executed, even if the exception is caught by the outer try block:
>>> try:
... try:
... raise TypeError()
... except NameError:
... print(1)
... finally:
... print(2)
... except TypeError:
... print(3)
... finally:
... print(4)
...
2
3
4There was a problem hiding this comment.
What I mean is that if an exception is raised inside an except/else/finally block in the nested try (not inside the try body), in that case the inner finally is not executed, the exception unwinds directly to the outer try statement handling.
There was a problem hiding this comment.
... except this is just wrong 😆 apparently finally does still apply, even if the except is raised in an except handler or else block! So never mind.
There was a problem hiding this comment.
I guess the one thing to consider still is the case where an exception is raised within the finally block itself? I.e. is it important that we register those intermediate states within the nested finally block with the outer try statement?
There was a problem hiding this comment.
I still can't find any cases where the inner finally is not executed, though it is of course true that if an exception is raised in an except handler then we might switch to the inner finally block before the except handler has finished, and indeed if an exception is raised in the inner finally block then that finally block might not run to completion. I might not handle that properly right now...
>>> try:
... try:
... raise TypeError
... except TypeError:
... raise RuntimeError
... print(1)
... finally:
... print(2)
... except RuntimeError:
... print(3)
... finally:
... print(4)
...
2
3
4| // in which case the outer `try` block must be kept informed about all the possible | ||
| // between-definition states we could have encountered in the inner `try` block(!) | ||
| if let Some(current_try_block) = | ||
| self.try_block_contexts().borrow_mut().current_try_block() |
There was a problem hiding this comment.
Can't we call self.try_block_contexts_mut() here to avoid the RefCell in the contexts (and all the borrow_mut stuff)?
|
|
||
| impl TryBlockContext { | ||
| pub(super) fn record_definition(&mut self, builder: &SemanticIndexBuilder) { | ||
| if !self.visiting_nested_try_stmt { |
There was a problem hiding this comment.
If I remove this check in the current version of this PR, all tests pass. So either we don't need this, or we need more tests.
There was a problem hiding this comment.
A test fails if you make this change:
diff --git a/crates/red_knot_python_semantic/src/semantic_index/builder.rs b/crates/red_knot_python_semantic/src/semantic_index/builder.rs
index c7e8a6428..74cbd6b2d 100644
--- a/crates/red_knot_python_semantic/src/semantic_index/builder.rs
+++ b/crates/red_knot_python_semantic/src/semantic_index/builder.rs
@@ -873,16 +873,6 @@ where
}
self.visit_body(finalbody);
-
- // Account for the fact that we could be visiting a nested `try` block,
- // in which case the outer `try` block must be kept informed about all the possible
- // between-definition states we could have encountered in the inner `try` block(!)
- if let Some(current_try_block) =
- self.try_block_contexts().borrow_mut().current_try_block()
- {
- current_try_block.record_exiting_nested_try_stmt();
- current_try_block.record_definition(self);
- }
}
_ => {
walk_stmt(self, stmt);
diff --git a/crates/red_knot_python_semantic/src/semantic_index/except_handlers.rs b/crates/red_knot_python_semantic/src/semantic_index/except_handlers.rs
index 2eded0089..ce4e76ca0 100644
--- a/crates/red_knot_python_semantic/src/semantic_index/except_handlers.rs
+++ b/crates/red_knot_python_semantic/src/semantic_index/except_handlers.rs
@@ -30,9 +30,7 @@ pub(super) struct TryBlockContext {
impl TryBlockContext {
pub(super) fn record_definition(&mut self, builder: &SemanticIndexBuilder) {
- if !self.visiting_nested_try_stmt {
- self.snapshots.push(builder.flow_snapshot());
- }
+ self.snapshots.push(builder.flow_snapshot());
}The if !self.visiting_nested_try_stmt check isn't necessary for correctness. But it's unnecessary to push all those snapshots to be recorded by the outer TryBlockContexts, since it's more efficient -- and more correct -- to simply record the state once at the completion of the inner try/except/else/finally block. I.e., the if !self.visiting_nested_try_stmt check is simply an optimisation to avoid unnecessary snapshotting.
I see that at the moment this is quite unclear; I'll add some more comments.
81b9bd2 to
4b385c7
Compare
4b385c7 to
89edfc9
Compare
|
Closing in favour of #13633 |
Summary
This PR adds control flow for
try/except/else/finallyblocks to red-knot.The semantics of
try/exceptblocks are as follows:exceptbranch is taken, this indicates that an exception in thetryblock caused that block to terminate early. This means that potentially 0 of the definitions in thetryblock were executed; or potentially all of them were; or potentially some number in between.exceptandelsebranches are mutually exclusive: if a singleexceptorelsebranch is taken, it means none of the others were taken.elsebranch was taken, it means that thetryblock was fully executed without any exceptions having occurred; we can in this branch count on all definitions in thetryblock having been executed.finallybranch, this branch is always executed last, before completion of the block, regardless of whether the block has anyexceptorelsebranches and (if so) which one was taken.In order to model (1), two new fields are added to the
SemanticIndexBuilder:visiting_try_blockflag is added, to keep track of whether we are visiting thetrybranch of atry/except/else/finallyblock.try_block_definition_statesfield is added, which is aVec<FlowSnapshot>. If thevisiting_try_blockflag is set totrue, theadd_definition()method pushes a snapshot to this vec after each new definition is added.Test Plan
cargo test.I believe I still may be missing some test coverage for nested
tryblocks. An early version of this PR looked like it had some subtle bugs there (where the visitor would no longer remember it was in atryblock if it had finished visiting atryblock inside atryblock). But I couldn't find a test that would fail as a result of the apparent bug (which I've now fixed in this PR branch, anyway).