Skip to content

Comments

[syntax-errors] Document behavior of global declarations in try nodes before 3.13#17285

Merged
ntBre merged 8 commits intomainfrom
brent/syn-global-in-try
Apr 9, 2025
Merged

[syntax-errors] Document behavior of global declarations in try nodes before 3.13#17285
ntBre merged 8 commits intomainfrom
brent/syn-global-in-try

Conversation

@ntBre
Copy link
Contributor

@ntBre ntBre commented Apr 7, 2025

Summary

This PR extends the documentation of the LoadBeforeGlobalDeclaration check to specify the behavior on versions of Python before 3.13. Namely, on Python 3.12, the else clause of a try statement is visited before the except handlers:

Python 3.12.9 (main, Feb 12 2025, 14:50:50) [Clang 19.1.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 10
>>> def g():
...     try:
...         1 / 0
...     except:
...         a = 1
...     else:
...         global a
...
>>> def f():
...     try:
...         pass
...     except:
...         global a
...     else:
...         print(a)
...
  File "<stdin>", line 5
SyntaxError: name 'a' is used prior to global declaration

The order is swapped on 3.13 (see CPython#111123):

Python 3.13.2 (main, Feb  5 2025, 08:05:21) [GCC 14.2.1 20250128] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 10
... def g():
...     try:
...         1 / 0
...     except:
...         a = 1
...     else:
...         global a
...
  File "<python-input-0>", line 8
    global a
    ^^^^^^^^
SyntaxError: name 'a' is assigned to before global declaration
>>> def f():
...     try:
...         pass
...     except:
...         global a
...     else:
...         print(a)
...
>>>

The current implementation of PLE0118 is correct for 3.13 but not 3.12: playground (it flags the first case regardless of Python version).

We decided to maintain this incorrect diagnostic for Python versions before 3.13 because the pre-3.13 behavior is very unintuitive and confirmed to be a bug, although the bug fix was not backported to earlier versions. This can lead to false positives and false negatives for pre-3.13 code, but we also expect that to be very rare, as demonstrated by the ecosystem check (before the version-dependent check was reverted here).

Test Plan

N/a

@ntBre ntBre added the rule Implementing or modifying a lint rule label Apr 7, 2025
@ntBre
Copy link
Contributor Author

ntBre commented Apr 7, 2025

I just realized that this might need to be gated behind preview since it modifies an existing, stable lint rule. Alternatively, I could just use a different SemanticSyntaxErrorKind here, which seems a little easier. On the other hand, this could probably be considered a bug in the current rule implementation.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 7, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@ntBre
Copy link
Contributor Author

ntBre commented Apr 7, 2025

Oh, there's a false positive anyway, moving back to draft.

@ntBre ntBre marked this pull request as draft April 7, 2025 20:04
@ntBre
Copy link
Contributor Author

ntBre commented Apr 8, 2025

The false positive should now be handled by tracking a scope depth in the source-order visitor to avoid flagging more-deeply-nested references. I think functions and classes are the only ways to introduce this kind of scope, in which global statements are also allowed (I don't think you can put a global statement in a comprehension or lambda).

I think u32 should be huge overkill for the scope depth tracking, so I'm more than happy to drop the size of that to a more reasonable level, but I wasn't sure how low to go. Is u8 enough?

@ntBre ntBre marked this pull request as ready for review April 8, 2025 16:18
@ntBre ntBre force-pushed the brent/syn-global-in-try branch from 86347a5 to d158e91 Compare April 8, 2025 16:56
@dhruvmanila
Copy link
Member

I think u32 should be huge overkill for the scope depth tracking, so I'm more than happy to drop the size of that to a more reasonable level, but I wasn't sure how low to go. Is u8 enough?

u32 should be fine, we use the same for ScopeId:

/// Id uniquely identifying a scope in a program.
///
/// Using a `u32` is sufficient because Ruff only supports parsing documents with a size of max `u32::max`
/// and it is impossible to have more scopes than characters in the file (because defining a function or class
/// requires more than one character).
#[newtype_index]
pub struct ScopeId;

@dhruvmanila
Copy link
Member

On Python 3.12, the else clause is visited before the except handlers:

This is confusing, do we have a source like a changelog entry where the related change is mentioned for context? It might also be useful to add that as comment in the source code.

Comment on lines 140 to 142
for stmt in body {
SourceOrderVisitor::visit_stmt(&mut visitor, stmt);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the visit_body method instead in here, orelse, handlers, and finalbody.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except for the "handlers" loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the easiest here is to call walk_stmt(&visitor, stmt) because this is an exact reimplementation of visiting a try statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

walk_stmt would only match one of these orders, right? We need to swap the order of the orelse and handlers visits depending on the version. I could still use it in one of the branches and then inline the whole modified version in the other branch.

Copy link
Member

@MichaReiser MichaReiser Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we traverse into function statements but we increase the scope counter. I have to take another look to understand what the scope counter is even used for but I suspect that different rules apply for names and global statements inside nested functions, classes and it's unclear to me if that also applies to lambdas (can't have any global statements but it may have identifiers).

So what I'm looking for is a test that demonstrates the behavior when lambdas are involved.

@ntBre
Copy link
Contributor Author

ntBre commented Apr 8, 2025

On Python 3.12, the else clause is visited before the except handlers:

This is confusing, do we have a source like a changelog entry where the related change is mentioned for context? It might also be useful to add that as comment in the source code.

This is the bug report Alex linked to in #11934. I'll look for anything else and at least link to this in the enum variant docs!

@dhruvmanila
Copy link
Member

This is the bug report Alex linked to in #11934. I'll look for anything else and at least link to this in the enum variant docs!

I think just linking to the CPython issue should be enough.

@dhruvmanila
Copy link
Member

(I need to head out now, I'll continue with the review later in the evening as I need to look at it a bit more closely to understand.)

@MichaReiser
Copy link
Member

I just realized that this might need to be gated behind preview since it modifies an existing, stable lint rule. Alternatively, I could just use a different SemanticSyntaxErrorKind here, which seems a little easier. On the other hand, this could probably be considered a bug in the current rule implementation.

Can you tell me more in how the new implementation diverges from the existing one?

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a few more tests and explore if there's a way to avoid duplicating the globals handling between the semantic syntax checker and the caller.

finalbody,
is_star: _,
}) => {
let mut visitor = TryExceptVisitor::new(ctx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a custom visitor here is probably fine because try statements aren't too deeply nested (they rarely contain nested functions) but this is now a case where the visitor ends up visiting entire suites multiple times:

  1. In the TryExcept visitor
  2. The outer visitor

I've a weak preference to avoid the double visitation unless it increases complexity by a lot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give this a try (no pun intended). The nice thing about the visitor is that it let me control the order in which things are seen, and I could just swap the order of the visits on different Python versions. I think that order-dependence might be hard to capture otherwise, but I'll play with it a bit.

The easiest thing to do would be to move that reordering into the parent visitor, but that seems too intrusive.

Comment on lines 140 to 142
for stmt in body {
SourceOrderVisitor::visit_stmt(&mut visitor, stmt);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the easiest here is to call walk_stmt(&visitor, stmt) because this is an exact reimplementation of visiting a try statement.


fn visit_stmt(&mut self, stmt: &Stmt) {
match stmt {
Stmt::FunctionDef(_) | Stmt::ClassDef(_) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about nested try statements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think try introduces a new scope:

>>> try:
...     x = 1
... except:
...     pass
...
>>> x
1
>>>

I think only functions and classes do. Is that what you were asking?

}

fn visit_expr(&mut self, expr: &Expr) {
if let Expr::Name(ast::ExprName { range, id, ctx: _ }) = expr {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about lambda expressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on that? I thought the SourceOrderVisitor would handle descending into lambda expressions for me. Or do they have another way of accessing an identifier that's not covered here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the visitor traverses into lambdas. The question is whether it should (e.g. we don't traverse into nested functions or classes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be getting confused here. Why don't we traverse into nested functions or classes? I thought the source_order::walk_stmt call in visit_stmt below would handle that.

The idea here was just to collect all identifiers and then compare them to the identifiers in any following global statements. If I missed traversing any kind of node that can contain identifiers, then I think it's a mistake, not my intention.

It sounds like I should add more tests with nested structures in any case.


fn visit_expr(&mut self, expr: &Expr) {
if let Expr::Name(ast::ExprName { range, id, ctx: _ }) = expr {
self.add_identifier(id.clone(), *range);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it unfortunate that we're collecting all identifiers here, even if there isn't a single global statement in the program (which should be the vast majority of programs).

It also has the downside that globals handling is now handled here but also something that we expect the caller to handle (context.global(name)). Could we instead make use that context already provides a global method or extend the context interface to avoid doing this work twice?

struct TryExceptVisitor<'a, Ctx> {
/// Context used for emitting errors.
ctx: &'a Ctx,
identifiers: FxHashMap<ast::name::Name, Ident>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should store name references here (you have the 'a lifetime anyway)

@ntBre
Copy link
Contributor Author

ntBre commented Apr 9, 2025

Can you tell me more in how the new implementation diverges from the existing one?

The divergence is just the version-dependence of the branch visiting order. From the summary:

The current implementation of PLE0118 is correct for 3.13 but not 3.12: playground (it flags the first case regardless of Python version).

Phrasing it that way, it clearly sounds like a bug, but I think the old behavior was so unintuitive that probably nobody runs into it

ntBre added 7 commits April 9, 2025 11:53
Summary
--

This PR extends the `LoadBeforeGlobalDeclaration` check, which corresponds to
PLE0118 in ruff, to handle Python versions before 3.13 properly.

On Python 3.12, the `else` clause is visited before the `except` handlers:

```pycon
Python 3.12.9 (main, Feb 12 2025, 14:50:50) [Clang 19.1.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 10
>>> def g():
...     try:
...         1 / 0
...     except:
...         a = 1
...     else:
...         global a
...
>>> def f():
...     try:
...         pass
...     except:
...         global a
...     else:
...         print(a)
...
  File "<stdin>", line 5
SyntaxError: name 'a' is used prior to global declaration

```

The order is swapped on 3.13:

```pycon
Python 3.13.2 (main, Feb  5 2025, 08:05:21) [GCC 14.2.1 20250128] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 10
... def g():
...     try:
...         1 / 0
...     except:
...         a = 1
...     else:
...         global a
...
  File "<python-input-0>", line 8
    global a
    ^^^^^^^^
SyntaxError: name 'a' is assigned to before global declaration
>>> def f():
...     try:
...         pass
...     except:
...         global a
...     else:
...         print(a)
...
>>>
```

The current implementation of PLE0118 is correct for 3.13 but not 3.12:
[playground](https://play.ruff.rs/d7467ea6-f546-4a76-828f-8e6b800694c9) (it
flags the first case regardless of Python version).

Test Plan
--

New inline tests that @AlexWaygood helped me write in our pairing session!
@ntBre ntBre force-pushed the brent/syn-global-in-try branch from d158e91 to 4f739f7 Compare April 9, 2025 16:03
@ntBre ntBre added the documentation Improvements or additions to documentation label Apr 9, 2025
@ntBre ntBre removed the rule Implementing or modifying a lint rule label Apr 9, 2025
@ntBre
Copy link
Contributor Author

ntBre commented Apr 9, 2025

Based on our discussion on Discord, I reverted all of the code changes here and simply extended the rule variant documentation to reflect our decision to enforce the 3.13 clause ordering on all Python versions.

I initially planned to keep the inline tests to demonstrate this, but they would now rely on a proper implementation of SemanticSyntaxContext::global for the test context to work, so I just deleted them too. We could add a linter test for PLE0118 showing this, though, if we wanted.

@ntBre
Copy link
Contributor Author

ntBre commented Apr 9, 2025

The cargo-shear failure was an issue in downloading/building it, not an extra dependency. I'll try to rerun it

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and sorry for not raising this sooner

@ntBre
Copy link
Contributor Author

ntBre commented Apr 9, 2025

No problem at all, I thought it was a useful discussion!

@ntBre ntBre changed the title [syntax-errors] global declarations in try nodes before 3.13 [syntax-errors] Document behavior of global declarations in try nodes before 3.13 Apr 9, 2025
@ntBre ntBre merged commit 2fbc4d5 into main Apr 9, 2025
22 checks passed
@ntBre ntBre deleted the brent/syn-global-in-try branch April 9, 2025 16:54
Glyphack pushed a commit to Glyphack/ruff that referenced this pull request Apr 9, 2025
…odes before 3.13 (astral-sh#17285)

Summary
--

This PR extends the documentation of the `LoadBeforeGlobalDeclaration`
check to specify the behavior on versions of Python before 3.13. Namely,
on Python 3.12, the `else` clause of a `try` statement is visited before
the `except` handlers:

```pycon
Python 3.12.9 (main, Feb 12 2025, 14:50:50) [Clang 19.1.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 10
>>> def g():
...     try:
...         1 / 0
...     except:
...         a = 1
...     else:
...         global a
...
>>> def f():
...     try:
...         pass
...     except:
...         global a
...     else:
...         print(a)
...
  File "<stdin>", line 5
SyntaxError: name 'a' is used prior to global declaration

```

The order is swapped on 3.13 (see
[CPython#111123](python/cpython#111123)):

```pycon
Python 3.13.2 (main, Feb  5 2025, 08:05:21) [GCC 14.2.1 20250128] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 10
... def g():
...     try:
...         1 / 0
...     except:
...         a = 1
...     else:
...         global a
...
  File "<python-input-0>", line 8
    global a
    ^^^^^^^^
SyntaxError: name 'a' is assigned to before global declaration
>>> def f():
...     try:
...         pass
...     except:
...         global a
...     else:
...         print(a)
...
>>>
```

The current implementation of PLE0118 is correct for 3.13 but not 3.12:
[playground](https://play.ruff.rs/d7467ea6-f546-4a76-828f-8e6b800694c9)
(it flags the first case regardless of Python version).

We decided to maintain this incorrect diagnostic for Python versions
before 3.13 because the pre-3.13 behavior is very unintuitive and
confirmed to be a bug, although the bug fix was not backported to
earlier versions. This can lead to false positives and false negatives
for pre-3.13 code, but we also expect that to be very rare, as
demonstrated by the ecosystem check (before the version-dependent check
was reverted here).

Test Plan
--

N/a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants