Skip over trivia tokens after re-lexing#21895
Conversation
| 29 | if call(foo, [a, b | ||
| | ^ Syntax Error: Expected `]`, found NonLogicalNewline | ||
| 30 | def bar(): | ||
| | ^^^ Syntax Error: Expected `]`, found `def` |
There was a problem hiding this comment.
I find those error messages more accurate. We only expect a ] because def is the start of a new keyword. It's otherwise perfectly fine to write a list over multiple lines.
There was a problem hiding this comment.
Hmm, but then the following error message points at this location "expecting ), found Newline". I think this is still an improvement but just found this a bit inconsistent. No need to change anything here.
Diagnostic diff on typing conformance testsNo changes detected when running ty on typing conformance tests ✅ |
|
| 1 | if call(foo, [a, b def bar(): pass | ||
| | ^ Syntax Error: Expected `)`, found newline |
There was a problem hiding this comment.
I think this is bad but it's inherent to how we do recovery. The issue is that nesting is still greater than 1 when we're recovering in [ so that the NonLogicalNewline isn't relexed as a Newline. However, it is later re-lexed when we recover from the unclosed (, which feels a bit inconsistent.
| | | ||
| 4 | class A: | ||
| 5 | pass | ||
| | ^ Syntax Error: unexpected EOF while parsing |
There was a problem hiding this comment.
This error message is just confusing because the class itself is fine. But it's because of a previous error.
| | _______^ | ||
| 4 | | class A: | ||
| 5 | | pass | ||
| | |_________^ Syntax Error: f-string: unterminated triple-quoted string |
There was a problem hiding this comment.
This is also just funny but it's the same problem. The recovery in f-string has a nesting level > 0, so it doesn't actually do anything.
1b77b58 to
05eded6
Compare
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| invalid-syntax: | 3 | 2 | 1 | 0 | 0 |
Linter (preview)
ℹ️ ecosystem check detected linter changes. (+2 -1 violations, +0 -0 fixes in 1 projects; 54 projects unchanged)
openai/openai-cookbook (+2 -1 violations, +0 -0 fixes)
ruff check --no-cache --exit-zero --no-fix --output-format concise --preview --select A,E703,F704,B015,B018,D100
+ examples/mcp/databricks_mcp_cookbook.ipynb:cell 30:11:49: invalid-syntax: Expected `)`, found newline - examples/mcp/databricks_mcp_cookbook.ipynb:cell 30:11:49: invalid-syntax: Expected `}`, found NonLogicalNewline + examples/mcp/databricks_mcp_cookbook.ipynb:cell 30:12:5: invalid-syntax: Expected `}`, found `return`
Changes by rule (1 rules affected)
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| invalid-syntax: | 3 | 2 | 1 | 0 | 0 |
Formatter (stable)
✅ ecosystem check detected no format changes.
Formatter (preview)
✅ ecosystem check detected no format changes.
05eded6 to
7d47f71
Compare
| pub(crate) fn re_lex_logical_token(&mut self) { | ||
| let mut non_logical_newline_start = None; | ||
| for token in self.tokens.iter().rev() { | ||
| let mut non_logical_line = None; |
| // Ensure `current` is positioned at a non-trivia token. | ||
| if self.current_kind().is_trivia() { | ||
| self.bump(self.current_kind()); | ||
| } |
There was a problem hiding this comment.
Is this the main part of the fix based on the PR description?
There was a problem hiding this comment.
Yes, the other changes are mainly added assertions and some safe-guarding that we never remove any tokens that came before the non logical newline
* origin/main: (29 commits) Document range suppressions, reorganize suppression docs (#21884) Ignore ruff:isort like ruff:noqa in new suppressions (#21922) [ty] Handle `Definition`s in `SemanticModel::scope` (#21919) [ty] Attach salsa db when running ide tests for easier debugging (#21917) [ty] Don't show hover for expressions with no inferred type (#21924) [ty] avoid unions of generic aliases of the same class in fixpoint (#21909) [ty] Squash false positive logs for failing to find `builtins` as a real module [ty] Uniformly use "not supported" in diagnostics (#21916) [ty] Reduce size of ty-ide snapshots (#21915) [ty] Adjust scope completions to use all reachable symbols [ty] Rename `all_members_of_scope` to `all_end_of_scope_members` [ty] Remove `all_` prefix from some routines on UseDefMap Enable `--document-private-items` for `ruff_python_formatter` (#21903) Remove `BackwardsTokenizer` based `parenthesized_range` references in `ruff_linter` (#21836) [ty] Revert "Do not infer types for invalid binary expressions in annotations" (#21914) Skip over trivia tokens after re-lexing (#21895) [ty] Avoid inferring types for invalid binary expressions in string annotations (#21911) [ty] Improve overload call resolution tracing (#21913) [ty] fix missing heap_size on Salsa query (#21912) [ty] Support implicit type of `cls` in signatures (#21771) ...
* origin/main: (36 commits) [ty] Defer all parameter and return type annotations (#21906) [ty] Fix workspace symbols to return members too (#21926) Document range suppressions, reorganize suppression docs (#21884) Ignore ruff:isort like ruff:noqa in new suppressions (#21922) [ty] Handle `Definition`s in `SemanticModel::scope` (#21919) [ty] Attach salsa db when running ide tests for easier debugging (#21917) [ty] Don't show hover for expressions with no inferred type (#21924) [ty] avoid unions of generic aliases of the same class in fixpoint (#21909) [ty] Squash false positive logs for failing to find `builtins` as a real module [ty] Uniformly use "not supported" in diagnostics (#21916) [ty] Reduce size of ty-ide snapshots (#21915) [ty] Adjust scope completions to use all reachable symbols [ty] Rename `all_members_of_scope` to `all_end_of_scope_members` [ty] Remove `all_` prefix from some routines on UseDefMap Enable `--document-private-items` for `ruff_python_formatter` (#21903) Remove `BackwardsTokenizer` based `parenthesized_range` references in `ruff_linter` (#21836) [ty] Revert "Do not infer types for invalid binary expressions in annotations" (#21914) Skip over trivia tokens after re-lexing (#21895) [ty] Avoid inferring types for invalid binary expressions in string annotations (#21911) [ty] Improve overload call resolution tracing (#21913) ...
|
Do we need to change anything in ruff/crates/ruff_python_ast/src/token/tokens.rs Lines 300 to 305 in 0138cd2 which tries to "mimic" this behavior of recovery. I think no, but just wanted to check-in with you. |
|
I don't think so |
Summary
Unlike the
Lexer, theTokenSourceskips over trivia token because they aren't semantically meaningfulduring parsing. However, this hasn't been the case after re-lexing where we forgot to skip over any trivia,
which this PR adds
Fixes astral-sh/ty#1828
Test Plan
This PR extends our fixture to assert that a node start or end falls within a token range (because that woudl be weird).
I added regression tests for the two fixed bugs.
I'm not super happy with how the parser recovers but I'll create a separate PR with some discussion around that (I don't have a solution yet).