Improved error recovery for unclosed strings (including f- and t-strings)#20848
Improved error recovery for unclosed strings (including f- and t-strings)#20848MichaReiser merged 6 commits intomainfrom
Conversation
| # This is also true for | ||
| # unterminated f-strings. |
There was a problem hiding this comment.
I know this looks silly, but keeping the comment over two lines reduces the snapshot changes.
| bitflags! { | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub(crate) struct TokenFlags: u8 { | ||
| pub(crate) struct TokenFlags: u16 { |
There was a problem hiding this comment.
While this increases the size of TokenFlags, it doesn't increase the size of Token. Which is why I didn't bother with any fancy encoding (e.g. it's unclosed if RAW_STRING_UPPERCASE and RAW_STRING_LOWERCASE are set)
|
698f74c to
a012e2b
Compare
| self.current_flags |= TokenFlags::UNCLOSED_STRING; | ||
|
|
||
| self.push_error(LexicalError::new( | ||
| LexicalErrorType::StringError, |
There was a problem hiding this comment.
This error was just wrong. a = "string<EOF reported an unexpected string error rather than unclosed string literal
ntBre
left a comment
There was a problem hiding this comment.
Nice! This looks great to me!
…rable * origin/main: [ty] Add (unused) `inferable` parameter to type property methods (#20865) Run macos tests on macos (#20889) Remove `release` CI job (#20887) [ty] CI: Faster ecosystem analysis (#20886) Remove `strip` from release profile (#20885) [ty] Sync vendored typeshed stubs (#20876) [ty] Add some completion ranking improvements (#20807) Improved error recovery for unclosed strings (including f- and t-strings) (#20848) Enable lto=fat (#20863) [`pyupgrade`] Extend `UP019` to detect `typing_extensions.Text` (`UP019`) (#20825) [`flake8-bugbear`] Omit annotation in preview fix for `B006` (#20877) fix(docs): Fix typo in `RUF015` description (#20873) [ty] Improve and extend tests for instance attributes redeclared in subclasses (#20866) [ty] Ignore slow seeds as a temporary measure (#20870) Remove parentheses around multiple exception types on Python 3.14+ (#20768) Update Black tests (#20794)
…nt-sets * dcreager/non-non-inferable: (174 commits) [ty] Add (unused) `inferable` parameter to type property methods (#20865) Run macos tests on macos (#20889) Remove `release` CI job (#20887) [ty] CI: Faster ecosystem analysis (#20886) Remove `strip` from release profile (#20885) [ty] Sync vendored typeshed stubs (#20876) [ty] Add some completion ranking improvements (#20807) Improved error recovery for unclosed strings (including f- and t-strings) (#20848) Enable lto=fat (#20863) [`pyupgrade`] Extend `UP019` to detect `typing_extensions.Text` (`UP019`) (#20825) [`flake8-bugbear`] Omit annotation in preview fix for `B006` (#20877) fix(docs): Fix typo in `RUF015` description (#20873) [ty] Improve and extend tests for instance attributes redeclared in subclasses (#20866) [ty] Ignore slow seeds as a temporary measure (#20870) use existing method Remove parentheses around multiple exception types on Python 3.14+ (#20768) Update Black tests (#20794) just the api parts [ty] Fix further issues in `super()` inference logic (#20843) [ty] Document when a rule was added (#20859) ...
Summary
This PR improves our lexer to preserve
STRINGtokens instead of converting them toUnknownif a string literal misses its closing quotes.Instead of converting to
UNKNOWN, it sets a flag on the string literal that allows upstream tools to check if it's an unclosed string literal.The benefit of preserving string literals is that it gives us much better error recovery because the parser now recognizes those literals.
That means, ty will correctly infer the literal type for
a = "unclosedto beLiteral["unclosed"].Unfortunately, preserving the kind for unclosed string literals regressed the f-string's and t-string's recovery mechanism. So, I went ahead and improved that too.
There are a few improvements:
f"unclosed) instead of parsing this asf""}. E.g., the parser now matches the quotes forf"{ab"instead of assuming that the closing quotes start a new stringrformat specifiers if the}is missing:f"{ab:r"now parses theras the raw conversion flag rather thanr"the start of a raw string literalFixes #19751
Fixes #20849
Review
You probably want to skip the first commit :) It updates all snapshots to now include the
unclosed: <UNCLOSED>flag.Test Plan
Reviewed and updated the snapshot tests. I also reviewed all usages of
TokenKind::Stringto find cases where the missing closing quote could now cause issues.This change should have no impact on AST-based lint rules or the formatter because they both only run when there are no parse errors.