[ty] Implement lsp support for string annotations by Gankra · Pull Request #21577 · astral-sh/ruff

Gankra · 2025-11-22T13:12:23Z

Fixes astral-sh/ty#1009

Summary

This adds support for:

semantic-tokens (syntax highlighting)
goto-type (partially implemented, but want to land as-is)
goto-declaration
goto-definition (falls out of goto-declaration)
hover (limited by goto-type)
find-references
rename-references (falls out of find-references)

There are 3 major things being introduced here:

TypeInferenceBuilder::string_annotations is a FxHashSet of exprs which were determined to be string annotations during inference. It's bubbled up in extras to hopefully minimize the overhead as in most contexts it's empty.
- Very happy to hear if this is too hacky and if I should do something better, but it's IMO important that we get an authoritative answer on whether something is a string annotation or not.
SemanticModel::enter_string_annotation checks if the expr was marked by TypeInferenceBuilder::string_annotations and then parses the subast and produces a sub-SemanticModel that sets SemanticModel::in_string_annotation_expr. This expr will be used by the model whenever we need to query e.g. the scope of the current expression (otherwise the code will constantly panic as the subast nodes are not in the current File's AST)
- This hazard consequently encouraged me to refactor a bunch of code to replace uses of file/db with SemanticModel to minimize hazards (it is no longer as safe to randomly materialize a SemanticModel in the middle of analysis, you need to thread through the one you have in case it has in_string_annotation_expr set).
GotoTarget::StringAnnotationSubexpr (and a semantic-tokens impl) which involves invoking SemanticModel::enter_string_annotation before invoking the same kind of subroutine a normal expression would.
- goto-type (and consequently displaying the type in hover) is the main hole here, because we can only get the type iff the string annotation is the entire subexpression (i.e. we can get the type of "int" but not the parts of "int | str"). This is shippable IMO.

Test Plan

Messed around in IDE, wrote a ton of tests.

astral-sh-bot · 2025-11-22T13:14:18Z

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

astral-sh-bot · 2025-11-22T13:16:03Z

`mypy_primer` results

Changes were detected when running on open source projects

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
+ src/scikit_build_core/build/wheel.py:98:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 44 diagnostics
+ Found 45 diagnostics

No memory usage changes detected ✅

Gankra · 2025-11-22T13:17:25Z

crates/ty_python_semantic/src/semantic_model.rs

+        // to look up a node in the AST to prevent panics, because these sub-AST nodes
+        // are not in the File's AST!
+        let source = source_text(self.db, self.file);
+        let string_literal = string_expr.as_single_part_string()?;


Pre-empting Micha: yes this does fail for various things but the type checker also does this so it's fine

Gankra · 2025-11-22T18:39:04Z

crates/ty_ide/src/goto.rs

+                // The type checker knows the type of the full annotation but nothing else
+                if AnyNodeRef::from(&*submod.body) == subnode {
+                    string_expr.inferred_type(model)
+                } else {
+                    // TODO: force the typechecker to tell us its secrets
+                    // (it computes but then immediately discards these types)
+                    return None;
+                }


This is like, the one caveat of this design. Fixing this would require awful hacks or more non-trivial changes to inference.

(awful hacks would be like... looking up types by-name)

Okay, maybe it doesn't do yet what I thought it would.

Could we change ExpressionNodeKey to be an enum:

enum ExpressionNodeKey { Primary(NodeKey), StringifiedAnnotation(Box<{ parent: NodeKey, inner: NodeKey }>) }

(It might even ahve to be a ThinVec if stringified annotationos can be nested multiple levels.

We could then start storing the inferred expressions in type inference and look them up here

Happy to leave this as a follow up

They cannot be nested, thankfully.

I agree this would be great, are NodeKeys this stable?

(Definitely followup material since it could have more non-trivial perf impact)

NodeKey's are stable for as long as the source text doesn't change. We simply iterate over the AST and assign an index to each node (see parsed_module)

Gankra · 2025-11-22T18:43:31Z

I want to make it clear that this is a "high risk" change in that any bugs in this could easily lead to LSP panics. However we could potentially have this functionality toggled behind a feature flag (literally just make enter_string_annotation always return None). IMO it should be safe to get into this week's release, if there's issues they should be found pretty fast by people kicking the tires, and then we can trivially disable it if so.

carljm · 2025-11-24T21:20:01Z

The new string-annotations tracking in inference looks good to me! Didn't review the rest.

Gankra · 2025-11-24T21:30:19Z

I'm reasonably confident with the rest of the code. If we wanna get this into the release tomorrow this would be fine to merge without further review (but I'll wait for review for now).

MichaReiser

This is a huge lift and is great. I think it might even be possible (without too much work), to start storing types for the individual expression parts, see my inline comment. But let's do this in a separate PR

MichaReiser · 2025-11-25T12:27:14Z

crates/ty_python_semantic/src/semantic_model.rs

+    pub fn node_in_ast<'a>(&'a self, node: ast::AnyNodeRef<'a>) -> ast::AnyNodeRef<'a> {
+        if let Some(string_annotation) = &self.in_string_annotation_expr {
+            (&**string_annotation).into()
+        } else {
+            node
+        }
+    }
+


I'm not sure I fully understand those methods and when I'm supposed to use them.

Like in what sense are they safe where the in-stringified-annotation nodes aren't? When should I use this method over using the ast node directly?

The rough heuristic in my mind is "if the node might be handed to the inference subsystem (i.e. trying to query it's scope or type)" (which SemanticModel gatekeeps). If you ever hand these sub-AST nodes to the inference subsystem it will immediately panic because the node isn't part of the file's AST.

MichaReiser · 2025-11-25T12:31:21Z

crates/ty_python_semantic/src/semantic_model.rs

+        // to look up a node in the AST to prevent panics, because these sub-AST nodes
+        // are not in the File's AST!
+        let source = source_text(self.db, self.file);
+        let string_literal = string_expr.as_single_part_string()?;


MichaReiser · 2025-11-25T12:34:16Z

crates/ty_python_semantic/src/semantic_model.rs

+        let file_scope = index.expression_scope_id(&model.expr_ref_in_ast(*self));
        let scope = file_scope.to_scope_id(model.db, model.file);

        infer_scope_types(model.db, scope).expression_type(*self)


So just for my understanding. We now store the expression types of each expression in a stringified literal within the string-literal's scope?

This is cool!

MichaReiser · 2025-11-25T12:39:51Z

crates/ty_ide/src/goto.rs

+                // The type checker knows the type of the full annotation but nothing else
+                if AnyNodeRef::from(&*submod.body) == subnode {
+                    string_expr.inferred_type(model)
+                } else {
+                    // TODO: force the typechecker to tell us its secrets
+                    // (it computes but then immediately discards these types)
+                    return None;
+                }


Okay, maybe it doesn't do yet what I thought it would.

Could we change ExpressionNodeKey to be an enum:

enum ExpressionNodeKey { Primary(NodeKey), StringifiedAnnotation(Box<{ parent: NodeKey, inner: NodeKey }>) }

(It might even ahve to be a ThinVec if stringified annotationos can be nested multiple levels.

We could then start storing the inferred expressions in type inference and look them up here

Happy to leave this as a follow up

* main: [ty] Implement `typing.override` (astral-sh#21627) [ty] Avoid expression reinference for diagnostics (astral-sh#21267) [ty] Improve autocomplete suppressions of keywords in variable bindings [ty] Only suggest completions based on text before the cursor Implement goto-definition and find-references for global/nonlocal statements (astral-sh#21616) [ty] Inlay Hint edit follow up (astral-sh#21621) [ty] Implement lsp support for string annotations (astral-sh#21577) [ty] Add 'remove unused ignore comment' code action (astral-sh#21582) [ty] Refactor `CheckSuppressionContext` to use `DiagnosticGuard` (astral-sh#21587) [ty] Improve several "Did you mean?" suggestions (astral-sh#21597) [ty] Add more and update existing projects in `ty_benchmark` (astral-sh#21536) [ty] fix ty playground initialization and vite optimization issues (astral-sh#21471)

Gankra requested review from AlexWaygood, MichaReiser, carljm, dcreager and sharkdp as code owners November 22, 2025 13:12

Gankra added server Related to the LSP server ty Multi-file analysis & type inference labels Nov 22, 2025

This comment was marked as resolved.

Sign in to view

Gankra commented Nov 22, 2025

View reviewed changes

Base automatically changed from gankra/modty to main November 22, 2025 16:06

Gankra force-pushed the gankra/strattr branch 2 times, most recently from e2862e6 to 0ab5389 Compare November 22, 2025 18:30

Gankra changed the title ~~[ty] implement lsp support for highlighting/hovering/goto-ing string annotations~~ [ty] implement lsp support for highlighting/renaming/hovering/goto-ing string annotations Nov 22, 2025

Gankra changed the title ~~[ty] implement lsp support for highlighting/renaming/hovering/goto-ing string annotations~~ [ty] implement lsp support for string annotations Nov 22, 2025

Gankra changed the title ~~[ty] implement lsp support for string annotations~~ [ty] Implement lsp support for string annotations Nov 22, 2025

Gankra commented Nov 22, 2025

View reviewed changes

carljm removed their request for review November 24, 2025 21:19

Gankra added 4 commits November 24, 2025 18:37

lsp string annotations

d9c1485

Add a lot of tests

a04d486

implement find-references

f5270a9

normalize naming

108ab4b

Gankra force-pushed the gankra/strattr branch from 0ab5389 to 108ab4b Compare November 24, 2025 23:37

MichaReiser approved these changes Nov 25, 2025

View reviewed changes

fixup

916a287

Gankra enabled auto-merge (squash) November 25, 2025 13:28

Gankra merged commit 66d2331 into main Nov 25, 2025
40 checks passed

Gankra deleted the gankra/strattr branch November 25, 2025 13:31

Comments

Conversation

Gankra commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

This comment was marked as resolved.

astral-sh-bot bot commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diagnostic diff on typing conformance tests

Uh oh!

This comment was marked as resolved.

astral-sh-bot bot commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra commented Nov 22, 2025

Uh oh!

carljm commented Nov 24, 2025

Uh oh!

Gankra commented Nov 24, 2025

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Gankra commented Nov 22, 2025 •

edited

Loading

astral-sh-bot bot commented Nov 22, 2025 •

edited

Loading

astral-sh-bot bot commented Nov 22, 2025 •

edited

Loading

`mypy_primer` results

Gankra Nov 25, 2025 •

edited

Loading