Skip to content

Comments

[ty] Implement lsp support for string annotations#21577

Merged
Gankra merged 5 commits intomainfrom
gankra/strattr
Nov 25, 2025
Merged

[ty] Implement lsp support for string annotations#21577
Gankra merged 5 commits intomainfrom
gankra/strattr

Conversation

@Gankra
Copy link
Contributor

@Gankra Gankra commented Nov 22, 2025

Fixes astral-sh/ty#1009

Summary

This adds support for:

  • semantic-tokens (syntax highlighting)
  • goto-type (partially implemented, but want to land as-is)
  • goto-declaration
  • goto-definition (falls out of goto-declaration)
  • hover (limited by goto-type)
  • find-references
  • rename-references (falls out of find-references)

There are 3 major things being introduced here:

  • TypeInferenceBuilder::string_annotations is a FxHashSet of exprs which were determined to be string annotations during inference. It's bubbled up in extras to hopefully minimize the overhead as in most contexts it's empty.
    • Very happy to hear if this is too hacky and if I should do something better, but it's IMO important that we get an authoritative answer on whether something is a string annotation or not.
  • SemanticModel::enter_string_annotation checks if the expr was marked by TypeInferenceBuilder::string_annotations and then parses the subast and produces a sub-SemanticModel that sets SemanticModel::in_string_annotation_expr. This expr will be used by the model whenever we need to query e.g. the scope of the current expression (otherwise the code will constantly panic as the subast nodes are not in the current File's AST)
    • This hazard consequently encouraged me to refactor a bunch of code to replace uses of file/db with SemanticModel to minimize hazards (it is no longer as safe to randomly materialize a SemanticModel in the middle of analysis, you need to thread through the one you have in case it has in_string_annotation_expr set).
  • GotoTarget::StringAnnotationSubexpr (and a semantic-tokens impl) which involves invoking SemanticModel::enter_string_annotation before invoking the same kind of subroutine a normal expression would.
    • goto-type (and consequently displaying the type in hover) is the main hole here, because we can only get the type iff the string annotation is the entire subexpression (i.e. we can get the type of "int" but not the parts of "int | str"). This is shippable IMO.

Test Plan

Messed around in IDE, wrote a ton of tests.

@Gankra Gankra added server Related to the LSP server ty Multi-file analysis & type inference labels Nov 22, 2025
@Gankra

This comment was marked as resolved.

@astral-sh-bot
Copy link

astral-sh-bot bot commented Nov 22, 2025

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

@Gankra

This comment was marked as resolved.

@astral-sh-bot
Copy link

astral-sh-bot bot commented Nov 22, 2025

mypy_primer results

Changes were detected when running on open source projects
scikit-build-core (https://github.com/scikit-build/scikit-build-core)
+ src/scikit_build_core/build/wheel.py:98:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 44 diagnostics
+ Found 45 diagnostics

No memory usage changes detected ✅

// to look up a node in the AST to prevent panics, because these sub-AST nodes
// are not in the File's AST!
let source = source_text(self.db, self.file);
let string_literal = string_expr.as_single_part_string()?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-empting Micha: yes this does fail for various things but the type checker also does this so it's fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

Base automatically changed from gankra/modty to main November 22, 2025 16:06
@Gankra Gankra force-pushed the gankra/strattr branch 2 times, most recently from e2862e6 to 0ab5389 Compare November 22, 2025 18:30
@Gankra Gankra changed the title [ty] implement lsp support for highlighting/hovering/goto-ing string annotations [ty] implement lsp support for highlighting/renaming/hovering/goto-ing string annotations Nov 22, 2025
@Gankra Gankra changed the title [ty] implement lsp support for highlighting/renaming/hovering/goto-ing string annotations [ty] implement lsp support for string annotations Nov 22, 2025
@Gankra Gankra changed the title [ty] implement lsp support for string annotations [ty] Implement lsp support for string annotations Nov 22, 2025
Comment on lines +352 to +359
// The type checker knows the type of the full annotation but nothing else
if AnyNodeRef::from(&*submod.body) == subnode {
string_expr.inferred_type(model)
} else {
// TODO: force the typechecker to tell us its secrets
// (it computes but then immediately discards these types)
return None;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is like, the one caveat of this design. Fixing this would require awful hacks or more non-trivial changes to inference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(awful hacks would be like... looking up types by-name)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, maybe it doesn't do yet what I thought it would.

Could we change ExpressionNodeKey to be an enum:

enum ExpressionNodeKey {
    Primary(NodeKey),
    StringifiedAnnotation(Box<{ parent: NodeKey, inner: NodeKey }>)
}

(It might even ahve to be a ThinVec if stringified annotationos can be nested multiple levels.

We could then start storing the inferred expressions in type inference and look them up here

Happy to leave this as a follow up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They cannot be nested, thankfully.

I agree this would be great, are NodeKeys this stable?

(Definitely followup material since it could have more non-trivial perf impact)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NodeKey's are stable for as long as the source text doesn't change. We simply iterate over the AST and assign an index to each node (see parsed_module)

@Gankra
Copy link
Contributor Author

Gankra commented Nov 22, 2025

I want to make it clear that this is a "high risk" change in that any bugs in this could easily lead to LSP panics. However we could potentially have this functionality toggled behind a feature flag (literally just make enter_string_annotation always return None). IMO it should be safe to get into this week's release, if there's issues they should be found pretty fast by people kicking the tires, and then we can trivially disable it if so.

@carljm carljm removed their request for review November 24, 2025 21:19
@carljm
Copy link
Contributor

carljm commented Nov 24, 2025

The new string-annotations tracking in inference looks good to me! Didn't review the rest.

@Gankra
Copy link
Contributor Author

Gankra commented Nov 24, 2025

I'm reasonably confident with the rest of the code. If we wanna get this into the release tomorrow this would be fine to merge without further review (but I'll wait for review for now).

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge lift and is great. I think it might even be possible (without too much work), to start storing types for the individual expression parts, see my inline comment. But let's do this in a separate PR

Comment on lines +252 to +259
pub fn node_in_ast<'a>(&'a self, node: ast::AnyNodeRef<'a>) -> ast::AnyNodeRef<'a> {
if let Some(string_annotation) = &self.in_string_annotation_expr {
(&**string_annotation).into()
} else {
node
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand those methods and when I'm supposed to use them.

Like in what sense are they safe where the in-stringified-annotation nodes aren't? When should I use this method over using the ast node directly?

Copy link
Contributor Author

@Gankra Gankra Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rough heuristic in my mind is "if the node might be handed to the inference subsystem (i.e. trying to query it's scope or type)" (which SemanticModel gatekeeps). If you ever hand these sub-AST nodes to the inference subsystem it will immediately panic because the node isn't part of the file's AST.

// to look up a node in the AST to prevent panics, because these sub-AST nodes
// are not in the File's AST!
let source = source_text(self.db, self.file);
let string_literal = string_expr.as_single_part_string()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol

let file_scope = index.expression_scope_id(&model.expr_ref_in_ast(*self));
let scope = file_scope.to_scope_id(model.db, model.file);

infer_scope_types(model.db, scope).expression_type(*self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just for my understanding. We now store the expression types of each expression in a stringified literal within the string-literal's scope?

This is cool!

Comment on lines +352 to +359
// The type checker knows the type of the full annotation but nothing else
if AnyNodeRef::from(&*submod.body) == subnode {
string_expr.inferred_type(model)
} else {
// TODO: force the typechecker to tell us its secrets
// (it computes but then immediately discards these types)
return None;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, maybe it doesn't do yet what I thought it would.

Could we change ExpressionNodeKey to be an enum:

enum ExpressionNodeKey {
    Primary(NodeKey),
    StringifiedAnnotation(Box<{ parent: NodeKey, inner: NodeKey }>)
}

(It might even ahve to be a ThinVec if stringified annotationos can be nested multiple levels.

We could then start storing the inferred expressions in type inference and look them up here

Happy to leave this as a follow up

@Gankra Gankra enabled auto-merge (squash) November 25, 2025 13:28
@Gankra Gankra merged commit 66d2331 into main Nov 25, 2025
40 checks passed
@Gankra Gankra deleted the gankra/strattr branch November 25, 2025 13:31
carljm added a commit to mtshiba/ruff that referenced this pull request Nov 25, 2025
* main:
  [ty] Implement `typing.override` (astral-sh#21627)
  [ty] Avoid expression reinference for diagnostics (astral-sh#21267)
  [ty] Improve autocomplete suppressions of keywords in variable bindings
  [ty] Only suggest completions based on text before the cursor
  Implement goto-definition and find-references for global/nonlocal statements (astral-sh#21616)
  [ty] Inlay Hint edit follow up (astral-sh#21621)
  [ty] Implement lsp support for string annotations (astral-sh#21577)
  [ty] Add 'remove unused ignore comment' code action (astral-sh#21582)
  [ty] Refactor `CheckSuppressionContext` to use `DiagnosticGuard` (astral-sh#21587)
  [ty] Improve several "Did you mean?" suggestions (astral-sh#21597)
  [ty] Add more and update existing projects in `ty_benchmark` (astral-sh#21536)
  [ty] fix ty playground initialization and vite optimization issues (astral-sh#21471)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

server Related to the LSP server ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support (substrings of) string annotations as GotoTargets

3 participants