Skip to content

Comments

[ty] Fix missing registration of identifiers during semantic index construction#18441

Closed
BurntSushi wants to merge 1 commit intomainfrom
ag/fix-completion-panic
Closed

[ty] Fix missing registration of identifiers during semantic index construction#18441
BurntSushi wants to merge 1 commit intomainfrom
ag/fix-completion-panic

Conversation

@BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented Jun 3, 2025

@BurntSushi
Copy link
Member Author

(This is kind of a wild guess on my part.)

@github-actions
Copy link
Contributor

github-actions bot commented Jun 3, 2025

mypy_primer results

No ecosystem changes detected ✅

@AlexWaygood AlexWaygood added the ty Multi-file analysis & type inference label Jun 3, 2025
@BurntSushi BurntSushi force-pushed the ag/fix-completion-panic branch from cb04b73 to cc466be Compare June 3, 2025 14:04
@AlexWaygood
Copy link
Member

Could you say a bit more about what the cause of the panic is? Where are you using the scopes_by_expression map in the autocompletion logic?

@BurntSushi
Copy link
Member Author

It's this:

pub fn completions(&self, node: ast::AnyNodeRef<'_>) -> Vec<Name> {
let index = semantic_index(self.db, self.file);
let file_scope = match node {
ast::AnyNodeRef::Identifier(identifier) => index.expression_scope_id(identifier),
node => match node.as_expr_ref() {
// If we couldn't identify a specific
// expression that we're in, then just
// fall back to the global scope.
None => FileScopeId::global(),
Some(expr) => index.expression_scope_id(expr),
},
};
let mut symbols = vec![];
for (file_scope, _) in index.ancestor_scopes(file_scope) {
for symbol in index.symbol_table(file_scope).symbols() {
symbols.push(symbol.name().clone());
}
}
symbols
}

Where the node given to that function is identified via:

fn find_target(parsed: &ParsedModule, offset: TextSize) -> Option<CoveringNode> {
let offset = match parsed.tokens().at_offset(offset) {
TokenAt::None => {
return Some(covering_node(
parsed.syntax().into(),
TextRange::empty(offset),
));
}
TokenAt::Single(tok) => tok.end(),
TokenAt::Between(_, tok) => tok.start(),
};
let before = parsed.tokens().before(offset);
let last = before.last()?;
let covering_node = covering_node(parsed.syntax().into(), last.range());
Some(covering_node)
}

Basically, we're looking for the most constraining AST node, looking up that node's scope and then listing out the identifiers in that scope (and its parent scopes).

@AlexWaygood
Copy link
Member

You also need to recurse into match statements -- this also causes the playground to panic for me:

match object():
    case fo<CURSOR>

@AlexWaygood
Copy link
Member

This still causes a panic on your branch if I run the playground locally:

def f(x):<CURSOR>

Comment on lines +1043 to +1046
fn record_scope_for_identifier(&mut self, name: &ast::Identifier) {
self.scopes_by_expression
.insert(name.into(), self.current_scope());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit weird to be inserting Identifiers into this map, since Identifiers aren't really expressions. I wondered if we should be using a separate map entirely for completions logic, but that would probably be overkill. It could be worth adding a comment about why we do this, though.

I'm also slightly concerned that it's easy to forget to insert into this map if we're just manually adding the self.record_scope_for_identifier() calls into various branches. But I don't necessarily have a better suggestion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't have a good sense of what is "right" to do here. I went this direction because it seems like the surrounding code already acknowledges an Identifier as something that can be addressed like an expression because of the impl From<&ast::Identifier> for ExpressionNodeKey.

@BurntSushi
Copy link
Member Author

This still causes a panic on your branch if I run the playground locally:

def f(x):<CURSOR>

Interesting! I can't reproduce this locally on this branch. It doesn't seem to panic on main either (when I run the LSP in vim).

match object():
    case fo<CURSOR>

Ah yeah I see the problem. There are some identifiers I missed in the match cases.

@MichaReiser
Copy link
Member

I've to think this through a bit more. The initial design of ScopedExpressionId was that they're indeed only for expressions and not for any other node. I need to look back at why we allowed some identifiers (that was probably me) and if it makes sense to allow all identifiers or if we need a different approach here.

@AlexWaygood
Copy link
Member

Interesting! I can't reproduce this locally on this branch. It doesn't seem to panic on main either (when I run the LSP in vim).

This reproduces it very consistently for me on your branch:

Screen.Recording.2025-06-03.at.15.59.34.mov

@AlexWaygood
Copy link
Member

AlexWaygood commented Jun 3, 2025

This reproduces it very consistently for me on your branch

Do parameter nodes have identifier nodes inside them? It only repros for me if it's a function with parameters

@MichaReiser
Copy link
Member

MichaReiser commented Jun 3, 2025

Looking at the change history, enabling scoped_expression_id for Identifier was probably an unintentional change. It was made 2 months ago to allow recording uses on Identifiers.

Recording the scopes for more nodes does seem reasonable to me if we have use cases beyond just expressions. But we should then rename ExpressionNodeKey, ScopedExpressionId, and the relevant methods. Although it's not clear ot me what a good name would be for this.

The alternative is that we change the completions code to use a different mechanism to get the scope. E.g. to walk further up the tree until it finds an ExprName or ClassDef or FunctionDef, similar to what's done in hover. This is more work but has the advantage that we don't need to track more data (especially because most identifiers are redundant to ExprName).

match covering_node.node() {
AnyNodeRef::Identifier(identifier) => match covering_node.parent() {
Some(AnyNodeRef::StmtFunctionDef(function)) => Some(GotoTarget::FunctionDef(function)),
Some(AnyNodeRef::StmtClassDef(class)) => Some(GotoTarget::ClassDef(class)),
Some(AnyNodeRef::Parameter(parameter)) => Some(GotoTarget::Parameter(parameter)),
Some(AnyNodeRef::Alias(alias)) => Some(GotoTarget::Alias(alias)),
Some(AnyNodeRef::StmtImportFrom(from)) => Some(GotoTarget::ImportedModule(from)),
Some(AnyNodeRef::ExceptHandlerExceptHandler(handler)) => {
Some(GotoTarget::ExceptVariable(handler))
}
Some(AnyNodeRef::Keyword(keyword)) => Some(GotoTarget::KeywordArgument(keyword)),
Some(AnyNodeRef::PatternMatchMapping(mapping)) => {
Some(GotoTarget::PatternMatchRest(mapping))
}
Some(AnyNodeRef::PatternKeyword(keyword)) => {
Some(GotoTarget::PatternKeywordArgument(keyword))
}
Some(AnyNodeRef::PatternMatchStar(star)) => {
Some(GotoTarget::PatternMatchStarName(star))
}
Some(AnyNodeRef::PatternMatchAs(as_pattern)) => {
Some(GotoTarget::PatternMatchAsName(as_pattern))
}
Some(AnyNodeRef::TypeParamTypeVar(var)) => Some(GotoTarget::TypeParamTypeVarName(var)),
Some(AnyNodeRef::TypeParamParamSpec(bound)) => {
Some(GotoTarget::TypeParamParamSpecName(bound))
}
Some(AnyNodeRef::TypeParamTypeVarTuple(var_tuple)) => {
Some(GotoTarget::TypeParamTypeVarTupleName(var_tuple))
}
Some(AnyNodeRef::ExprAttribute(attribute)) => {
Some(GotoTarget::Expression(attribute.into()))
}
Some(AnyNodeRef::StmtNonlocal(_)) => Some(GotoTarget::NonLocal { identifier }),
Some(AnyNodeRef::StmtGlobal(_)) => Some(GotoTarget::Globals { identifier }),
None => None,
Some(parent) => {
tracing::debug!(
"Missing `GoToTarget` for identifier with parent {:?}",
parent.kind()
);
None
}
},
node => node.as_expr_ref().map(GotoTarget::Expression),
}

Related: astral-sh/ty#144

@BurntSushi
Copy link
Member Author

The alternative is that we change the completions code to use a different mechanism to get the scope. E.g. to walk further up the tree until it finds an ExprName or ClassDef or FunctionDef, similar to what's done in hover. This is more work but has the advantage that we don't need to track more data (especially because most identifiers are redundant to ExprName).

All righty. I'll put in a pin in this PR for now until I have a chance to work on improving target detection. I guess if we can side-step this issue entirely, then it probably makes sense to avoid storing identifiers.

If we end up abdicating this PR, then I can submit another one beefing up the docs on the existing methods. Particularly their panic conditions. :-) When I used this method, I falsely assumed that any expression in the File given to build the semantic index would be a valid key.

@dhruvmanila
Copy link
Member

dhruvmanila commented Jun 3, 2025

Looking at the change history, enabling scoped_expression_id for Identifier was probably an unintentional change. It was made 2 months ago to allow recording uses on Identifiers.

Yes, this was done by me (in #17366) to allow recording a use on identifiers which is how the overload implementation is being done. The main reason is that the function name is stored as Identifier and that is used to find the previous definition of the same function to create the overload "link".

@MichaReiser
Copy link
Member

Sounds good to me and thanks for jumping on this. It might be good to have a "fix" for now because it seems very easy to trigger this bug in the LSP (which results in a crash).

A short term fix could be to add a try_scoped_expression_id method that returns Option. That gives us some time to explore the next steps without regressing the completion feature.

@AlexWaygood AlexWaygood added the server Related to the LSP server label Jun 3, 2025
sharkdp added a commit that referenced this pull request Jun 4, 2025
## Summary

Implement a hotfix for the playground/LSP crashes related to missing
`expression_scope_id`s.

relates to: astral-sh/ty#572

## Test Plan

* Regression tests from #18441
* Ran the playground locally to check if panics occur / completions
still work.

---------

Co-authored-by: Andrew Gallant <[email protected]>
@BurntSushi BurntSushi closed this Jan 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

server Related to the LSP server ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic during completions: no entry found for key

4 participants