[ty] Use an interval map for scopes by expression#19025
Merged
MichaReiser merged 4 commits intomainfrom Jul 14, 2025
Merged
Conversation
Contributor
|
Contributor
|
8ac1ed4 to
d4ef8d6
Compare
a55036c to
e021a77
Compare
e021a77 to
8d0af39
Compare
8d0af39 to
7bc278e
Compare
sharkdp
approved these changes
Jul 14, 2025
| /// Builds an interval-map that matches expressions (by their node index) to their enclosing scopes. | ||
| /// | ||
| /// The interval map is built in a two-step process because the expression ids are assigned in source order, | ||
| /// but we visit the expressions in semantic order. Few expressions are registered out of order. |
Contributor
There was a problem hiding this comment.
Is this something that would change with the proposal in #19271?
Member
Author
There was a problem hiding this comment.
Yes, but I don't think it would invalidate the entire approach. Instead, we would have to use a regular sort call in build before building the interval map (and Rust's sorting claims to be pretty good at sorting mostly sorted data)
dcreager
added a commit
that referenced
this pull request
Jul 14, 2025
* dcreager/merge-arguments: (223 commits) fix docs Combine CallArguments and CallArgumentTypes [ty] Sync vendored typeshed stubs (#19334) [`refurb`] Make example error out-of-the-box (`FURB122`) (#19297) [refurb] Make example error out-of-the-box (FURB177) (#19309) [ty] ignore errors when reformatting codemodded typeshed (#19332) [ty] Provide docstrings for stdlib APIs when hovering over them in an IDE (#19311) [ty] Add virtual files to the only project database (#19322) Add t-string fixtures for rules that do not need to be modified (#19146) [ty] Remove `FileLookupError` (#19323) [ty] Fix handling of metaclasses in `object.<CURSOR>` completions [ty] Use an interval map for scopes by expression (#19025) [ty] List all `enum` members (#19283) [ty] Handle configuration errors in LSP more gracefully (#19262) [ty] Use python version and path from Python extension (#19012) [`pep8_naming`] Avoid false positives on standard library functions with uppercase names (`N802`) (#18907) Update Rust crate toml to 0.9.0 (#19320) [ty] Fix server version (#19284) Update NPM Development dependencies (#19319) Update taiki-e/install-action action to v2.56.13 (#19317) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The semantic index stores a map from expression to scope because we need to know in which
TypeInference(scope) to look up the expression's type. Today, we use a hash map to store the expression-to-scope mapping.This PR replaces the hash map with an interval map (vector-based) that maps a range of node IDs (expressions) to their corresponding scope. The advantage of an interval map over a hash map is that it reduces memory consumption from
O(expressions)toO(~scopes).The main downside (other than increased complexity) is that the lookup complexity increases from
O(1)toO(log(~scopes)). Looking at the benchmark results, the fact that we need to write less data outweighs the slightly slower lookup times.The instrumented benchmarks show a 1-2% performance improvement. I measured memory consumption on a large project and the overall memory consumption of all semantic indices decreased by about 5%,
Test plan
cargo test