[ty] Use an interval map for scopes by expression by MichaReiser · Pull Request #19025 · astral-sh/ruff

MichaReiser · 2025-06-29T14:45:09Z

Summary

The semantic index stores a map from expression to scope because we need to know in which TypeInference (scope) to look up the expression's type. Today, we use a hash map to store the expression-to-scope mapping.

This PR replaces the hash map with an interval map (vector-based) that maps a range of node IDs (expressions) to their corresponding scope. The advantage of an interval map over a hash map is that it reduces memory consumption from O(expressions) to O(~scopes).

The main downside (other than increased complexity) is that the lookup complexity increases from O(1) to O(log(~scopes)). Looking at the benchmark results, the fact that we need to write less data outweighs the slightly slower lookup times.

The instrumented benchmarks show a 1-2% performance improvement. I measured memory consumption on a large project and the overall memory consumption of all semantic indices decreased by about 5%,

Test plan

cargo test

github-actions · 2025-06-29T14:48:27Z

`mypy_primer` results

No ecosystem changes detected ✅

Memory usage changes were detected when running on open source projects

flake8 (https://github.com/pycqa/flake8)
-     memo fields = ~66MB
+     memo fields = ~63MB

prefect (https://github.com/PrefectHQ/prefect)
-     memo fields = ~568MB
+     memo fields = ~541MB

github-actions · 2025-06-29T14:55:17Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

sharkdp

Very nice — thank you!

sharkdp · 2025-07-14T07:28:38Z

crates/ty_python_semantic/src/semantic_index/builder.rs

+/// Builds an interval-map that matches expressions (by their node index) to their enclosing scopes.
+///
+/// The interval map is built in a two-step process because the expression ids are assigned in source order,
+/// but we visit the expressions in semantic order. Few expressions are registered out of order.


Is this something that would change with the proposal in #19271?

Yes, but I don't think it would invalidate the entire approach. Instead, we would have to use a regular sort call in build before building the interval map (and Rust's sorting claims to be pretty good at sorting mostly sorted data)

crates/ty_python_semantic/src/semantic_index/builder.rs

* dcreager/merge-arguments: (223 commits) fix docs Combine CallArguments and CallArgumentTypes [ty] Sync vendored typeshed stubs (#19334) [`refurb`] Make example error out-of-the-box (`FURB122`) (#19297) [refurb] Make example error out-of-the-box (FURB177) (#19309) [ty] ignore errors when reformatting codemodded typeshed (#19332) [ty] Provide docstrings for stdlib APIs when hovering over them in an IDE (#19311) [ty] Add virtual files to the only project database (#19322) Add t-string fixtures for rules that do not need to be modified (#19146) [ty] Remove `FileLookupError` (#19323) [ty] Fix handling of metaclasses in `object.<CURSOR>` completions [ty] Use an interval map for scopes by expression (#19025) [ty] List all `enum` members (#19283) [ty] Handle configuration errors in LSP more gracefully (#19262) [ty] Use python version and path from Python extension (#19012) [`pep8_naming`] Avoid false positives on standard library functions with uppercase names (`N802`) (#18907) Update Rust crate toml to 0.9.0 (#19320) [ty] Fix server version (#19284) Update NPM Development dependencies (#19319) Update taiki-e/install-action action to v2.56.13 (#19317) ...

MichaReiser added internal An internal refactor or improvement ty Multi-file analysis & type inference labels Jun 29, 2025

MichaReiser force-pushed the micha/scopes-by-expression-interval-map branch from 8ac1ed4 to d4ef8d6 Compare June 29, 2025 15:16

Base automatically changed from micha/ast-ids to main July 2, 2025 15:57

MichaReiser force-pushed the micha/scopes-by-expression-interval-map branch from a55036c to e021a77 Compare July 11, 2025 17:02

[ty] Use an interval map for scopes by expression

809c034

MichaReiser force-pushed the micha/scopes-by-expression-interval-map branch from e021a77 to 8d0af39 Compare July 11, 2025 17:05

Use arrays of structs

7bc278e

MichaReiser force-pushed the micha/scopes-by-expression-interval-map branch from 8d0af39 to 7bc278e Compare July 11, 2025 17:36

MichaReiser marked this pull request as ready for review July 12, 2025 16:26

MichaReiser requested review from AlexWaygood, carljm, dcreager and sharkdp as code owners July 12, 2025 16:26

sharkdp approved these changes Jul 14, 2025

View reviewed changes

MichaReiser added 2 commits July 14, 2025 13:25

Use range inclusive

ae3c910

Use unstable_sort

9bc8bf6

MichaReiser merged commit 3560f86 into main Jul 14, 2025
37 checks passed

MichaReiser deleted the micha/scopes-by-expression-interval-map branch July 14, 2025 11:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[ty] Use an interval map for scopes by expression#19025

[ty] Use an interval map for scopes by expression#19025
MichaReiser merged 4 commits intomainfrom
micha/scopes-by-expression-interval-map

MichaReiser commented Jun 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 29, 2025 •

edited

Loading

Uh oh!

sharkdp left a comment

Uh oh!

sharkdp Jul 14, 2025 •

edited

Loading

Uh oh!

MichaReiser Jul 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

MichaReiser commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

github-actions bot commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

Uh oh!

sharkdp left a comment

Choose a reason for hiding this comment

Uh oh!

sharkdp Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MichaReiser Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MichaReiser commented Jun 29, 2025 •

edited

Loading

github-actions bot commented Jun 29, 2025 •

edited

Loading

`mypy_primer` results

github-actions bot commented Jun 29, 2025 •

edited

Loading

`ruff-ecosystem` results

sharkdp Jul 14, 2025 •

edited

Loading