Skip to content

[ty] Memoize binary operator return types#24700

Merged
charliermarsh merged 1 commit intomainfrom
charlie/bind-i
Apr 18, 2026
Merged

[ty] Memoize binary operator return types#24700
charliermarsh merged 1 commit intomainfrom
charlie/bind-i

Conversation

@charliermarsh
Copy link
Copy Markdown
Member

@charliermarsh charliermarsh commented Apr 17, 2026

Summary

Especially for cases like astral-sh/ty#3039, we were running binary operator inference over and over, and throwing away everything except the return type. This PR adds a cached query for just the return type, which is more lightweight than storing the entire Bindings but seemingly still very effective.

For:

import pandas as pd

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6], "c": [7, 8, 9]})
df["d"] = df["a"] + df["b"] + df["c"] + 1 + (df["a"] ** 2 + df["b"] ** 2 + df["c"] ** 2)

Codex reports a 3.32x speedup. Repeating that expression 20 times, Codex reports a 50.79x speedup (from 52.471s down to 1.033s).

Closes astral-sh/ty#3039.

@astral-sh-bot astral-sh-bot Bot added the ty Multi-file analysis & type inference label Apr 17, 2026
@astral-sh-bot
Copy link
Copy Markdown

astral-sh-bot Bot commented Apr 17, 2026

Typing conformance results

No changes detected ✅

Current numbers
The percentage of diagnostics emitted that were expected errors held steady at 87.94%. The percentage of expected errors that received a diagnostic held steady at 83.36%. The number of fully passing files held steady at 79/133.

@astral-sh-bot
Copy link
Copy Markdown

astral-sh-bot Bot commented Apr 17, 2026

Memory usage report

Summary

Project Old New Diff Outcome
flake8 47.94MB 47.94MB +0.00% (1.36kB)
trio 117.66MB 117.64MB -0.02% (20.72kB) ⬇️
sphinx 262.81MB 262.78MB -0.01% (27.09kB) ⬇️
prefect 717.12MB 716.76MB -0.05% (366.13kB) ⬇️

Significant changes

Click to expand detailed breakdown

flake8

Name Old New Diff Outcome
try_call_bin_op_return_type_impl 0.00B 6.31kB +6.31kB (new)
infer_scope_types_impl 989.65kB 986.17kB -0.35% (3.48kB)
infer_definition_types 1.87MB 1.86MB -0.18% (3.45kB)
try_call_bin_op_return_type_impl::interned_arguments 0.00B 2.15kB +2.15kB (new)
infer_expression_types_impl 1.04MB 1.04MB -0.02% (180.00B)

trio

Name Old New Diff Outcome
try_call_bin_op_return_type_impl 0.00B 50.12kB +50.12kB (new) ⬇️
infer_definition_types 7.61MB 7.58MB -0.46% (36.21kB) ⬇️
infer_expression_types_impl 7.03MB 7.00MB -0.37% (26.96kB) ⬇️
try_call_bin_op_return_type_impl::interned_arguments 0.00B 14.35kB +14.35kB (new) ⬇️
infer_expression_type_impl 1.31MB 1.30MB -0.73% (9.84kB) ⬇️
infer_scope_types_impl 4.75MB 4.75MB -0.15% (7.50kB) ⬇️
loop_header_reachability 131.79kB 129.49kB -1.74% (2.30kB) ⬇️
all_narrowing_constraints_for_expression 592.48kB 590.41kB -0.35% (2.07kB) ⬇️
all_negative_narrowing_constraints_for_expression 184.51kB 184.36kB -0.08% (156.00B) ⬇️
infer_deferred_types 2.34MB 2.34MB -0.01% (132.00B) ⬇️
infer_unpack_types 143.41kB 143.38kB -0.02% (24.00B) ⬇️

sphinx

Name Old New Diff Outcome
try_call_bin_op_return_type_impl 0.00B 207.61kB +207.61kB (new) ⬇️
infer_definition_types 23.76MB 23.63MB -0.55% (133.98kB) ⬇️
infer_expression_types_impl 20.86MB 20.75MB -0.50% (107.47kB) ⬇️
try_call_bin_op_return_type_impl::interned_arguments 0.00B 57.58kB +57.58kB (new) ⬇️
infer_scope_types_impl 15.45MB 15.43MB -0.18% (28.82kB) ⬇️
infer_expression_type_impl 2.91MB 2.90MB -0.34% (10.04kB) ⬇️
loop_header_reachability 369.07kB 364.36kB -1.28% (4.71kB) ⬇️
all_narrowing_constraints_for_expression 2.34MB 2.34MB -0.15% (3.53kB) ⬇️
all_negative_narrowing_constraints_for_expression 1.00MB 1.00MB -0.18% (1.85kB) ⬇️
StaticClassLiteral<'db>::implicit_attribute_inner_ 2.36MB 2.36MB -0.03% (780.00B) ⬇️
Type<'db>::member_lookup_with_policy_ 6.86MB 6.86MB -0.01% (720.00B) ⬇️
infer_unpack_types 437.45kB 437.04kB -0.09% (420.00B) ⬇️

prefect

Name Old New Diff Outcome
infer_definition_types 90.69MB 90.38MB -0.34% (314.63kB) ⬇️
try_call_bin_op_return_type_impl 0.00B 257.16kB +257.16kB (new) ⬇️
infer_expression_types_impl 63.25MB 63.01MB -0.39% (250.23kB) ⬇️
try_call_bin_op_return_type_impl::interned_arguments 0.00B 58.01kB +58.01kB (new) ⬇️
infer_scope_types_impl 54.83MB 54.79MB -0.07% (41.91kB) ⬇️
infer_expression_type_impl 13.44MB 13.41MB -0.22% (30.11kB) ⬇️
StaticClassLiteral<'db>::implicit_attribute_inner_ 10.07MB 10.06MB -0.10% (10.01kB) ⬇️
all_narrowing_constraints_for_expression 7.21MB 7.20MB -0.11% (8.26kB) ⬇️
Type<'db>::member_lookup_with_policy_ 17.25MB 17.24MB -0.04% (6.80kB) ⬇️
loop_header_reachability 438.07kB 432.49kB -1.27% (5.58kB) ⬇️
Type<'db>::class_member_with_policy_ 17.62MB 17.62MB -0.02% (3.80kB) ⬇️
all_negative_narrowing_constraints_for_expression 2.63MB 2.63MB -0.14% (3.74kB) ⬇️
function_known_decorators 8.27MB 8.26MB -0.04% (3.28kB) ⬇️
infer_deferred_types 14.60MB 14.59MB -0.02% (3.20kB) ⬇️
infer_unpack_types 899.39kB 899.07kB -0.04% (336.00B) ⬇️
... 7 more

@astral-sh-bot
Copy link
Copy Markdown

astral-sh-bot Bot commented Apr 17, 2026

ecosystem-analyzer results

No diagnostic changes detected ✅

Full report with detailed diff (timing results)

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 17, 2026

Merging this PR will improve performance by 16.33%

⚡ 1 improved benchmark
✅ 46 untouched benchmarks
⏩ 60 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime colour_science 62 s 53.3 s +16.33%

Comparing charlie/bind-i (91a87cd) with main (65d768e)

Open in CodSpeed

Footnotes

  1. 60 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@charliermarsh charliermarsh added the performance Potential performance improvement label Apr 17, 2026
@charliermarsh charliermarsh marked this pull request as ready for review April 17, 2026 20:50
@astral-sh-bot astral-sh-bot Bot requested a review from oconnor663 April 17, 2026 20:50
@ibraheemdev
Copy link
Copy Markdown
Member

Do you know why we're running binary operator inference multiple times? Are we performing multi-inference somewhere? It seems like it would be more effective to memoize at an outer layer rather than specifically binary operators, if this is a more general problem.

@carljm carljm removed their request for review April 17, 2026 23:25
@charliermarsh
Copy link
Copy Markdown
Member Author

Hmm, I don't think multi-inference is involved. Given:

import pandas as pd

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6], "c": [7, 8, 9]})
df["d"] = df["a"] + df["b"] + df["c"] + 1 + (df["a"] ** 2 + df["b"] ** 2 + df["c"] ** 2)

We run try_call_bin_op(Series, Add, Series) 5 times, try_call_bin_op(Series, Pow, int) 3 times, etc. And the expression-level cache doesn't help.

@ibraheemdev
Copy link
Copy Markdown
Member

Ah I see, that seems reasonable.

@charliermarsh
Copy link
Copy Markdown
Member Author

I do wonder if there’s a more general idea here though. I’ll explore it separately.

@charliermarsh charliermarsh merged commit 67296f0 into main Apr 18, 2026
56 checks passed
@charliermarsh charliermarsh deleted the charlie/bind-i branch April 18, 2026 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Potential performance improvement ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extremely slow type checking of code involving pandas arithmetic

3 participants