Speed up make_simplified_union, fix recursive tuple crash#15128
Speed up make_simplified_union, fix recursive tuple crash#15128hauntsaninja merged 8 commits intopython:masterfrom
Conversation
The following code optimises make_simplified_union in the common case that there are exact duplicates in the union. In this regard, this is similar to python#15104 To get this to work, I needed to use partial tuple fallbacks in a couple places (these maybe had the potential to be latent crashes anyway?) There were some interesting things going on with recursive type aliases and type state assumptions This is about a 25% speedup on the pydantic codebase and about a 2% speedup on self check (measured with uncompiled mypy)
This comment has been minimized.
This comment has been minimized.
|
Looks like there's a major performance regression in |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Diff from mypy_primer, showing the effect of this PR on open source code: pandas-stubs (https://github.com/pandas-dev/pandas-stubs) got 1.14 faster (68.0s -> 59.5s)
speedrun.com_global_scoreboard_webapp (https://github.com/Avasam/speedrun.com_global_scoreboard_webapp) got 1.24 faster (43.8s -> 35.2s)
|
|
(note to self: once this is merged, rename |
|
I did a measurement using |
JukkaL
left a comment
There was a problem hiding this comment.
Looks good! I'm happy to see additional performance wins. I finally got around to reviewing this now that my jet lag is better.
Left a few additional ideas (optional, since this already seems like a nice improvement).
| # If deleted subtypes had more general truthiness, use that | ||
| orig_item = new_items[duplicate_index] | ||
| if not orig_item.can_be_true and ti.can_be_true: | ||
| new_items[duplicate_index] = true_or_false(orig_item) |
There was a problem hiding this comment.
Should we only change can_be_true, or is it ok to only adjust can_be_false?
| if not orig_item.can_be_true and ti.can_be_true: | ||
| new_items[duplicate_index] = true_or_false(orig_item) | ||
| elif not orig_item.can_be_false and ti.can_be_false: | ||
| new_items[duplicate_index] = true_or_false(orig_item) |
|
This fixes a crash that just got reported #15192, so I added a regression test for it. I added the two micro optimisations suggested. |
|
Diff from mypy_primer, showing the effect of this PR on open source code: pandas-stubs (https://github.com/pandas-dev/pandas-stubs) got 1.15x faster (66.6s -> 58.0s)
speedrun.com_global_scoreboard_webapp (https://github.com/Avasam/speedrun.com_global_scoreboard_webapp) got 1.27x faster (33.3s -> 26.2s)
|
|
Nice, looks like the primer timings are pretty consistent between: |
|
This PR actually introduced a recursive tuple crash I will check maybe there is a simple fix for it. |
|
It looks like there is (a bug is quite obvious). I will submit a PR shortly. |
|
See #15216 |
Fixes #15192
The following code optimises make_simplified_union in the common case that there are exact duplicates in the union. In this regard, this is similar to #15104
There's a behaviour change in one unit test. I think it's good? We'll see what mypy_primer has to say.
To get this to work, I needed to use partial tuple fallbacks in a couple places (these maybe had the potential to be latent crashes anyway?) There were some interesting things going on with recursive type aliases and type state assumptions
This is about a 25% speedup on the pydantic codebase and about a 2% speedup on self check (measured with uncompiled mypy)