[ty] Optimize union building for unions with many enum-literal members#22363
[ty] Optimize union building for unions with many enum-literal members#22363AlexWaygood merged 6 commits intomainfrom
Conversation
Diagnostic diff on typing conformance testsNo changes detected when running ty on typing conformance tests ✅ |
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
24a387e to
fc51a96
Compare
fc51a96 to
c62c6c8
Compare
Merging this PR will improve performance by ×2.2Summary
Performance Changes
Comparing Footnotes
|
fa29e3f to
a2d8493
Compare
c62c6c8 to
1500752
Compare
|
Nice, this is a huge improvement for However, it's also a substantial regression for pytorch. Do you think this is because of the now higher limit? |
Hmm, we currently have no limit at all for enum literals on |
fade96f to
910630d
Compare
|
@MichaReiser, I can reproduce the speedup on discord.py locally but I can't reproduce the slowdown on pytorch. When I run ty_benchmark, it reports a 3% speedup for this PR on pytorch: |
|
Hmm. I did ran the benchmark twice before, but I'm now unable to reproduce. So maybe just a flake on my machine. |
carljm
left a comment
There was a problem hiding this comment.
This is great, thank you!!
| replace_with.push(KnownClass::Bytes.to_instance(self.db)); | ||
| } | ||
| UnionElement::EnumLiterals { enum_class, .. } => { | ||
| replace_with.push(Type::instance(self.db, ClassType::NonGeneric(*enum_class))); |
There was a problem hiding this comment.
I'm not aware of anything that prevents enum classes from being generic, so I expect this should not directly construct a ClassType::NonGeneric, but should use a constructor that inspects the class instead. (IIRC we aim to maintain an invariant that we never represent an unspecialized generic class with ClassType::NonGeneric)
There was a problem hiding this comment.
The concept of a generic enum class is inherently incoherent.
Apart from that, attempting to create a generic enum class usually fails at runtime:
>>> from enum import Enum
>>> class Foo[T](Enum):
... X: T
...
Traceback (most recent call last):
File "<python-input-1>", line 1, in <module>
class Foo[T](Enum):
X: T
File "<python-input-1>", line 1, in <generic parameters of Foo>
class Foo[T](Enum):
X: T
File "/Users/alexw/.pyenv/versions/3.13.1/lib/python3.13/enum.py", line 491, in __prepare__
member_type, first_enum = metacls._get_mixins_(cls, bases)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
File "/Users/alexw/.pyenv/versions/3.13.1/lib/python3.13/enum.py", line 956, in _get_mixins_
raise TypeError("new enumerations should be created as "
"`EnumName([mixin_type, ...] [data_type,] enum_type)`")
TypeError: new enumerations should be created as `EnumName([mixin_type, ...] [data_type,] enum_type)`
>>> from typing import Generic, TypeVar
>>> T = TypeVar("T")
>>> class Foo(Enum, Generic[T]):
... X: T
...
Traceback (most recent call last):
File "<python-input-4>", line 1, in <module>
class Foo(Enum, Generic[T]):
X: T
File "/Users/alexw/.pyenv/versions/3.13.1/lib/python3.13/enum.py", line 491, in __prepare__
member_type, first_enum = metacls._get_mixins_(cls, bases)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
File "/Users/alexw/.pyenv/versions/3.13.1/lib/python3.13/enum.py", line 956, in _get_mixins_
raise TypeError("new enumerations should be created as "
"`EnumName([mixin_type, ...] [data_type,] enum_type)`")
TypeError: new enumerations should be created as `EnumName([mixin_type, ...] [data_type,] enum_type)`even when the class creation does not immediately fail at runtime, it does not work in the way you'd expect, because of the fact that Generic.__class_getitem__ is clobbered by EnumMeta.__getitem__:
>>> from enum import Enum
>>> from typing import Generic, TypeVar
>>> T = TypeVar("T")
>>> class Foo(Generic[T], Enum):
... X: T
...
>>> Foo[int]
Traceback (most recent call last):
File "<python-input-6>", line 1, in <module>
Foo[int]
~~~^^^^^
File "/Users/alexw/.pyenv/versions/3.13.1/lib/python3.13/enum.py", line 789, in __getitem__
return cls._member_map_[name]
~~~~~~~~~~~~~~~~^^^^^^
KeyError: <class 'int'>Generic enums are explicitly forbidden by mypy and pyrefly:
- https://mypy-play.net/?mypy=latest&python=3.12&gist=35ef91038c460dd48cd557bb8704076d
- https://pyrefly.org/sandbox/?project=N4IgZglgNgpgziAXKOBDAdgEwEYHsAeAdAA4CeS4ATrgLYAEALqcROgOZ0Q3G6UN0BxGOhiUIAYwA0dACrMYANVSUAOujDV6wgK70uPPnQCi6XWrUy6AXlnyllABQqQM5wEpz6cVFRw4dADFcXAchETFxAG0ZAF1pE103RDU6VLp8RFk1EEkQbQZoOBJyRBAAYjoAVQKoCCY6MG0vAtx0OE9MGDAG3hpUBgB9UxpsUQcMznQGNzoAWgA%2BOjgGSmT0NLpKGAZtSnWwZwA5XVHVumB8AF9nbNyyLbAoUkIGWigKCoAFUgenpYwcAQ6OJWpA2Lt%2BhBWoQ1BUAMowGB0AAWDAYxDgiAA9Fj7l0noReGwscIsZhcOI4FiQeoIODKJDWliepQ6KgAG6oaCobCwYGgukQlrrXDEYVFNRkBjI1qzdmiOBQ9Y2ZwAZkIAEYAEw3dAgS65VDiArygLQGAUNBYPBEMj6oA
and while we do not yet explicitly forbid them, we also make it impossible for generic enums to exist in our model due to the way we represent enum-literal types -- EnumLiteralType wraps a ClassLiteral rather than a ClassType:
ruff/crates/ty_python_semantic/src/types.rs
Lines 12696 to 12704 in c02d164
So I do not believe that generic enums are a concept that we can or should support, and I would instead view this as a missing lint that we should implement in a separate PR (where we explicitly forbid enum classes to multiple-inherit from Generic[] or have PEP-695 type parameters)
There was a problem hiding this comment.
Fair enough, that works for me! Looks like I just picked the wrong type checker to test in (pyright supports generic enums to at least some extent.) The one part of an enum that could be coherently generic is methods, maybe? But I'm happy to just say we don't support generic enums and never will.
| // for why we avoid using the `should_widen` closure here. | ||
| let enum_literals_limit = | ||
| if self.recursively_defined.is_yes() && cycle_recovery { | ||
| MAX_RECURSIVE_UNION_LITERALS |
There was a problem hiding this comment.
Do we have any tests involving a recursively-defined union with more than 10 enum elements in it? (Or one with less than 10, for that matter). Could probably look at the PR that introduced this constant to find similar tests for other kinds of literals, to work from.
There was a problem hiding this comment.
It doesn't look to me like we have any tests for recursively defined unions with lots of Literal elements in them. I looked at c7b5067, which originally introduced the size limit on the number of Literal elements we allow to coexist in a union before widening the Literals, and f3e5713, which increased the limit for non-recursive unions -- neither commit appeared to include tests of the kind you're asking about here.
Happy to look into this more as a followup if you think it's important, but for now I'm going to move on!
| }; | ||
| if literals.len() >= enum_literals_limit { | ||
| let replace_with = | ||
| Type::instance(self.db, ClassType::NonGeneric(enum_class)); |
There was a problem hiding this comment.
Same comment here about ClassType::NonGeneric.
It seems like this (make an instance type out of an enum class and add it in place) is something we do a lot of different places in this diff. I'm not sure it can be abstracted smaller than the two lines to do those two things, but it would be nice if this particular repeated line didn't repeat non-germane assumptions about genericness in any way (whether they are correct assumptions or not.)
There was a problem hiding this comment.
yeah, I can extract this out into a method on EnumLiteralType
There was a problem hiding this comment.
Oh, we actually already have the helper method you're asking for, I just didn't know about it/forgot about it:
ruff/crates/ty_python_semantic/src/types.rs
Lines 12709 to 12713 in 7319c37
910630d to
da75161
Compare
Summary
When building unions, we have fast paths for int-literals, bytes-literals and string-literals so that we avoid repeated expensive subtype-of checks. However, we currently have no equivalent fast path for enum literals. This PR adds that fast path.
This PR leads to a dramatic performance improvement for the snippet in astral-sh/ty#1588 (comment). (But I would still say our performance is too slow even with this PR, so I wouldn't say it closes that issue.)
Test Plan
QUICKCHECK_TESTS=1000000 cargo test --release -p ty_python_semantic -- --ignored types::property_tests::stablestill passes