Skip to content

gh-138912: Add fast path for match class patterns without sub-patterns#144820

Open
cdce8p wants to merge 1 commit intopython:mainfrom
cdce8p:match-class-opcode-isinstance
Open

gh-138912: Add fast path for match class patterns without sub-patterns#144820
cdce8p wants to merge 1 commit intopython:mainfrom
cdce8p:match-class-opcode-isinstance

Conversation

@cdce8p
Copy link
Contributor

@cdce8p cdce8p commented Feb 14, 2026

Extracted from #139080. This builds on the work merged with 75d4839.

Add a fast path for match class patterns without any sub-patterns. This avoids creating a new (empty) tuple, pushing it to the stack and subsequently unpacking the tuple result from the MATCH_CLASS opcode again.

__
Micro benchmarks with all specializations enabled.

Micro benchmark (bare class pattern)
from timeit import timeit
setup = """
class A:
    def __init__(self, x, y):
        self.x = x
        self.y = y
a = A(1, 2)

def f1(a):
    if isinstance(a, A):
        return True
    return False

def f2(a):
    match a:
        case A():
            return True
    return False
"""
print(f"If\t{timeit("f1(a)", setup, number=10**7)}")
print(f"Match\t{timeit("f2(a)", setup, number=10**7)}")
# Before
If      0.32616550009697676
Match   0.33599245804362

# After
If      0.3079932499676943
Match   0.2534141249489039 -24.58%

If specialization disabled, the performance improvement is even better with -40.5%.


📚 Documentation preview 📚: https://cpython-previews--144820.org.readthedocs.build/

@cdce8p cdce8p force-pushed the match-class-opcode-isinstance branch from a515a54 to 7f8b4b2 Compare February 22, 2026 11:17
@markshannon
Copy link
Member

Opcodes are a very limited resource, so we need to justify it carefully when adding new ones.

Is pattern matching widely used enough yet to justify this? I don't know.
Is the performance increase enough to justify adding a new instruction?


An approach to optimizing pattern matching that I would be more enthusiastic about is optimizing the decision tree.
This will still need new instructions, but they would be smaller and more easily optimized.

For example, case A(a, b) does both an instance check, and unpacks into a and b, so

match m:
    case A(a, b):
        X
    case A():
       Y

can be transformed into the decision tree, like this:

if m isa A:
    if tmp := unpack_attributes(m, 2):
        a, b = tmp
        X
    else:
        Y

which can be converted into code that has simpler operations and is more amenable to further specialization/jitting.

@eendebakpt
Copy link
Contributor

Inspired by Mark's comments I looked at the generated bytecode:

def g(m):
    match m:
        case A():
            return 100
    return None
import dis
print(dis.dis(g))

Output:

  2           RESUME                   0

  3           LOAD_FAST                0 (m)

  4           LOAD_GLOBAL              0 (A)
              LOAD_CONST               1 (())
              MATCH_CLASS              0
              COPY                     1
              POP_JUMP_IF_NONE         3 (to L1)
              UNPACK_SEQUENCE          0

  5           RETURN_CONST             2 (100)

  4   L1:     POP_TOP

  6           RETURN_CONST             0 (None)
None

This can be simplified to (diff main...eendebakpt:pattern_match_class_v0)

  8           RESUME                   0

  9           LOAD_FAST_BORROW         0 (m)

 10           LOAD_GLOBAL              0 (A)
              LOAD_CONST               0 (())
              MATCH_CLASS              0
              POP_JUMP_IF_FALSE        3 (to L1)
              NOT_TAKEN

 11           LOAD_SMALL_INT         100
              RETURN_VALUE

 12   L1:     LOAD_CONST               1 (None)
              RETURN_VALUE
None

A quick benchmarks shows this is a bit faster.

I tried changing the bytecode generation to use an isinstance check instead of MATCH_CLASS, but that turns out to be slower (even when adding isinstance to the LOAD_COMMON_CONSTANT). Maybe this can be made faster (also for the multi argument case), but that would require some more work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review performance Performance or resource usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants