Skip to content

Commit 8e0c73e

Browse files
committed
builder: BCOffset/InstrIndex named conversions + 3-class fix in build_inline_except_opcode_array_c
Fixes THREE boundary-domain bugs in build_inline_except_opcode_array_c (introduced W27c #2a 7135d94), found across two HIR-diff Phase 0 cycles of the W-2B-RECONVERT investigation. CONVENTION (Python/jit/bytecode.cpp:8-14, builder.cpp:1235, phx_frame_state.h cur_instr_offs semantics): - jit_bc_instr_init expects INSTRUCTION INDEX (codeUnit[]) - jit_bc_instr_get_jump_target / next_offset / base_offset return whatever was stored at init (now INDEX after Class A fix) - phx_block_map keys are BYTE OFFSETS - OpcodeArrayEntry.base_offset is consumed downstream as BYTE OFFSET (cur_instr_offs assignment per builder_emit_c.c:3320) - BCOffset.value() (caller-passed except_body_offset) is BYTE OFFSET - BYTES = INDEX * sizeof(_Py_CODEUNIT) (= 2 in 3.12) NAMED CONVERSIONS (Python/jit/bytecode_c.h, gated by _Py_OPCODE): static inline int phx_bc_offset_to_instr_index(int byte_off); static inline int phx_bc_instr_index_to_offset(int instr_idx); Codifies the boundary-domain rule by example (per pythia python#137 python#2 + supervisor 19:01:19Z + theologian 19:01:17Z). THREE FIXES: CLASS A (line 3241): jit_bc_instr_init was passed except_body_offset (BCOffset.value() byte offset) where INSTRUCTION INDEX was expected. codeUnit(code)[byte_offset] read PAST end of co_code → garbage opcode → switch-default → Deopt with corrupt frame state. Found by Phase 0 HIR-diff (test_exc_raise_catch bb 12: correct Return -1 vs corrupt LoadConst NoneType + Deopt at offset 58). CLASS B (line 3273-3275, theologian class-of-bug audit 18:42:53Z): target = jit_bc_instr_get_jump_target returns INDEX, but phx_block_map_lookup_or_panic expects BYTE OFFSET. Without conversion, JUMP_BACKWARD-in-except-body lookup fails: JIT_CHECK_C panic OR silent wrong-block. Dormant pre-fix because no test had backward-jump-in-except-body. CLASS C (line 3260, exposed by Phase 0' HIR-diff after Class A+B fix): After Class A fix corrected the init to INDEX, jit_bc_instr_base_offset returns INDEX. But entry->base_offset is consumed downstream (line 3320) as BYTE OFFSET via match_tc.frame.cur_instr_offs assignment. Pre-fix 'correct by accident' — Class A's BYTES-as-INDEX init wrote BYTES into bci->base_offset, so jit_bc_instr_base_offset returned BYTES, matching downstream. Correct Class A exposed Class C: cur_instr_offs got INDEX (half the correct BYTE value) → interpreter Deopt resumed at wrong bytecode position → SIGSEGV in test_multiple_exceptions_in_loop (deterministic 0/20 post Class A+B fix, vs 20/20 PASS pre-W27c). DIAGNOSIS: HIR-diff for test_multiple_exceptions_in_loop revealed Deopt CurInstrOffset 124 (correct, BYTES) → 62 (wrong, INDEX = 124/2). Direct evidence of the domain mismatch. LATENT in pushed W27c #2a (e4e7507 on SonicField/cpython): all three classes present. Class A, B dormant (no test exercises emitInlineExceptionMatch or JUMP_BACKWARD-in-except-body). Class C compensated by Class A — both broken in opposite directions canceling out for downstream consumers of entry->base_offset. ALL three must fix together. INVESTIGATION CHAIN: - testkeeper bisect 17:52:30Z localized #2b regression → W27c #2b sole - pythia python#136 python#1 18:23:34Z flagged HEAP/RACE rode on absence-of-evidence - generalist 18:24Z proposed HIR-diff Phase 0 falsifier - generalist 18:32+18:34Z captured HIR_2a + HIR_2b dumps; found Class A - theologian 18:42:53Z class-of-bug audit found Class B - supervisor 18:43:17Z directed dual-fix - pythia python#137 python#2 19:00:29Z flagged inline-arithmetic violates boundary-domain rule - supervisor 19:01:19Z + theologian 19:01:17Z directed amend to named conversions - testkeeper 19:06:55Z full Phoenix gate caught NEW regression (test_multiple_exceptions_in_loop deterministic 0/20) - generalist 19:14Z HIR-diff Phase 0' on test_multiple_exceptions_in_loop revealed Class C (cur_instr_offs 124→62) - supervisor 19:16:42Z authorized Class C fix + extended audit OTHER OpcodeArrayEntry FIELDS AUDITED (per supervisor 19:16:42Z extended class-of-bug discipline): - entry->opcode: written from jit_bc_instr_opcode (no domain — opcode value); consumed in dispatch loop switch. CLEAN. - entry->oparg: written from jit_bc_instr_oparg (no domain — oparg value); consumed in dispatch loop emit calls. CLEAN. - entry->base_offset: Class C above; FIXED. - entry->const_obj: written from PyTuple_GET_ITEM (PyObject*); consumed in dispatch loop hir_type_from_object. CLEAN (no domain conversion). - entry->jump_target_block: Class B above; FIXED. VERIFICATION pending (testkeeper 4-suite extended verify): 1. 30x test_exc_raise_catch (Class A regression) 2. 30x test_exc_binary_subscr_dict_in_try (Class A latent activation) 3. 30x test_exc_continue_in_loop (Class B latent activation) 4. 30x multi-except-in-loop sentinel (Class C latent activation) 5. Full Phoenix suite (which originally caught Class C)
1 parent e4e7507 commit 8e0c73e

2 files changed

Lines changed: 46 additions & 4 deletions

File tree

Python/jit/bytecode_c.h

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,26 @@ int jit_bc_instr_next_offset(JitBytecodeInstr *bci);
5555
/* Raw word access (requires _Py_CODEUNIT from cpython/code.h) */
5656
#ifdef _Py_OPCODE
5757
_Py_CODEUNIT jit_bc_instr_word(JitBytecodeInstr *bci);
58+
59+
/* Boundary-domain conversions between BCOffset (byte offsets, used by
60+
* C++ BCOffset.value() and phx_block_map keys per builder.cpp:1235) and
61+
* instruction indices (codeUnit[] indexing, used by all jit_bc_instr_*
62+
* accessors and phx_block_map_lookup_or_panic argument). See
63+
* Python/jit/bytecode.cpp:8-14 for the boundary convention.
64+
*
65+
* Use these named helpers at every C/C++ seam that crosses byte ↔ index
66+
* units, per the boundary-domain rule (W-PROTOCOL-CODIFY, supervisor
67+
* 2026-04-25 19:01:19Z + theologian 19:01:17Z). Inline arithmetic at the
68+
* seam is the bug class: BCOffset/InstrIndex Class A + B + C mismatches
69+
* in build_inline_except_opcode_array_c (W-2B-RECONVERT investigation
70+
* found three boundary-domain bugs in a single helper across two HIR-diff
71+
* Phase 0 cycles). */
72+
static inline int phx_bc_offset_to_instr_index(int byte_off) {
73+
return byte_off / (int)sizeof(_Py_CODEUNIT);
74+
}
75+
static inline int phx_bc_instr_index_to_offset(int instr_idx) {
76+
return instr_idx * (int)sizeof(_Py_CODEUNIT);
77+
}
5878
#endif
5979

6080
/* Bytecode block iteration */

Python/jit/hir/builder_emit_c.c

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3237,8 +3237,16 @@ static void build_inline_except_opcode_array_c(
32373237
return;
32383238
}
32393239

3240+
/* except_body_offset is a BCOffset.value() (BYTE OFFSET, see
3241+
* Python/jit/bytecode.cpp:8-14 for the C/C++ boundary convention).
3242+
* jit_bc_instr_init expects an INSTRUCTION INDEX. Convert at the
3243+
* boundary via the named helper. Subsequent re-inits inside the loop
3244+
* use jit_bc_instr_next_offset which already returns an instruction
3245+
* index. (Class A fix: was passing BYTES → OOB read past end of
3246+
* bytecode → garbage opcode → corrupt deopt.) */
32403247
JitBytecodeInstr ebc;
3241-
jit_bc_instr_init(&ebc, code, except_body_offset);
3248+
jit_bc_instr_init(&ebc, code,
3249+
phx_bc_offset_to_instr_index(except_body_offset));
32423250
int emitted_terminator = 0;
32433251
while (!emitted_terminator) {
32443252
if (n == cap) {
@@ -3257,7 +3265,15 @@ static void build_inline_except_opcode_array_c(
32573265

32583266
int op = jit_bc_instr_opcode(&ebc);
32593267
int oparg = jit_bc_instr_oparg(&ebc);
3260-
int base_off = jit_bc_instr_base_offset(&ebc);
3268+
/* (Class C fix) jit_bc_instr_base_offset returns whatever was
3269+
* stored at init — INDEX after our Class A fix. But entry->base_offset
3270+
* is consumed downstream as BYTE OFFSET (cur_instr_offs assignment
3271+
* at line 3320 → BCOffset domain per phx_frame_state.h cur_instr_offs
3272+
* semantics, builder.cpp:1710/1774/4392). Convert at the WRITE site
3273+
* to preserve the original BYTES semantic. (Pre-Class-A fix this
3274+
* compensated by Class A's BYTES-as-INDEX init giving back BYTES
3275+
* here; Class C was dormant. Correct Class A exposed Class C.) */
3276+
int base_off = phx_bc_instr_index_to_offset(jit_bc_instr_base_offset(&ebc));
32613277

32623278
OpcodeArrayEntry *entry = &arr[n++];
32633279
entry->opcode = op;
@@ -3270,9 +3286,15 @@ static void build_inline_except_opcode_array_c(
32703286
entry->const_obj = (void*)PyTuple_GET_ITEM(code->co_consts, oparg);
32713287
}
32723288
if (op == JUMP_BACKWARD || op == JUMP_BACKWARD_NO_INTERRUPT) {
3273-
int target = jit_bc_instr_get_jump_target(&ebc);
3289+
/* (Class B fix) jit_bc_instr_get_jump_target returns INSTRUCTION
3290+
* INDEX (codeUnit[]); phx_block_map keys are BYTE OFFSETS
3291+
* (per builder.cpp:1235 phx_block_map_insert with
3292+
* BCOffset.value()). Convert at the boundary via the named
3293+
* helper. */
3294+
int target_idx = jit_bc_instr_get_jump_target(&ebc);
3295+
int target_off = phx_bc_instr_index_to_offset(target_idx);
32743296
entry->jump_target_block = phx_block_map_lookup_or_panic(
3275-
&phx_hir_builder_state(builder)->block_map_phx, target);
3297+
&phx_hir_builder_state(builder)->block_map_phx, target_off);
32763298
}
32773299

32783300
if (op == RETURN_VALUE || op == RETURN_CONST

0 commit comments

Comments
 (0)