|
| 1 | +# W25 §5.3 Mutation Test — Step B Falsification Baseline |
| 2 | + |
| 3 | +**Status:** Pre-Step-B baseline procedure. Run by testkeeper at HEAD |
| 4 | +3404a81192 (post-Step-A, pre-Step-B) to capture the empirical "drift |
| 5 | +silently links" baseline. Re-run after Step B lands to confirm the |
| 6 | +"drift caught at compile time" post-condition. |
| 7 | + |
| 8 | +**Purpose:** Per W25 spec docs/w25-hbb-canonicalization.md §5.3, validate |
| 9 | +that Step B's local-extern cleanup actually closes the signature-drift |
| 10 | +surface. Without this empirical test, the "lint gate catches drift" |
| 11 | +claim is theoretical. |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +## 1. Mutation choice |
| 16 | + |
| 17 | +Modify `Python/jit/hir/hir_c_api.h` to ADD an extra parameter to a |
| 18 | +widely-used API function. Choice: `hir_block_id`. |
| 19 | + |
| 20 | +**Original signature** (post-Step-A at 3404a81192): |
| 21 | + |
| 22 | +```c |
| 23 | +int hir_block_id(struct HirBasicBlock *block); |
| 24 | +``` |
| 25 | +
|
| 26 | +**Mutated signature** (W25 §5.3 baseline mutation): |
| 27 | +
|
| 28 | +```c |
| 29 | +int hir_block_id(struct HirBasicBlock *block, int unused_drift_param); |
| 30 | +``` |
| 31 | + |
| 32 | +The mutation introduces a parameter-count mismatch. Local extern decls |
| 33 | +in §1b TUs do NOT include the new parameter — every consuming TU has |
| 34 | +stale signature. |
| 35 | + |
| 36 | +Why `hir_block_id`: |
| 37 | +- Called from at least 1 §1b TU (per `count_w25_b1b_tus.sh` survey). |
| 38 | +- Simple int parameter — no implicit-conversion masking. |
| 39 | +- ABI-incompatible mutation (extra arg) — cannot be silently linked |
| 40 | + because caller pushes wrong arg count. |
| 41 | +- BUT: in C, calling a function with a different signature than its |
| 42 | + declaration is undefined behavior, NOT a hard link error. The |
| 43 | + question §5.3 answers empirically is whether THIS specific mutation |
| 44 | + is caught by the build, the linker, or only at runtime. |
| 45 | + |
| 46 | +## 2. Baseline procedure (pre-Step-B, run NOW at HEAD 3404a81192) |
| 47 | + |
| 48 | +```bash |
| 49 | +# Step 1: capture baseline state |
| 50 | +git rev-parse HEAD # expect: 3404a811927d1fb1fce61ce7f1ab8763f988711e |
| 51 | + |
| 52 | +# Step 2: apply mutation |
| 53 | +cd /data/users/alexturner/phoenix/cpython |
| 54 | +git stash --include-untracked # stash any working-tree changes |
| 55 | +sed -i 's|int hir_block_id(struct HirBasicBlock \*block);|int hir_block_id(struct HirBasicBlock *block, int unused_drift_param);|' Python/jit/hir/hir_c_api.h |
| 56 | + |
| 57 | +# Verify mutation applied |
| 58 | +grep -A1 "hir_block_id" Python/jit/hir/hir_c_api.h | head -5 |
| 59 | + |
| 60 | +# Step 3: try to build |
| 61 | +scripts/build_phoenix.sh > /tmp/w25-mutation-baseline-stdout.log 2>&1 |
| 62 | +echo "BUILD_EXIT=$?" >> /tmp/w25-mutation-baseline-stdout.log |
| 63 | + |
| 64 | +# Step 4: capture findings |
| 65 | +echo "=== BASELINE (pre-Step-B at 3404a81192) ===" > docs/w25-step-b-mutation-baseline.txt |
| 66 | +grep -E "error:|warning:|BUILD_EXIT" /tmp/w25-mutation-baseline-stdout.log | tail -30 >> docs/w25-step-b-mutation-baseline.txt |
| 67 | + |
| 68 | +# Step 5: revert |
| 69 | +git checkout -- Python/jit/hir/hir_c_api.h |
| 70 | +git stash pop || true # restore stashed working tree |
| 71 | + |
| 72 | +# Step 6: verify revert succeeded |
| 73 | +git diff Python/jit/hir/hir_c_api.h # expect: empty |
| 74 | +``` |
| 75 | + |
| 76 | +## 3. Expected baseline outcome |
| 77 | + |
| 78 | +**Hypothesis (theoretical):** the mutation will be silently linkable |
| 79 | +because §1b TUs use local extern decls (their stale signature matches |
| 80 | +itself; linker sees only function name). |
| 81 | + |
| 82 | +**To-verify:** the build either: |
| 83 | +- (a) Succeeds at link time — confirms drift surface exists. Record |
| 84 | + the finding as PRE-STEP-B BASELINE. |
| 85 | +- (b) Fails at compile time — surprising; hir_c_api.h consumers (the |
| 86 | + §1a TUs) would catch the mismatch. Record what those compile errors |
| 87 | + look like. |
| 88 | +- (c) Succeeds at compile but fails at runtime — also possible if some |
| 89 | + caller ends up with corrupted-stack behavior. |
| 90 | + |
| 91 | +The actual outcome is the empirical baseline. Document it. |
| 92 | + |
| 93 | +## 4. Post-Step-B procedure |
| 94 | + |
| 95 | +After Step B lands (local extern decls deleted, all §1b TUs include |
| 96 | +hir_c_api.h): |
| 97 | + |
| 98 | +```bash |
| 99 | +# At post-Step-B HEAD |
| 100 | +git rev-parse HEAD # expect: <Step B's last commit hash> |
| 101 | +sed -i 's|int hir_block_id(struct HirBasicBlock \*block);|int hir_block_id(struct HirBasicBlock *block, int unused_drift_param);|' Python/jit/hir/hir_c_api.h |
| 102 | +scripts/build_phoenix.sh > /tmp/w25-mutation-poststepb-stdout.log 2>&1 |
| 103 | +echo "BUILD_EXIT=$?" >> /tmp/w25-mutation-poststepb-stdout.log |
| 104 | +git checkout -- Python/jit/hir/hir_c_api.h |
| 105 | +``` |
| 106 | + |
| 107 | +**Expected post-Step-B outcome:** compile FAILS in every consuming TU |
| 108 | +that calls `hir_block_id` (the linker doesn't get a chance — compiler |
| 109 | +catches the mismatch at parse time because all consumers see the |
| 110 | +canonical signature from hir_c_api.h). |
| 111 | + |
| 112 | +## 5. Acceptance criterion |
| 113 | + |
| 114 | +§5.3 falsification PASSES when the pre-Step-B baseline shows DRIFT GOES |
| 115 | +UNDETECTED (or weakly detected) AND the post-Step-B run shows DRIFT |
| 116 | +CAUGHT AT COMPILE TIME with clear errors at every call site. |
| 117 | + |
| 118 | +If pre-Step-B baseline shows the drift is ALREADY caught at compile time |
| 119 | +(unexpected outcome (b) above), then either: |
| 120 | +- The drift surface was always smaller than Step B's framing assumed |
| 121 | + (and Step B's value is reduced — not invalidated, just reframed). |
| 122 | +- Or my mutation choice doesn't actually exercise the §1b drift surface |
| 123 | + (need a different mutation). |
| 124 | + |
| 125 | +Either way, the empirical finding informs Step B's framing. |
| 126 | + |
| 127 | +--- |
| 128 | + |
| 129 | +## 6. Sequencing |
| 130 | + |
| 131 | +1. **Pre-Step-B baseline run** (NOW): testkeeper executes §2 procedure |
| 132 | + at HEAD 3404a81192. Outcome captured in |
| 133 | + docs/w25-step-b-mutation-baseline.txt (NEW file). |
| 134 | +2. **Step B implementation** (NEXT): generalist deletes local externs + |
| 135 | + adds canonical includes in 7 cleanup-target TUs per |
| 136 | + count_w25_b1b_tus.sh inventory. |
| 137 | +3. **Post-Step-B re-run** (POST-STEP-B): testkeeper repeats §4 procedure. |
| 138 | + Outcome appended to docs/w25-step-b-mutation-baseline.txt as POST-STEP-B |
| 139 | + section. |
| 140 | +4. **§5.3 closure**: if post-Step-B drift is caught at compile time per §5 |
| 141 | + acceptance, mutation test confirms Step B closed the drift surface. |
| 142 | + |
| 143 | +--- |
| 144 | + |
| 145 | +## 7. Cross-references |
| 146 | + |
| 147 | +- W25 spec: docs/w25-hbb-canonicalization.md §5.3 |
| 148 | +- §1b TU inventory: scripts/count_w25_b1b_tus.sh (17 total / 7 cleanup |
| 149 | + targets / 10 type-only at HEAD e6a8a2d0fb) |
| 150 | +- Step A: e6a8a2d0fb (canonical struct-pointer typedef landed) |
| 151 | +- §5.1 dual-include compile check: 3404a81192 (canonicalization |
| 152 | + validated structurally) |
| 153 | +- §5.2 lint gate: deferred to Step C (post-Step-B) |
0 commit comments