Skip to content

Commit e8b8c14

Browse files
committed
docs: W25 §5.3 mutation test pre-Step-B baseline procedure
Per W25 spec §5.3 falsification + supervisor [chat L2129] sequencing post-push-66: documents the mutation procedure for empirically validating that Step B's local-extern cleanup closes the signature-drift surface. docs/w25-step-b-mutation-test.md: §1 Mutation choice (hir_block_id + extra unused parameter) §2 Baseline procedure for pre-Step-B (testkeeper runs at HEAD 3404a81) §3 Expected baseline outcomes (a/b/c — empirical, not predicted) §4 Post-Step-B re-run procedure §5 Acceptance criterion (drift undetected pre-Step-B → caught compile-time post-Step-B) §6 Sequencing (baseline NOW → Step B → post-Step-B re-run) §7 Cross-references WHY DOC NOT SCRIPT: Mutation is one-line sed; scripting it adds maintenance overhead without value. testkeeper executes the documented procedure; outputs land in docs/w25-step-b-mutation-baseline.txt (created by the procedure run, not by this commit). PURPOSE: Without §5.3 baseline, the "lint gate catches drift" framing is theoretical. Empirical evidence — silent-link pre-Step-B vs compile-fail post-Step-B — is the falsification test the spec calls for. The actual baseline outcome may surprise; that surprise is the value (e.g., if drift is ALREADY caught at compile time, Step B's framing reframes — not invalidates). NEXT IN PIPELINE: testkeeper executes §2 at HEAD 3404a81 → captures docs/w25-step-b-mutation-baseline.txt → my Step B implementation begins (7 cleanup-target TUs). Authorization chain: - W25 spec §5.3: theologian L2017 - Supervisor sequencing post-push-66: chat L2129 - Generalist commitment to pre-Step-B baseline: chat L2065
1 parent 3404a81 commit e8b8c14

1 file changed

Lines changed: 153 additions & 0 deletions

File tree

docs/w25-step-b-mutation-test.md

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# W25 §5.3 Mutation Test — Step B Falsification Baseline
2+
3+
**Status:** Pre-Step-B baseline procedure. Run by testkeeper at HEAD
4+
3404a81192 (post-Step-A, pre-Step-B) to capture the empirical "drift
5+
silently links" baseline. Re-run after Step B lands to confirm the
6+
"drift caught at compile time" post-condition.
7+
8+
**Purpose:** Per W25 spec docs/w25-hbb-canonicalization.md §5.3, validate
9+
that Step B's local-extern cleanup actually closes the signature-drift
10+
surface. Without this empirical test, the "lint gate catches drift"
11+
claim is theoretical.
12+
13+
---
14+
15+
## 1. Mutation choice
16+
17+
Modify `Python/jit/hir/hir_c_api.h` to ADD an extra parameter to a
18+
widely-used API function. Choice: `hir_block_id`.
19+
20+
**Original signature** (post-Step-A at 3404a81192):
21+
22+
```c
23+
int hir_block_id(struct HirBasicBlock *block);
24+
```
25+
26+
**Mutated signature** (W25 §5.3 baseline mutation):
27+
28+
```c
29+
int hir_block_id(struct HirBasicBlock *block, int unused_drift_param);
30+
```
31+
32+
The mutation introduces a parameter-count mismatch. Local extern decls
33+
in §1b TUs do NOT include the new parameter — every consuming TU has
34+
stale signature.
35+
36+
Why `hir_block_id`:
37+
- Called from at least 1 §1b TU (per `count_w25_b1b_tus.sh` survey).
38+
- Simple int parameter — no implicit-conversion masking.
39+
- ABI-incompatible mutation (extra arg) — cannot be silently linked
40+
because caller pushes wrong arg count.
41+
- BUT: in C, calling a function with a different signature than its
42+
declaration is undefined behavior, NOT a hard link error. The
43+
question §5.3 answers empirically is whether THIS specific mutation
44+
is caught by the build, the linker, or only at runtime.
45+
46+
## 2. Baseline procedure (pre-Step-B, run NOW at HEAD 3404a81192)
47+
48+
```bash
49+
# Step 1: capture baseline state
50+
git rev-parse HEAD # expect: 3404a811927d1fb1fce61ce7f1ab8763f988711e
51+
52+
# Step 2: apply mutation
53+
cd /data/users/alexturner/phoenix/cpython
54+
git stash --include-untracked # stash any working-tree changes
55+
sed -i 's|int hir_block_id(struct HirBasicBlock \*block);|int hir_block_id(struct HirBasicBlock *block, int unused_drift_param);|' Python/jit/hir/hir_c_api.h
56+
57+
# Verify mutation applied
58+
grep -A1 "hir_block_id" Python/jit/hir/hir_c_api.h | head -5
59+
60+
# Step 3: try to build
61+
scripts/build_phoenix.sh > /tmp/w25-mutation-baseline-stdout.log 2>&1
62+
echo "BUILD_EXIT=$?" >> /tmp/w25-mutation-baseline-stdout.log
63+
64+
# Step 4: capture findings
65+
echo "=== BASELINE (pre-Step-B at 3404a81192) ===" > docs/w25-step-b-mutation-baseline.txt
66+
grep -E "error:|warning:|BUILD_EXIT" /tmp/w25-mutation-baseline-stdout.log | tail -30 >> docs/w25-step-b-mutation-baseline.txt
67+
68+
# Step 5: revert
69+
git checkout -- Python/jit/hir/hir_c_api.h
70+
git stash pop || true # restore stashed working tree
71+
72+
# Step 6: verify revert succeeded
73+
git diff Python/jit/hir/hir_c_api.h # expect: empty
74+
```
75+
76+
## 3. Expected baseline outcome
77+
78+
**Hypothesis (theoretical):** the mutation will be silently linkable
79+
because §1b TUs use local extern decls (their stale signature matches
80+
itself; linker sees only function name).
81+
82+
**To-verify:** the build either:
83+
- (a) Succeeds at link time — confirms drift surface exists. Record
84+
the finding as PRE-STEP-B BASELINE.
85+
- (b) Fails at compile time — surprising; hir_c_api.h consumers (the
86+
§1a TUs) would catch the mismatch. Record what those compile errors
87+
look like.
88+
- (c) Succeeds at compile but fails at runtime — also possible if some
89+
caller ends up with corrupted-stack behavior.
90+
91+
The actual outcome is the empirical baseline. Document it.
92+
93+
## 4. Post-Step-B procedure
94+
95+
After Step B lands (local extern decls deleted, all §1b TUs include
96+
hir_c_api.h):
97+
98+
```bash
99+
# At post-Step-B HEAD
100+
git rev-parse HEAD # expect: <Step B's last commit hash>
101+
sed -i 's|int hir_block_id(struct HirBasicBlock \*block);|int hir_block_id(struct HirBasicBlock *block, int unused_drift_param);|' Python/jit/hir/hir_c_api.h
102+
scripts/build_phoenix.sh > /tmp/w25-mutation-poststepb-stdout.log 2>&1
103+
echo "BUILD_EXIT=$?" >> /tmp/w25-mutation-poststepb-stdout.log
104+
git checkout -- Python/jit/hir/hir_c_api.h
105+
```
106+
107+
**Expected post-Step-B outcome:** compile FAILS in every consuming TU
108+
that calls `hir_block_id` (the linker doesn't get a chance — compiler
109+
catches the mismatch at parse time because all consumers see the
110+
canonical signature from hir_c_api.h).
111+
112+
## 5. Acceptance criterion
113+
114+
§5.3 falsification PASSES when the pre-Step-B baseline shows DRIFT GOES
115+
UNDETECTED (or weakly detected) AND the post-Step-B run shows DRIFT
116+
CAUGHT AT COMPILE TIME with clear errors at every call site.
117+
118+
If pre-Step-B baseline shows the drift is ALREADY caught at compile time
119+
(unexpected outcome (b) above), then either:
120+
- The drift surface was always smaller than Step B's framing assumed
121+
(and Step B's value is reduced — not invalidated, just reframed).
122+
- Or my mutation choice doesn't actually exercise the §1b drift surface
123+
(need a different mutation).
124+
125+
Either way, the empirical finding informs Step B's framing.
126+
127+
---
128+
129+
## 6. Sequencing
130+
131+
1. **Pre-Step-B baseline run** (NOW): testkeeper executes §2 procedure
132+
at HEAD 3404a81192. Outcome captured in
133+
docs/w25-step-b-mutation-baseline.txt (NEW file).
134+
2. **Step B implementation** (NEXT): generalist deletes local externs +
135+
adds canonical includes in 7 cleanup-target TUs per
136+
count_w25_b1b_tus.sh inventory.
137+
3. **Post-Step-B re-run** (POST-STEP-B): testkeeper repeats §4 procedure.
138+
Outcome appended to docs/w25-step-b-mutation-baseline.txt as POST-STEP-B
139+
section.
140+
4. **§5.3 closure**: if post-Step-B drift is caught at compile time per §5
141+
acceptance, mutation test confirms Step B closed the drift surface.
142+
143+
---
144+
145+
## 7. Cross-references
146+
147+
- W25 spec: docs/w25-hbb-canonicalization.md §5.3
148+
- §1b TU inventory: scripts/count_w25_b1b_tus.sh (17 total / 7 cleanup
149+
targets / 10 type-only at HEAD e6a8a2d0fb)
150+
- Step A: e6a8a2d0fb (canonical struct-pointer typedef landed)
151+
- §5.1 dual-include compile check: 3404a81192 (canonicalization
152+
validated structurally)
153+
- §5.2 lint gate: deferred to Step C (post-Step-B)

0 commit comments

Comments
 (0)