chore(corpus-origin): tag merged evidence by tier + pin confidence-source contract by igorls · Pull Request #1223 · MemPalace/mempalace

igorls · 2026-04-26T21:38:02Z

Follow-up to #1221 addressing both points from the Copilot review on that PR.

Changes

1. Tier provenance in merged evidence (mempalace/cli.py).

The pre-#1221 code prefixed heuristic evidence as Tier-1 heuristic: … when concatenated with LLM evidence. The merge-fields refactor in #1221 flattened that, so the on-disk origin.json could no longer tell which tier produced which signal line. This restores the tier prefix on both sides — Tier-1 heuristic: … and Tier-2 LLM: … — applied idempotently so already-tagged entries pass through unchanged.

2. Pin the confidence-source contract (tests/test_corpus_origin_integration.py).

#1221's contract says heuristic owns likely_ai_dialogue AND confidence. The bool was asserted; the confidence wasn't. A future change letting Tier 2's confidence leak through could pass all four existing tests silently.

Two inline assertions plus one dedicated test:

test_merge_tier_fields_heuristic_yes_llm_no_keeps_heuristic_bool — asserts merged confidence is not the mocked LLM's 0.90
test_merge_tier_fields_heuristic_no_no_personas_leak — asserts merged confidence is the heuristic's narrative-branch 0.9, not the mocked LLM's 0.95
New test_merge_tier_fields_confidence_matches_heuristic_call — calls detect_origin_heuristic directly on the same samples and pins equality, so any source other than the heuristic produces a visible mismatch

Plus an evidence-prefix assertion in test_merge_tier_fields_heuristic_yes_llm_yes_combines_evidence covering the new tagging behavior.

Verification

1378 tests pass locally (5/5 of the merge-tier suite)
ruff check . clean
ruff format --check . clean (CI-pinned 0.4.x)
No behavior change beyond the evidence-string format (the bool/confidence/persona/user/platform contract from feat(corpus-origin): merge LLM fields into heuristic result instead of replacing #1221 is unchanged)

Checklist

Tests pass
No hardcoded paths
Linter passes

Copilot

Pull request overview

Restores provenance/auditability for corpus-origin merged evidence by tier and strengthens the “confidence comes from heuristic” contract so future refactors can’t silently reintroduce Tier-2 confidence leakage.

Changes:

Prefix merged evidence lines with tier provenance (Tier-1 heuristic: … / Tier-2 LLM: …) in _run_pass_zero, idempotently.
Add/extend integration test assertions to pin that merged confidence matches the heuristic (and that merged evidence is fully tier-prefixed).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`mempalace/cli.py`	Prefixes Tier-1/Tier-2 evidence lines during merge to preserve provenance in `origin.json`.
`tests/test_corpus_origin_integration.py`	Adds assertions and a new test to ensure merged confidence always matches heuristic output and evidence lines are tier-tagged.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- cli.py: stringify each evidence entry exactly once before the startswith check (was calling str(e) twice per element). - tests: replace brittle `confidence != 0.90` assertion with an equality check against detect_origin_heuristic on the same samples. The original would have spuriously fired if the heuristic ever legitimately produced 0.90 for these samples; the new form pins the contract directly.

…urce contract Two follow-ups to PR #1221's merge-fields behavior, both raised by the Copilot review on that PR: - Evidence merge now prefixes each entry with `Tier-1 heuristic: ` or `Tier-2 LLM: ` so the on-disk `origin.json` audit record retains tier provenance. The pre-#1221 code labeled heuristic evidence; the merge-fields refactor flattened that. Re-prefixing is idempotent. - Tests now assert that the merged `confidence` is the heuristic's, not the LLM's. Added inline assertions to the two existing contradiction/disagreement tests, plus a dedicated `test_merge_tier_fields_confidence_matches_heuristic_call` that compares to `detect_origin_heuristic` directly so a future regression letting Tier 2 confidence leak through cannot pass silently. Tests: 1378 pass. Ruff check + format both clean (CI-pinned 0.4.x).

- cli.py: stringify each evidence entry exactly once before the startswith check (was calling str(e) twice per element). - tests: replace brittle `confidence != 0.90` assertion with an equality check against detect_origin_heuristic on the same samples. The original would have spuriously fired if the heuristic ever legitimately produced 0.90 for these samples; the new form pins the contract directly.

Brings in upstream's corpus-origin + privacy-warning track (PRs MemPalace#1211 MemPalace#1221 MemPalace#1223 MemPalace#1224 MemPalace#1225) plus the canonical merged versions of our four PRs that landed today (21:22-21:41 UTC): MemPalace#1173 quarantine_stale_hnsw on make_client + cold-start gate + integrity sniff-test (PROTO/STOP byte check, no deserialization) MemPalace#1177 .blob_seq_ids_migrated marker guard, closes MemPalace#1090 MemPalace#1198 _tokenize None-document guard in BM25 reranker MemPalace#1201 palace_graph.build_graph skips None metadata Conflict resolution: * mempalace/backends/chroma.py — took upstream as base (it has the igorls-review pickle-protocol docstring, thread-safety paragraph, and Path(marker).touch() style nit), then re-applied MemPalace#1094's _coerce_none_metas in query()/get() since MemPalace#1094 is still open and not yet in develop. * mempalace/mcp_server.py — took upstream's clean form. Dropped the fork-only `palace_path=` kwarg from four ChromaCollection() call sites: the kwarg was load-bearing for MemPalace#1171's per-collection write lock, but MemPalace#1171 closed in favor of MemPalace#976's mine_global_lock + daemon-strict, so the kwarg has no remaining consumer. ChromaCollection.__init__ in upstream/develop is back to (self, collection); calling it with palace_path= raised TypeError → silently swallowed by the broad except in _get_collection() → returned None → tool_status() returned _no_palace(). 41 mcp_server tests went from failing-with-KeyError to passing. * mempalace/cli.py — dropped fork-only `workers=args.workers` from the cmd_mine -> miner.mine() call. Pre-existing fork-side bug: the `--workers` argparse arg landed in 5cd14bd but miner.mine() never accepted a workers param, so production `mempalace mine` TypeError'd on every invocation. Removed the broken plumbing; tests/test_cli.py updated to match. * CHANGELOG.md — took upstream verbatim. Fork-specific changelog lives in FORK_CHANGELOG.md (canonical: docs/fork-changes.yaml). * CLAUDE.md — kept ours. Fork's CLAUDE.md is operational; upstream's added a "Design Principles / Contributing" charter, which lives in README.md on the fork. * tests/test_backends.py — took upstream's ruff-formatted line widths. docs/fork-changes.yaml flips the two MemPalace#1173 entries (hnsw-integrity-gate, hnsw-cold-start-gate) and the MemPalace#1201 entry (palace-graph-none-guard) from OPEN to MERGED 2026-04-26. MemPalace#1173 MemPalace#1177 MemPalace#1198 MemPalace#1201 added to the merged_upstream archive at the bottom. FORK_CHANGELOG.md regenerated. scripts/check-docs.sh: 4/4 clean. Test suite: 1460/1460. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

cmd_init now invokes ``_run_pass_zero`` unconditionally (#1221, #1223 landed on develop after this PR's branch point). The pass reads sample content via ``builtins.open``; with that mocked to MagicMock, the downstream ``"\\n\\n".join(samples)`` in ``corpus_origin.detect_origin_heuristic`` raises ``TypeError: expected str instance, MagicMock found``. This test only cares about the wing-slug write to the registry, so stub the pass-zero call directly rather than try to satisfy its full sample-gathering contract.

- cli.py: stringify each evidence entry exactly once before the startswith check (was calling str(e) twice per element). - tests: replace brittle `confidence != 0.90` assertion with an equality check against detect_origin_heuristic on the same samples. The original would have spuriously fired if the heuristic ever legitimately produced 0.90 for these samples; the new form pins the contract directly.

cmd_init now invokes ``_run_pass_zero`` unconditionally (MemPalace#1221, MemPalace#1223 landed on develop after this PR's branch point). The pass reads sample content via ``builtins.open``; with that mocked to MagicMock, the downstream ``"\\n\\n".join(samples)`` in ``corpus_origin.detect_origin_heuristic`` raises ``TypeError: expected str instance, MagicMock found``. This test only cares about the wing-slug write to the registry, so stub the pass-zero call directly rather than try to satisfy its full sample-gathering contract.

Copilot AI review requested due to automatic review settings April 26, 2026 21:38

igorls requested review from bensig and milla-jovovich as code owners April 26, 2026 21:38

Copilot started reviewing on behalf of igorls April 26, 2026 21:38 View session

Copilot AI reviewed Apr 26, 2026

View reviewed changes

Comment thread mempalace/cli.py Outdated

Comment thread tests/test_corpus_origin_integration.py Outdated

igorls added 2 commits April 26, 2026 19:18

igorls force-pushed the chore/corpus-origin-merge-followup branch from fad81aa to 5e33592 Compare April 26, 2026 22:20

igorls merged commit 22f3d9b into develop Apr 26, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(corpus-origin): tag merged evidence by tier + pin confidence-source contract#1223

chore(corpus-origin): tag merged evidence by tier + pin confidence-source contract#1223
igorls merged 2 commits intodevelopfrom
chore/corpus-origin-merge-followup

igorls commented Apr 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

igorls commented Apr 26, 2026

Changes

Verification

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants