Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Restores provenance/auditability for corpus-origin merged evidence by tier and strengthens the “confidence comes from heuristic” contract so future refactors can’t silently reintroduce Tier-2 confidence leakage.
Changes:
- Prefix merged evidence lines with tier provenance (
Tier-1 heuristic: …/Tier-2 LLM: …) in_run_pass_zero, idempotently. - Add/extend integration test assertions to pin that merged
confidencematches the heuristic (and that merged evidence is fully tier-prefixed).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
mempalace/cli.py |
Prefixes Tier-1/Tier-2 evidence lines during merge to preserve provenance in origin.json. |
tests/test_corpus_origin_integration.py |
Adds assertions and a new test to ensure merged confidence always matches heuristic output and evidence lines are tier-tagged. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
igorls
added a commit
that referenced
this pull request
Apr 26, 2026
- cli.py: stringify each evidence entry exactly once before the startswith check (was calling str(e) twice per element). - tests: replace brittle `confidence != 0.90` assertion with an equality check against detect_origin_heuristic on the same samples. The original would have spuriously fired if the heuristic ever legitimately produced 0.90 for these samples; the new form pins the contract directly.
…urce contract Two follow-ups to PR #1221's merge-fields behavior, both raised by the Copilot review on that PR: - Evidence merge now prefixes each entry with `Tier-1 heuristic: ` or `Tier-2 LLM: ` so the on-disk `origin.json` audit record retains tier provenance. The pre-#1221 code labeled heuristic evidence; the merge-fields refactor flattened that. Re-prefixing is idempotent. - Tests now assert that the merged `confidence` is the heuristic's, not the LLM's. Added inline assertions to the two existing contradiction/disagreement tests, plus a dedicated `test_merge_tier_fields_confidence_matches_heuristic_call` that compares to `detect_origin_heuristic` directly so a future regression letting Tier 2 confidence leak through cannot pass silently. Tests: 1378 pass. Ruff check + format both clean (CI-pinned 0.4.x).
- cli.py: stringify each evidence entry exactly once before the startswith check (was calling str(e) twice per element). - tests: replace brittle `confidence != 0.90` assertion with an equality check against detect_origin_heuristic on the same samples. The original would have spuriously fired if the heuristic ever legitimately produced 0.90 for these samples; the new form pins the contract directly.
fad81aa to
5e33592
Compare
jphein
added a commit
to jphein/mempalace
that referenced
this pull request
Apr 26, 2026
Brings in upstream's corpus-origin + privacy-warning track (PRs MemPalace#1211 MemPalace#1221 MemPalace#1223 MemPalace#1224 MemPalace#1225) plus the canonical merged versions of our four PRs that landed today (21:22-21:41 UTC): MemPalace#1173 quarantine_stale_hnsw on make_client + cold-start gate + integrity sniff-test (PROTO/STOP byte check, no deserialization) MemPalace#1177 .blob_seq_ids_migrated marker guard, closes MemPalace#1090 MemPalace#1198 _tokenize None-document guard in BM25 reranker MemPalace#1201 palace_graph.build_graph skips None metadata Conflict resolution: * mempalace/backends/chroma.py — took upstream as base (it has the igorls-review pickle-protocol docstring, thread-safety paragraph, and Path(marker).touch() style nit), then re-applied MemPalace#1094's _coerce_none_metas in query()/get() since MemPalace#1094 is still open and not yet in develop. * mempalace/mcp_server.py — took upstream's clean form. Dropped the fork-only `palace_path=` kwarg from four ChromaCollection() call sites: the kwarg was load-bearing for MemPalace#1171's per-collection write lock, but MemPalace#1171 closed in favor of MemPalace#976's mine_global_lock + daemon-strict, so the kwarg has no remaining consumer. ChromaCollection.__init__ in upstream/develop is back to (self, collection); calling it with palace_path= raised TypeError → silently swallowed by the broad except in _get_collection() → returned None → tool_status() returned _no_palace(). 41 mcp_server tests went from failing-with-KeyError to passing. * mempalace/cli.py — dropped fork-only `workers=args.workers` from the cmd_mine -> miner.mine() call. Pre-existing fork-side bug: the `--workers` argparse arg landed in 5cd14bd but miner.mine() never accepted a workers param, so production `mempalace mine` TypeError'd on every invocation. Removed the broken plumbing; tests/test_cli.py updated to match. * CHANGELOG.md — took upstream verbatim. Fork-specific changelog lives in FORK_CHANGELOG.md (canonical: docs/fork-changes.yaml). * CLAUDE.md — kept ours. Fork's CLAUDE.md is operational; upstream's added a "Design Principles / Contributing" charter, which lives in README.md on the fork. * tests/test_backends.py — took upstream's ruff-formatted line widths. docs/fork-changes.yaml flips the two MemPalace#1173 entries (hnsw-integrity-gate, hnsw-cold-start-gate) and the MemPalace#1201 entry (palace-graph-none-guard) from OPEN to MERGED 2026-04-26. MemPalace#1173 MemPalace#1177 MemPalace#1198 MemPalace#1201 added to the merged_upstream archive at the bottom. FORK_CHANGELOG.md regenerated. scripts/check-docs.sh: 4/4 clean. Test suite: 1460/1460. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
igorls
added a commit
that referenced
this pull request
Apr 27, 2026
cmd_init now invokes ``_run_pass_zero`` unconditionally (#1221, #1223 landed on develop after this PR's branch point). The pass reads sample content via ``builtins.open``; with that mocked to MagicMock, the downstream ``"\\n\\n".join(samples)`` in ``corpus_origin.detect_origin_heuristic`` raises ``TypeError: expected str instance, MagicMock found``. This test only cares about the wing-slug write to the registry, so stub the pass-zero call directly rather than try to satisfy its full sample-gathering contract.
lealvona
pushed a commit
to lealvona/mempalace
that referenced
this pull request
Apr 29, 2026
- cli.py: stringify each evidence entry exactly once before the startswith check (was calling str(e) twice per element). - tests: replace brittle `confidence != 0.90` assertion with an equality check against detect_origin_heuristic on the same samples. The original would have spuriously fired if the heuristic ever legitimately produced 0.90 for these samples; the new form pins the contract directly.
lealvona
pushed a commit
to lealvona/mempalace
that referenced
this pull request
Apr 29, 2026
cmd_init now invokes ``_run_pass_zero`` unconditionally (MemPalace#1221, MemPalace#1223 landed on develop after this PR's branch point). The pass reads sample content via ``builtins.open``; with that mocked to MagicMock, the downstream ``"\\n\\n".join(samples)`` in ``corpus_origin.detect_origin_heuristic`` raises ``TypeError: expected str instance, MagicMock found``. This test only cares about the wing-slug write to the registry, so stub the pass-zero call directly rather than try to satisfy its full sample-gathering contract.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #1221 addressing both points from the Copilot review on that PR.
Changes
1. Tier provenance in merged evidence (
mempalace/cli.py).The pre-#1221 code prefixed heuristic evidence as
Tier-1 heuristic: …when concatenated with LLM evidence. The merge-fields refactor in #1221 flattened that, so the on-diskorigin.jsoncould no longer tell which tier produced which signal line. This restores the tier prefix on both sides —Tier-1 heuristic: …andTier-2 LLM: …— applied idempotently so already-tagged entries pass through unchanged.2. Pin the confidence-source contract (
tests/test_corpus_origin_integration.py).#1221's contract says heuristic owns
likely_ai_dialogueANDconfidence. The bool was asserted; the confidence wasn't. A future change letting Tier 2's confidence leak through could pass all four existing tests silently.Two inline assertions plus one dedicated test:
test_merge_tier_fields_heuristic_yes_llm_no_keeps_heuristic_bool— asserts merged confidence is not the mocked LLM's0.90test_merge_tier_fields_heuristic_no_no_personas_leak— asserts merged confidence is the heuristic's narrative-branch0.9, not the mocked LLM's0.95test_merge_tier_fields_confidence_matches_heuristic_call— callsdetect_origin_heuristicdirectly on the same samples and pins equality, so any source other than the heuristic produces a visible mismatchPlus an evidence-prefix assertion in
test_merge_tier_fields_heuristic_yes_llm_yes_combines_evidencecovering the new tagging behavior.Verification
ruff check .cleanruff format --check .clean (CI-pinned 0.4.x)Checklist