feat: offline fact checker against entity registry + knowledge graph by igorls · Pull Request #792 · MemPalace/mempalace

igorls · 2026-04-13T10:48:45Z

Summary

New offline module mempalace/fact_checker.py — verifies arbitrary text against the local entity registry and knowledge graph for contradictions. Authored by Milla (MSL); extracted from a bundled commit.

Stacked on #791 (full chain: #784 → #788 → #789 → #790 → #791).

What this adds

mempalace/fact_checker.py — single new module, 177 lines:

from mempalace.fact_checker import check_text
issues = check_text("Bob is Alice's brother", palace_path)
# → [{"type": "relationship_mismatch", "detail": "KG says Bob is Alice's husband"}]

Detects three issue classes:

Entity-name confusion: similar names mentioned in text that match different entities in the registry (edit distance 1–2)
Relationship mismatches: text claims A→relation→B, but KG has A→different-relation→B
Stale facts: KG has valid_from / valid_to — facts outside that window are flagged

Also installable as a CLI: python -m mempalace.fact_checker "text" --palace ~/.mempalace/palace

Why offline-only — verified

Aligned with CLAUDE.md's "local-first, zero API" principle. Verified by grep:

$ grep -iE "anthropic|openai|request\.|urllib|http|api_key" fact_checker.py
(no matches)

Only I/O is:

Read ~/.mempalace/known_entities.json (local)
Query the KnowledgeGraph (SQLite, local)

Relationship to bundled commit

Original commit 935f657 bundled three features:

drawer-grep search → feat(search): drawer-grep returns best-matching chunk + neighbors #791
closet_llm.py → deferred pending LLM_ENDPOINT refactor
fact_checker.py → this PR

Authorship preserved via --author="MSL <...>".

Test plan

Module imports cleanly (from mempalace.fact_checker import check_text)
Smoke-tested on a trivial input — returns [] as expected for empty palace
Full suite: 708/708 pass (2 version-consistency tests deselected — pre-existing develop bug)
No unit tests yet for this module — the original branch didn't include them. Reviewer may want to land minimum coverage as a follow-up.

Callouts for reviewers

No tests in this PR — open question whether to block on them. 177 lines of new code warrants a few unit tests for the three detection paths.
Depends on known_entities.json — format not yet documented. Check entity_registry.py for the schema.
No CLI help text — python -m mempalace.fact_checker has a docstring but no argparse --help. Minor polish.

fact_checker.py verifies text for contradictions against locally stored entities and KG facts. Catches similar-name confusion (Bob vs Bobby), relationship mismatches (KG says husband, text says brother), and stale facts (KG valid_from/valid_to). No hardcoded facts. No network calls. Reads: - ~/.mempalace/known_entities.json - KnowledgeGraph SQLite Usage: from mempalace.fact_checker import check_text issues = check_text("Bob is Alice's brother", palace_path) # CLI python -m mempalace.fact_checker "text" --palace ~/.mempalace/palace Extracted from Milla's commit 935f657 which bundled this with closet_llm (deferred) and drawer-grep (PR #791). Ported only fact_checker.py — verified no network / API imports. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…act-checker

Merges the full hardened stack (up through #791 drawer-grep) and turns fact_checker from "dead code hidden behind bare except" into an actually-working offline contradiction detector with tests. ## Dead paths the PR body advertised but the code never executed Both buried by a single outer ``except Exception: pass``: * ``kg.query(subject)`` — ``KnowledgeGraph`` has no ``query()`` method; it has ``query_entity()``. The attribute error was silently swallowed and the entire KG branch always returned ``[]``. Now using ``kg.query_entity(subject, direction="outgoing")`` with proper handling of the ``predicate``/``object``/``current``/``valid_to`` fields the real API returns. * ``KnowledgeGraph(palace_path=palace_path)`` — the constructor's only kwarg is ``db_path``. Passing ``palace_path`` raised TypeError, silently swallowed. Now computing the db_path correctly from ``<palace>/knowledge_graph.sqlite3``, matching the convention the MCP server already uses. ## Contradiction logic rewritten The previous ``if kg_pred in claim and fact.object not in claim`` only fired when text used the SAME predicate word as the KG fact — the exact opposite of the stated use case ("Bob is Alice's brother" when KG says husband" would NOT have fired). Replaced with a proper parse → lookup → compare pipeline: * ``_extract_claims`` parses two surface forms ("X is Y's Z" and "X's Z is Y") into ``(subject, predicate, object)`` triples. * ``_check_kg_contradictions`` pulls the subject's outgoing facts and flags two classes: - ``relationship_mismatch`` when a current KG fact matches the same ``(subject, object)`` pair but with a different predicate. - ``stale_fact`` when the exact triple exists but is ``valid_to``-closed in the past. * Stale-fact detection is now implemented (the PR body claimed it; the old code silently didn't implement it). ## Performance fix — O(n²) → O(mentioned × n) ``_check_entity_confusion`` previously computed Levenshtein for every pair of registered names on every ``check_text`` call. For 1,000 registered names that's ~500K edit-distance calls per hook invocation. Now we first identify which registry names actually appear in the text (single regex scan), then only compute edit distance between mentioned and unmentioned names. Pinned by a test that asserts <200ms on a 500- name registry with zero mentions. Also: when *both* similar names are mentioned in the text, we no longer flag them — the user clearly knows they're different people. ## Shared entity-registry loader ``mempalace/miner.py`` already had an mtime-cached loader for ``~/.mempalace/known_entities.json``. fact_checker had a duplicate implementation that leaked file handles and ignored caching. Extended miner's cache to expose both the flat set (``_load_known_entities``) and the raw category dict (``_load_known_entities_raw``); fact_checker now imports the latter. No more double disk reads, no more handle leak. ## Tests — 24 cases in tests/test_fact_checker.py All three detection paths + both dead-code regressions: * ``test_kg_init_uses_db_path_not_palace_path_kwarg`` — pins the correct KG constructor signature so the ``palace_path=`` bug can't come back. * ``test_relationship_mismatch_detected`` — the headline example from the PR body now actually fires. * ``test_stale_fact_detected`` — valid_to-closed triple is flagged. * ``test_current_fact_same_triple_is_not_flagged`` — no false positive on a still-valid match. * ``test_performance_bounded_by_mentioned_names`` — 500-name registry, zero mentions, <200ms. Regression for the O(n²) blowup. * ``test_no_false_positive_when_both_names_mentioned`` — Mila and Milla in the same text is fine. * Plus claim extraction, flatten_names shapes, CLI exit code, empty text handling, missing-palace graceful fallback, registry-dict shape support. 785/785 suite pass. ruff + format clean on CI-pinned 0.4.x.

Brings in PR #793 (optional LLM-based closet regeneration via user-configured OpenAI-compatible endpoint) and PR #795 (hybrid closet+drawer search — closets boost, never gate). Stack: #784 → #788 → #789 → #790 → #791 → #792 → #793 (+ #795). Findings hardened on our side ───────────────────────────── 1) closet_llm.regenerate_closets didn't use the blessed palace helpers. Before: * manual closets_col.get(where=...) + .delete(ids=...) with a silent ``except Exception: pass`` around both — if the purge failed, pre-existing regex closets survived alongside fresh LLM closets, giving the searcher double hits for the same source. * ``source.split('/')[-1][:30]`` to build the closet_id — quietly wrong on Windows paths (``C:\\proj\\a.md`` has no ``/``, so the whole string ends up in the ID). * no mine_lock around purge+upsert — a concurrent regex rebuild of the same source could interleave with our purge and leave a mix of regex and LLM pointers. * no ``normalize_version`` stamp on the LLM closets — the miner's stale-version gate would treat them as leftovers from an older schema and rebuild over them on the next mine. After: routes through ``purge_file_closets`` + ``mine_lock`` + ``os.path.basename`` + ``NORMALIZE_VERSION`` stamp. Regression tests cover each. 2) searcher.search_memories was still closet-first. PR #795 merged into #793's head to fix the recall regression documented in that PR (R@1 0.25 on narrative content vs. 0.42 baseline). The hybrid design makes closets a ranking boost rather than a gate: drawers are always queried at the floor, and matching closet hits (rank 0-4 within CLOSET_DISTANCE_CAP=1.5) add a boost of 0.40/0.25/0.15/0.08/0.04 to the effective distance. Merged to take the incoming hybrid design, with two cleanups: * kept the ``_expand_with_neighbors`` / ``_extract_drawer_ids_from_closet`` helpers as separately-tested utilities (still imported by tests and future callers); * replaced the fragile ``source_file.endswith(basename)`` reverse- lookup in the enrichment step with internal ``_source_file_full`` / ``_chunk_index`` fields stripped before return, so enrichment doesn't silently pick the wrong path when two sources share a basename across directories; * drawer-grep enrichment now sorts by ``chunk_index`` before neighbor expansion, so ``best_idx ± 1`` corresponds to actual document order rather than whatever order Chroma returned. 3) Closet-first tests in test_closets.py (``TestSearchMemoriesClosetFirst``, end-to-end ``test_closet_first_search_includes_drawer_index_and_total``) pinned contracts that the hybrid path now violates (``matched_via`` went from ``"closet"`` to ``"drawer+closet"``). Rewrote them around the new invariant: direct drawers are always the floor, closet agreement flips the hit's matched_via and exposes closet_preview. Verification ──────────── * 805/805 pass under ``uv run pytest tests/ -v --ignore=tests/benchmarks`` (13 new tests from PR #793 + 5 from PR #795 + 2 new regressions for the closet_llm hardening + the rewritten hybrid assertions in test_closets.py). * CI-pinned ruff 0.4.x clean on ``mempalace/`` + ``tests/`` (check + format both pass). * No new deps — closet_llm.py still uses stdlib ``urllib.request`` per the PR's "zero new dependencies" promise. Co-Authored-By: MSL <[email protected]>

igorls requested review from bensig and milla-jovovich as code owners April 13, 2026 10:48

igorls mentioned this pull request Apr 13, 2026

feat: optional LLM-based closet regeneration — bring-your-own endpoint #793

Merged

3 tasks

igorls added the P0 critical label Apr 13, 2026

igorls assigned bensig and milla-jovovich Apr 13, 2026

igorls mentioned this pull request Apr 13, 2026

fix(search): hybrid closet+drawer retrieval — closets boost, never gate #795

Merged

3 tasks

igorls added 2 commits April 13, 2026 18:13

Merge remote-tracking branch 'origin/pr/drawer-grep-search' into pr/f…

339f96a

…act-checker

igorls merged commit 8833d4c into pr/drawer-grep-search Apr 13, 2026

igorls mentioned this pull request Apr 13, 2026

feat(v3.3): land Milla's stacked closet/BM25/KG/LLM chain (#784-#795) on develop #829

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: offline fact checker against entity registry + knowledge graph#792

feat: offline fact checker against entity registry + knowledge graph#792
igorls merged 3 commits intopr/drawer-grep-searchfrom
pr/fact-checker

igorls commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

igorls commented Apr 13, 2026

Summary

What this adds

Why offline-only — verified

Relationship to bundled commit

Test plan

Callouts for reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants