feat: palace doctor for health diagnostics#436
feat: palace doctor for health diagnostics#436Nitrogonza9 wants to merge 2 commits intoMemPalace:developfrom
Conversation
New doctor.py module with read-only diagnostic checks: - Palace existence and ChromaDB collection health - Metadata completeness (missing wing, room, source_file) - Wing/room distribution and tiny-room detection - Drawer size warnings (very small or very large content) - Knowledge graph health (empty KG, orphaned entities) Reports actionable recommendations with OK/WARN/ERROR severity. Never modifies data — safe to run anytime. New CLI command: mempalace doctor New MCP tool: mempalace_doctor Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
web3guru888
left a comment
There was a problem hiding this comment.
Nice companion to your #433 and #434 — building out the MemPalace tooling suite. The doctor command fills a real gap; we've built similar ad-hoc checks in our integration and having them standardized is valuable.
Feedback on the checks:
-
Good coverage: palace existence, collection health, metadata completeness, wing/room distribution, drawer sizes, KG health + orphan detection. This covers the main failure modes we've seen.
-
Pagination in metadata checks — I notice the doctor walks the collection with
limit=500, offset=...for metadata and drawer size checks separately. That means two full scans of the collection. For large palaces (we have 710+ entities, some users have 39k+ drawers per #427), consider a single-pass approach that collects all stats at once, or use theiter_all_metadatashelper from #427 if that lands first. -
Tiny room threshold — Flagging rooms with 1 drawer as fragmentation is a good heuristic. In our integration we intentionally create single-drawer rooms for high-value discoveries, so this would be a false positive. Consider making the threshold configurable or adding a note that single-drawer rooms can be intentional.
-
Missing check: duplicate content — The PR description mentions "duplicate content detection" but I don't see it in the actual checks. Our tiered dedup (hard threshold 0.86, soft 0.55) catches 3x more duplicates than a single threshold. Might be a good addition for a follow-up.
-
MCP tool exposure — Exposing doctor via MCP (
mempalace_doctor) is a smart move. AI agents can self-diagnose palace issues before search quality degrades.
14 tests with good coverage of edge cases. Clean implementation.
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
|
Thanks @web3guru888 — third solid review, appreciate the consistency. Single-pass collection scan — fixing now. You're right, two full scans is wasteful. I'll merge the metadata and drawer size checks into one pass that collects all stats simultaneously. Tiny room threshold — making it a note, not a hard warning. Good point about intentional single-drawer rooms for high-value discoveries. I'll change the message to clarify it's informational, not necessarily a problem. Duplicate content detection — noted for follow-up. Honest oversight — the PR description mentioned it but I didn't implement it. I'll remove it from the description to avoid confusion. Your tiered dedup approach (0.86 hard / 0.55 soft) would be the right way to do this, but it needs embedding similarity which is a separate effort. Pushing fixes now. — Gonzalo |
Address review feedback from @web3guru888: - Merge three separate collection scans (metadata, wings/rooms, sizes) into a single _check_drawers() pass. Reads documents + metadatas once instead of walking the collection three times. - Change tiny_rooms from WARN to OK with informational message — single- drawer rooms can be intentional for high-value discoveries. - Remove "duplicate content detection" from scope (not implemented, needs embedding similarity — separate PR). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Summary
New
mempalace doctorcommand that runs read-only diagnostic checks on palace health. Reports actionable issues with OK/WARN/ERROR severity — never modifies data.Checks performed
palace_existscollectionmetadata_wingmetadata_roommetadata_sourcewingstiny_roomssmall_drawerslarge_drawerskg_existskg_entitieskg_orphansIntegration
mempalace doctormempalace_doctor— AI agents can self-diagnose palace issuesExample output
Test plan
pytest tests/test_doctor.py -v— 14 tests passpytest tests/ -v— full suite 548 passed, 0 failedruff check— no lint errors🤖 Generated with Claude Code