Skip to content

feat: palace doctor for health diagnostics#436

Open
Nitrogonza9 wants to merge 2 commits intoMemPalace:developfrom
Nitrogonza9:feat/palace-doctor
Open

feat: palace doctor for health diagnostics#436
Nitrogonza9 wants to merge 2 commits intoMemPalace:developfrom
Nitrogonza9:feat/palace-doctor

Conversation

@Nitrogonza9
Copy link
Copy Markdown

Summary

New mempalace doctor command that runs read-only diagnostic checks on palace health. Reports actionable issues with OK/WARN/ERROR severity — never modifies data.

Checks performed

Check What it detects
palace_exists Palace directory missing or unreadable
collection ChromaDB collection empty or corrupted
metadata_wing Drawers missing wing metadata
metadata_room Drawers missing room metadata
metadata_source Drawers missing source_file metadata
wings Wing/room distribution stats
tiny_rooms Rooms with only 1 drawer (fragmentation)
small_drawers Drawers < 20 chars (likely noise)
large_drawers Drawers > 50K chars (may hurt search)
kg_exists KG missing — suggests extract-kg
kg_entities Empty KG or KG stats
kg_orphans Entities with no relationships

Integration

  • CLI: mempalace doctor
  • MCP tool: mempalace_doctor — AI agents can self-diagnose palace issues

Example output

  [+] palace_exists: Palace found at /home/user/.mempalace/palace
  [+] collection: Collection healthy: 150 drawers
  [+] metadata_wing: All drawers have wing metadata
  [!] metadata_source: 3/150 drawers missing 'source_file' metadata
  [+] wings: 3 wings found
      project(85), notes(42), personal(23)
  [!] kg_exists: No knowledge graph found — run extract-kg to auto-populate

  Summary: Healthy with 2 warnings

Test plan

  • pytest tests/test_doctor.py -v — 14 tests pass
  • pytest tests/ -v — full suite 548 passed, 0 failed
  • ruff check — no lint errors
  • No new dependencies
  • Read-only verified (never modifies palace data)

🤖 Generated with Claude Code

New doctor.py module with read-only diagnostic checks:
- Palace existence and ChromaDB collection health
- Metadata completeness (missing wing, room, source_file)
- Wing/room distribution and tiny-room detection
- Drawer size warnings (very small or very large content)
- Knowledge graph health (empty KG, orphaned entities)

Reports actionable recommendations with OK/WARN/ERROR severity.
Never modifies data — safe to run anytime.

New CLI command: mempalace doctor
New MCP tool: mempalace_doctor

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@web3guru888 web3guru888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice companion to your #433 and #434 — building out the MemPalace tooling suite. The doctor command fills a real gap; we've built similar ad-hoc checks in our integration and having them standardized is valuable.

Feedback on the checks:

  1. Good coverage: palace existence, collection health, metadata completeness, wing/room distribution, drawer sizes, KG health + orphan detection. This covers the main failure modes we've seen.

  2. Pagination in metadata checks — I notice the doctor walks the collection with limit=500, offset=... for metadata and drawer size checks separately. That means two full scans of the collection. For large palaces (we have 710+ entities, some users have 39k+ drawers per #427), consider a single-pass approach that collects all stats at once, or use the iter_all_metadatas helper from #427 if that lands first.

  3. Tiny room threshold — Flagging rooms with 1 drawer as fragmentation is a good heuristic. In our integration we intentionally create single-drawer rooms for high-value discoveries, so this would be a false positive. Consider making the threshold configurable or adding a note that single-drawer rooms can be intentional.

  4. Missing check: duplicate content — The PR description mentions "duplicate content detection" but I don't see it in the actual checks. Our tiered dedup (hard threshold 0.86, soft 0.55) catches 3x more duplicates than a single threshold. Might be a good addition for a follow-up.

  5. MCP tool exposure — Exposing doctor via MCP (mempalace_doctor) is a smart move. AI agents can self-diagnose palace issues before search quality degrades.

14 tests with good coverage of edge cases. Clean implementation.

🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.

@Nitrogonza9
Copy link
Copy Markdown
Author

Thanks @web3guru888 — third solid review, appreciate the consistency.

Single-pass collection scan — fixing now. You're right, two full scans is wasteful. I'll merge the metadata and drawer size checks into one pass that collects all stats simultaneously.

Tiny room threshold — making it a note, not a hard warning. Good point about intentional single-drawer rooms for high-value discoveries. I'll change the message to clarify it's informational, not necessarily a problem.

Duplicate content detection — noted for follow-up. Honest oversight — the PR description mentioned it but I didn't implement it. I'll remove it from the description to avoid confusion. Your tiered dedup approach (0.86 hard / 0.55 soft) would be the right way to do this, but it needs embedding similarity which is a separate effort.

Pushing fixes now.

— Gonzalo

Address review feedback from @web3guru888:

- Merge three separate collection scans (metadata, wings/rooms, sizes)
  into a single _check_drawers() pass. Reads documents + metadatas once
  instead of walking the collection three times.
- Change tiny_rooms from WARN to OK with informational message — single-
  drawer rooms can be intentional for high-value discoveries.
- Remove "duplicate content detection" from scope (not implemented,
  needs embedding similarity — separate PR).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@bensig bensig changed the base branch from main to develop April 11, 2026 22:22
@igorls igorls added area/cli CLI commands area/mcp MCP server and tools enhancement New feature or request labels Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cli CLI commands area/mcp MCP server and tools enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants