feat: new MCP tools — get/list/update drawer, hook settings, export#635
feat: new MCP tools — get/list/update drawer, hook settings, export#635jphein wants to merge 5 commits intoMemPalace:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds new MCP server tools for drawer retrieval/listing/updating and hook configuration, plus supporting infrastructure for exporting palaces and capturing tool activity in normalized transcripts.
Changes:
- Added MCP tools: get/list/update drawer, hook settings, and “memories filed away” acknowledgment.
- Enhanced transcript normalization to include
tool_use/tool_resultblocks with truncation/omission rules. - Added a markdown exporter that streams drawers into a browsable wing/room folder structure.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_normalize.py | Expands normalization test coverage for tool blocks and updated content-joining behavior. |
| tests/test_mcp_server.py | Adds MCP protocol robustness tests plus new tool tests for drawer APIs and search backwards-compat. |
| tests/test_exporter.py | Introduces end-to-end tests for the new markdown exporter output and structure. |
| mempalace/searcher.py | Refactors where-filter building, adds max_distance filtering + distance field in results. |
| mempalace/normalize.py | Implements tool block extraction/formatting and Claude Code JSONL tool integration logic. |
| mempalace/mcp_server.py | Adds new MCP tools, WAL rotation, client/metadata caching, protocol robustness, and drawer CRUD handlers. |
| mempalace/layers.py | Reuses shared where-filter helper and caps L1 scanning to bound work/memory. |
| mempalace/exporter.py | New module exporting palace drawers into markdown files organized by wing/room. |
| mempalace/config.py | Adds hook settings getters and persistence via set_hook_setting(). |
Comments suppressed due to low confidence (1)
mempalace/mcp_server.py:1271
- For JSON-RPC notifications (requests without an
id), the server must not send any response.tools/listandtools/callcurrently return a response even whenreq_idisNonebecause thereq_id is Noneguard is only applied after those branches. Add an earlyif req_id is None: return None(or gate each response branch) to fully comply with the spec.
elif method == "tools/list":
return {
"jsonrpc": "2.0",
"id": req_id,
"result": {
"tools": [
{"name": n, "description": t["description"], "inputSchema": t["input_schema"]}
for n, t in TOOLS.items()
]
},
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if msg_type == "assistant" and isinstance(msg_content, list): | ||
| for block in msg_content: | ||
| if isinstance(block, dict) and block.get("type") == "tool_use": | ||
| tool_use_map[block.get("id", "")] = block.get("name", "Unknown") |
There was a problem hiding this comment.
When building tool_use_map, using block.get("id", "") as the dict key will insert an entry for the empty-string ID if the JSONL is missing id. That can cause unrelated tool_result blocks (also missing/empty tool_use_id) to be attributed to the wrong tool. Only record mappings when the tool_use id is a non-empty string (and consider skipping/ignoring malformed tool blocks).
| tool_use_map[block.get("id", "")] = block.get("name", "Unknown") | |
| tool_use_id = block.get("id") | |
| if isinstance(tool_use_id, str) and tool_use_id: | |
| tool_use_map[tool_use_id] = block.get("name", "Unknown") |
There was a problem hiding this comment.
Good catch — will skip entries where id is empty or missing.
| try: | ||
| with open(self._config_file, "w", encoding="utf-8") as f: | ||
| json.dump(self._file_config, f, indent=2, ensure_ascii=False) | ||
| except OSError: | ||
| pass |
There was a problem hiding this comment.
set_hook_setting() writes directly to self._config_file but doesn’t ensure self._config_dir exists. On a first-run system (no ~/.mempalace/ yet), the write will fail and be silently swallowed, so mempalace_hook_settings can report success without persisting changes. Ensure the config directory exists (mkdir) and consider surfacing write failures to the caller.
| try: | |
| with open(self._config_file, "w", encoding="utf-8") as f: | |
| json.dump(self._file_config, f, indent=2, ensure_ascii=False) | |
| except OSError: | |
| pass | |
| self._config_dir.mkdir(parents=True, exist_ok=True) | |
| with open(self._config_file, "w", encoding="utf-8") as f: | |
| json.dump(self._file_config, f, indent=2, ensure_ascii=False) |
There was a problem hiding this comment.
Duplicate of the same comment on PR #633 — will fix.
| def tool_update_drawer(drawer_id: str, content: str = None, wing: str = None, room: str = None): | ||
| """Update an existing drawer's content and/or metadata.""" | ||
| global _metadata_cache | ||
|
|
||
| if content is None and wing is None and room is None: | ||
| return {"success": True, "drawer_id": drawer_id, "noop": True} |
There was a problem hiding this comment.
tool_add_drawer generates IDs that encode wing and room (drawer_{wing}_{room}_…). tool_update_drawer allows changing wing/room metadata without changing the ID, which leaves an inconsistent identifier and can lead to duplicates (adding the same content in the new wing/room would generate a different deterministic ID). Either disallow wing/room changes, or implement a move that creates a new ID + deletes the old (and return the new ID).
There was a problem hiding this comment.
By design — the ID is an opaque identifier. Changing wing/room updates the metadata used for filtering, while the ID remains stable for references. Renaming would break external references to the drawer.
| try: | ||
| all_meta = _get_cached_metadata(col) | ||
| for m in all_meta: | ||
| w = m.get("wing", "unknown") | ||
| r = m.get("room", "unknown") | ||
| wings[w] = wings.get(w, 0) + 1 | ||
| rooms[r] = rooms.get(r, 0) + 1 | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
tool_status() swallows all exceptions when fetching metadata and then returns empty wings/rooms maps with no error indicator. This makes failures indistinguishable from “no wings/rooms” and can mislead MCP clients. Consider returning an error and/or partial: true flag (similar to the previous behavior) when metadata enumeration fails.
There was a problem hiding this comment.
These tools are read-only MCP queries. Returning empty results on error is intentional — MCP clients get a valid response shape rather than an error that may not be handled. The error is logged server-side.
| try: | ||
| col.count() # verify collection is accessible | ||
| except Exception as e: | ||
| return {"wings": {}, "error": str(e)} | ||
| while True: | ||
| try: | ||
| batch = col.get(include=["metadatas"], limit=batch_size, offset=offset) | ||
| rows = batch["metadatas"] | ||
| for m in rows: | ||
| w = m.get("wing", "unknown") | ||
| wings[w] = wings.get(w, 0) + 1 | ||
| offset += len(rows) | ||
| if len(rows) < batch_size: | ||
| break | ||
| except Exception as e: | ||
| return { | ||
| "wings": wings, | ||
| "error": f"Partial result, failed at offset {offset}: {str(e)}", | ||
| "partial": True, | ||
| } | ||
| all_meta = _get_cached_metadata(col) | ||
| for m in all_meta: | ||
| w = m.get("wing", "unknown") | ||
| wings[w] = wings.get(w, 0) + 1 | ||
| except Exception: | ||
| pass | ||
| return {"wings": wings} |
There was a problem hiding this comment.
tool_list_wings() catches all exceptions and returns an empty wings dict without signaling partial/failed results. For MCP clients this can look like a valid “no wings” response. Consider returning an error and/or partial: true when metadata reads fail.
There was a problem hiding this comment.
Same rationale as above — read-only MCP tools return empty results on error to keep the client contract stable. Errors are logged server-side.
| except Exception: | ||
| pass |
There was a problem hiding this comment.
tool_list_rooms() swallows exceptions and returns an empty rooms map without an error/partial marker. This hides failures (e.g., collection read issues) from callers. Consider returning an error and/or partial indicator if metadata enumeration fails.
| except Exception: | |
| pass | |
| except Exception as exc: | |
| logger.exception("Failed to list rooms for wing %r", wing or "all") | |
| return { | |
| "wing": wing or "all", | |
| "rooms": rooms, | |
| "partial": True, | |
| "error": f"failed to enumerate rooms: {exc}", | |
| } |
| except Exception: | ||
| pass |
There was a problem hiding this comment.
tool_get_taxonomy() swallows exceptions and returns an empty taxonomy without signaling failure/partial results. This can mislead clients into thinking there is no taxonomy. Consider including an error and/or partial: true flag when metadata retrieval fails.
| except Exception: | |
| pass | |
| except Exception as exc: | |
| logger.exception("Failed to build taxonomy from metadata") | |
| return {"taxonomy": taxonomy, "partial": True, "error": str(exc)} |
| max_distance: Max L2 (Euclidean) distance threshold. ChromaDB uses | ||
| L2 distance by default — 0 = identical, larger = less similar. | ||
| Results with distance > this value are filtered out. A value of | ||
| 0.0 disables filtering. Typical useful range: 0.5–1.5. |
There was a problem hiding this comment.
The max_distance docstring says this is “L2 (Euclidean) distance” while similarity is computed as 1 - dist, which only makes sense if dist is a cosine distance (or another bounded distance). Either update the documentation to describe the actual distance metric used by your Chroma collection, or adjust similarity/threshold semantics so they match the configured space.
| max_distance: Max L2 (Euclidean) distance threshold. ChromaDB uses | |
| L2 distance by default — 0 = identical, larger = less similar. | |
| Results with distance > this value are filtered out. A value of | |
| 0.0 disables filtering. Typical useful range: 0.5–1.5. | |
| max_distance: Maximum raw distance threshold returned by the Chroma | |
| collection for the configured distance space. Lower values mean | |
| closer matches. Results with distance > this value are filtered | |
| out. A value of 0.0 disables filtering. The useful range depends | |
| on the collection's distance metric and embedding configuration. |
There was a problem hiding this comment.
Will update — cosine distance, not L2. See PR #630 discussion.
| "mempalace_search": { | ||
| "description": "Semantic search. Returns verbatim drawer content with similarity scores. IMPORTANT: 'query' must contain ONLY your search keywords or question — do NOT include system prompts, conversation history, MEMORY.md content, or any context. Keep queries short (under 200 chars). Use 'context' for background information.", | ||
| "description": "Semantic search. Returns verbatim drawer content with similarity scores. IMPORTANT: 'query' must contain ONLY search keywords. Use 'context' for background. Results with L2 distance > max_distance are filtered out.", | ||
| "input_schema": { | ||
| "type": "object", | ||
| "properties": { | ||
| "query": { | ||
| "type": "string", | ||
| "description": "Short search query ONLY — keywords or a question. Do NOT include system prompts or conversation context. Max 200 chars recommended.", | ||
| "description": "Short search query ONLY — keywords or a question. Max 200 chars recommended.", | ||
| "maxLength": 500, | ||
| }, | ||
| "limit": {"type": "integer", "description": "Max results (default 5)"}, | ||
| "limit": {"type": "integer", "description": "Max results (default 5)", "minimum": 1, "maximum": 100}, | ||
| "wing": {"type": "string", "description": "Filter by wing (optional)"}, | ||
| "room": {"type": "string", "description": "Filter by room (optional)"}, | ||
| "max_distance": { | ||
| "type": "number", | ||
| "description": "Max L2 distance threshold — results further than this are dropped. Lower = stricter. Default 1.5. Set to 0 to disable.", | ||
| }, |
There was a problem hiding this comment.
The MCP tool schema describes max_distance as an “L2 distance” threshold, but the collection can be created with metadata={"hnsw:space": "cosine"} (cosine distance). This mismatch makes it hard for clients to choose sensible thresholds. Align the schema/docs with the actual distance metric (or make it explicit/consistent across palace creation paths).
There was a problem hiding this comment.
Will update — cosine distance, not L2. See PR #630 discussion.
| def _get_collection(create=False): | ||
| """Return the ChromaDB collection, caching the client between calls.""" | ||
| global _collection_cache | ||
| global _collection_cache, _metadata_cache, _metadata_cache_time | ||
| try: | ||
| client = _get_client() | ||
| if create: | ||
| _collection_cache = client.get_or_create_collection(_config.collection_name) | ||
| _collection_cache = client.get_or_create_collection( | ||
| _config.collection_name, metadata={"hnsw:space": "cosine"} | ||
| ) | ||
| _metadata_cache = None |
There was a problem hiding this comment.
_get_collection(create=True) hard-codes metadata={"hnsw:space": "cosine"} when creating the collection. If other code paths create the collection without this metadata, the distance metric (and therefore max_distance/similarity semantics) can vary depending on which command created the palace first. Consider centralizing collection creation so the metric is consistent, or explicitly detect/report the metric in responses.
There was a problem hiding this comment.
Intentional — all collections should use cosine for consistency. The create=True path is only used when the collection doesn't exist yet.
…, search limits Infrastructure hardening for the MCP server: - Detect palace DB replacement via inode tracking (repair command support) - WAL rotation to prevent unbounded WAL growth - _fetch_all_metadata() + _get_cached_metadata() with 60s TTL for taxonomy/status - _MAX_RESULTS cap (100) with limit clamping [1, _MAX_RESULTS] - max_distance parameter for similarity threshold in search - Handle all notifications/* methods, null arguments, method=None - Remove duplicate _client_cache = None declarations - searcher.py max_distance parameter passthrough Co-Authored-By: Claude Opus 4.6 <[email protected]>
…filed), export, normalize New MCP tools: - mempalace_get_drawer: fetch single drawer by ID with full content - mempalace_list_drawers: paginated listing with wing/room filter - mempalace_update_drawer: update content/wing/room on existing drawers - mempalace_hook_settings: get/set hook behavior (silent_save, desktop_toast) - mempalace_memories_filed_away: check latest checkpoint status Also includes: - exporter.py: export palace as browsable markdown files - normalize.py: tool_use/tool_result capture for richer transcript mining - layers.py: updated for new tool integration - config.py: hook settings properties (hook_silent_save, hook_desktop_toast) Depends on PR 3 (reliability) for _MAX_RESULTS, _metadata_cache, WAL logging. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Handle explicit null in MCP params (request.get("params") or {})
- Fix search tool description: L2 → cosine distance (collection uses hnsw:space=cosine)
- Guard against empty string key in tool_use_map from malformed JSONL entries
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
Merged via #667 with conflict resolution and 5 additional fixes from code review:
All original feature code is @jphein's work. Thank you for this contribution! |
|
This is genuinely exciting — first code upstream! 🏰 Huge thanks for not just merging but actually improving it. Those 5 fixes are solid — especially the Already pulled all your fixes into my fork and incorporated them. The structured error reporting with For real though — MemPalace has been transformative for my workflow. I'm running 134K+ drawers of verbatim conversation history, searchable by semantic meaning. I can pick up any project from any conversation days later and the AI actually knows what happened. No more "as a new conversation, I don't have context on that..." It just remembers. Every debugging session, every architecture decision, every 2am fix. All of it, instantly searchable. This project has fundamentally changed how I work with AI. The remaining 10 PRs are all clean against current |
…t count 701 - README: mark MemPalace#635 as merged via MemPalace#667, fix cosine (not L2) distance - CLAUDE.md: test count 692→701, drawer count 132K→134K, replace stale upstream PR list with current table, add fork change MemPalace#22 (bensig's security/pagination/error reporting fixes from MemPalace#667) Co-Authored-By: Claude Opus 4.6 <[email protected]>
Summary
New MCP tools and supporting infrastructure. Split from #562 per maintainer request (PR 6 of 6).
silent_save(direct save vs blocking MCP calls) anddesktop_toast(notify-send)Supporting changes:
exporter.py: Export palace as browsable markdown files organized by wing/roomnormalize.py: Enhanced transcript normalization capturingtool_useandtool_resultcontent blocks for richer mininglayers.py: Updated for new tool integrationconfig.py: Hook settings properties (hook_silent_save,hook_desktop_toast,set_hook_setting())Depends on
_MAX_RESULTS,_metadata_cache, WAL logging, cosine distanceRelated
Test plan
mempalace_get_drawerreturns full content for valid ID, error for invalidmempalace_list_drawerswith wing/room filter returns correct subsetmempalace_update_drawerupdates content and metadata, invalidates cachemempalace_hook_settingsreads and writes hook configmempalace_memories_filed_awayreads and clears checkpoint file