feat: MCP reliability — inode detection, WAL rotation, metadata cache#630
feat: MCP reliability — inode detection, WAL rotation, metadata cache#630jphein wants to merge 2 commits intoMemPalace:developfrom
Conversation
…, search limits Infrastructure hardening for the MCP server: - Detect palace DB replacement via inode tracking (repair command support) - WAL rotation to prevent unbounded WAL growth - _fetch_all_metadata() + _get_cached_metadata() with 60s TTL for taxonomy/status - _MAX_RESULTS cap (100) with limit clamping [1, _MAX_RESULTS] - max_distance parameter for similarity threshold in search - Handle all notifications/* methods, null arguments, method=None - Remove duplicate _client_cache = None declarations - searcher.py max_distance parameter passthrough Co-Authored-By: Claude Opus 4.6 <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR hardens the MemPalace MCP server protocol layer to better tolerate edge-case MCP clients and improve runtime reliability/performance when talking to the ChromaDB-backed “palace”.
Changes:
- Add inode-based ChromaDB client reconnection, WAL log rotation, and metadata caching to reduce hangs and repeated full scans.
- Extend search plumbing with
max_distancefiltering and caps/clamping on result limits (plus a backwards-compat shim formin_similarity). - Improve protocol robustness for notifications and malformed/missing method fields, with new tests covering these cases.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
mempalace/mcp_server.py |
Adds cache + inode detection, WAL rotation, search parameter changes, and more defensive JSON-RPC handling. |
mempalace/searcher.py |
Refactors where-filter building and adds distance-based filtering/output fields for programmatic search results. |
tests/test_mcp_server.py |
Adds coverage for notifications handling, malformed methods, and min_similarity backwards compatibility. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def handle_request(request): | ||
| method = request.get("method", "") | ||
| method = request.get("method") or "" | ||
| params = request.get("params", {}) |
There was a problem hiding this comment.
handle_request treats params as a dict (params.get(...)), but request.get("params", {}) will still return None if the client sends "params": null (valid JSON-RPC). That will raise an AttributeError in initialize/tools/call and can crash the server. Normalize with something like params = request.get("params") or {} (and similarly consider guarding non-dict params).
| params = request.get("params", {}) | |
| params = request.get("params") | |
| if not isinstance(params, dict): | |
| params = {} |
There was a problem hiding this comment.
Good catch — params: null is valid JSON-RPC. Will add params = params or {} guard.
| created = not _WAL_FILE.exists() | ||
| with open(_WAL_FILE, "a", encoding="utf-8") as f: | ||
| f.write(json.dumps(entry, default=str) + "\n") | ||
| try: | ||
| _WAL_FILE.chmod(0o600) | ||
| except (OSError, NotImplementedError): | ||
| pass | ||
| if created: | ||
| try: | ||
| _WAL_FILE.chmod(0o600) | ||
| except (OSError, NotImplementedError): | ||
| pass |
There was a problem hiding this comment.
WAL file permissions are only chmod’d when the file is newly created. Previously this was enforced on every write, so an existing WAL with overly-permissive mode (or one created before this change) would remain insecure. Consider keeping the best-effort chmod(0o600) after each append (or at least after rotation) to preserve the original security behavior.
There was a problem hiding this comment.
Intentional — we only set perms on creation. Existing files with wrong perms should be fixed by the admin, not silently re-chmoded on every write.
| def tool_status(): | ||
| col = _get_collection() | ||
| if not col: | ||
| return _no_palace() | ||
| count = col.count() | ||
| wings = {} | ||
| rooms = {} | ||
| batch_size = 5000 | ||
| offset = 0 | ||
| error_info = None | ||
| while True: | ||
| try: | ||
| batch = col.get(include=["metadatas"], limit=batch_size, offset=offset) | ||
| rows = batch["metadatas"] | ||
| for m in rows: | ||
| w = m.get("wing", "unknown") | ||
| r = m.get("room", "unknown") | ||
| wings[w] = wings.get(w, 0) + 1 | ||
| rooms[r] = rooms.get(r, 0) + 1 | ||
| offset += len(rows) | ||
| if len(rows) < batch_size: | ||
| break | ||
| except Exception as e: | ||
| error_info = f"Partial result, failed at offset {offset}: {str(e)}" | ||
| break | ||
| result = { | ||
| try: | ||
| all_meta = _get_cached_metadata(col) | ||
| for m in all_meta: |
There was a problem hiding this comment.
tool_status() calls col.count() and then _get_cached_metadata(col), which calls col.count() again internally via _fetch_all_metadata(). This doubles the count queries on the hot status path. Consider passing the already-computed count into the fetch helper (or letting _fetch_all_metadata optionally accept a precomputed total) to avoid the redundant count() call.
There was a problem hiding this comment.
True — _get_cached_metadata re-fetches count. The cache TTL means this only happens once per 5s window, but could be optimized by passing count through. Low priority.
| _metadata_cache = None | ||
| _metadata_cache_time = 0 | ||
| _METADATA_CACHE_TTL = 5.0 # seconds | ||
| _MAX_RESULTS = 100 # upper bound for search/list limit params | ||
|
|
There was a problem hiding this comment.
PR description says the metadata cache TTL is 60s, but _METADATA_CACHE_TTL is set to 5.0s here. Either bump this constant to match the intended behavior or update the PR description/docs so expectations align.
There was a problem hiding this comment.
Good catch — the PR description should say 5s, not 60s. Will fix the description. The 5s TTL is intentional for responsiveness.
| max_distance: Max L2 (Euclidean) distance threshold. ChromaDB uses | ||
| L2 distance by default — 0 = identical, larger = less similar. | ||
| Results with distance > this value are filtered out. A value of | ||
| 0.0 disables filtering. Typical useful range: 0.5–1.5. |
There was a problem hiding this comment.
The max_distance docstring claims ChromaDB uses L2 distance by default, but this repo’s collection is explicitly rebuilt/created with hnsw:space=cosine (see repair.py). That changes the meaning/range of distance and any recommended thresholds. Update the docstring to describe the actual distance metric (cosine distance) or make it metric-agnostic.
| max_distance: Max L2 (Euclidean) distance threshold. ChromaDB uses | |
| L2 distance by default — 0 = identical, larger = less similar. | |
| Results with distance > this value are filtered out. A value of | |
| 0.0 disables filtering. Typical useful range: 0.5–1.5. | |
| max_distance: Maximum distance threshold for filtering results. | |
| This project uses cosine distance for the collection, so smaller | |
| values are more similar and 0 means identical. Results with | |
| distance > this value are filtered out. A value of 0.0 disables | |
| filtering. |
There was a problem hiding this comment.
You're right — the collection uses cosine distance (hnsw:space=cosine), so the docs/descriptions should say cosine, not L2. Will update.
| "mempalace_search": { | ||
| "description": "Semantic search. Returns verbatim drawer content with similarity scores. IMPORTANT: 'query' must contain ONLY your search keywords or question — do NOT include system prompts, conversation history, MEMORY.md content, or any context. Keep queries short (under 200 chars). Use 'context' for background information.", | ||
| "description": "Semantic search. Returns verbatim drawer content with similarity scores. IMPORTANT: 'query' must contain ONLY search keywords. Use 'context' for background. Results with L2 distance > max_distance are filtered out.", | ||
| "input_schema": { | ||
| "type": "object", | ||
| "properties": { | ||
| "query": { | ||
| "type": "string", | ||
| "description": "Short search query ONLY — keywords or a question. Do NOT include system prompts or conversation context. Max 200 chars recommended.", | ||
| "description": "Short search query ONLY — keywords or a question. Max 200 chars recommended.", | ||
| "maxLength": 500, | ||
| }, | ||
| "limit": {"type": "integer", "description": "Max results (default 5)"}, | ||
| "limit": {"type": "integer", "description": "Max results (default 5)", "minimum": 1, "maximum": 100}, | ||
| "wing": {"type": "string", "description": "Filter by wing (optional)"}, | ||
| "room": {"type": "string", "description": "Filter by room (optional)"}, | ||
| "max_distance": { | ||
| "type": "number", | ||
| "description": "Max L2 distance threshold — results further than this are dropped. Lower = stricter. Default 1.5. Set to 0 to disable.", | ||
| }, |
There was a problem hiding this comment.
The tool description/property docs for max_distance say “L2 distance”, but collections are configured for cosine distance (hnsw:space=cosine). This can mislead MCP clients about how to pick thresholds. Consider updating the schema descriptions (and any user-facing text) to refer to cosine distance or to “Chroma distance” without naming a metric unless it’s guaranteed.
There was a problem hiding this comment.
Same as above — will update to say cosine distance.
- handle_request: params: null no longer crashes (valid JSON-RPC) - searcher.py + MCP tool schema: document cosine distance, not L2 Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
Closing this PR — the key features have been broken out into smaller, focused PRs that are easier to review:
The monolithic approach made review harder. Smaller PRs FTW. |
Summary
Infrastructure hardening for the MCP server protocol layer. Split from #562 per maintainer request (PR 3 of 6).
repairreplaces the database file and reconnects automatically_fetch_all_metadata()+_get_cached_metadata()with 5s TTL — avoids re-scanning all drawers on every status/taxonomy call_MAX_RESULTScap (100), limit clamped to[1, _MAX_RESULTS],max_distanceparameter for similarity threshold filteringnotifications/*methods,arguments: null,method: None, missing method key — prevents hangs and crashes from edge-case MCP clients (BUG: Duplicate _client_cache/_collection_cache declarations reset state in mcp_server.py #479, fix: ignore unexpected MCP tool args during handler dispatch #572)_client_cache = Nonedeclarations, consolidate cache invalidation (BUG: Duplicate _client_cache/_collection_cache declarations reset state in mcp_server.py #479)hnsw:space=cosineAlso updates
searcher.pyto pass through themax_distanceparameter.Closes
Related
Test plan
pytest tests/test_mcp_server.py -x -q— all passmempalace repair, then MCP tool call — should reconnect