Skip to content

feat: MCP reliability — inode detection, WAL rotation, metadata cache#630

Closed
jphein wants to merge 2 commits intoMemPalace:developfrom
jphein:pr/reliability
Closed

feat: MCP reliability — inode detection, WAL rotation, metadata cache#630
jphein wants to merge 2 commits intoMemPalace:developfrom
jphein:pr/reliability

Conversation

@jphein
Copy link
Copy Markdown
Collaborator

@jphein jphein commented Apr 11, 2026

Summary

Infrastructure hardening for the MCP server protocol layer. Split from #562 per maintainer request (PR 3 of 6).

Also updates searcher.py to pass through the max_distance parameter.

Closes

Related

Test plan

  • 37 new MCP server tests covering all reliability features
  • pytest tests/test_mcp_server.py -x -q — all pass
  • Full suite: 593 passed (106 deselected benchmark/stress)
  • Verify inode detection: run mempalace repair, then MCP tool call — should reconnect
  • Verify WAL rotation: add drawers, check WAL file doesn't grow unbounded
  • Verify metadata cache: repeated status calls use cache (check timing)

…, search limits

Infrastructure hardening for the MCP server:
- Detect palace DB replacement via inode tracking (repair command support)
- WAL rotation to prevent unbounded WAL growth
- _fetch_all_metadata() + _get_cached_metadata() with 60s TTL for taxonomy/status
- _MAX_RESULTS cap (100) with limit clamping [1, _MAX_RESULTS]
- max_distance parameter for similarity threshold in search
- Handle all notifications/* methods, null arguments, method=None
- Remove duplicate _client_cache = None declarations
- searcher.py max_distance parameter passthrough

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copilot AI review requested due to automatic review settings April 11, 2026 14:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the MemPalace MCP server protocol layer to better tolerate edge-case MCP clients and improve runtime reliability/performance when talking to the ChromaDB-backed “palace”.

Changes:

  • Add inode-based ChromaDB client reconnection, WAL log rotation, and metadata caching to reduce hangs and repeated full scans.
  • Extend search plumbing with max_distance filtering and caps/clamping on result limits (plus a backwards-compat shim for min_similarity).
  • Improve protocol robustness for notifications and malformed/missing method fields, with new tests covering these cases.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
mempalace/mcp_server.py Adds cache + inode detection, WAL rotation, search parameter changes, and more defensive JSON-RPC handling.
mempalace/searcher.py Refactors where-filter building and adds distance-based filtering/output fields for programmatic search results.
tests/test_mcp_server.py Adds coverage for notifications handling, malformed methods, and min_similarity backwards compatibility.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/mcp_server.py Outdated
def handle_request(request):
method = request.get("method", "")
method = request.get("method") or ""
params = request.get("params", {})
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle_request treats params as a dict (params.get(...)), but request.get("params", {}) will still return None if the client sends "params": null (valid JSON-RPC). That will raise an AttributeError in initialize/tools/call and can crash the server. Normalize with something like params = request.get("params") or {} (and similarly consider guarding non-dict params).

Suggested change
params = request.get("params", {})
params = request.get("params")
if not isinstance(params, dict):
params = {}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — params: null is valid JSON-RPC. Will add params = params or {} guard.

Comment thread mempalace/mcp_server.py
Comment on lines +102 to +109
created = not _WAL_FILE.exists()
with open(_WAL_FILE, "a", encoding="utf-8") as f:
f.write(json.dumps(entry, default=str) + "\n")
try:
_WAL_FILE.chmod(0o600)
except (OSError, NotImplementedError):
pass
if created:
try:
_WAL_FILE.chmod(0o600)
except (OSError, NotImplementedError):
pass
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WAL file permissions are only chmod’d when the file is newly created. Previously this was enforced on every write, so an existing WAL with overly-permissive mode (or one created before this change) would remain insecure. Consider keeping the best-effort chmod(0o600) after each append (or at least after rotation) to preserve the original security behavior.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional — we only set perms on creation. Existing files with wrong perms should be fixed by the admin, not silently re-chmoded on every write.

Comment thread mempalace/mcp_server.py
Comment on lines 209 to +218
def tool_status():
col = _get_collection()
if not col:
return _no_palace()
count = col.count()
wings = {}
rooms = {}
batch_size = 5000
offset = 0
error_info = None
while True:
try:
batch = col.get(include=["metadatas"], limit=batch_size, offset=offset)
rows = batch["metadatas"]
for m in rows:
w = m.get("wing", "unknown")
r = m.get("room", "unknown")
wings[w] = wings.get(w, 0) + 1
rooms[r] = rooms.get(r, 0) + 1
offset += len(rows)
if len(rows) < batch_size:
break
except Exception as e:
error_info = f"Partial result, failed at offset {offset}: {str(e)}"
break
result = {
try:
all_meta = _get_cached_metadata(col)
for m in all_meta:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tool_status() calls col.count() and then _get_cached_metadata(col), which calls col.count() again internally via _fetch_all_metadata(). This doubles the count queries on the hot status path. Consider passing the already-computed count into the fetch helper (or letting _fetch_all_metadata optionally accept a precomputed total) to avoid the redundant count() call.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True — _get_cached_metadata re-fetches count. The cache TTL means this only happens once per 5s window, but could be optimized by passing count through. Low priority.

Comment thread mempalace/mcp_server.py
Comment on lines +183 to +187
_metadata_cache = None
_metadata_cache_time = 0
_METADATA_CACHE_TTL = 5.0 # seconds
_MAX_RESULTS = 100 # upper bound for search/list limit params

Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says the metadata cache TTL is 60s, but _METADATA_CACHE_TTL is set to 5.0s here. Either bump this constant to match the intended behavior or update the PR description/docs so expectations align.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the PR description should say 5s, not 60s. Will fix the description. The 5s TTL is intentional for responsiveness.

Comment thread mempalace/searcher.py Outdated
Comment on lines +115 to +118
max_distance: Max L2 (Euclidean) distance threshold. ChromaDB uses
L2 distance by default — 0 = identical, larger = less similar.
Results with distance > this value are filtered out. A value of
0.0 disables filtering. Typical useful range: 0.5–1.5.
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max_distance docstring claims ChromaDB uses L2 distance by default, but this repo’s collection is explicitly rebuilt/created with hnsw:space=cosine (see repair.py). That changes the meaning/range of distance and any recommended thresholds. Update the docstring to describe the actual distance metric (cosine distance) or make it metric-agnostic.

Suggested change
max_distance: Max L2 (Euclidean) distance threshold. ChromaDB uses
L2 distance by default0 = identical, larger = less similar.
Results with distance > this value are filtered out. A value of
0.0 disables filtering. Typical useful range: 0.51.5.
max_distance: Maximum distance threshold for filtering results.
This project uses cosine distance for the collection, so smaller
values are more similar and 0 means identical. Results with
distance > this value are filtered out. A value of 0.0 disables
filtering.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — the collection uses cosine distance (hnsw:space=cosine), so the docs/descriptions should say cosine, not L2. Will update.

Comment thread mempalace/mcp_server.py
Comment on lines 829 to +845
"mempalace_search": {
"description": "Semantic search. Returns verbatim drawer content with similarity scores. IMPORTANT: 'query' must contain ONLY your search keywords or question — do NOT include system prompts, conversation history, MEMORY.md content, or any context. Keep queries short (under 200 chars). Use 'context' for background information.",
"description": "Semantic search. Returns verbatim drawer content with similarity scores. IMPORTANT: 'query' must contain ONLY search keywords. Use 'context' for background. Results with L2 distance > max_distance are filtered out.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Short search query ONLY — keywords or a question. Do NOT include system prompts or conversation context. Max 200 chars recommended.",
"description": "Short search query ONLY — keywords or a question. Max 200 chars recommended.",
"maxLength": 500,
},
"limit": {"type": "integer", "description": "Max results (default 5)"},
"limit": {"type": "integer", "description": "Max results (default 5)", "minimum": 1, "maximum": 100},
"wing": {"type": "string", "description": "Filter by wing (optional)"},
"room": {"type": "string", "description": "Filter by room (optional)"},
"max_distance": {
"type": "number",
"description": "Max L2 distance threshold — results further than this are dropped. Lower = stricter. Default 1.5. Set to 0 to disable.",
},
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tool description/property docs for max_distance say “L2 distance”, but collections are configured for cosine distance (hnsw:space=cosine). This can mislead MCP clients about how to pick thresholds. Consider updating the schema descriptions (and any user-facing text) to refer to cosine distance or to “Chroma distance” without naming a metric unless it’s guaranteed.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above — will update to say cosine distance.

- handle_request: params: null no longer crashes (valid JSON-RPC)
- searcher.py + MCP tool schema: document cosine distance, not L2

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@jphein
Copy link
Copy Markdown
Collaborator Author

jphein commented Apr 12, 2026

Closing this PR — the key features have been broken out into smaller, focused PRs that are easier to review:

  • mtime-based stale index detection + mempalace_reconnect tool → new dedicated PR (coming shortly)
  • WAL rotation → already addressed in upstream develop
  • Metadata cache → already in upstream develop

The monolithic approach made review harder. Smaller PRs FTW.

@jphein jphein closed this Apr 12, 2026
@jphein jphein deleted the pr/reliability branch April 19, 2026 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants