fix: cap search limit, paginate status tools, remove duplicate cache decls#484
fix: cap search limit, paginate status tools, remove duplicate cache decls#484jphein wants to merge 2 commits intoMemPalace:mainfrom
Conversation
Float equality on mtime fails due to JSON round-trip precision loss, causing every file to be re-mined on each run. Use epsilon < 0.01. Also adds bulk_check_mined() for fetching all source_file/mtime pairs in paginated batches — turns 25K individual DB queries into ~5 fetches. Fixes MemPalace#475 Co-Authored-By: Claude Opus 4.6 <[email protected]>
…decls - Clamp tool_search limit to [1, 100] to prevent memory exhaustion - Replace hardcoded limit=10000 in status/taxonomy tools with paginated _fetch_all_metadata() helper (matches palace_graph.py pattern) - Remove duplicate _client_cache/_collection_cache declarations Fixes MemPalace#477, MemPalace#478, MemPalace#479 Co-Authored-By: Claude Opus 4.6 <[email protected]>
There was a problem hiding this comment.
Pull request overview
Hardens MCP server read/search tooling to behave correctly and safely at scale (large palaces), while cleaning up initialization state.
Changes:
- Clamp
tool_searchresult limits to prevent excessive memory usage. - Replace hardcoded
limit=10000metadata reads with paginated metadata fetching for status/taxonomy tools. - Remove duplicate
_client_cache/_collection_cachedeclarations and adjust mined-file mtime comparison behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
mempalace/palace.py |
Adjusts mined-file mtime comparison tolerance; adds a bulk metadata prefetch helper for mined-file checks. |
mempalace/mcp_server.py |
Removes duplicate cache re-declarations; adds paginated metadata helper for status/taxonomy tools; extends search tool API and clamps search limit. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def tool_search( | ||
| query: str, limit: int = 5, wing: str = None, room: str = None, min_similarity: float = 1.5 | ||
| ): | ||
| limit = max(1, min(limit, 100)) | ||
| return search_memories( | ||
| query, | ||
| palace_path=_config.palace_path, | ||
| wing=wing, | ||
| room=room, | ||
| n_results=limit, | ||
| min_similarity=min_similarity, | ||
| ) |
There was a problem hiding this comment.
tool_search() passes min_similarity through to search_memories(), but search_memories (mempalace/searcher.py) does not accept that keyword argument. This will raise TypeError: search_memories() got an unexpected keyword argument 'min_similarity' on every call. Either implement min_similarity support in search_memories (and apply the distance filtering there), or stop passing the argument from tool_search.
| def tool_search( | ||
| query: str, limit: int = 5, wing: str = None, room: str = None, min_similarity: float = 1.5 | ||
| ): | ||
| limit = max(1, min(limit, 100)) |
There was a problem hiding this comment.
The parameter name min_similarity is inconsistent with the semantics described in the tool docs ("Max L2 distance threshold" where lower is stricter). Consider renaming the argument/schema field to something like max_distance/max_l2_distance (or, if you intended a true similarity score, invert the comparison and document a 0–1 range).
| def _fetch_all_metadata(col, where=None): | ||
| """Paginate col.get() to avoid the 10K silent truncation limit.""" | ||
| total = col.count() | ||
| all_meta = [] | ||
| offset = 0 | ||
| while offset < total: | ||
| kwargs = {"include": ["metadatas"], "limit": 1000, "offset": offset} | ||
| if where: | ||
| kwargs["where"] = where | ||
| batch = col.get(**kwargs) | ||
| all_meta.extend(batch["metadatas"]) | ||
| offset += len(batch["metadatas"]) | ||
| if not batch["metadatas"]: | ||
| break | ||
| return all_meta |
There was a problem hiding this comment.
_fetch_all_metadata() materializes all metadatas into a list, which can be very large on big palaces and spike memory usage even though the callers only need aggregated counts. Consider changing this helper to yield/iterate batches (or accept a callback) so tool_status/tool_list_* can aggregate per batch without storing everything; this would also avoid the extra col.count() call when tool_status() already fetched the count.
| "description": "Semantic search. Returns verbatim drawer content with similarity scores. Results with distance > min_similarity are filtered out (L2 distance: lower = more similar).", | ||
| "input_schema": { | ||
| "type": "object", | ||
| "properties": { | ||
| "query": {"type": "string", "description": "What to search for"}, | ||
| "limit": {"type": "integer", "description": "Max results (default 5)"}, | ||
| "wing": {"type": "string", "description": "Filter by wing (optional)"}, | ||
| "room": {"type": "string", "description": "Filter by room (optional)"}, | ||
| "min_similarity": { | ||
| "type": "number", | ||
| "description": "Max L2 distance threshold — results further than this are dropped. Lower = stricter. Default 1.5 filters clearly irrelevant results. Set to 0 to disable filtering.", | ||
| }, |
There was a problem hiding this comment.
The mempalace_search input schema doesn't reflect the new runtime constraints: limit is clamped to 1–100 in tool_search(), but the schema has no minimum/maximum. Adding these (and a minimum: 0 for min_similarity if kept) helps clients validate requests and avoids surprises when large limits are silently reduced.
| def bulk_check_mined(collection, filepaths: list[str]) -> dict[str, float]: | ||
| """Pre-fetch source_file/source_mtime pairs for all documents in the collection. | ||
|
|
||
| Returns a dict mapping source_file -> source_mtime (as float) for every | ||
| document that has both fields. Callers can check membership and compare | ||
| mtimes locally instead of issuing one ChromaDB query per file. | ||
|
|
||
| The *filepaths* argument is accepted for API symmetry but the function | ||
| fetches the full collection in paginated batches (like palace_graph.py) | ||
| since a WHERE-IN filter on thousands of paths is not supported by ChromaDB. | ||
| """ |
There was a problem hiding this comment.
bulk_check_mined() accepts filepaths but does not use it (it always scans the full collection). Even with the docstring note, this API is easy to misuse by callers who expect it to only check the provided paths. Consider either removing the parameter, or renaming it to _filepaths/unused_filepaths to make the intent explicit at the callsite.
web3guru888
left a comment
There was a problem hiding this comment.
This is a meaty PR — three distinct fixes. Let me review each:
1. Search limit cap (max(1, min(limit, 100))) ✅
Good security hygiene. Without this, an MCP client could pass limit=999999 and exhaust memory. The [1, 100] range is reasonable.
One note: The min_similarity parameter addition (default 1.5) is a nice bonus but changes the tool_search signature. Since search_memories() needs to accept this parameter too, make sure the downstream function handles it. Looking at the diff, it's passed through to search_memories() — is min_similarity already an accepted parameter there, or does this need a corresponding change in searcher.py?
2. Paginated status tools via _fetch_all_metadata() ✅
Clean refactor — extracting the pagination into a shared helper eliminates the code duplication that exists in the current codebase (and in #371's approach). The offset += len(batch["metadatas"]) advancement with if not batch["metadatas"]: break sentinel is good.
Note: PR #371 also fixes this same pagination issue with inline loops. These two PRs conflict — worth coordinating with @RhettOP. This PR's helper approach is DRYer.
Batch size difference: This uses limit=1000 vs #371's batch_size=5000. The smaller batch size is more conservative on memory but means more round-trips. Both are fine for metadata-only fetches; I'd lean toward the smaller batches.
3. Duplicate cache declarations removed ✅
Good catch. The second _client_cache = None / _collection_cache = None after the WAL setup block would silently reset the singleton state. This is a real bug — if any code path initialized the client before the WAL block, the second declaration would wipe it.
4. bulk_check_mined() in palace.py (bonus)
The paginated mined-file lookup is a nice performance improvement over per-file queries. The float comparison change (abs(float(stored_mtime) - current_mtime) < 0.01 vs exact ==) is smart — float precision issues in mtime serialization are a real problem.
Overall: This is three good fixes in one PR. The only risk is the conflict with #371 on pagination. Consider splitting the search-limit cap and cache-dedup fix into a separate PR to make merging easier.
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
|
Superseded by #493 which includes these fixes along with other improvements. |
|
Addressing Copilot's review points:
Note: #562 was split into 6 focused PRs per maintainer request. |
Summary
tool_searchlimit uncapped — addedmax(1, min(limit, 100))clamp to prevent memory exhaustion via large limit valueslimit=10000intool_status,tool_list_wings,tool_list_rooms,tool_get_taxonomywith paginated_fetch_all_metadata()helper (matches existingpalace_graph.pypattern)_client_cache/_collection_cachedeclarations — removed second pair that silently reset state after WAL setup blockFixes #477, #478, #479
Test plan
🤖 Generated with Claude Code