fix: paginate large collection reads and surface errors in MCP tools (#339, #338)#371
Conversation
|
@RhettOP pls fix lint |
bensig
left a comment
There was a problem hiding this comment.
fix lint, ready to merge
web3guru888
left a comment
There was a problem hiding this comment.
Big +1 on this PR — we filed #338 and #339 originally, so this directly addresses bugs we hit in production. Our integration runs 208 discoveries across 5 domains with 710+ KG entities, and the silent except Exception: pass in status tools made debugging near-impossible when collections grew past 10K.
Pagination approach looks solid. The while offset < count loop with batch_size=5000 is clean. One edge case to consider: if drawers are being added concurrently (e.g. an active mining session while querying status), count might increase between the initial call and later batches. This is probably fine for status queries (off-by-a-few is acceptable), but worth a comment noting the race.
Error surfacing is exactly right. Returning {"error": ..., "partial": True} is much better than swallowing exceptions — MCP clients can now distinguish between "empty palace" and "something broke mid-read." This matches how we handle partial results in our own integration.
Minor observations:
- In
tool_list_rooms,total = col.count()returns the unfiltered count even whenwhere={"wing": wing}is set. This means the loop may iterate more pages than needed when filtering by wing (extra empty batches). Not a correctness issue, just suboptimal. Could use a sentinel check:if not batch["metadatas"]: break. - Consider extracting the pagination loop into a shared helper (e.g.
_fetch_all_metadata(col, where=None)) to reduce duplication across the 4 functions. PR #484 takes this approach — worth coordinating. batch_size=5000is a reasonable default. In our integration we use similar batch sizes with ChromaDB ≥0.6.0.
Overall: this directly unblocks better diagnostics for everyone. Thank you for picking these up.
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
Code reviewFound 2 issues:
Fix: drive the loop off batch size instead of total count — break when
Both issues can be fixed by replacing the while True:
batch = col.get(include=["metadatas"], limit=batch_size, offset=offset)
rows = batch["metadatas"]
# process rows
offset += len(rows)
if len(rows) < batch_size:
breakGenerated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
…p bound - Replace 'while offset < count/total' with 'while True' + break on short batch - Fixes tool_list_rooms iterating over unfiltered col.count() when wing filter active - Fixes all 4 paginated functions (tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy) missing early-exit when batch smaller than batch_size - Remove unused 'total' variable in tool_list_wings, tool_list_rooms, tool_get_taxonomy (replaced col.count() with accessibility check only) Per bensig review comments on PR MemPalace#371
|
Fixed in the latest commit. All 4 paginated functions (tool_status, tool_list_wings, tool_list_rooms, tool_get_taxonomy) now use |
Summary
Fixes #339 and #338 - both issues share the same root cause in
mempalace/mcp_server.py.Root cause
Four functions (
tool_status,tool_list_wings,tool_list_rooms,tool_get_taxonomy) used:col.get(include=["metadatas"], limit=10000)- a single unbounded fetch that OOMs or hits ChromaDB HNSW errors on large collections (100k+ drawers).except Exception: pass- silently swallowed every error, making the caller receive an empty{}that looked like a valid but empty palace.Fix
Pagination (fixes #338): Replace the single
col.get()call with awhile offset < totalloop usingbatch_size=5000. Each iteration fetches one page and accumulates results.tool_list_wings: paginated count-by-wing aggregationtool_list_rooms: paginated count-by-room, with optionalwherefilter passed to every batch calltool_get_taxonomy: paginated nested wing->room->count treetool_status: paginated wings/rooms aggregation blockError surfacing (fixes #339): Replace
except Exception: passwithexcept Exception as e: return {..., "error": str(e), "partial": True}. Callers now receive the partial result collected so far plus a descriptive error message, instead of silent empty output.Changes
mempalace/mcp_server.py: 69 lines added / 34 removed across the four affected functions. No other files changed.