fix: guard against empty ChromaDB query results in search paths#305
fix: guard against empty ChromaDB query results in search paths#305dr-robert-li wants to merge 1 commit intoMemPalace:developfrom
Conversation
ChromaDB 1.x can return {documents: []} (empty outer list) for queries against a fresh collection or filter that excludes everything. The pre-fix code did results['documents'][0] without guarding the outer list, raising IndexError. ChromaDB 0.6.x returns {documents: [[]]} (empty inner) which the existing 'if not docs' check already handled, so this manifests only on 1.x.
Fixed in searcher.search, searcher.search_memories, layers.Layer3.search, and layers.Layer3.search_raw. Regression tests inject the empty-outer shape via mock to exercise the guard regardless of installed ChromaDB version.
Closes MemPalace#195
PR Review: fix: guard against empty ChromaDB query results in search pathsExecutive Summary
Affected Areas: Business Impact: Prevents user-facing crashes when searching an empty or freshly initialized palace, or when ChromaDB filters exclude all results. Flow Changes: None for the success path. The empty-result path now returns gracefully instead of crashing with Ratings
PR Health
The BugChromaDB 0.6.x returns ChromaDB 1.x returns The FixGuard pattern applied consistently across all 4 paths: if not results.get("documents") or not results["documents"][0]:
return <appropriate empty response>This guard handles:
The old Medium Priority Issues🎨 #1: Missing test for
|
- pyproject.toml: widen chromadb to <2.0 for Python 3.14 compat (MemPalace#302) - config.py + miner/convo_miner/mcp_server: add hnsw:space=cosine so similarity = 1 - distance stays in [0,1] instead of negative L2 (MemPalace#304) - searcher.py + layers.py: guard against ChromaDB 1.x empty-outer query results (IndexError on fresh collections) (MemPalace#305) - mcp_server.py: redirect stdout→stderr at import to protect JSON-RPC wire from chromadb/posthog chatter (MemPalace#306) - mcp_server.py: replace 10k-limited col.get with paginating _iter_all_metadatas helper; stop swallowing errors silently (MemPalace#307) - mcp_server.py: drop undocumented wait_for_previous arg injected by Gemini MCP clients (MemPalace#322) - searcher.py + hybrid_searcher.py + mcp_server.py: add min_similarity threshold filter so callers get a clean "no results" signal (MemPalace#350) - entity_detector.py: add CODE_KEYWORDS blocklist (~80 terms) to stop Rust types / React / framework names being detected as entities (MemPalace#349) Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
|
Hey @dr-robert-li — I've taken a look and ran it through CLI. Thanks for the fix! There are three open PRs addressing #195 (empty ChromaDB query results) and going with #197 as the most focused diff. Closing this one as superseded — the bug is still getting fixed, just via the other PR. Really appreciate the contribution. 💜 |
|
Reopening — this was closed too quickly and your work deserves proper review. Standing by for merge once we clear the current security baseline. Thank you for the contribution. |
web3guru888
left a comment
There was a problem hiding this comment.
Good defensive fix. The guard against {documents: []} (empty outer list) is necessary for ChromaDB 1.x compatibility, and the approach of checking before [0] indexing is correct.
Coverage is comprehensive — all 4 search paths are guarded:
searcher.search()— CLI searchsearcher.search_memories()— library/MCP searchlayers.Layer3.search()— memory stack searchlayers.Layer3.search_raw()— raw result search
The guard logic is consistent across all paths:
if not results.get("documents") or not results["documents"][0]:This handles both {documents: []} (1.x) and {documents: [[]]} (0.6.x) in one check, which is nice.
One subtle note: The PR moves the guard before the docs = results["documents"][0] assignment and removes the old if not docs check that came after the assignment. This is correct — the old check was dead code for 1.x since it would crash before reaching it. The new placement prevents the crash entirely.
Test approach is good — using mocks to inject the empty-outer shape means the tests work regardless of installed ChromaDB version. This is the right way to test version-dependent behavior.
We pin chromadb >=0.6.0,<0.7.0 in our integration so we haven't hit this specific shape, but the guard is forward-compatible and low-risk. Clean fix.
🔭 Reviewed as part of the MemPalace-AGI integration project — autonomous research with perfect memory. Community interaction updates are posted regularly on the dashboard.
What does this PR do?
Guards
searcher.search,searcher.search_memories,layers.Layer3.search, andlayers.Layer3.search_rawagainst empty ChromaDB query results. ChromaDB 1.x can return{documents: []}(empty outer list) for queries against a fresh collection or filter that excludes everything; the pre-fix code didresults['documents'][0]and crashed withIndexError.ChromaDB 0.6.x returns
{documents: [[]]}(empty inner) which the existingif not docscheck already handled, so this only manifests on 1.x — but the guard hardens both shapes for free.Fixes #195.
How to test
Tests inject the empty-outer shape via mock so they exercise the guard regardless of installed ChromaDB version. Without the fix they raise
IndexError: list index out of range.Checklist
python -m pytest tests/ -v) — 119 passed, 0 failedruff check .)