fix: guard searcher loop against None doc/meta from ChromaDB (#1007)#1009
fix: guard searcher loop against None doc/meta from ChromaDB (#1007)#1009
Conversation
ChromaDB query() can return None in the documents and metadatas arrays
when a drawer's HNSW vector entry exists but its metadata/document rows
haven't been materialized yet — partial-flush states, mid-delete, or
schema upgrade boundaries. The search reader currently assumes meta is
always a dict and doc always a str, so a None entry raises:
AttributeError: 'NoneType' object has no attribute 'get'
Two-line defensive coercion: meta = meta or {}, doc = doc or "". The
hit still appears with its real distance and index; source/wing/room
fall back to "?" where content is missing. That is strictly more useful
than crashing the whole search.
Hit frequently today on chromadb 1.5.x (see #1006 root cause). Even
after that pin lands, partial-state results remain possible on
interrupted mines and schema boundaries, so the defensive guard is
worth having on its own.
See #1007 for traceback and reproduction.
Code reviewFound 1 issue:
mempalace/mempalace/searcher.py Lines 283 to 291 in 80ba1ea 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
|
Closing in favor of #999 (opened ~72 min earlier and more comprehensive). #999 covers the same Since |
Same class of bug as #1007: ChromaDB's query() can return None in the documents and metadatas arrays when a drawer's HNSW vector entry exists but its metadata/document rows haven't been materialized. The code in Layer3.search_raw (mempalace/layers.py) calls meta.get("wing", ...), meta.get("room", ...), meta.get("source_file", ...) directly without null safety, so it raises: AttributeError: 'NoneType' object has no attribute 'get' Two-line defensive coercion matching the pattern in #1009 / PR #999 for searcher.py: meta = meta or {}, doc = doc or "". The hit still appears with its real distance; source/wing/room fall back to their fallback values where the metadata row is missing. Frequently hit on chromadb 1.5.x (root cause #1006). Even after the chromadb floor lands (#1010), partial-state results remain possible during interrupted mines and schema upgrade boundaries, so the guard is worth having on its own. Fixes #1011.
Same class of bug as #1007: ChromaDB's query() can return None in the documents and metadatas arrays when a drawer's HNSW vector entry exists but its metadata/document rows haven't been materialized. The code in Layer3.search_raw (mempalace/layers.py) calls meta.get("wing", ...), meta.get("room", ...), meta.get("source_file", ...) directly without null safety, so it raises: AttributeError: 'NoneType' object has no attribute 'get' Two-line defensive coercion matching the pattern in #1009 / PR #999 for searcher.py: meta = meta or {}, doc = doc or "". The hit still appears with its real distance; source/wing/room fall back to their fallback values where the metadata row is missing. Frequently hit on chromadb 1.5.x (root cause #1006). Even after the chromadb floor lands (#1010), partial-state results remain possible during interrupted mines and schema upgrade boundaries, so the guard is worth having on its own. Fixes #1011.
Summary
Fixes #1007.
`mempalace search` crashes with `AttributeError: 'NoneType' object has no attribute 'get'` when any result has `None` metadata. Happens routinely in partial-flush states (very common on chromadb 1.5.x per #1006, but also possible during interrupted mines or schema upgrades even on fixed versions).
Change
Two-line defensive coercion at the top of the results loop in `mempalace/searcher.py`:
```diff
for i, (doc, meta, dist) in enumerate(zip(docs, metas, dists), 1):
similarity = round(max(0.0, 1 - dist), 3)
source = Path(meta.get("source_file", "?")).name
```
Result: the hit still shows with its real distance, source/wing/room fall back to `"?"`, the document body prints empty. User sees that a hit exists even if its content row is missing — strictly more useful than crashing.
Test plan
pip install mempalacesilently broken on chromadb 1.5.x — writes queued but never flush, search returns ghosts #1006)Related
pip install mempalacesilently broken on chromadb 1.5.x — writes queued but never flush, search returns ghosts #1006 — root cause for why `None` metadata appears routinely today