Skip to content

bug: fresh pip install mempalace silently broken on chromadb 1.5.x — writes queued but never flush, search returns ghosts #1006

@bensig

Description

@bensig

Severity: critical

Every new user who runs `pip install mempalace` today hits this silently. The mine output claims success ("Drawers filed: 231"), but `mempalace status` shows 0 user drawers and `mempalace search` either returns no results or crashes on `NoneType` metadata (separate issue #IS2).

Root cause: `pyproject.toml` pins `chromadb>=0.5.0` with no upper bound, so fresh installs pull `chromadb==1.5.8` (latest) which has a regression where the `embeddings_queue` → `embeddings` + `embedding_metadata` processor never runs.

Reproduction

python3.11 -m venv .venv
source .venv/bin/activate
pip install mempalace                          # pulls chromadb 1.5.8 (latest)

mempalace mine ~/.claude/projects/<session-dir> --mode convos --wing test
#   ✓ [   1/6] ...jsonl  +191
#   Done.
#   Drawers filed: 231

mempalace status
#   MemPalace Status — 0 drawers   (or only ancient test fixtures if present)

mempalace search "anything"
#   AttributeError: 'NoneType' object has no attribute 'get'
#   (see sibling issue for searcher.py defensive fix)

Reproduced on Python 3.11.11 and 3.14.2 — same result. Not Python-version-dependent.

Evidence

Direct SQLite probe of `~/.mempalace/palace/chroma.sqlite3` after a mine that reported 231 drawers filed:

Table Row count
`embeddings` 8
`embeddings_queue` 652
`embedding_metadata` 56
`col.count()` 4

Every upsert added rows to `embeddings_queue` (op=2) but `embeddings` / `embedding_metadata` never grew. The segment materialization step didn't run — not on `PersistentClient` open, not after 10+ seconds of dwell time, not after a probe `col.modify()` call, not after an additional upsert.

The HNSW vector index does get populated directly (sync path) — so `col.query()` returns the correct IDs and distances, but with `None` for every `documents` and `metadatas` entry:

res = col.query(query_texts=['RFC 002 source adapter'], n_results=5,
                include=['documents','metadatas','distances'])

res['ids']        # ['drawer_mempalace_technical_15fc11...', ...]  ← real queued ids
res['distances']  # [0.79, 0.89, 0.92, 1.01, 1.03]                  ← real scores
res['documents']  # [None, None, None, None, None]                  ← never flushed
res['metadatas']  # [None, None, None, None, None]                  ← never flushed

`col.get(ids=)` returns empty — they're unreachable via the normal read path.

Tested versions

`chromadb` Result
1.5.8 (current default on fresh install) Broken — queue stalls
1.5.6 Broken — same stall
0.6.3 Works immediately — 231 drawers visible, search returns real content

Verified by installing `chromadb==0.6.3`, wiping `~/.mempalace/palace`, re-running the same mine. Status instantly shows 231 drawers, search returns our conversation content with correct similarity scores.

Hypothesis on root cause

The 1.x architecture introduced `segments` + an async `embeddings_queue` as a WAL-like write path. The segment materialization step (queue → embeddings + embedding_metadata tables) isn't running from `PersistentClient` alone. Possibly:

  • A background-thread startup regressed in the 1.5.x series
  • An initialization/compaction call that `PersistentClient` no longer triggers
  • A separate server/worker process that `HttpClient` would start but `PersistentClient` doesn't

Not investigated upstream — ChromaDB issue search left for the fix PR.

Proposed fix

Two-line change in `pyproject.toml`:

- "chromadb>=0.5.0",
+ "chromadb>=0.5.0,<1.0",

Pins to the last-known-good major. Existing installs that pulled 1.5.x can recover by `pip install "chromadb<1.0"` + re-mine (deterministic IDs, safe upsert).

Long-term: investigate upstream, contribute a fix to ChromaDB, or raise the ceiling once a 1.x version ships with the queue processor restored.

Related

Companion issue

The `searcher.py` crash on `None` metadata is a separate defensive-coding gap that should be fixed regardless — partial-state results can occur during interrupted mines or schema boundaries even on known-good versions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/miningFile and conversation miningarea/searchSearch and retrievalbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions