Skip to content

_fix_blob_seq_ids misreads chromadb 1.5.x max_seq_id BLOBs, suppressing new writes #1134

@sha2fiddy

Description

@sha2fiddy

What's broken

mempalace/backends/chroma.py::_fix_blob_seq_ids (added in #664) runs int.from_bytes(blob, 'big') on every BLOB row in max_seq_id. But chromadb 1.5.x writes its own BLOB format there — b'\x11\x11' followed by six ASCII digits — and the shim reads that as a big-endian u64. The result is a ~1.23×10^18 integer sitting in the column.

chromadb's embeddings_queue filter is seq_id > start, and start is read from max_seq_id.seq_id at segment open. New rows' seq_ids are small (<1e10 even on large palaces), so nothing ever crosses the poisoned threshold. Writes queue and never flush. Old drawers still read fine, which is why the bug is silent until you try to fetch something written after the poisoning.

Repro

Any palace with BLOB-typed max_seq_id.seq_id rows the first time it opens under chromadb 1.5.x with the #664 shim active. Mine got hit on 2026-04-20:

segment scope clean poisoned
b8e762ef-92b8-4b67-8bd0-ae8ba4622826 closets VECTOR 481194 1229821589196978484
c9bf2ac4-2557-4540-b5f8-76f0f75c71be closets METADATA 501418 1229822654349062456
5ef0f58c-dea2-4247-9a0a-fa3c256f0a4c drawers VECTOR 502498 1229822654365841720
b7b6cc41-a631-4d0c-803e-5256793be770 drawers METADATA 502607 1229822654365970487

Clean values came from a chroma.sqlite3.pre-sysdb10fix-2026-04-20 backup. 465k+ existing drawers still read; every write after 2026-04-20 silently disappeared.

Fix

Shim: drop max_seq_id from the loop — chromadb owns that format. The embeddings branch stays for real 0.6.x palaces but skips any BLOB starting with b'\x11\x11' as belt-and-braces.

New subcommand mempalace repair --mode max-seq-id restores poisoned rows (detects seq_id > 2^53). Two clean-value sources: --from-sidecar <path> copies exact values out of a pre-corruption chroma.sqlite3 backup, or the default heuristic uses MAX(embeddings.seq_id) over the owning collection. Heuristic matches METADATA max exactly; VECTOR segments land a few seq_ids ahead, which the queue treats as already consumed — skipping a tiny window of already-indexed embeddings.

PR to follow.

Possibly related

Out of scope

  • chromadb's choice of max_seq_id BLOB format.
  • A status --health command that would have caught this earlier.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstorage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions