Summary
mempalace repair silently caps recovery at 10,000 drawers, discarding everything beyond that. On a palace with 67,580 drawers, repair completed successfully with no warning and left the palace with exactly
10,000 — a 57,580-drawer loss (~85%).
Environment
- MemPalace: 3.3.3 (also reproducible on 3.3.2, see context)
- Python: 3.x via pipx (
~/.local/pipx/venvs/mempalace)
- ChromaDB: 1.5.8
- OS: macOS 15 (Darwin 25.3.0), Apple Silicon
- Palace size: ~619 MB, 67,580 drawers in
mempalace_drawers, 499 in mempalace_closets
Steps to Reproduce
- Palace with > 10,000 drawers; HNSW for
mempalace_drawers corrupted (see "Trigger" below).
mempalace status segfaults (exit 139) — known HNSW/sqlite drift symptom.
- Quarantine drawers HNSW files (
data_level0.bin, header.bin, length.bin, link_lists.bin, index_metadata.pickle) so repair itself doesn't segfault.
- Run
mempalace repair --yes.
Observed Output
=======================================================
MemPalace Repair
Palace: /Users/.../.mempalace/palace
Drawers found: 10000
...
Repair complete. 10000 drawers rebuilt.
Backup saved at /Users/.../.mempalace/palace.backup
Drawers found: 10000 — but the underlying ChromaDB metadata segment held 67,580 embeddings (verified directly in chroma.sqlite3):
SELECT c.name, COUNT(*)
FROM embeddings e
JOIN segments s ON e.segment_id = s.id
JOIN collections c ON s.collection = c.id
GROUP BY c.name;
-- mempalace_closets : 499
-- mempalace_drawers : 67580 <-- before repair
-- mempalace_drawers : 10000 <-- after repair
Expected Behavior
Either:
- Repair must process all drawers (paginated collection.get(limit=N, offset=K) loop), or
- Repair must fail loudly when the source set exceeds the extraction cap, refusing to overwrite.
Silently dropping 85% of memories with a "Repair complete" success message is the worst possible outcome for a tool whose entire job is data preservation.
Root Cause (suspected)
collection.get() in ChromaDB defaults to a 10,000-row limit. The repair extraction path likely calls .get() once without paginating. Same issue would affect any palace > 10K drawers.
Trigger of the Underlying HNSW Corruption
Pre-existing palace state — exact cause unknown. mempalace status, mempalace repair, and chromadb.Collection.count() on mempalace_drawers all segfault when the corrupt HNSW is loaded. Closets collection
unaffected. The 3.3.2 #1000 "Quarantine stale HNSW" fix did not auto-trigger here; manual quarantine of HNSW files was required just to make repair runnable — at which point the 10K cap surfaced.
Mitigation Used
Restored the pre-repair backup (~/.mempalace.bak-<date>) and recovered the lost 57,580 drawers manually by extracting embedding vectors directly from chroma.sqlite3 and rebuilding the HNSW index without going
through mempalace repair.
Suggested Fix
In the repair extraction loop, paginate:
BATCH = 5000
offset = 0
while True:
batch = src.get(limit=BATCH, offset=offset, include=["documents","metadatas","embeddings"])
if not batch["ids"]:
break
dst.add(ids=batch["ids"], documents=batch["documents"],
metadatas=batch["metadatas"], embeddings=batch["embeddings"])
offset += BATCH
Plus a sanity-check assertion: assert len(dst) == src_total_count before declaring success and before overwriting the live palace.
Severity
High. Silent data loss in a tool sold as "verbatim memory, 96.6% R@5". The user's only signal something went wrong was the post-repair status showing a number that looked too round.
Summary
mempalace repairsilently caps recovery at 10,000 drawers, discarding everything beyond that. On a palace with 67,580 drawers, repair completed successfully with no warning and left the palace with exactly10,000 — a 57,580-drawer loss (~85%).
Environment
~/.local/pipx/venvs/mempalace)mempalace_drawers, 499 inmempalace_closetsSteps to Reproduce
mempalace_drawerscorrupted (see "Trigger" below).mempalace statussegfaults (exit 139) — known HNSW/sqlite drift symptom.data_level0.bin,header.bin,length.bin,link_lists.bin,index_metadata.pickle) sorepairitself doesn't segfault.mempalace repair --yes.Observed Output
=======================================================
MemPalace Repair
Drawers found: 10000— but the underlying ChromaDB metadata segment held 67,580 embeddings (verified directly inchroma.sqlite3):