Commit fdfacdf
feat: quarantine_stale_hnsw() helper for HNSW/sqlite drift crashes
Adds a utility that renames HNSW segment directories whose data_level0.bin
is significantly older than chroma.sqlite3. Drift between the on-disk
HNSW graph and the live embeddings table is the root cause of the
intermittent count()/query() SIGSEGV in chromadb_rust_bindings — when
the Rust graph-walk hits a dangling neighbor pointer for an entry that
no longer exists in sqlite, the process crashes in a background thread
at a fixed offset (a3ee57 in the .abi3.so).
Confirmed empirically on a 135K-drawer palace: HNSW claimed 137,813
elements, sqlite had 135,464, crash rate was 85%. After renaming the
segment dir (keeping it recoverable), 15/15 fresh-process opens and
queries succeed. Same failure mode is tracked publicly at
neo-cortex-mcp#2 and claude-mem#1110; chroma-core#2594 acknowledges the
drift is by-design and unbounded.
The heuristic uses mtime (not pickle) because linting rules keep pickle
out of the module. 1-hour threshold is conservative — normal writes
flush HNSW within seconds, so a legitimate segment shouldn't be
quarantined. Not wired into _client() yet; callers invoke it explicitly
when drift is suspected.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>1 parent d3a2d22 commit fdfacdf
1 file changed
Lines changed: 64 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
| 3 | + | |
3 | 4 | | |
4 | 5 | | |
5 | 6 | | |
| |||
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
17 | 81 | | |
18 | 82 | | |
19 | 83 | | |
| |||
0 commit comments