feat: mempalace migrate — recover palaces from different ChromaDB versions by bensig · Pull Request #502 · MemPalace/mempalace

bensig · 2026-04-10T07:08:45Z

Summary

Adds mempalace migrate command that recovers palaces created with a different ChromaDB version. Reads documents and metadata directly from SQLite (bypassing the ChromaDB API that fails on version-mismatched databases), then reimports into a fresh palace.

Fixes the 3.0.0 → 3.1.0 upgrade path where chromadb was silently downgraded from 1.5.x to 0.6.x, breaking the on-disk format.

How it works

Detects chromadb version from SQLite schema (schema_str column = 1.x)
Tries the ChromaDB API first — if it works, no migration needed
Falls back to raw SQL extraction of all documents + metadata
Builds fresh palace in a temp directory using current chromadb
Swaps atomically (backup old → move new into place)

Usage

mempalace migrate                          # migrate default palace
mempalace migrate --dry-run                # preview what would be migrated
mempalace --palace /path/to/palace migrate  # migrate specific palace

Tested

Created a chromadb 1.5.7 palace, confirmed 0.6.3 can't read it
Ran mempalace migrate — all drawers recovered with full metadata
Verified status and search work on migrated palace

Fixes #457

Test plan

ruff check . / ruff format --check . — clean
pytest tests/ -v — 534 passed
Manual: 1.x → 0.6.x migration with 3 test drawers, all recovered

…abels - Add AGENTS.md with build commands, project structure, conventions - Add .github/dependabot.yml for automated pip + actions updates - Add .github/CODEOWNERS for review routing - Expand .gitignore (.env, .DS_Store, IDE configs, coverage, venvs) - Add C901 complexity rule to ruff (max-complexity=25, benchmarks excluded) - Add --durations=10 to pytest CI for test performance tracking - Add docs/schema.sql for knowledge graph schema documentation - Created P0-P3 priority + area/* + security/performance/docs labels

…sions Reads documents and metadata directly from ChromaDB's SQLite (bypassing the API that fails on version-mismatched databases), then reimports into a fresh palace using the currently installed ChromaDB. Fixes the 3.0.0 → 3.1.0 upgrade path where chromadb was downgraded from 1.5.x to 0.6.x, breaking the on-disk storage format. - Detects chromadb version from SQLite schema (0.6.x vs 1.x) - Extracts all drawers with full metadata via raw SQL - Builds fresh palace in temp dir, swaps atomically - Backs up original palace before any changes - Supports --dry-run to preview without modifying Fixes #457

web3guru888

The SQLite-bypass approach here is exactly right — it's the only reliable path when the ChromaDB API itself is broken. A few observations from our integration:

migrate.py looks solid. The raw SQL extraction handles the 1.x → 0.6.x schema gap cleanly, the backup-then-swap pattern is safe, and the dry-run flag is genuinely useful (we would have used this during our own upgrade pain). The version detection via schema_str column presence is the right heuristic.

One gap worth addressing: extract_drawers_from_sqlite queries on embedding_id but the inner metadata query uses e.id (embeddings.id) not embeddings.embedding_id. In most cases these map correctly through the JOIN, but the two-query structure leaves a subtle mismatch possibility: if rows returned for the same embedding_id have different e.id values (e.g. duplicate entries from a failed partial write), the metadata merge could silently drop keys. Consider joining on embedding_id throughout or at minimum asserting uniqueness.

Embeddings not migrated — explicitly noted in the PR, but worth adding a user-visible line in the migrate output: something like 'Embeddings will be regenerated on first search — initial queries after migration may be slow.' Users who haven't read the code will wonder why the first search takes 10x longer.

CODEOWNERS concentration — same note as in #497: bensig now owns .claude-plugin/, .codex-plugin/, integrations/, .github/ and is co-owner of mempalace/. That's a single-point bottleneck if unavailable. Fine for now but worth a comment in CODEOWNERS.

The --durations=10 CI addition is useful — we've caught slow tests that way. The C901 complexity rule is a good addition too.

534 tests passing on a PR with a new migrate.py is a bit surprising — are there dedicated tests for extract_drawers_from_sqlite and detect_chromadb_version? The coverage count may be coming from the other files in this batch rather than the new migration logic. Worth adding at minimum a unit test with a mock SQLite db that matches the 1.x and 0.6.x schema patterns.

web3guru888

This is a solid approach. Reading directly from SQLite to bypass a broken ChromaDB API is exactly the right call — it's the only reliable way to recover data across major schema changes, and it's robust because SQLite's own format hasn't changed.

A few observations from reading the implementation:

The raw SQL query works but has a gap:

SELECT e.embedding_id,
       MAX(CASE WHEN em.key = 'chroma:document' THEN em.string_value END) as document
FROM embeddings e
JOIN embedding_metadata em ON em.id = e.id
GROUP BY e.embedding_id

This assumes ChromaDB 1.x stores the document text as chroma:document in embedding_metadata. If a palace was created with an older 0.4.x schema (some users upgraded from 3.0.0 which shipped with >=0.4.0), the document storage key may differ. Worth testing against a real 0.4.x-created palace before merging, or adding a fallback query path.

Batch import uses col.add(), not col.upsert(): If the migration is re-run (e.g., interrupted and restarted), the second run will crash on the first batch containing an ID that's already in the fresh palace. Consider col.upsert() or a check-before-import guard to make re-runs safe. This matters especially because a power failure between swap and completion would leave users stranded.

Connection leak in extract_drawers_from_sqlite: The conn.close() call at the end is correct, but if either SQL query raises (malformed DB, unexpected schema), conn leaks. A context manager (with sqlite3.connect(db_path) as conn) would close it on error too. The detect_chromadb_version function already uses finally: conn.close() correctly — same pattern here.

Missing: KG migration. The SQLite knowledge graph at ~/.mempalace/knowledge_graph.db is not ChromaDB-dependent and doesn't need version migration, but it's worth noting in the output that KG data is preserved separately. Users who've been burned by the 3.0→3.1 upgrade might not realize their KG is safe.

# Could add after migration completes:
kg_path = os.path.expanduser("~/.mempalace/knowledge_graph.db")
if os.path.exists(kg_path):
    print(f"  Note: Knowledge graph at {kg_path} is unaffected.")

The dry-run, atomic backup-then-swap, and version detection logic are all well-designed. This fills a real gap — we've seen multiple reports in #457 and #469 of users losing palace data on upgrade. With the upsert fix and connection guard, this is close to ready.

R0uter · 2026-04-10T07:54:09Z

confirmed from #469 it works for my old palace data.

web3guru888 · 2026-04-10T10:16:43Z

This is exactly the tool the community has been missing. The ChromaDB version break has been one of the most painful issues we've dealt with in our integration — we ended up pinning chromadb >=0.6.0,<0.7.0 specifically to avoid it, and even with that guard we had to do a one-off manual SQLite extraction when a dev machine had the wrong version installed. Having mempalace migrate as a proper first-class command is a big quality-of-life improvement.

The implementation looks solid. A few things worth testing if you haven't already:

Idempotency: if someone runs migrate on an already-migrated palace (e.g. they forget they already ran it), does it exit cleanly without corrupting anything? Step 2 (try API first) should handle this, but worth confirming it's a clean no-op.

Interruption recovery: if the process is killed mid-swap (between backup and move), what's the recovery path? The atomic swap approach is good, but it'd be reassuring to document what state the user is left in and how to resume — even just a note in the --dry-run output or error message pointing to the backup dir.

Neither of these feels like a blocker given R0uter's confirmation and the manual test coverage. Just good edge cases to have in the test plan before merge. Nice work on this one.

milla-jovovich

Looks good, merge it

milla-jovovich

Looks good

bensig added 2 commits April 9, 2026 23:29

bensig mentioned this pull request Apr 10, 2026

3.1.0 upgrade silently breaks running MCP servers due to ChromaDB version downgrade (1.5.6 → 0.6.3) #457

Closed

bensig requested a review from milla-jovovich April 10, 2026 07:10

web3guru888 reviewed Apr 10, 2026

View reviewed changes

milla-jovovich approved these changes Apr 10, 2026

View reviewed changes

web3guru888 reviewed Apr 10, 2026

View reviewed changes

web3guru888 mentioned this pull request Apr 10, 2026

Feedback from production use: the weight of memory itself #514

Closed

web3guru888 mentioned this pull request Apr 10, 2026

Hook scripts call "mempalace hook run" command that doesn't exist in PyPI package v3.0.0 #377

Open

milla-jovovich self-assigned this Apr 10, 2026

milla-jovovich approved these changes Apr 10, 2026

View reviewed changes

bensig merged commit 559e43b into main Apr 10, 2026
6 checks passed

web3guru888 mentioned this pull request Apr 10, 2026

chore(deps): update chromadb requirement from <0.7,>=0.5.0 to >=0.5.0,<1.6 #546

Closed

bensig deleted the fix/chromadb-version-migration branch April 10, 2026 16:26

milla-jovovich approved these changes Apr 10, 2026

View reviewed changes

bensig mentioned this pull request Apr 12, 2026

palace data gone after upgrade to v3.1.0 #469

Closed

sha2fiddy mentioned this pull request Apr 12, 2026

fix: mempalace migrate produces incomplete ChromaDB 0.6.x schema #722

Open

igorls mentioned this pull request Apr 13, 2026

release: v3.2.0 #762

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: mempalace migrate — recover palaces from different ChromaDB versions#502

feat: mempalace migrate — recover palaces from different ChromaDB versions#502
bensig merged 2 commits intomainfrom
fix/chromadb-version-migration

bensig commented Apr 10, 2026

Uh oh!

web3guru888 left a comment

Uh oh!

web3guru888 left a comment

Uh oh!

R0uter commented Apr 10, 2026

Uh oh!

web3guru888 commented Apr 10, 2026

Uh oh!

milla-jovovich left a comment

Uh oh!

Uh oh!

milla-jovovich left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

bensig commented Apr 10, 2026

Summary

How it works

Usage

Tested

Test plan

Uh oh!

web3guru888 left a comment

Choose a reason for hiding this comment

Uh oh!

web3guru888 left a comment

Choose a reason for hiding this comment

Uh oh!

R0uter commented Apr 10, 2026

Uh oh!

web3guru888 commented Apr 10, 2026

Uh oh!

milla-jovovich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

milla-jovovich left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants