Skip to content

Commit 034023c

Browse files
jpheinclaude
andcommitted
feat(lifespan): auto-migrate Stop-hook checkpoints to recovery collection (Phase E)
Calls mempalace.migrate.migrate_checkpoints_to_recovery() during the daemon's lifespan startup, after the HNSW quarantine pass and before the ChromaDB warmup ping. Idempotent — re-runs return 0 once the canonical palace has reorganized; the only one-time cost is the first restart after deploy. Closes Phase E of the checkpoint-collection-split spec (docs/superpowers/specs/2026-04-25-checkpoint-collection-split.md on jphein/mempalace). Phases A–D shipped on the mempalace fork on 2026-04-25/26 — the migrate function exists and is wired into `mempalace repair --mode reorganize` for explicit operator runs. This commit makes the daemon do it automatically on first startup post-upgrade, so operators don't have to know about the manual command. Gated behind PALACE_AUTO_MIGRATE_CHECKPOINTS=1 (default). Set =0 to skip auto-migrate in environments where the one-time cost is unwanted; the manual `mempalace repair --mode reorganize` path remains. Defensive: ImportError if the mempalace fork-side function isn't present (upstream-shaped installs); other exceptions logged as non-fatal warnings so the daemon still starts. After this lands and the canonical 151K-drawer palace migrates, the predicted Cat 9 A/B convergence (kind=all ≈ kind=content tokens-per-question) becomes testable — that's the empirical proof the structural fix worked. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
1 parent ed3a892 commit 034023c

1 file changed

Lines changed: 29 additions & 0 deletions

File tree

main.py

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,35 @@ async def lifespan(app: FastAPI):
321321
len(moved), moved,
322322
)
323323

324+
# Migrate Stop-hook auto-save checkpoints from the main searchable
325+
# collection into the dedicated mempalace_session_recovery collection
326+
# so they don't dominate vector top-N. Idempotent — re-runs return 0
327+
# once the canonical palace has reorganized. Gated behind
328+
# PALACE_AUTO_MIGRATE_CHECKPOINTS so operators can disable in
329+
# environments where the one-time migration cost is unwanted.
330+
# See mempalace docs/superpowers/specs/2026-04-25-checkpoint-collection-split.md.
331+
if os.environ.get("PALACE_AUTO_MIGRATE_CHECKPOINTS", "1") != "0":
332+
try:
333+
from mempalace.migrate import migrate_checkpoints_to_recovery
334+
335+
loop = asyncio.get_running_loop()
336+
moved_checkpoints = await loop.run_in_executor(
337+
None, migrate_checkpoints_to_recovery, _mp._config.palace_path
338+
)
339+
if moved_checkpoints:
340+
logger.info(
341+
"Migrated %d checkpoint drawer(s) from main → mempalace_session_recovery; "
342+
"mempalace_search now queries content-only.",
343+
moved_checkpoints,
344+
)
345+
except ImportError:
346+
# mempalace.migrate.migrate_checkpoints_to_recovery is fork-side
347+
# only at the moment; on upstream-shaped installs without it the
348+
# daemon should still start cleanly.
349+
logger.debug("migrate_checkpoints_to_recovery not available; skipping auto-migrate.")
350+
except Exception as e:
351+
logger.warning("Auto-migrate of checkpoints failed (non-fatal): %s", e)
352+
324353
# Warm the ChromaDB client before accepting traffic. The Rust HNSW binding
325354
# occasionally segfaults on the very first request if opened cold; opening
326355
# it here (before yield) ensures the PersistentClient is fully initialized.

0 commit comments

Comments
 (0)