You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(checkpoint-split): phase D migration + PreCompact recovery write
Closes the structural fix from phases A-C: the verbatim store
(mempalace_drawers) is now actually verbatim — derivative writes
(Stop-hook + PreCompact checkpoints) live in the dedicated
mempalace_session_recovery collection.
Phase D — migration of existing checkpoint drawers:
- migrate_checkpoints_to_recovery(palace_path, batch_size=1000) in
mempalace/migrate.py walks the main collection in pages, filters
drawers with topic in _CHECKPOINT_TOPICS in Python (avoids the
chromadb 1.5.x $in/$nin filter-planner bug), copies them to the
recovery collection (preserving IDs + metadata), then deletes from
main. Idempotent — re-running on a fully-reorganized palace
returns 0.
- Add-then-delete order: a crash mid-migration leaves a duplicate,
not a loss. Recoverable.
- 6 new tests in test_migrate.py::TestMigrateCheckpointsToRecovery
(moves checkpoints, idempotent, preserves IDs/metadata, handles
auto-save synonym, no-checkpoints returns 0, no-palace returns 0).
CLI:
- mempalace repair --mode {rebuild,reorganize} (default rebuild).
The new "reorganize" mode invokes migrate_checkpoints_to_recovery
on the resolved palace and prints a one-line result. Designed for
operators upgrading a palace post-A-C and for palace-daemon's
eventual lifespan auto-migrate (phase E, deferred).
PreCompact incorporation:
- hook_precompact now calls _save_diary_direct mirroring hook_stop,
leaving a recovery-collection marker before transcript mining +
compaction. Per JP's framing — "the palace is just verbatim
chats + tool calls" — a context-compaction event is a derivative
write that belongs in the recovery store, not the searchable
corpus. Failures here are non-fatal (logged, mining still runs).
- This addresses the gap noted earlier: today's split was
Stop-only; PreCompact entries are now incorporated.
Deploy script:
- scripts/deploy.sh post-restart import check now also imports
_segment_appears_healthy, migrate_checkpoints_to_recovery,
add_drawers, _build_drawer — proves all of today's fork-ahead
surface is loaded after a restart, not just the gate fixture.
Phase E (palace-daemon lifespan auto-migrate) is still deferred —
cross-repo, requires separate go-ahead.
Suite 1372/1372 pass. Test count bumped 1366→1372 in README.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy file name to clipboardExpand all lines: CLAUDE.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,7 +53,7 @@ Ruff for linting (`ruff check`), line length 100, target Python 3.9.
53
53
21. **feat: `kind` filter on `search_memories` excludes Stop-hook checkpoints by default** (commits `8d02835` → `3d85739` → `398f42f` → `f9f5cc4`, 2026-04-25) — Stop-hook auto-save diary entries (topic=checkpoint, text starting `"CHECKPOINT:"`) were dominating MCP search results because they're short, word-dense, and outrank substantive content under cosine similarity. New `kind` parameter on `search_memories` and `mempalace_search` MCP tool: `"content"` (default, excludes checkpoints), `"checkpoint"` (only checkpoints, recovery/audit), `"all"` (no filter, pre-2026-04-25 behavior). **Two architecture corrections during the same day:** (a) the where-clause filter (`topic $nin [...]`) tripped a ChromaDB 1.5.x filter-planner bug — `Internal error: Error finding id` on every kind=content vector query — so the exclusion moved to post-filter only (`398f42f`); (b) vector top-N is dominated by checkpoints on this palace (top-10 hits all CHECKPOINT entries on probe queries), so post-filter alone empties the result set without aggressive over-fetch — pull size raised to `max(n*20, 100)` for kind != "all" (`f9f5cc4`). Post-filter checks both `topic` metadata and text-prefix shape; coverage equivalent to the original belt-and-suspenders without the chromadb bug. Result dicts now surface `topic`. 9 tests in `TestCheckpointFilter`. Companion fix in [`jphein/palace-daemon`](https://github.com/jphein/palace-daemon) commit `dd8894c` standardizes all hook clients on `topic="checkpoint"` (was `topic="auto-save"` in `clients/hook.py`). Structural fix still pending: stop indexing checkpoints as searchable drawers (separate session-recovery table). Upstream PR pending.
54
54
22.**fix: `palace_graph.build_graph` skips None metadata** (commit `5fd15db`, 2026-04-25) — `palace_graph.py:95` called `meta.get("room", "")` unconditionally; ChromaDB returns `None` for legacy/partial-write drawers, AttributeError took out every consumer of `build_graph` (graph_stats, find_tunnels, traverse, daemon's `/stats`). Caught by `palace-daemon/scripts/verify-routes.sh` smoke-test on 2026-04-25 — `/stats` was 500-ing on a single None drawer. Adds `if meta is None: continue` guard. Closes the same gap as upstream's #999 None-metadata audit (which covered searcher/mcp_server/miner.status) plus our PR #1094 (which coerces at backend boundary for new writes), in a different read path the audit didn't reach. Filed as #1201 on 2026-04-25.
55
55
56
-
23. **feat: checkpoint collection split — phases A–C** (commit `e266365`, 2026-04-25) — Promoted from "future work" to "necessary" by 2026-04-25 Cat 9 A/B (`kind=all` 632 tokens/Q vs `kind=content` 3 tokens/Q on the canonical 151K palace; over-fetch=100 inadequate, structural fix non-optional). **Phase A:** new `_SESSION_RECOVERY_COLLECTION` constant + `get_session_recovery_collection()` in `palace.py` (mirrors `get_collection`'s shape — cosine, num_threads=1). **Phase B:** `tool_diary_write` routes `topic in _CHECKPOINT_TOPICS` to the dedicated `mempalace_session_recovery` collection, everything else stays in `mempalace_drawers`; new `_get_session_recovery_collection()` in `mcp_server.py` with parallel cache. **Phase C:** new `tool_session_recovery_read` MCP handler reads recovery collection only with optional filters `session_id`, `agent`, `since`, `until`, `wing`, `limit`; `session_id` added as optional metadata field on `tool_diary_write` so the new tool can filter by Claude Code session. Registered in `TOOLS` dict, documented in `website/reference/mcp-tools.md`. 12 new tests across `tests/test_session_recovery.py` + `TestCheckpointRouting` + `TestSessionRecoveryRead`. Design + plan at `docs/superpowers/specs/2026-04-25-checkpoint-collection-split.md` and `docs/superpowers/plans/2026-04-25-checkpoint-collection-split-impl.md`. **Phases D (data migration of ~640 existing checkpoints out of main collection) and E (palace-daemon `lifespan` auto-migrate + `mempalace repair --mode reorganize`) deferred** — multi-day work, gated on a separate go-ahead. Once D lands and the canonical-palace re-run shows the predicted `kind=all` ≈ `kind=content` token convergence, the `kind=` post-filter and over-fetch hack become deletable.
56
+
23. **feat: checkpoint collection split — phases A–C** (commit `e266365`, 2026-04-25) — Promoted from "future work" to "necessary" by 2026-04-25 Cat 9 A/B (`kind=all` 632 tokens/Q vs `kind=content` 3 tokens/Q on the canonical 151K palace; over-fetch=100 inadequate, structural fix non-optional). **Phase A:** new `_SESSION_RECOVERY_COLLECTION` constant + `get_session_recovery_collection()` in `palace.py` (mirrors `get_collection`'s shape — cosine, num_threads=1). **Phase B:** `tool_diary_write` routes `topic in _CHECKPOINT_TOPICS` to the dedicated `mempalace_session_recovery` collection, everything else stays in `mempalace_drawers`; new `_get_session_recovery_collection()` in `mcp_server.py` with parallel cache. **Phase C:** new `tool_session_recovery_read` MCP handler reads recovery collection only with optional filters `session_id`, `agent`, `since`, `until`, `wing`, `limit`; `session_id` added as optional metadata field on `tool_diary_write` so the new tool can filter by Claude Code session. Registered in `TOOLS` dict, documented in `website/reference/mcp-tools.md`. 12 new tests across `tests/test_session_recovery.py` + `TestCheckpointRouting` + `TestSessionRecoveryRead`. Design + plan at `docs/superpowers/specs/2026-04-25-checkpoint-collection-split.md` and `docs/superpowers/plans/2026-04-25-checkpoint-collection-split-impl.md`. **Phases D (data migration of ~640 existing checkpoints out of main collection) and E (palace-daemon `lifespan` auto-migrate + `mempalace repair --mode reorganize`) deferred** — multi-day work, gated on a separate go-ahead. Once D lands and the canonical-palace re-run shows the predicted `kind=all` ≈ `kind=content` token convergence, the `kind=` post-filter and over-fetch hack become deletable. **Update 2026-04-26:** phase D shipped — `migrate_checkpoints_to_recovery()` in `mempalace/migrate.py`, idempotent walk that moves topic in `_CHECKPOINT_TOPICS` drawers from main → recovery while preserving IDs and metadata. Wired into `mempalace repair --mode reorganize` (CLI dispatch in `cli.py` chooses between `rebuild` (HNSW from sqlite) and `reorganize` (this new path)). PreCompact hook also incorporated — `hook_precompact` now writes a recovery marker via `_save_diary_direct` mirroring Stop, so a context-compaction event leaves a queryable timestamp in the recovery collection. 6 new migration tests in `test_migrate.py::TestMigrateCheckpointsToRecovery`. Phase E (palace-daemon `lifespan` auto-migrate) still pending — cross-repo, separate go-ahead.
57
57
58
58
27. **perf: batch ChromaDB inserts in miner (cherry-pick of upstream #1085)** (commit `6be6fff`, 2026-04-26) — Cherry-picked @midweste's [#1085](https://github.com/MemPalace/mempalace/pull/1085) "batch ChromaDB inserts in miner — 10-30x faster mining". Upstream PR #1085 is still **OPEN** as of 2026-04-26 (created 2026-04-21, base=develop, not yet merged) — verified via `gh pr view 1085 --repo MemPalace/mempalace`. We cherry-picked the commit ahead of merge so the fork can use it now; this row clears when #1085 merges into develop and we next sync. We don't file a competing fork-side PR — the proposal is @midweste's. New `_build_drawer()` helper builds id+document+metadata in one shot; new `add_drawers()` batch-insert function takes the full chunk list and sub-batches at `DRAWER_UPSERT_BATCH_SIZE` (one chromadb upsert + one ONNX embedding forward-pass per sub-batch instead of per-chunk). `process_file` now calls `add_drawers` directly. Hoists `datetime.now()` and `os.path.getmtime()` to file-level (2 syscalls per file instead of 2N). **Conflict resolution:** fork already had a fork-only `_build_drawer_metadata` + an outer batch loop in `process_file`; upstream's clean structure supersedes both. Kept fork's `DRAWER_UPSERT_BATCH_SIZE=1000` (more conservative than upstream's 5000 for embedding-pass memory headroom); aliased upstream's `CHROMA_BATCH_LIMIT` to point at it so any code/test referencing either name sees the same value. 74/74 miner+convo_miner tests pass; full suite 1366/1366. Becomes a no-op when #1085 merges into upstream develop and we next sync develop→main.
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ Fork of [MemPalace](https://github.com/milla-jovovich/mempalace), tracking `upst
12
12
13
13
What this fork adds that you won't get from upstream yet: a **deterministic silent-save hook architecture** (zero data loss, `systemMessage` notification, daemon-strict mode that skips local writes when `PALACE_DAEMON_URL` is set), **ChromaDB 1.5.x hardening** (`quarantine_stale_hnsw` drift recovery, segfault-trigger guards, 8-site `None`-metadata safety), **search that never silently misses** (`search_memories` returns warnings + sqlite BM25 top-up + `available_in_scope` so callers can see what they aren't getting), and **`kind=`-filtered search** that excludes Stop-hook auto-save checkpoints by default — discovered via the 2026-04-25 RLM smoke test, which surfaced that checkpoint diary entries (high word-density session summaries) were dominating retrieval and producing confident-but-misleading answers. Full list below.
14
14
15
-
1366 tests pass on `main` · [Discussion #1017](https://github.com/MemPalace/mempalace/discussions/1017) introduces the fork upstream · [Issues on this repo](https://github.com/jphein/mempalace/issues) for fork-specific feedback.
15
+
1372 tests pass on `main` · [Discussion #1017](https://github.com/MemPalace/mempalace/discussions/1017) introduces the fork upstream · [Issues on this repo](https://github.com/jphein/mempalace/issues) for fork-specific feedback.
0 commit comments