fix(hooks): drop checkpoint write path (verbatim-only, part 1)#3
Closed
jphein wants to merge 3 commits intofix/restore-transcript-ingest-via-daemonfrom
Closed
fix(hooks): drop checkpoint write path (verbatim-only, part 1)#3jphein wants to merge 3 commits intofix/restore-transcript-ingest-via-daemonfrom
jphein wants to merge 3 commits intofix/restore-transcript-ingest-via-daemonfrom
Conversation
The Stop hook used to do two things on each fire:
(a) write a 1-KB checkpoint summary diary entry into the
mempalace_session_recovery collection
(b) auto-mine the verbatim transcript into mempalace_drawers
The summary in (a) is redundant once (b) is searchable, AND the
recovery collection had no semantic-search MCP surface — so
checkpoints were structurally invisible to mempalace_search. Net
effect from a user's seat: agents (and JP) couldn't find recent
session content via search even though everything was on disk.
Drop the summary half. Verbatim transcript chunks already contain
every word a checkpoint summary would have surfaced. Leaving only
the transcript-ingest path matches the verbatim-only architecture
described in docs/superpowers/specs/2026-05-05-verbatim-only-design.md.
Changes:
* hook_stop silent path: remove _save_diary_direct call. Save marker
advances unconditionally (fire-and-forget mining doesn't have a
"confirmed save" signal to gate on; failure detection moves to
daemon-side observability — hook.log + systemd journal).
* hook_precompact: remove the recovery-marker write. Mine + compaction
proceed as before.
* systemMessage shape changes from "✦ N memories woven into the
palace — themes" to "✦ Transcript ingest triggered (wing=...)".
Lighter, accurate, no false count.
* Delete the now-unused _save_diary_direct function (~120 LOC) and
its dependencies _extract_themes + _THEME_STOPWORDS (~30 LOC).
* Update 4 hook tests + 1 OSError test to mock _ingest_transcript
rather than _save_diary_direct, and to expect the new
systemMessage shape.
Scope deliberately limited: this PR removes hook-side WRITES.
mempalace_session_recovery collection still exists; the
mempalace_session_recovery_read MCP tool still works against the
existing 763 archived entries; mempalace.migrate is untouched.
A follow-up PR will dispose of the collection itself once this
change has run in production for a verification window.
Net diff: −230 / +53 LOC. 1558 tests pass, 1 skipped, lint clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Captures the architectural shift driving this PR family — drop checkpoint summaries, expose only the verbatim transcripts that mempalace_search can already reach. Documents: * Why: the recovery collection had no semantic-search MCP surface, so checkpoints in it were structurally invisible to search * What: hook write-path deletion (this PR) + collection disposal (follow-up PR) * How: code-deletion checklist for both repos, migration plan with tarball safety net, test plan * Open questions queued for review at merge time Spec gates the destructive parts on the answers to those open questions; the write-path removal in this PR is scoped narrow enough to land without them. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Pull request overview
Removes the hook-side checkpoint-summary diary write path in favor of verbatim transcript ingestion as the sole “silent save” mechanism, updates hook tests accordingly, and adds a design spec documenting the verbatim-only direction.
Changes:
- Deleted
_save_diary_direct/_extract_themeshelpers and switchedhook_stopsilent mode to trigger_ingest_transcriptonly (with a newsystemMessageformat). - Removed the PreCompact recovery-marker write; PreCompact now just ingests transcript (then sync-mines project data).
- Updated hook tests to mock
_ingest_transcriptand assert the newsystemMessageshape; added a verbatim-only architecture spec.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
mempalace/hooks_cli.py |
Drops checkpoint diary writes in Stop (silent) and PreCompact; emits new “Transcript ingest triggered (wing=…)” system message. |
tests/test_hooks_cli.py |
Updates tests to mock _ingest_transcript instead of _save_diary_direct and to assert new messaging/wing surfacing. |
docs/superpowers/specs/2026-05-05-verbatim-only-design.md |
Adds design document describing the verbatim-only architecture and planned follow-on deletions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+665
to
+672
| """Precompact hook: mine the transcript synchronously, then allow compaction. | ||
|
|
||
| The recovery-marker write was removed (it used to land a | ||
| "where we were" diary entry in the now-deleted | ||
| mempalace_session_recovery collection). The verbatim transcript | ||
| chunks captured by _ingest_transcript already cover the recovery | ||
| use case — searching for any phrase from the last few messages | ||
| locates the session. |
This was referenced May 6, 2026
Copilot review on #3 caught that the docstring claimed the recovery collection was "now-deleted," but this PR only removes the write call site — collection retirement is the follow-up PR #5. Reword to match actual scope: the recovery-marker write is removed here; the collection still exists and gets disposed in #5. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Owner
Author
|
Copilot review addressed in commit
Stacked on #2 — no CI here because base isn't main, but full suite passes locally: 1558 tests pass, 1 skipped, ruff clean. |
jphein
added a commit
that referenced
this pull request
May 6, 2026
Copilot review on #3 caught that the docstring claimed the recovery collection was "now-deleted," but this PR only removes the write call site — collection retirement is the follow-up PR #5. Reword to match actual scope: the recovery-marker write is removed here; the collection still exists and gets disposed in #5. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein
added a commit
that referenced
this pull request
May 6, 2026
* fix(hooks): drop checkpoint write path from Stop + PreCompact
The Stop hook used to do two things on each fire:
(a) write a 1-KB checkpoint summary diary entry into the
mempalace_session_recovery collection
(b) auto-mine the verbatim transcript into mempalace_drawers
The summary in (a) is redundant once (b) is searchable, AND the
recovery collection had no semantic-search MCP surface — so
checkpoints were structurally invisible to mempalace_search. Net
effect from a user's seat: agents (and JP) couldn't find recent
session content via search even though everything was on disk.
Drop the summary half. Verbatim transcript chunks already contain
every word a checkpoint summary would have surfaced. Leaving only
the transcript-ingest path matches the verbatim-only architecture
described in docs/superpowers/specs/2026-05-05-verbatim-only-design.md.
Changes:
* hook_stop silent path: remove _save_diary_direct call. Save marker
advances unconditionally (fire-and-forget mining doesn't have a
"confirmed save" signal to gate on; failure detection moves to
daemon-side observability — hook.log + systemd journal).
* hook_precompact: remove the recovery-marker write. Mine + compaction
proceed as before.
* systemMessage shape changes from "✦ N memories woven into the
palace — themes" to "✦ Transcript ingest triggered (wing=...)".
Lighter, accurate, no false count.
* Delete the now-unused _save_diary_direct function (~120 LOC) and
its dependencies _extract_themes + _THEME_STOPWORDS (~30 LOC).
* Update 4 hook tests + 1 OSError test to mock _ingest_transcript
rather than _save_diary_direct, and to expect the new
systemMessage shape.
Scope deliberately limited: this PR removes hook-side WRITES.
mempalace_session_recovery collection still exists; the
mempalace_session_recovery_read MCP tool still works against the
existing 763 archived entries; mempalace.migrate is untouched.
A follow-up PR will dispose of the collection itself once this
change has run in production for a verification window.
Net diff: −230 / +53 LOC. 1558 tests pass, 1 skipped, lint clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
* docs(spec): verbatim-only architecture design
Captures the architectural shift driving this PR family — drop
checkpoint summaries, expose only the verbatim transcripts that
mempalace_search can already reach. Documents:
* Why: the recovery collection had no semantic-search MCP surface,
so checkpoints in it were structurally invisible to search
* What: hook write-path deletion (this PR) + collection disposal
(follow-up PR)
* How: code-deletion checklist for both repos, migration plan with
tarball safety net, test plan
* Open questions queued for review at merge time
Spec gates the destructive parts on the answers to those open
questions; the write-path removal in this PR is scoped narrow
enough to land without them.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
* docs(hooks): clarify hook_precompact docstring scope
Copilot review on #3 caught that the docstring claimed
the recovery collection was "now-deleted," but this PR only removes
the write call site — collection retirement is the follow-up PR #5.
Reword to match actual scope: the recovery-marker write is removed
here; the collection still exists and gets disposed in #5.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
jphein
added a commit
that referenced
this pull request
May 6, 2026
Follow-up to fix/drop-checkpoint-write-path. With nothing writing to the recovery collection anymore (the hooks moved to verbatim-only on the parent branch), the read paths and the migration that fed the collection are dead code. Delete them. Removed in mempalace/: - palace.py: _SESSION_RECOVERY_COLLECTION constant, get_session_recovery_collection(), _CHECKPOINT_TOPICS tuple - mcp_server.py: _get_session_recovery_collection(), _recovery_collection_cache global, the topic-routing branch in tool_diary_write (now always lands in mempalace_drawers), tool_session_recovery_read() and its TOOLS dict registration - migrate.py: migrate_checkpoints_to_recovery() and the dependent _CHECKPOINT_TOPICS import - cli.py: cmd_repair --mode reorganize handler + the choice flag Removed in tests/: - tests/test_session_recovery.py — entire file (recovery-collection module test suite) - tests/test_migrate.py: TestMigrateCheckpointsToRecovery class - tests/test_mcp_server.py: TestCheckpointRouting and TestSessionRecoveryRead classes Removed in docs/: - website/reference/mcp-tools.md: mempalace_session_recovery_read section (caught by the readme-claims meta-test before deploy) Production data on disks still has the mempalace_session_recovery collection with its 763 archived entries — this PR's code change makes the collection unreachable through any MCP/CLI path, but does NOT delete the on-disk data. A separate one-shot script (scripts/phase2_purge_recovery.py per the spec) handles the collection-level delete after deploy. JP signed off on hard-delete in the spec ack. Net diff: 12 files, −1300 lines / +30 lines. 1540 tests pass, 1 skipped. Ruff lint + format clean. Stacked on #3 (which stops new writes); will auto-rebase against main when that merges. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein
added a commit
that referenced
this pull request
May 6, 2026
…ol (#8) * refactor(mcp): retire mempalace_session_recovery collection + read tool Follow-up to fix/drop-checkpoint-write-path. With nothing writing to the recovery collection anymore (the hooks moved to verbatim-only on the parent branch), the read paths and the migration that fed the collection are dead code. Delete them. Removed in mempalace/: - palace.py: _SESSION_RECOVERY_COLLECTION constant, get_session_recovery_collection(), _CHECKPOINT_TOPICS tuple - mcp_server.py: _get_session_recovery_collection(), _recovery_collection_cache global, the topic-routing branch in tool_diary_write (now always lands in mempalace_drawers), tool_session_recovery_read() and its TOOLS dict registration - migrate.py: migrate_checkpoints_to_recovery() and the dependent _CHECKPOINT_TOPICS import - cli.py: cmd_repair --mode reorganize handler + the choice flag Removed in tests/: - tests/test_session_recovery.py — entire file (recovery-collection module test suite) - tests/test_migrate.py: TestMigrateCheckpointsToRecovery class - tests/test_mcp_server.py: TestCheckpointRouting and TestSessionRecoveryRead classes Removed in docs/: - website/reference/mcp-tools.md: mempalace_session_recovery_read section (caught by the readme-claims meta-test before deploy) Production data on disks still has the mempalace_session_recovery collection with its 763 archived entries — this PR's code change makes the collection unreachable through any MCP/CLI path, but does NOT delete the on-disk data. A separate one-shot script (scripts/phase2_purge_recovery.py per the spec) handles the collection-level delete after deploy. JP signed off on hard-delete in the spec ack. Net diff: 12 files, −1300 lines / +30 lines. 1540 tests pass, 1 skipped. Ruff lint + format clean. Stacked on #3 (which stops new writes); will auto-rebase against main when that merges. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * docs(README): align with verbatim-only retirement Copilot review on #5 caught that the README still documented session_recovery_read as current, which would leave users calling a tool that returns "Unknown tool" after this PR ships. Updates the four user-facing sections that described the recovery collection as a current shipping feature: * "What just shipped" — adds 2026-05-05 update note explaining the split was retired, with the generalizable lesson preserved (a side collection without a semantic-search MCP read tool is invisible). * "Verbatim vs derivative" principle — reframes the recovery collection as the failed first attempt rather than a current example; pattern stays valid but each future sibling needs its own read surface. * "Structural retrieval fixes" bullet — describes verbatim-only end-state, links to the new spec, retains historical context for Apr 25 → May 5. * "Deterministic hook saves" bullet — updates to the new ingest-only save path, removes "recovery-collection marker" mention. * "What it looks like in production" code block — updates the systemMessage example to the new "Transcript ingest triggered (wing=...)" shape. * "P8" architectural principle — keeps it but reframes as on-hold with a precondition (read-surface parity) drawn from the cycle. * Fork-ahead tracking table — replaces the old multi-collection-split row with a verbatim-only retirement row pointing at the four PRs; also retires references to session_recovery_read in the drawer_id surfacing row; adds new row for the mining management CLI (#4). 42 README meta-tests pass (test_readme_claims.py). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_ingest_transcriptpath) cover everything a summary would surface._save_diary_direct(~120 LOC),_extract_themes, and_THEME_STOPWORDShelpers — no remaining callers in source or tests._ingest_transcriptrather than_save_diary_direct, and to expect the newsystemMessageshape (✦ Transcript ingest triggered (wing=...)).docs/superpowers/specs/2026-05-05-verbatim-only-design.md.Why
The Stop hook used to do two things on each fire:
mempalace_session_recoverycollectionmempalace_drawers(a) is redundant once (b) is searchable. Worse, the recovery collection had no semantic-search MCP surface — only filter-based read via
mempalace_session_recovery_read(session_id, agent, since, until, wing). So checkpoints in it were structurally invisible tomempalace_search. Net effect from a user's seat: agents (and JP) couldn't find recent session content via search even though everything was on disk.This PR drops (a). Verbatim transcripts in
mempalace_drawerscarry every word a checkpoint summary would have surfaced — searching is the recovery query.Stacked on #2
Base branch is
fix/restore-transcript-ingest-via-daemon(the Phase 1 PR that restores transcript ingest). When that merges, this PR's base auto-updates tomain. Reviewers see only the new commits in this PR.Behavior change
hook_stopsilent path:result = _save_diary_direct(...); advance save marker only ifcount > 0; systemMessage was✦ N memories woven into the palace — themes._ingest_transcript(...)only; advance save marker unconditionally (fire-and-forget mining doesn't have a "confirmed save" signal); systemMessage is✦ Transcript ingest triggered (wing=...). Failure detection moves to daemon-side observability (hook.log + systemd journal).hook_precompact:Scope deliberately limited
This PR removes hook-side writes only. Read paths and existing data are untouched:
mempalace_session_recoverycollection still exists with its 763 archived entriesmempalace_session_recovery_readMCP tool still works for those entriesmempalace.migratemodule still existslifespanauto-migrate still runsA follow-up PR (Phase 2b, see spec) disposes of the collection and its read paths once this change has run in production for a verification window. That PR is the destructive one and gates on JP's explicit ack of the open questions in the spec.
Test plan
ruff checkcleanruff format --checkclean_save_diary_direct,_extract_themes, or_THEME_STOPWORDSin source or tests (confirmed via grep)tests/test_corpus_origin_integration.py::test_no_internal_coordination_jargon_in_source_or_testspasses; original draft used "Phase 2" comments which it correctly flagged, swapped to feature-language)Risk
_ingest_transcriptsilently fails for many fires in a row, transcripts go un-mined without retry. Mitigation: the daemon's/mineendpoint logs every accept/reject, and hook.log records every_post_daemon_minecall (success or fail). Detection moves from "hook self-reports" to "daemon-side log inspection" — appropriate for the verbatim-only model where the daemon owns writes._save_diary_directand_extract_themesare removed. Anyone who imported these frommempalace.hooks_cliexternally would break. Searched repo — no callers remain. Not an upstream API.🤖 Generated with Claude Code