Skip to content

fix(hooks): route mining through daemon when PALACE_DAEMON_URL is set#2

Merged
jphein merged 2 commits intomainfrom
fix/restore-transcript-ingest-via-daemon
May 6, 2026
Merged

fix(hooks): route mining through daemon when PALACE_DAEMON_URL is set#2
jphein merged 2 commits intomainfrom
fix/restore-transcript-ingest-via-daemon

Conversation

@jphein
Copy link
Copy Markdown
Owner

@jphein jphein commented May 5, 2026

Summary

  • Replaces three hook skip-and-bail paths (_maybe_auto_ingest, _mine_sync, _ingest_transcript) with POSTs to the daemon's existing /mine endpoint via a new _post_daemon_mine() helper
  • Restores transcript auto-ingest, which has been silently disabled since 2026-04-24 in any deploy with PALACE_DAEMON_URL set
  • Aligns transcript wing routing with the diary checkpoint behavior (_wing_from_transcript_path()), replacing hardcoded "sessions" with the per-project derivation already used at hook_stop:740

Why

Daemon-strict mode (introduced 2026-04-24 in commits 8c90c0f + 0e97b19 to fix the HNSW drift incident) stops local writes when PALACE_DAEMON_URL is set, on the assumption a daemon-side writer takes over. The diary half got that writer via /silent-save, but the transcript-ingest half did not. So for ~11 days every Claude Code Stop hook left a checkpoint summary in the recovery collection and zero verbatim transcript drawers in mempalace_drawers. mempalace_search lost visibility into recent sessions even though MCP, daemon, and HNSW were all healthy.

Hook log evidence:

[15:36:04] Skipping transcript ingest: PALACE_DAEMON_URL set, daemon owns writes
[15:36:04] Skipping auto-ingest: PALACE_DAEMON_URL set, daemon owns writes

This PR makes the hook actually call the daemon instead of bailing.

The daemon already has /mine; the only daemon-side change needed is path translation (so a remote daemon can find client-side paths at its own mount points). That's covered in the companion PR jphein/palace-daemon#1 (feat/transcript-path-translation).

This PR degrades gracefully if Branch B is not yet deployed: hook POSTs, daemon 400s on path-not-found, hook logs and continues — same end-state as today, just with telemetry on the failure mode.

Behavior change

_ingest_transcript now uses _wing_from_transcript_path(transcript_path) for wing routing (e.g. wing_memorypalace, wing_realmwatch) instead of hardcoded "sessions". This matches what the diary checkpoint write at hook_stop:740 has always done. Transcripts that don't match the standard Claude Code project layout still get a fallback wing (wing_sessions from _wing_from_transcript_path).

Test plan

  • 5 existing tests now use clear=True on patch.dict so PALACE_DAEMON_URL in the developer env stops leaking into local-spawn assertions (pre-existing test isolation bug; unrelated to this fix in spirit but caught by it)
  • 6 new tests cover _post_daemon_mine (URL/body/api-key/error paths) and the daemon-routed branches in _maybe_auto_ingest, _mine_sync, _ingest_transcript
  • Full suite green: 1552 passed, 1 skipped on Python 3.12 / Ubuntu
source venv/bin/activate
python -m pytest tests/ -q

🤖 Generated with Claude Code

Daemon-strict mode (introduced 2026-04-24 in commits 8c90c0f + 0e97b19
to fix the HNSW drift incident) skipped all three local mining paths
when PALACE_DAEMON_URL was set, on the assumption a daemon-side writer
would do the work instead. The diary checkpoint half got that writer
via /silent-save, but the transcript-ingest half did not — so for ~11
days every Claude Code session left a checkpoint summary in the
recovery collection and zero verbatim transcript drawers in
mempalace_drawers. mempalace_search lost visibility into recent
sessions even though MCP, daemon, and HNSW were all healthy.

Replace the three skip-and-bail branches with POSTs to the daemon's
existing /mine endpoint via a new _post_daemon_mine() helper:

* _maybe_auto_ingest (background project mine on Stop)
* _mine_sync (synchronous project mine on PreCompact)
* _ingest_transcript (transcript convo mine on Stop / PreCompact)

The daemon already has /mine; this PR just wires the hook to call it.
Path translation (so the daemon can find client-side paths in its own
filesystem) is handled daemon-side via PALACE_DAEMON_PATH_MAP — see
the companion palace-daemon PR.

Behavior change: transcript ingest now routes to the project wing
derived via _wing_from_transcript_path(), matching the diary
checkpoint behavior at hook_stop:740. Previously hardcoded "sessions";
now produces e.g. "wing_memorypalace" / "wing_realmwatch" per
transcript. The "sessions" wing remains as the fallback for paths
that don't match the Claude Code project layout (now spelled
"wing_sessions" by the existing helper).

Tests:
* 5 existing tests now use clear=True on patch.dict so PALACE_DAEMON_URL
  in the developer env stops leaking into local-spawn assertions.
* 6 new tests cover _post_daemon_mine (URL/body/api-key/error-paths)
  and the daemon-routed branches in all three mining functions.
* Full suite green: 1552 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copilot AI review requested due to automatic review settings May 5, 2026 23:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes daemon-strict hook behavior by routing mining operations through the palace-daemon when PALACE_DAEMON_URL is set, instead of skipping local mining. It restores transcript ingest in daemon deployments and aligns transcript wing routing with the existing per-project wing derivation used elsewhere in the hooks.

Changes:

  • Added _post_daemon_mine() helper to POST mining jobs to the daemon /mine endpoint (with optional API key header).
  • Updated _maybe_auto_ingest, _mine_sync, and _ingest_transcript to route mining via daemon in strict mode rather than bailing.
  • Updated/expanded hook tests, including env isolation fixes (patch.dict(..., clear=True)) and new coverage for daemon-routed branches.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
mempalace/hooks_cli.py Adds daemon /mine POST helper and routes project + transcript mining through daemon in strict mode; updates transcript wing derivation.
tests/test_hooks_cli.py Fixes env leakage in existing tests and adds new tests validating _post_daemon_mine and daemon-routed hook behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/hooks_cli.py
Comment on lines +327 to +330
if _daemon_strict():
for mine_dir, mode in targets:
_post_daemon_mine(mine_dir, wing="general", mode=mode)
return
Comment thread mempalace/hooks_cli.py
Comment on lines +351 to +354
if _daemon_strict():
for mine_dir, mode in targets:
_post_daemon_mine(mine_dir, wing="general", mode=mode)
return
Comment thread mempalace/hooks_cli.py Outdated
Comment on lines +589 to +597
path = Path(transcript_path).expanduser()
if not path.is_file() or path.stat().st_size < 100:
return

project_wing = _wing_from_transcript_path(transcript_path)

if _daemon_strict():
_post_daemon_mine(str(path.parent), wing=project_wing, mode="convos")
return
Three findings from Copilot's review pass plus one CI lint fix:

1. **Wing derivation parity** (Copilot L330, L354). _maybe_auto_ingest
   and _mine_sync hardcoded wing="general" in the daemon-routed
   branch, but the local-spawn fallback omits --wing so convo_miner
   derives from Path(mine_dir).name. The two paths produced
   different wings for identical input. Add _wing_from_mine_dir()
   that mirrors normalize_wing_name(Path(mine_dir).name) and use it
   in both daemon-routed branches.

2. **Reuse transcript validator** (Copilot L597). _ingest_transcript
   bypassed _validate_transcript_path, accepting any harness-supplied
   .jsonl path including ".." traversal sequences. _count_human_messages
   already validates; mirror that here so the same traversal/extension
   guards apply on the ingest path.

3. **Timeout adequacy**. Daemon /mine awaits proc.communicate() (blocks
   until subprocess finishes). 10s was too short for typical mines;
   bump to 30s. The hook-side timeout is cosmetic — even on timeout
   the daemon-side mine completes since the subprocess is independent
   of the HTTP connection — but a misleading log entry still reads
   like a failure.

4. **Ruff format**. CI's `ruff format --check .` flagged blank-line
   gaps inside the new test class _FakeResp. Auto-formatted.

New tests:
- test_ingest_transcript_rejects_traversal — closes the security finding
- test_ingest_transcript_rejects_wrong_extension — same guard family
- test_wing_from_mine_dir_normalizes — covers the new helper
- test_maybe_auto_ingest_routes_through_daemon updated to assert wing
  derivation (project dirname "my-project" → "my_project")
- test_mine_sync_routes_through_daemon updated likewise

82 passed, 1 skipped. Ruff format + lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit that referenced this pull request May 6, 2026
Four findings from Copilot's review pass on the mining-management CLI:

1. **--wing filter pushed into the query** (Copilot cli.py:800).
   Previously cmd_mined fetched every drawer in the palace and filtered
   in Python, doing O(total) work even for "mempalace mined --wing X"
   on a 150K-drawer palace where X has 12K. Pre-fetch IDs via
   col.get(where=...) to size the loop, then pass where= on each
   batch get(). On the canonical palace, --wing filters now scan
   ~10x less data.

2. **Reject negative --limit at parse time** (Copilot cli.py:818).
   argparse type=int silently accepted -1, producing nonsensical
   "... -2 more" output. Add a typed _nonneg_int validator that
   raises ArgumentTypeError on negatives. argparse turns that into
   a clean exit code 2 with usage error message.

3. **Update module __doc__** (Copilot cli.py:1495). main() uses the
   module docstring as the help epilog. The hard-coded command list
   stopped at "status" — added "mined" and "purge --source-file"
   so `mempalace --help` reflects the new commands.

4. **Dispatch test for `mempalace mined`** (Copilot cli.py:1542).
   Sibling commands have main()-level dispatch tests; mined didn't.
   A typo or registration regression would have slipped through unit
   coverage. Added test_main_mined_dispatches and a paired
   test_main_mined_rejects_negative_limit (verifies #2).

Tests: 68 cli tests pass; full suite green at 1566 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@jphein jphein merged commit 09d2ca6 into main May 6, 2026
6 checks passed
@jphein jphein deleted the fix/restore-transcript-ingest-via-daemon branch May 6, 2026 01:16
jphein added a commit that referenced this pull request May 6, 2026
Four findings from Copilot's review pass on the mining-management CLI:

1. **--wing filter pushed into the query** (Copilot cli.py:800).
   Previously cmd_mined fetched every drawer in the palace and filtered
   in Python, doing O(total) work even for "mempalace mined --wing X"
   on a 150K-drawer palace where X has 12K. Pre-fetch IDs via
   col.get(where=...) to size the loop, then pass where= on each
   batch get(). On the canonical palace, --wing filters now scan
   ~10x less data.

2. **Reject negative --limit at parse time** (Copilot cli.py:818).
   argparse type=int silently accepted -1, producing nonsensical
   "... -2 more" output. Add a typed _nonneg_int validator that
   raises ArgumentTypeError on negatives. argparse turns that into
   a clean exit code 2 with usage error message.

3. **Update module __doc__** (Copilot cli.py:1495). main() uses the
   module docstring as the help epilog. The hard-coded command list
   stopped at "status" — added "mined" and "purge --source-file"
   so `mempalace --help` reflects the new commands.

4. **Dispatch test for `mempalace mined`** (Copilot cli.py:1542).
   Sibling commands have main()-level dispatch tests; mined didn't.
   A typo or registration regression would have slipped through unit
   coverage. Added test_main_mined_dispatches and a paired
   test_main_mined_rejects_negative_limit (verifies #2).

Tests: 68 cli tests pass; full suite green at 1566 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit that referenced this pull request May 6, 2026
* feat(cli): mempalace mined + purge --source-file

Adds two ergonomic gaps in the mining-management surface JP asked for:

1. **mempalace purge --source-file <path>** — extends the existing
   purge command with a third filter alongside --wing and --room.
   Single-filter or combined ($and) usage works the same way wing/room
   do; the filter accepts a metadata.source_file exact-match. End-to-
   end test exercises a real palace and confirms only the matching
   drawers are deleted, leaving siblings intact.

2. **mempalace mined** — companion to `mempalace status`, but groups
   by wing × source_file rather than wing × room. Answers "which
   files have I mined into this wing?" so an operator can pick
   targets for the new --source-file purge. Skips drawers without a
   source_file metadata key (diary entries, kg drawers, etc.).
   Honors --wing and --limit (default 50 sources per wing, --limit 0
   to show all). Uses the same paginated col.get() pattern as
   miner.status() to handle 100K+ drawer palaces without tripping
   sqlite's max-variable limit.

Together these close the "removing manually mined data" half of JP's
ask. The "automining when files change" half is queued for a separate
file-watcher daemon PR.

Pre-existing `mempalace mine <dir>` already covers "adding" (and
re-mines on mtime change via bulk_check_mined dedup), so no command
needed for refresh.

Tests: 8 new test cases (3 for purge --source-file with mock + real
palace; 3 for cmd_mined including end-to-end groupBy + wing filter +
no-palace error path; 2 updated to add the new arg to the test
namespace builder). Full suite: 1564 passed, 1 skipped, lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

* fix(cli): address Copilot review on #4

Four findings from Copilot's review pass on the mining-management CLI:

1. **--wing filter pushed into the query** (Copilot cli.py:800).
   Previously cmd_mined fetched every drawer in the palace and filtered
   in Python, doing O(total) work even for "mempalace mined --wing X"
   on a 150K-drawer palace where X has 12K. Pre-fetch IDs via
   col.get(where=...) to size the loop, then pass where= on each
   batch get(). On the canonical palace, --wing filters now scan
   ~10x less data.

2. **Reject negative --limit at parse time** (Copilot cli.py:818).
   argparse type=int silently accepted -1, producing nonsensical
   "... -2 more" output. Add a typed _nonneg_int validator that
   raises ArgumentTypeError on negatives. argparse turns that into
   a clean exit code 2 with usage error message.

3. **Update module __doc__** (Copilot cli.py:1495). main() uses the
   module docstring as the help epilog. The hard-coded command list
   stopped at "status" — added "mined" and "purge --source-file"
   so `mempalace --help` reflects the new commands.

4. **Dispatch test for `mempalace mined`** (Copilot cli.py:1542).
   Sibling commands have main()-level dispatch tests; mined didn't.
   A typo or registration regression would have slipped through unit
   coverage. Added test_main_mined_dispatches and a paired
   test_main_mined_rejects_negative_limit (verifies #2).

Tests: 68 cli tests pass; full suite green at 1566 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants