Skip to content

feat(cli): mempalace mined + purge --source-file#4

Closed
jphein wants to merge 2 commits intofix/restore-transcript-ingest-via-daemonfrom
feat/mine-by-source-file
Closed

feat(cli): mempalace mined + purge --source-file#4
jphein wants to merge 2 commits intofix/restore-transcript-ingest-via-daemonfrom
feat/mine-by-source-file

Conversation

@jphein
Copy link
Copy Markdown
Owner

@jphein jphein commented May 6, 2026

Summary

Closes the removing half of JP's ask: "add a way of adding and removing the manually mined data." Adding is already covered by the existing `mempalace mine

`; this PR adds the symmetric removal + listing surface.

  • `mempalace purge --source-file ` — extends the existing purge command with a third filter alongside `--wing` and `--room`. Composes with the others (single filter or `$and`).
  • `mempalace mined` — companion to `mempalace status` that groups by wing × source_file rather than wing × room. Answers "which files have I mined into this wing?" so an operator can pick targets for `--source-file` purge.

Stacked on #2

Base branch is `fix/restore-transcript-ingest-via-daemon` (Phase 1 PR). When that merges, this PR auto-rebases against `main`. Reviewers see only the new commits in this PR.

Why

The earlier work landed checkpoint deletion (PR #3) and transcript-ingest restore (PR #2), both write-side. JP also asked for a CLI "way of adding and removing the manually mined data." The existing `mempalace mine

` covers add. The existing `mempalace purge` covers remove-by-wing/-room but not by source_file — and there was no listing view that grouped by file. This PR plugs both gaps.

Behavior

```
$ mempalace mined --wing wing_realmwatch --limit 5

MemPalace Mined — sources by wing

WING: wing_realmwatch (147 sources, 12,195 drawers)
312 /home/jp/Projects/realmwatch/realm-map.js
198 /home/jp/Projects/realmwatch/map_server.py
176 /home/jp/Projects/realmwatch/voice/oracle.py
143 /home/jp/Projects/realmwatch/CLAUDE.md
121 /home/jp/Projects/realmwatch/desktop-themes/deploy.sh
... 142 more (use --limit 0 to show all)

$ mempalace purge --source-file /home/jp/Projects/realmwatch/old-script.py
Found 47 drawers matching source-file=/home/jp/Projects/realmwatch/old-script.py
... [confirmation prompt unless --yes]
Deleting matching drawers...
Purged 47 drawers. Remaining: 150,847
```

Test plan

  • 1564 passed, 1 skipped — full suite green on Linux 3.12 locally
  • `ruff check` clean
  • `ruff format --check` clean
  • 8 new test cases:
    • `test_cmd_purge_source_file_only` — mock-backed `where` clause
    • `test_cmd_purge_source_file_with_wing_uses_and_filter` — `$and` composition
    • `test_cmd_purge_source_file_end_to_end` — real palace; siblings preserved
    • `test_cmd_mined_no_palace` — error path
    • `test_cmd_mined_groups_by_wing_and_source` — end-to-end with real palace, asserts skipped diary-style drawers (no source_file)
    • `test_cmd_mined_filter_by_wing` — `--wing` excludes other wings
    • 2 existing tests updated to add the new arg to the test-namespace builder
  • `test_cmd_purge_requires_filter` updated to expect the new error message

Risk

  • `--source-file` accepts arbitrary strings; chromadb's where-clause does exact-match `$eq`, so there's no glob/prefix risk. The clause is case-sensitive — operator must match exactly the path that was originally mined (which mempalace stores absolute, resolved). `mempalace mined` is the recommended way to discover the right value.
  • `cmd_mined` paginates 5000 drawers at a time (mirrors `miner.status`). On the canonical 151K palace, that's ~30 batches; <2 seconds in informal testing.
  • Diary-style drawers (no source_file metadata) are silently skipped from `mempalace mined` output. They're still visible in `mempalace status`. If an operator expects them to appear here too, that's the place to look — documented in the docstring.

🤖 Generated with Claude Code

Adds two ergonomic gaps in the mining-management surface JP asked for:

1. **mempalace purge --source-file <path>** — extends the existing
   purge command with a third filter alongside --wing and --room.
   Single-filter or combined ($and) usage works the same way wing/room
   do; the filter accepts a metadata.source_file exact-match. End-to-
   end test exercises a real palace and confirms only the matching
   drawers are deleted, leaving siblings intact.

2. **mempalace mined** — companion to `mempalace status`, but groups
   by wing × source_file rather than wing × room. Answers "which
   files have I mined into this wing?" so an operator can pick
   targets for the new --source-file purge. Skips drawers without a
   source_file metadata key (diary entries, kg drawers, etc.).
   Honors --wing and --limit (default 50 sources per wing, --limit 0
   to show all). Uses the same paginated col.get() pattern as
   miner.status() to handle 100K+ drawer palaces without tripping
   sqlite's max-variable limit.

Together these close the "removing manually mined data" half of JP's
ask. The "automining when files change" half is queued for a separate
file-watcher daemon PR.

Pre-existing `mempalace mine <dir>` already covers "adding" (and
re-mines on mtime change via bulk_check_mined dedup), so no command
needed for refresh.

Tests: 8 new test cases (3 for purge --source-file with mock + real
palace; 3 for cmd_mined including end-to-end groupBy + wing filter +
no-palace error path; 2 updated to add the new arg to the test
namespace builder). Full suite: 1564 passed, 1 skipped, lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copilot AI review requested due to automatic review settings May 6, 2026 00:37
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds CLI support for managing mined source-file data in MemPalace: purge --source-file extends the existing delete workflow, and mempalace mined gives operators a source-file inventory by wing to target those purges.

Changes:

  • Extend cmd_purge and CLI parsing to accept --source-file alongside --wing and --room.
  • Add cmd_mined to aggregate drawers by wing and source_file, with optional wing filtering and per-wing limits.
  • Add tests covering source-file purge behavior and basic end-to-end coverage for the new mined command.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
mempalace/cli.py Adds the new source-file purge filter, implements cmd_mined, and registers the command in argparse/dispatch.
tests/test_cli.py Adds unit and end-to-end tests for source-file purge and mined-command output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/cli.py
Comment on lines +797 to +800
wing = m.get("wing", "?")
if args.wing and wing != args.wing:
continue
wing_sources[wing][src] += 1
Comment thread mempalace/cli.py
Comment on lines +814 to +818
shown = sources if args.limit == 0 else sources[: args.limit]
for src, count in shown:
print(f" {count:5} {src}")
if args.limit and len(sources) > args.limit:
print(f" ... {len(sources) - args.limit} more (use --limit 0 to show all)")
Comment thread mempalace/cli.py
Comment on lines +1492 to +1495
p_mined = sub.add_parser(
"mined",
help="List mined source files grouped by wing (companion to status, which groups by room)",
)
Comment thread mempalace/cli.py
Comment on lines 1540 to +1542
"purge": cmd_purge,
"status": cmd_status,
"mined": cmd_mined,
Four findings from Copilot's review pass on the mining-management CLI:

1. **--wing filter pushed into the query** (Copilot cli.py:800).
   Previously cmd_mined fetched every drawer in the palace and filtered
   in Python, doing O(total) work even for "mempalace mined --wing X"
   on a 150K-drawer palace where X has 12K. Pre-fetch IDs via
   col.get(where=...) to size the loop, then pass where= on each
   batch get(). On the canonical palace, --wing filters now scan
   ~10x less data.

2. **Reject negative --limit at parse time** (Copilot cli.py:818).
   argparse type=int silently accepted -1, producing nonsensical
   "... -2 more" output. Add a typed _nonneg_int validator that
   raises ArgumentTypeError on negatives. argparse turns that into
   a clean exit code 2 with usage error message.

3. **Update module __doc__** (Copilot cli.py:1495). main() uses the
   module docstring as the help epilog. The hard-coded command list
   stopped at "status" — added "mined" and "purge --source-file"
   so `mempalace --help` reflects the new commands.

4. **Dispatch test for `mempalace mined`** (Copilot cli.py:1542).
   Sibling commands have main()-level dispatch tests; mined didn't.
   A typo or registration regression would have slipped through unit
   coverage. Added test_main_mined_dispatches and a paired
   test_main_mined_rejects_negative_limit (verifies #2).

Tests: 68 cli tests pass; full suite green at 1566 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit that referenced this pull request May 6, 2026
Copilot review on #5 caught that the README still
documented session_recovery_read as current, which would leave users
calling a tool that returns "Unknown tool" after this PR ships.

Updates the four user-facing sections that described the recovery
collection as a current shipping feature:

* "What just shipped" — adds 2026-05-05 update note explaining the
  split was retired, with the generalizable lesson preserved (a side
  collection without a semantic-search MCP read tool is invisible).
* "Verbatim vs derivative" principle — reframes the recovery
  collection as the failed first attempt rather than a current
  example; pattern stays valid but each future sibling needs its
  own read surface.
* "Structural retrieval fixes" bullet — describes verbatim-only
  end-state, links to the new spec, retains historical context for
  Apr 25 → May 5.
* "Deterministic hook saves" bullet — updates to the new ingest-only
  save path, removes "recovery-collection marker" mention.
* "What it looks like in production" code block — updates the
  systemMessage example to the new "Transcript ingest triggered
  (wing=...)" shape.
* "P8" architectural principle — keeps it but reframes as on-hold
  with a precondition (read-surface parity) drawn from the cycle.
* Fork-ahead tracking table — replaces the old multi-collection-split
  row with a verbatim-only retirement row pointing at the four PRs;
  also retires references to session_recovery_read in the drawer_id
  surfacing row; adds new row for the mining management CLI (#4).

42 README meta-tests pass (test_readme_claims.py).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@jphein
Copy link
Copy Markdown
Owner Author

jphein commented May 6, 2026

Copilot review addressed in commit 2f26fdb — all four findings fixed:

  1. cli.py:800 — --wing filter scanned full collection. Now pushes the filter into the chromadb where clause: a pre-fetch sizes the loop, and each batch get() carries where=. On the canonical 151K palace, mempalace mined --wing wing_realmwatch (12K drawers) now scans ~12K rather than 151K.
  2. cli.py:818 — argparse accepts negative --limit. Added a typed _nonneg_int validator that raises ArgumentTypeError on negatives. Argparse exits cleanly at parse time instead of producing nonsense "... -2 more" output.
  3. cli.py:1495 — module __doc__ epilog stale. Added mempalace mined and mempalace purge --source-file to the docstring command list so mempalace --help reflects the new commands.
  4. cli.py:1542 — dispatch test missing. Added test_main_mined_dispatches (parallel to sibling commands) and test_main_mined_rejects_negative_limit (covers fix fix(hooks): route mining through daemon when PALACE_DAEMON_URL is set #2 end-to-end).

Stacked on #2 — full suite passes locally: 1566 passed, 1 skipped, ruff clean.

@jphein jphein deleted the branch fix/restore-transcript-ingest-via-daemon May 6, 2026 01:16
@jphein jphein closed this May 6, 2026
jphein added a commit that referenced this pull request May 6, 2026
Four findings from Copilot's review pass on the mining-management CLI:

1. **--wing filter pushed into the query** (Copilot cli.py:800).
   Previously cmd_mined fetched every drawer in the palace and filtered
   in Python, doing O(total) work even for "mempalace mined --wing X"
   on a 150K-drawer palace where X has 12K. Pre-fetch IDs via
   col.get(where=...) to size the loop, then pass where= on each
   batch get(). On the canonical palace, --wing filters now scan
   ~10x less data.

2. **Reject negative --limit at parse time** (Copilot cli.py:818).
   argparse type=int silently accepted -1, producing nonsensical
   "... -2 more" output. Add a typed _nonneg_int validator that
   raises ArgumentTypeError on negatives. argparse turns that into
   a clean exit code 2 with usage error message.

3. **Update module __doc__** (Copilot cli.py:1495). main() uses the
   module docstring as the help epilog. The hard-coded command list
   stopped at "status" — added "mined" and "purge --source-file"
   so `mempalace --help` reflects the new commands.

4. **Dispatch test for `mempalace mined`** (Copilot cli.py:1542).
   Sibling commands have main()-level dispatch tests; mined didn't.
   A typo or registration regression would have slipped through unit
   coverage. Added test_main_mined_dispatches and a paired
   test_main_mined_rejects_negative_limit (verifies #2).

Tests: 68 cli tests pass; full suite green at 1566 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit that referenced this pull request May 6, 2026
* feat(cli): mempalace mined + purge --source-file

Adds two ergonomic gaps in the mining-management surface JP asked for:

1. **mempalace purge --source-file <path>** — extends the existing
   purge command with a third filter alongside --wing and --room.
   Single-filter or combined ($and) usage works the same way wing/room
   do; the filter accepts a metadata.source_file exact-match. End-to-
   end test exercises a real palace and confirms only the matching
   drawers are deleted, leaving siblings intact.

2. **mempalace mined** — companion to `mempalace status`, but groups
   by wing × source_file rather than wing × room. Answers "which
   files have I mined into this wing?" so an operator can pick
   targets for the new --source-file purge. Skips drawers without a
   source_file metadata key (diary entries, kg drawers, etc.).
   Honors --wing and --limit (default 50 sources per wing, --limit 0
   to show all). Uses the same paginated col.get() pattern as
   miner.status() to handle 100K+ drawer palaces without tripping
   sqlite's max-variable limit.

Together these close the "removing manually mined data" half of JP's
ask. The "automining when files change" half is queued for a separate
file-watcher daemon PR.

Pre-existing `mempalace mine <dir>` already covers "adding" (and
re-mines on mtime change via bulk_check_mined dedup), so no command
needed for refresh.

Tests: 8 new test cases (3 for purge --source-file with mock + real
palace; 3 for cmd_mined including end-to-end groupBy + wing filter +
no-palace error path; 2 updated to add the new arg to the test
namespace builder). Full suite: 1564 passed, 1 skipped, lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

* fix(cli): address Copilot review on #4

Four findings from Copilot's review pass on the mining-management CLI:

1. **--wing filter pushed into the query** (Copilot cli.py:800).
   Previously cmd_mined fetched every drawer in the palace and filtered
   in Python, doing O(total) work even for "mempalace mined --wing X"
   on a 150K-drawer palace where X has 12K. Pre-fetch IDs via
   col.get(where=...) to size the loop, then pass where= on each
   batch get(). On the canonical palace, --wing filters now scan
   ~10x less data.

2. **Reject negative --limit at parse time** (Copilot cli.py:818).
   argparse type=int silently accepted -1, producing nonsensical
   "... -2 more" output. Add a typed _nonneg_int validator that
   raises ArgumentTypeError on negatives. argparse turns that into
   a clean exit code 2 with usage error message.

3. **Update module __doc__** (Copilot cli.py:1495). main() uses the
   module docstring as the help epilog. The hard-coded command list
   stopped at "status" — added "mined" and "purge --source-file"
   so `mempalace --help` reflects the new commands.

4. **Dispatch test for `mempalace mined`** (Copilot cli.py:1542).
   Sibling commands have main()-level dispatch tests; mined didn't.
   A typo or registration regression would have slipped through unit
   coverage. Added test_main_mined_dispatches and a paired
   test_main_mined_rejects_negative_limit (verifies #2).

Tests: 68 cli tests pass; full suite green at 1566 passed, 1 skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit that referenced this pull request May 6, 2026
Copilot review on #5 caught that the README still
documented session_recovery_read as current, which would leave users
calling a tool that returns "Unknown tool" after this PR ships.

Updates the four user-facing sections that described the recovery
collection as a current shipping feature:

* "What just shipped" — adds 2026-05-05 update note explaining the
  split was retired, with the generalizable lesson preserved (a side
  collection without a semantic-search MCP read tool is invisible).
* "Verbatim vs derivative" principle — reframes the recovery
  collection as the failed first attempt rather than a current
  example; pattern stays valid but each future sibling needs its
  own read surface.
* "Structural retrieval fixes" bullet — describes verbatim-only
  end-state, links to the new spec, retains historical context for
  Apr 25 → May 5.
* "Deterministic hook saves" bullet — updates to the new ingest-only
  save path, removes "recovery-collection marker" mention.
* "What it looks like in production" code block — updates the
  systemMessage example to the new "Transcript ingest triggered
  (wing=...)" shape.
* "P8" architectural principle — keeps it but reframes as on-hold
  with a precondition (read-surface parity) drawn from the cycle.
* Fork-ahead tracking table — replaces the old multi-collection-split
  row with a verbatim-only retirement row pointing at the four PRs;
  also retires references to session_recovery_read in the drawer_id
  surfacing row; adds new row for the mining management CLI (#4).

42 README meta-tests pass (test_readme_claims.py).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit that referenced this pull request May 6, 2026
…ol (#8)

* refactor(mcp): retire mempalace_session_recovery collection + read tool

Follow-up to fix/drop-checkpoint-write-path. With nothing writing to
the recovery collection anymore (the hooks moved to verbatim-only on
the parent branch), the read paths and the migration that fed the
collection are dead code. Delete them.

Removed in mempalace/:
- palace.py: _SESSION_RECOVERY_COLLECTION constant,
  get_session_recovery_collection(), _CHECKPOINT_TOPICS tuple
- mcp_server.py: _get_session_recovery_collection(),
  _recovery_collection_cache global, the topic-routing branch in
  tool_diary_write (now always lands in mempalace_drawers),
  tool_session_recovery_read() and its TOOLS dict registration
- migrate.py: migrate_checkpoints_to_recovery() and the dependent
  _CHECKPOINT_TOPICS import
- cli.py: cmd_repair --mode reorganize handler + the choice flag

Removed in tests/:
- tests/test_session_recovery.py — entire file (recovery-collection
  module test suite)
- tests/test_migrate.py: TestMigrateCheckpointsToRecovery class
- tests/test_mcp_server.py: TestCheckpointRouting and
  TestSessionRecoveryRead classes

Removed in docs/:
- website/reference/mcp-tools.md: mempalace_session_recovery_read
  section (caught by the readme-claims meta-test before deploy)

Production data on disks still has the mempalace_session_recovery
collection with its 763 archived entries — this PR's code change
makes the collection unreachable through any MCP/CLI path, but does
NOT delete the on-disk data. A separate one-shot script
(scripts/phase2_purge_recovery.py per the spec) handles the
collection-level delete after deploy. JP signed off on hard-delete
in the spec ack.

Net diff: 12 files, −1300 lines / +30 lines. 1540 tests pass, 1
skipped. Ruff lint + format clean.

Stacked on #3 (which stops new writes); will
auto-rebase against main when that merges.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

* docs(README): align with verbatim-only retirement

Copilot review on #5 caught that the README still
documented session_recovery_read as current, which would leave users
calling a tool that returns "Unknown tool" after this PR ships.

Updates the four user-facing sections that described the recovery
collection as a current shipping feature:

* "What just shipped" — adds 2026-05-05 update note explaining the
  split was retired, with the generalizable lesson preserved (a side
  collection without a semantic-search MCP read tool is invisible).
* "Verbatim vs derivative" principle — reframes the recovery
  collection as the failed first attempt rather than a current
  example; pattern stays valid but each future sibling needs its
  own read surface.
* "Structural retrieval fixes" bullet — describes verbatim-only
  end-state, links to the new spec, retains historical context for
  Apr 25 → May 5.
* "Deterministic hook saves" bullet — updates to the new ingest-only
  save path, removes "recovery-collection marker" mention.
* "What it looks like in production" code block — updates the
  systemMessage example to the new "Transcript ingest triggered
  (wing=...)" shape.
* "P8" architectural principle — keeps it but reframes as on-hold
  with a precondition (read-surface parity) drawn from the cycle.
* Fork-ahead tracking table — replaces the old multi-collection-split
  row with a verbatim-only retirement row pointing at the four PRs;
  also retires references to session_recovery_read in the drawer_id
  surfacing row; adds new row for the mining management CLI (#4).

42 README meta-tests pass (test_readme_claims.py).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit to jphein/palace-daemon that referenced this pull request May 6, 2026
Closes the "automining when files are added or updated" half of JP's
ask. The other half (CLI add/remove of mined data) ships in
jphein/mempalace#4 as `mempalace mined` + `mempalace purge --source-file`.

Adds a watchdog-based file watcher that runs inside the daemon's
lifespan. Configured via `PALACE_WATCH_DIRS` env var:

    PALACE_WATCH_DIRS="/home/jp/Projects/realmwatch=wing_realmwatch,
                       /home/jp/Projects/oracle=wing_oracle"

Each entry is `path` or `path=wing`. Bare path derives wing via
`mempalace.config.normalize_wing_name(basename)` to match the
local-spawn miner default. Non-existent paths are warned-and-skipped,
not fatal — a misconfigured env var doesn't kill startup.

Architecture:

* watcher.py — pure-Python module: parse_watch_dirs(),
  _DebouncedMineHandler (subclass of FileSystemEventHandler with a
  per-target debounce timer), WatcherService (lifecycle wrapper around
  watchdog.Observer).
* main.py lifespan — instantiate WatcherService, schedule one
  recursive watch per target, register stop() on shutdown. Failures
  are non-fatal; the daemon serves traffic regardless.
* main.py — `_internal_mine()` async closure runs the same
  `mempalace mine ... --mode projects --wing X` argv as the existing
  /mine endpoint, gated by the same `_mine_sem`.
* main.py — new `GET /watch` endpoint surfaces the running target
  list. No POST/DELETE — runtime add/remove requires daemon restart
  (operator just edits the systemd unit env and restarts).

Debounce: events on the same target collapse via a 2-second per-target
timer. Catches editor write+rename storms and *.swp / *.lock churn.
Also pre-filters by extension allowlist (29 extensions: code, configs,
markdown). Inotify fires for everything; the handler discards
non-watched extensions before the timer schedules.

Path translation: when a watch path lives in a Syncthing-replicated
directory, the daemon-side path is translated through
`PALACE_DAEMON_PATH_MAP` before the mine subprocess runs (same
mechanism used by /mine).

Tests: 11 new unittest cases in tests/test_watcher.py:
- TestParseWatchDirs (7 cases): empty, single-with-derived-wing,
  explicit-wing, multiple, skips-nonexistent, skips-files, env-fallback
- TestDebouncedMineHandler (4 cases): single-event-fires,
  burst-collapses-to-one, skips-directory-events, skips-bad-extensions

Stacked on #1 (path-translation). When that
merges, this PR auto-rebases against main; reviewers see only the
new commits in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein added a commit to jphein/palace-daemon that referenced this pull request May 6, 2026
* feat(watcher): file-watcher service for auto-mining on file change

Closes the "automining when files are added or updated" half of JP's
ask. The other half (CLI add/remove of mined data) ships in
jphein/mempalace#4 as `mempalace mined` + `mempalace purge --source-file`.

Adds a watchdog-based file watcher that runs inside the daemon's
lifespan. Configured via `PALACE_WATCH_DIRS` env var:

    PALACE_WATCH_DIRS="/home/jp/Projects/realmwatch=wing_realmwatch,
                       /home/jp/Projects/oracle=wing_oracle"

Each entry is `path` or `path=wing`. Bare path derives wing via
`mempalace.config.normalize_wing_name(basename)` to match the
local-spawn miner default. Non-existent paths are warned-and-skipped,
not fatal — a misconfigured env var doesn't kill startup.

Architecture:

* watcher.py — pure-Python module: parse_watch_dirs(),
  _DebouncedMineHandler (subclass of FileSystemEventHandler with a
  per-target debounce timer), WatcherService (lifecycle wrapper around
  watchdog.Observer).
* main.py lifespan — instantiate WatcherService, schedule one
  recursive watch per target, register stop() on shutdown. Failures
  are non-fatal; the daemon serves traffic regardless.
* main.py — `_internal_mine()` async closure runs the same
  `mempalace mine ... --mode projects --wing X` argv as the existing
  /mine endpoint, gated by the same `_mine_sem`.
* main.py — new `GET /watch` endpoint surfaces the running target
  list. No POST/DELETE — runtime add/remove requires daemon restart
  (operator just edits the systemd unit env and restarts).

Debounce: events on the same target collapse via a 2-second per-target
timer. Catches editor write+rename storms and *.swp / *.lock churn.
Also pre-filters by extension allowlist (29 extensions: code, configs,
markdown). Inotify fires for everything; the handler discards
non-watched extensions before the timer schedules.

Path translation: when a watch path lives in a Syncthing-replicated
directory, the daemon-side path is translated through
`PALACE_DAEMON_PATH_MAP` before the mine subprocess runs (same
mechanism used by /mine).

Tests: 11 new unittest cases in tests/test_watcher.py:
- TestParseWatchDirs (7 cases): empty, single-with-derived-wing,
  explicit-wing, multiple, skips-nonexistent, skips-files, env-fallback
- TestDebouncedMineHandler (4 cases): single-event-fires,
  burst-collapses-to-one, skips-directory-events, skips-bad-extensions

Stacked on #1 (path-translation). When that
merges, this PR auto-rebases against main; reviewers see only the
new commits in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

* fix(watcher): address Copilot review on #2

Seven findings, all addressed:

1. **Path translation before is_dir check** (Copilot watcher.py:133).
   parse_watch_dirs() validated against the daemon filesystem before
   any path translation, so a PALACE_WATCH_DIRS value written in the
   client namespace was always rejected as "not a directory". Add a
   translator= kwarg; main.py now passes _translate_client_path so
   client-namespace entries reach the daemon's view first.

2. **Drop on_any_event in favor of specific subscriptions** (Copilot
   watcher.py:160). watchdog 3.x emits opened/closed events on plain
   file reads on Linux; routing those through the debounce would
   re-mine the project on every file open. Replaced on_any_event
   with explicit on_created / on_modified / on_moved / on_deleted.

3. **Check dest_path on FileMovedEvent** (Copilot watcher.py:171).
   Editor save-via-rename writes a temp file then renames to the
   real filename. The src_path on the move event is the temp; only
   dest_path has the real extension. Extracted _has_watchable_extension()
   helper; on_moved() checks both sides.

4. **Per-target schedule() failure isolation** (Copilot watcher.py:219).
   A bare for-loop let an exception in observer.schedule() (e.g.
   inotify watch limit on a large repo) abort the entire service.
   Wrap each schedule() in try/except; one failed target no longer
   disables the others. Logged at warning level.

5. **Cancel debounce timers on stop** (Copilot watcher.py:224). stop()
   tore down the observer but left armed threading.Timer objects to
   fire mid-teardown. Add cancel_pending() on the handler; service
   stop() walks all handlers first.

6. **Surface watcher-coroutine errors** (Copilot watcher.py:246).
   asyncio.run_coroutine_threadsafe() returns a Future the caller
   must observe; dropping it swallowed exceptions silently. Add a
   _log_future_exception done-callback.

7. **/watch reports running state accurately** (Copilot main.py:558).
   app.state.watcher was set unconditionally even when start()
   returned early (empty env, missing dep, all targets failed). Now
   only published to app.state when watcher.is_running. GET /watch
   reports running: false for idle/disabled instead of misleading
   running: true.

Tests: 28 pass (was 24). New cases:
- test_rename_to_watched_extension_fires (covers #3)
- test_cancel_pending_drops_armed_timer (covers #5)
- test_one_failed_target_doesnt_disable_others (covers #4)
- test_translator_runs_before_is_dir (covers #1)
Existing handler tests retargeted from on_any_event to on_modified
since on_any_event no longer exists in this revision.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants