feat(watcher): file-watcher service for auto-mining on file change#2
Closed
jphein wants to merge 2 commits intofeat/transcript-path-translationfrom
Closed
feat(watcher): file-watcher service for auto-mining on file change#2jphein wants to merge 2 commits intofeat/transcript-path-translationfrom
jphein wants to merge 2 commits intofeat/transcript-path-translationfrom
Conversation
Closes the "automining when files are added or updated" half of JP's ask. The other half (CLI add/remove of mined data) ships in jphein/mempalace#4 as `mempalace mined` + `mempalace purge --source-file`. Adds a watchdog-based file watcher that runs inside the daemon's lifespan. Configured via `PALACE_WATCH_DIRS` env var: PALACE_WATCH_DIRS="/home/jp/Projects/realmwatch=wing_realmwatch, /home/jp/Projects/oracle=wing_oracle" Each entry is `path` or `path=wing`. Bare path derives wing via `mempalace.config.normalize_wing_name(basename)` to match the local-spawn miner default. Non-existent paths are warned-and-skipped, not fatal — a misconfigured env var doesn't kill startup. Architecture: * watcher.py — pure-Python module: parse_watch_dirs(), _DebouncedMineHandler (subclass of FileSystemEventHandler with a per-target debounce timer), WatcherService (lifecycle wrapper around watchdog.Observer). * main.py lifespan — instantiate WatcherService, schedule one recursive watch per target, register stop() on shutdown. Failures are non-fatal; the daemon serves traffic regardless. * main.py — `_internal_mine()` async closure runs the same `mempalace mine ... --mode projects --wing X` argv as the existing /mine endpoint, gated by the same `_mine_sem`. * main.py — new `GET /watch` endpoint surfaces the running target list. No POST/DELETE — runtime add/remove requires daemon restart (operator just edits the systemd unit env and restarts). Debounce: events on the same target collapse via a 2-second per-target timer. Catches editor write+rename storms and *.swp / *.lock churn. Also pre-filters by extension allowlist (29 extensions: code, configs, markdown). Inotify fires for everything; the handler discards non-watched extensions before the timer schedules. Path translation: when a watch path lives in a Syncthing-replicated directory, the daemon-side path is translated through `PALACE_DAEMON_PATH_MAP` before the mine subprocess runs (same mechanism used by /mine). Tests: 11 new unittest cases in tests/test_watcher.py: - TestParseWatchDirs (7 cases): empty, single-with-derived-wing, explicit-wing, multiple, skips-nonexistent, skips-files, env-fallback - TestDebouncedMineHandler (4 cases): single-event-fires, burst-collapses-to-one, skips-directory-events, skips-bad-extensions Stacked on #1 (path-translation). When that merges, this PR auto-rebases against main; reviewers see only the new commits in this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds a watchdog-backed file-watcher to palace-daemon so project directories can be re-mined automatically after file changes, extending the daemon’s existing /mine flow with background monitoring and a status endpoint.
Changes:
- Adds
watcher.pywith env parsing, debounce logic, and aWatcherServicewrapper aroundwatchdog. - Wires watcher startup/shutdown into the FastAPI lifespan and adds
GET /watchto report configured targets. - Adds watcher unit tests and the new
watchdogdependency.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
watcher.py |
New watcher module for parsing watch targets, filtering events, debouncing, and starting/stopping filesystem observers. |
main.py |
Starts the watcher during app lifespan, stops it on shutdown, and adds the /watch endpoint. |
tests/test_watcher.py |
Adds unit tests for watch-dir parsing and debounce handler behavior. |
requirements.txt |
Adds the watchdog package dependency. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+131
to
+133
| path = Path(path_str).expanduser().resolve() | ||
| if not path.is_dir(): | ||
| _log.warning("PALACE_WATCH_DIRS: skipping %r (not a directory)", path_str) |
| self._timer: threading.Timer | None = None | ||
| self._lock = threading.Lock() | ||
|
|
||
| def on_any_event(self, event: FileSystemEvent) -> None: |
Comment on lines
+167
to
+171
| try: | ||
| suffix = Path(event.src_path).suffix.lower() | ||
| except Exception: | ||
| return | ||
| if suffix not in _WATCHABLE_EXTENSIONS: |
Comment on lines
+211
to
+219
| self._targets = targets | ||
| self._observer = Observer() | ||
| for target in targets: | ||
| handler = _DebouncedMineHandler(target, self._mine_fn) | ||
| self._observer.schedule(handler, str(target.path), recursive=True) | ||
| _log.info("watching %s → wing=%s", target.path, target.wing) | ||
| self._observer.start() | ||
| _log.info("WatcherService started with %d target(s)", len(targets)) | ||
|
|
Comment on lines
+220
to
+224
| def stop(self) -> None: | ||
| if self._observer is None: | ||
| return | ||
| try: | ||
| self._observer.stop() |
Comment on lines
+243
to
+246
| def _trigger(path: str, wing: str) -> None: | ||
| _log.info("watcher fired mine: dir=%s wing=%s", path, wing) | ||
| try: | ||
| asyncio.run_coroutine_threadsafe(internal_mine(path, wing), loop) |
Comment on lines
+531
to
+558
| try: | ||
| from watcher import WatcherService, make_async_mine_fn, parse_watch_dirs | ||
|
|
||
| targets = parse_watch_dirs() | ||
| loop = asyncio.get_running_loop() | ||
|
|
||
| async def _internal_mine(path: str, wing: str) -> None: | ||
| translated = _translate_client_path(path) | ||
| dir_path = Path(translated) | ||
| if not dir_path.is_dir(): | ||
| logger.warning("watcher: skipping mine for non-dir path %s", translated) | ||
| return | ||
| mempalace_bin = os.path.join(os.path.dirname(sys.executable), "mempalace") | ||
| argv = [mempalace_bin, "mine", translated, "--mode", "projects", "--wing", wing] | ||
| # Same pattern as /mine endpoint: list-form argv, no shell. | ||
| async with _mine_sem: | ||
| proc = await asyncio.create_subprocess_exec( | ||
| *argv, | ||
| stdout=asyncio.subprocess.PIPE, | ||
| stderr=asyncio.subprocess.PIPE, | ||
| ) | ||
| stdout, stderr = await proc.communicate() | ||
| if proc.returncode != 0: | ||
| logger.warning("watcher mine returned %s for %s", proc.returncode, translated) | ||
|
|
||
| watcher = WatcherService(make_async_mine_fn(loop, _internal_mine)) | ||
| watcher.start(targets) | ||
| app.state.watcher = watcher |
Seven findings, all addressed: 1. **Path translation before is_dir check** (Copilot watcher.py:133). parse_watch_dirs() validated against the daemon filesystem before any path translation, so a PALACE_WATCH_DIRS value written in the client namespace was always rejected as "not a directory". Add a translator= kwarg; main.py now passes _translate_client_path so client-namespace entries reach the daemon's view first. 2. **Drop on_any_event in favor of specific subscriptions** (Copilot watcher.py:160). watchdog 3.x emits opened/closed events on plain file reads on Linux; routing those through the debounce would re-mine the project on every file open. Replaced on_any_event with explicit on_created / on_modified / on_moved / on_deleted. 3. **Check dest_path on FileMovedEvent** (Copilot watcher.py:171). Editor save-via-rename writes a temp file then renames to the real filename. The src_path on the move event is the temp; only dest_path has the real extension. Extracted _has_watchable_extension() helper; on_moved() checks both sides. 4. **Per-target schedule() failure isolation** (Copilot watcher.py:219). A bare for-loop let an exception in observer.schedule() (e.g. inotify watch limit on a large repo) abort the entire service. Wrap each schedule() in try/except; one failed target no longer disables the others. Logged at warning level. 5. **Cancel debounce timers on stop** (Copilot watcher.py:224). stop() tore down the observer but left armed threading.Timer objects to fire mid-teardown. Add cancel_pending() on the handler; service stop() walks all handlers first. 6. **Surface watcher-coroutine errors** (Copilot watcher.py:246). asyncio.run_coroutine_threadsafe() returns a Future the caller must observe; dropping it swallowed exceptions silently. Add a _log_future_exception done-callback. 7. **/watch reports running state accurately** (Copilot main.py:558). app.state.watcher was set unconditionally even when start() returned early (empty env, missing dep, all targets failed). Now only published to app.state when watcher.is_running. GET /watch reports running: false for idle/disabled instead of misleading running: true. Tests: 28 pass (was 24). New cases: - test_rename_to_watched_extension_fires (covers #3) - test_cancel_pending_drops_armed_timer (covers #5) - test_one_failed_target_doesnt_disable_others (covers #4) - test_translator_runs_before_is_dir (covers #1) Existing handler tests retargeted from on_any_event to on_modified since on_any_event no longer exists in this revision. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Owner
Author
|
Copilot review addressed in commit
Stacked on #1 (path-translation). Tests: 28 pass (+4 new), 0 fail. Ran |
jphein
added a commit
that referenced
this pull request
May 6, 2026
Seven findings, all addressed: 1. **Path translation before is_dir check** (Copilot watcher.py:133). parse_watch_dirs() validated against the daemon filesystem before any path translation, so a PALACE_WATCH_DIRS value written in the client namespace was always rejected as "not a directory". Add a translator= kwarg; main.py now passes _translate_client_path so client-namespace entries reach the daemon's view first. 2. **Drop on_any_event in favor of specific subscriptions** (Copilot watcher.py:160). watchdog 3.x emits opened/closed events on plain file reads on Linux; routing those through the debounce would re-mine the project on every file open. Replaced on_any_event with explicit on_created / on_modified / on_moved / on_deleted. 3. **Check dest_path on FileMovedEvent** (Copilot watcher.py:171). Editor save-via-rename writes a temp file then renames to the real filename. The src_path on the move event is the temp; only dest_path has the real extension. Extracted _has_watchable_extension() helper; on_moved() checks both sides. 4. **Per-target schedule() failure isolation** (Copilot watcher.py:219). A bare for-loop let an exception in observer.schedule() (e.g. inotify watch limit on a large repo) abort the entire service. Wrap each schedule() in try/except; one failed target no longer disables the others. Logged at warning level. 5. **Cancel debounce timers on stop** (Copilot watcher.py:224). stop() tore down the observer but left armed threading.Timer objects to fire mid-teardown. Add cancel_pending() on the handler; service stop() walks all handlers first. 6. **Surface watcher-coroutine errors** (Copilot watcher.py:246). asyncio.run_coroutine_threadsafe() returns a Future the caller must observe; dropping it swallowed exceptions silently. Add a _log_future_exception done-callback. 7. **/watch reports running state accurately** (Copilot main.py:558). app.state.watcher was set unconditionally even when start() returned early (empty env, missing dep, all targets failed). Now only published to app.state when watcher.is_running. GET /watch reports running: false for idle/disabled instead of misleading running: true. Tests: 28 pass (was 24). New cases: - test_rename_to_watched_extension_fires (covers #3) - test_cancel_pending_drops_armed_timer (covers #5) - test_one_failed_target_doesnt_disable_others (covers #4) - test_translator_runs_before_is_dir (covers #1) Existing handler tests retargeted from on_any_event to on_modified since on_any_event no longer exists in this revision. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
jphein
added a commit
that referenced
this pull request
May 6, 2026
* feat(watcher): file-watcher service for auto-mining on file change Closes the "automining when files are added or updated" half of JP's ask. The other half (CLI add/remove of mined data) ships in jphein/mempalace#4 as `mempalace mined` + `mempalace purge --source-file`. Adds a watchdog-based file watcher that runs inside the daemon's lifespan. Configured via `PALACE_WATCH_DIRS` env var: PALACE_WATCH_DIRS="/home/jp/Projects/realmwatch=wing_realmwatch, /home/jp/Projects/oracle=wing_oracle" Each entry is `path` or `path=wing`. Bare path derives wing via `mempalace.config.normalize_wing_name(basename)` to match the local-spawn miner default. Non-existent paths are warned-and-skipped, not fatal — a misconfigured env var doesn't kill startup. Architecture: * watcher.py — pure-Python module: parse_watch_dirs(), _DebouncedMineHandler (subclass of FileSystemEventHandler with a per-target debounce timer), WatcherService (lifecycle wrapper around watchdog.Observer). * main.py lifespan — instantiate WatcherService, schedule one recursive watch per target, register stop() on shutdown. Failures are non-fatal; the daemon serves traffic regardless. * main.py — `_internal_mine()` async closure runs the same `mempalace mine ... --mode projects --wing X` argv as the existing /mine endpoint, gated by the same `_mine_sem`. * main.py — new `GET /watch` endpoint surfaces the running target list. No POST/DELETE — runtime add/remove requires daemon restart (operator just edits the systemd unit env and restarts). Debounce: events on the same target collapse via a 2-second per-target timer. Catches editor write+rename storms and *.swp / *.lock churn. Also pre-filters by extension allowlist (29 extensions: code, configs, markdown). Inotify fires for everything; the handler discards non-watched extensions before the timer schedules. Path translation: when a watch path lives in a Syncthing-replicated directory, the daemon-side path is translated through `PALACE_DAEMON_PATH_MAP` before the mine subprocess runs (same mechanism used by /mine). Tests: 11 new unittest cases in tests/test_watcher.py: - TestParseWatchDirs (7 cases): empty, single-with-derived-wing, explicit-wing, multiple, skips-nonexistent, skips-files, env-fallback - TestDebouncedMineHandler (4 cases): single-event-fires, burst-collapses-to-one, skips-directory-events, skips-bad-extensions Stacked on #1 (path-translation). When that merges, this PR auto-rebases against main; reviewers see only the new commits in this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(watcher): address Copilot review on #2 Seven findings, all addressed: 1. **Path translation before is_dir check** (Copilot watcher.py:133). parse_watch_dirs() validated against the daemon filesystem before any path translation, so a PALACE_WATCH_DIRS value written in the client namespace was always rejected as "not a directory". Add a translator= kwarg; main.py now passes _translate_client_path so client-namespace entries reach the daemon's view first. 2. **Drop on_any_event in favor of specific subscriptions** (Copilot watcher.py:160). watchdog 3.x emits opened/closed events on plain file reads on Linux; routing those through the debounce would re-mine the project on every file open. Replaced on_any_event with explicit on_created / on_modified / on_moved / on_deleted. 3. **Check dest_path on FileMovedEvent** (Copilot watcher.py:171). Editor save-via-rename writes a temp file then renames to the real filename. The src_path on the move event is the temp; only dest_path has the real extension. Extracted _has_watchable_extension() helper; on_moved() checks both sides. 4. **Per-target schedule() failure isolation** (Copilot watcher.py:219). A bare for-loop let an exception in observer.schedule() (e.g. inotify watch limit on a large repo) abort the entire service. Wrap each schedule() in try/except; one failed target no longer disables the others. Logged at warning level. 5. **Cancel debounce timers on stop** (Copilot watcher.py:224). stop() tore down the observer but left armed threading.Timer objects to fire mid-teardown. Add cancel_pending() on the handler; service stop() walks all handlers first. 6. **Surface watcher-coroutine errors** (Copilot watcher.py:246). asyncio.run_coroutine_threadsafe() returns a Future the caller must observe; dropping it swallowed exceptions silently. Add a _log_future_exception done-callback. 7. **/watch reports running state accurately** (Copilot main.py:558). app.state.watcher was set unconditionally even when start() returned early (empty env, missing dep, all targets failed). Now only published to app.state when watcher.is_running. GET /watch reports running: false for idle/disabled instead of misleading running: true. Tests: 28 pass (was 24). New cases: - test_rename_to_watched_extension_fires (covers #3) - test_cancel_pending_drops_armed_timer (covers #5) - test_one_failed_target_doesnt_disable_others (covers #4) - test_translator_runs_before_is_dir (covers #1) Existing handler tests retargeted from on_any_event to on_modified since on_any_event no longer exists in this revision. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the automining when files are added or updated half of JP's mining-management ask. (The CLI half ships in jphein/mempalace#4 as `mempalace mined` and `mempalace purge --source-file`.)
Stacked on #1
Base branch is `feat/transcript-path-translation`. The watcher uses `_translate_client_path` from that PR to bridge client-namespace watch paths to the daemon's filesystem view. When PR #1 merges, this PR auto-rebases against `main`.
Why
PR #1 + jphein/mempalace#2 restored hook-driven transcript mining. PR #3 + #5 retired the checkpoint summary path. The remaining gap was project files: a `.md`, `.py`, etc. file that JP edits doesn't get re-mined until the next manual `mempalace mine` invocation. This PR fills that gap so verbatim search stays current automatically.
Behavior
```
$ curl http://disks.jphe.in:8085/watch
{
"running": true,
"targets": [
{"path": "/mnt/raid/projects/realmwatch", "wing": "wing_realmwatch"},
{"path": "/mnt/raid/projects/oracle", "wing": "wing_oracle"}
]
}
```
When a file inside a watched path changes:
Architecture
Test plan
Risk
🤖 Generated with Claude Code