Skip to content

feat(index): mm index --debounce-window + --flush + --status (closes #536 documented gap)#548

Merged
memtomem merged 1 commit intomainfrom
feat/index-debounce-window
Apr 29, 2026
Merged

feat(index): mm index --debounce-window + --flush + --status (closes #536 documented gap)#548
memtomem merged 1 commit intomainfrom
feat/index-debounce-window

Conversation

@memtomem
Copy link
Copy Markdown
Owner

Summary

Closes the documented debounce gap from PR #536. mm index gains three flags backed by a file-system-backed debounce queue under ~/.memtomem/index_debounce_queue.json (sidecar-lockfile-protected):

  • --debounce-window <SECONDS> — record PATH in the queue and drain entries silent ≥ SECONDS. Designed for PostToolUse[Write] hook callers; rapid consecutive writes to the same file restart the window so a codegen burst is indexed once at the end.
  • --flushsynchronously drain every queued entry. Blocks until each file is indexed (or recorded as an error). Worst-case latency ≈ queue depth × per-file index cost. The plugin's Stop hook now chains mm index --flush before mm session end --auto.
  • --status — snapshot of the queue (depth, oldest entry). Race-prone by design (no lock); concurrent hooks may modify the queue between the read and any caller action. Use as telemetry only — --flush is the only correctness primitive for "drain the queue".

The three flags are mutually exclusive with each other and with the plain mm index <path> invocation.

Concurrency model

The queue file is replaced atomically via os.replace on every mutation. Locking the queue file directly would disconnect the lock from the file mid-mutation (the OS keeps the locked inode alive after replace, but later openers see a different inode). The lock is therefore on a sidecar ~/.memtomem/.index_debounce_queue.json.lock — never replaced, so all writers serialize on the same inode. Verified by test_concurrent_enqueue_does_not_lose_entries (20 parallel threads, all entries persisted).

Indexer errors keep entries in the queue for retry on the next hook fire — failing files are not silently lost. Last-write-wins for --namespace/--force when the same path is enqueued twice.

RFC-B (PreCompact, deferred) — future-extensibility reserved

When the PreCompact hook contract lands and a checkpoint handler wants to flush only the files Claude Code reports as in-flight, --flush will gain a --paths <list> form. The underlying debounce.drain_all already accepts an optional paths= filter (test_paths_filter_drains_subset_only pins this) so the CLI addition is additive — no second ABI change. CHANGELOG and claude-code.md Caveats both call this out so a future RFC-B reviewer doesn't have to rediscover the extension shape.

Plugin + docs sync

packages/memtomem-claude-plugin/hooks/hooks.json and docs/guides/integrations/claude-code.md Hooks Automation Setup are updated byte-for-byte (TestPluginHooksDocsParity from #536 catches drift):

  • PostToolUse[Write]mm index --debounce-window 5 "$FP"
  • Stopmm index --flush; mm session end --auto (timeout bumped 5000 → 10000 ms to accommodate flush worst-case latency)
  • "Important Caveats" — old "Debounce gap" caveat (pointed at native debounce as future work) replaced with "Debounce mechanics" caveat covering queue file, window-restart semantics, Stop-hook flush chain, --status race, and RFC-B extension shape.

Test plan

  • uv run pytest packages/memtomem/tests/test_index_debounce.py -m "not ollama"19 new tests pass (enqueue semantics, drain-ready/drain-all, status snapshot, 20-thread concurrency, partial-drain retry).
  • uv run pytest packages/memtomem/tests -m "not ollama"3147 pass, 0 fail (no regressions; TestPluginHooksDocsParity enforces docs↔plugin parity).
  • uv run ruff check && uv run ruff format --check && uv run mypy — all clean (mypy advisory, no new errors).
  • Manual: mm index --debounce-window 5 /tmp/test.md queues; second call within 5s updates last_seen; third call after 5s drains. mm index --status prints queue depth. mm index --flush drains synchronously.

Open questions for review

  • Reviewer: confirm 5-second window default in the plugin recipe is sane for typical codegen burst spacing; lower (1-2s) catches less, higher (10s) delays too much
  • Reviewer: confirm sidecar-lockfile pattern over named-pipe / SQLite-WAL alternatives — the existing codebase uses fcntl.flock for _liveness.py so this is consistent
  • Reviewer: confirm Stop-hook timeout bump 5000 → 10000 ms is enough for flush worst-case (a burst of 20 files at ~250ms each = 5s; the bump leaves headroom but isn't unbounded)

🤖 Generated with Claude Code

…536 documented gap)

PR #536 shipped the PostToolUse[Write] extension/path filter and
documented but didn't address the rapid-consecutive-writes gap:
codegen loops that touch the same file many times in a few seconds
re-index it on every Write, fanning out embedding cost and producing
noisy index churn. This PR closes the gap with a file-system-backed
debounce queue and three new CLI flags on ``mm index``.

The persistent queue lives at
``~/.memtomem/index_debounce_queue.json``, guarded by ``flock`` on a
sidecar lockfile (the queue file itself is replaced atomically via
``os.replace``, so locking it directly would disconnect the lock
from the file mid-mutation; the sidecar is never replaced and
therefore serializes correctly under concurrent writers — exercised
by ``test_concurrent_enqueue_does_not_lose_entries`` with 20 parallel
threads).

### CLI surface

- ``--debounce-window <SECONDS>``: record PATH in the queue and drain
  entries that have been silent at least SECONDS. Repeated writes to
  the same path restart the window — the *last* hook in a burst (or
  any later hook firing on a different file) indexes the burst's final
  state once.
- ``--flush``: synchronously drain every queued entry. Blocks until
  every queued file is indexed (or recorded as an error). Worst-case
  latency ≈ queue depth × per-file index cost. The plugin's Stop hook
  now chains ``mm index --flush`` before ``mm session end --auto`` so
  the burst at session end isn't left in the queue.
- ``--status``: snapshot of the queue (depth, oldest entry). Read
  without a lock by design — concurrent hooks may add or drain
  entries between the read and any caller action. The docstring and
  ``--help`` text both flag the race so callers don't try
  status-then-flush as a correctness pattern; ``--flush`` is the
  only "drain the queue" primitive.

The three flags are mutually exclusive with each other and with the
plain ``mm index <path>`` invocation.

### Indexer error semantics

Indexing errors keep the entry in the queue for retry on the next
hook fire — failing files are not silently lost. Last-write-wins for
``--namespace``/``--force`` when the same path is enqueued twice (most
recent caller's intent applies on drain).

### RFC-B (PreCompact, deferred) — future-extensibility reserved

When the PreCompact hook contract lands and a checkpoint handler
wants to flush only the files Claude Code reports as in-flight,
``--flush`` will gain a ``--paths <list>`` form. The underlying
``debounce.drain_all`` already accepts an optional ``paths=`` filter
(verified by ``test_paths_filter_drains_subset_only``) so the CLI
addition is additive — no second ABI change. CHANGELOG note covers
this so a future RFC-B reviewer doesn't have to rediscover the
extension shape.

### Plugin + docs sync

Plugin ``hooks.json`` and ``docs/guides/integrations/claude-code.md``
Hooks Automation Setup are updated byte-for-byte
(``TestPluginHooksDocsParity`` from #536 catches drift). The
PostToolUse[Write] hook calls ``mm index --debounce-window 5``; the
Stop hook chains ``mm index --flush; mm session end --auto`` (Stop
hook timeout bumped from 5000 to 10000 ms to accommodate the flush
worst-case latency).

The "Important Caveats" section is rewritten — the old "Debounce gap"
caveat (which pointed at native debounce as future work) is replaced
with a "Debounce mechanics" caveat that explains the queue file,
window restart semantics, the Stop-hook flush chain, the race-prone
nature of ``--status``, and the RFC-B selective-payload extension
shape.

### Tests

19 new unit tests in ``test_index_debounce.py`` covering enqueue
semantics (last-write-wins, first-seen vs last-seen, distinct paths),
``drain_ready`` (caller's own enqueue doesn't qualify, mixed-readiness
queue, error retry, namespace/force forwarding), ``drain_all`` (full
drain, subset via ``paths=``, empty-queue no-op), ``status_snapshot``
(empty/oldest-by-first-seen), and persistence + concurrency
(round-trip across calls, 20-thread parallel enqueue, partial-drain
write-back). The parity test from #536 keeps the plugin and docs
snippets aligned automatically.

Co-Authored-By: Claude <[email protected]>
@memtomem memtomem merged commit 9246b76 into main Apr 29, 2026
7 checks passed
@memtomem memtomem deleted the feat/index-debounce-window branch April 29, 2026 07:57
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 29, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants