refactor(stm): decouple surfacing via remote-only MCP#2
Merged
Conversation
STM previously supported two LTM access paths — in-process (direct SearchPipeline/SqliteBackend import) and mcp_client (stdio MCP). The dual path tightly coupled STM to core internals and conflicted with STM's role as an MCP proxy gateway, where memtomem is just another upstream alongside brave-search/github/etc. Collapse to a single remote-only path so STM depends only on the MCP protocol contract, not core internals. This sets up the eventual repo split (langgraph pattern: memtomem-stm becomes a separate package that spawns memtomem-server as a child stdio process). Removed: - SurfacingEngine.search_pipeline / storage constructor args - ltm_mode config field (remote is the only mode now) - in-process bootstrap branch in server.py - _extract_heuristic's import of memtomem.tools.entity_extraction - TestFeedbackBoost (used direct SqliteBackend.increment_access) Added: - _fake_memtomem_server.py + test_remote_ltm_integration.py for an end-to-end check that exercises stdio MCP without needing core installed Follow-ups (separate PRs): - Restore access boost via a new mem_increment_access MCP action - Restore include_session_context via McpClientSearchAdapter.scratch_list - Replace the heuristic stub with a native STM regex extractor Tests: 1588 passing (core 857, STM 731), 0 regressions, ruff clean. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Pre-existing main-branch CI breakage that surfaced when this PR ran the full ruff + pytest pipeline (the last 5 commits on main were all red): - memory_crud.py:113 had an unused `chunk = await ...` assignment — the index_entry return value isn't read, so just await it. - test_indexing_engine.py::TestIndexEntry's four tests call index_engine.index_entry, which triggers a real embedding HTTP call. They need the `ollama` marker so CI's `-m "not ollama"` filter skips them when Ollama isn't running. These two fixes unblock CI for both this PR and main. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
CI's `ruff format --check` was failing because eight existing core files weren't formatted (none of these were touched by this PR's refactor; all were pre-existing main-branch dust). Running `uv run ruff format packages/memtomem/src packages/memtomem-stm/src` reformats them in place. Pure formatting changes only. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
tsdata
added a commit
that referenced
this pull request
Apr 9, 2026
After Phase 1 (#2) made STM communicate with memtomem only over the MCP protocol, the in-process glue that lived inside memtomem core became dead code. Phase 2 removes it so the two packages can ship from separate repositories without leftover cross-references. Backend deletions: - packages/memtomem/src/memtomem/cli/stm_cmd.py — `mm stm init`/`reset` wizard that imported memtomem_stm. Belongs to the STM package now. - packages/memtomem/src/memtomem/web/routes/proxy.py — STM dashboard REST routes. Will be re-added inside the STM repo if needed. - packages/memtomem/src/memtomem/cli/__init__.py — drop the `stm` subcommand registration. - packages/memtomem/src/memtomem/config.py — drop StmProxyConfig and the `stm_proxy` field on Mem2MemConfig. - packages/memtomem/src/memtomem/server/context.py — drop the `stm_proxy_manager` slot on AppContext. - packages/memtomem/src/memtomem/server/lifespan.py — drop the entire `if config.stm_proxy.enabled` bootstrap branch (~80 lines) plus its cleanup section. memtomem core no longer imports memtomem_stm at all. - packages/memtomem/src/memtomem/web/app.py — drop `proxy.router` registration and the matching import. Frontend deletions: - packages/memtomem/src/memtomem/web/static/index.html — STM tab button and the entire STM panel (~70 lines). - packages/memtomem/src/memtomem/web/static/app.js — `_stmPollId` early declaration, the `if (tabName === 'stm')` switch arm, the STM PROXY DASHBOARD section (~320 lines), and the `stm-filter`/`stm-hist-prev`/ `stm-hist-next` data-action handlers from the global delegator. Test updates: - test_cli.py: drop `stm` from registered subcommand list and remove `test_stm_help`. - test_context_window.py: remove `TestSTMFormatter` (it imported memtomem_stm.surfacing.formatter — a cross-package boundary violation that should never have been in core tests). - test_indexing_engine.py: ruff --fix removed unused imports surfaced by the indexing test changes. - assorted other test files: ruff --fix cleaned unused imports. Verification: - 850 core tests pass (`uv run pytest packages/memtomem/tests -m "not ollama"`) - ruff check + format clean for both src trees - final grep across packages/memtomem confirms zero references to stm_proxy / StmProxy / stm_cmd / memtomem_stm in source or tests Follow-up (not blocking): style.css still contains `.stm-*` rules; they are unused but harmless. Drop them in a small later cleanup PR. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
4 tasks
tsdata
added a commit
that referenced
this pull request
Apr 9, 2026
After Phase 1 (#2) made STM communicate with memtomem only over the MCP protocol, the in-process glue that lived inside memtomem core became dead code. Phase 2 removes it so the two packages can ship from separate repositories without leftover cross-references. Backend deletions: - packages/memtomem/src/memtomem/cli/stm_cmd.py — `mm stm init`/`reset` wizard that imported memtomem_stm. Belongs to the STM package now. - packages/memtomem/src/memtomem/web/routes/proxy.py — STM dashboard REST routes. Will be re-added inside the STM repo if needed. - packages/memtomem/src/memtomem/cli/__init__.py — drop the `stm` subcommand registration. - packages/memtomem/src/memtomem/config.py — drop StmProxyConfig and the `stm_proxy` field on Mem2MemConfig. - packages/memtomem/src/memtomem/server/context.py — drop the `stm_proxy_manager` slot on AppContext. - packages/memtomem/src/memtomem/server/lifespan.py — drop the entire `if config.stm_proxy.enabled` bootstrap branch (~80 lines) plus its cleanup section. memtomem core no longer imports memtomem_stm at all. - packages/memtomem/src/memtomem/web/app.py — drop `proxy.router` registration and the matching import. Frontend deletions: - packages/memtomem/src/memtomem/web/static/index.html — STM tab button and the entire STM panel (~70 lines). - packages/memtomem/src/memtomem/web/static/app.js — `_stmPollId` early declaration, the `if (tabName === 'stm')` switch arm, the STM PROXY DASHBOARD section (~320 lines), and the `stm-filter`/`stm-hist-prev`/ `stm-hist-next` data-action handlers from the global delegator. Test updates: - test_cli.py: drop `stm` from registered subcommand list and remove `test_stm_help`. - test_context_window.py: remove `TestSTMFormatter` (it imported memtomem_stm.surfacing.formatter — a cross-package boundary violation that should never have been in core tests). - test_indexing_engine.py: ruff --fix removed unused imports surfaced by the indexing test changes. - assorted other test files: ruff --fix cleaned unused imports. Verification: - 850 core tests pass (`uv run pytest packages/memtomem/tests -m "not ollama"`) - ruff check + format clean for both src trees - final grep across packages/memtomem confirms zero references to stm_proxy / StmProxy / stm_cmd / memtomem_stm in source or tests Follow-up (not blocking): style.css still contains `.stm-*` rules; they are unused but harmless. Drop them in a small later cleanup PR. Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
4 tasks
memtomem
pushed a commit
that referenced
this pull request
Apr 18, 2026
Reflect the Gemini-regenerated security curation (commit 5fbb47f) in the handoff doc so a fresh session knows which Phase 2c steps remain. Updates the phase-progress table, adds a "Security Phase 3a measurements" subsection (21.9% drift -> H1 supported; intra-vocab reclassifications to auth/mtls, auth/rbac, networking/tls; incident subtopic skew), marks next-actions #2-#3 as complete, and points #4 (IDF pre-measure) at the authoritative Gemini batch directory. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
memtomem
pushed a commit
that referenced
this pull request
Apr 18, 2026
11 chunks affected, 13 events / 32 = 40.6% event-count — **Upper- outlier band (32-45%)** realized per pre-registration, above observability's 28.1% by 12.5 pp. Pre-registered cell; not a novel failure mode. Two systematic Gemini patterns drove 6/13 events (47%): - Pattern 1: `kubectl logs` diagnostic conflated with `observability/logging` subtopic (B3 #2, B5 #1, B6 #2) - Pattern 2: postmortem genre conflated with `incident_response/postmortem` subtopic (B7 #1, B7 #2, B8 #2) Both are Phase 3b drift-validator "forbidden pair" candidates (trigger condition k >= 5 topics met at k8s). Category mix: absent-topic 11, missed secondary 2 (paired with cross-topic drops on B1 #4 cost_opt→k8s/scaling and B8 #4 observability/metrics→k8s/scheduling), out-of-vocab 0, intra-vocab 0, over-correction 0. Band-realization sensitivity check: chunk-count drift would be 11/32 = 34.4%, still Upper-outlier. Event-count convention does not flip the band. Baseline observation (n=2 event-count at observability + k8s): drifts span 28.1% to 40.6%. "Prompt-structure drift as constant ~25-30%" hypothesis weakened — topic-level variance appears real. H1/H2/H3 formal reformulation still deferred to kafka per (A)- path; no promotion to working hypothesis at n=2. Post-k8s decision pending divergence (Step 6); handoff Step 3 ✅. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
memtomem
pushed a commit
that referenced
this pull request
Apr 18, 2026
Trigger condition ("≥ 5 topics ledgered") met at k8s — implement the
three-tier drift validator against the accumulated 6-topic curation
ledger (caching baseline + postgres + cost_opt + security +
observability + k8s). Two tiers locked; greenlist deferred until a
false-positive case appears.
**Forbidden** (auto-reject, CLI exit 1):
- Closed-vocab enforcement: unknown topic, unknown subtopic, malformed
tag. Guards against silent closed-vocab drift after the 2026-04-17
freeze.
- `genre-postmortem-vs-ir-postmortem-subtopic`: postmortem-genre chunk
with non-IR primary cannot tag `incident_response/postmortem` as
secondary. k8s Pattern 2 (3 events: B7 #1, B7 #2, B8 #2).
**Manual-review** (warn, no block — curator verifies by body context):
- `kubectl-logs-diagnostic-vs-observability-logging`: k8s/* primary +
obs/logging secondary + body uses `kubectl logs`. k8s Pattern 1
(3 events: B3 #2, B5 #1, B6 #2).
- `security-access-control-primary-with-rbac-body`: security/access_control
primary + body mentions RBAC-specific resources (Role, RoleBinding,
ClusterRoleBinding). Security ledger trouble KO #1, postmortem KO #2.
- `security-encryption-primary-with-transport-body`: security/encryption
primary + transport-layer body (TLS, mTLS, cert-manager, certbot,
PeerAuthentication) AND secondary does not already name
networking/tls or auth/mtls. Security ledger adr EN #1, runbook KO
#2, trouble KO #2. Suppression clause matches "Borderline cases
preserved" precedent (postmortem EN #3, runbook EN #1) where the
curator already captured the split functionally.
Deliverables:
- tools/retrieval-eval/drift_validator.py (~280 lines): parser,
rules, CLI. `uv run python tools/retrieval-eval/drift_validator.py
<corpus_root|fixture.md>`. Corpus walker filters to genre fixtures
only (skips README and other .md).
- packages/memtomem/tests/test_drift_validator.py: 23 tests — parser,
each rule positive + negative, CLI exit codes, corpus-sanity
assertions (zero forbidden, zero manual-review on current corpus).
- tools/retrieval-eval/README.md: tool entry added.
- docs/planning/b2-v2-design.md § "Drift validator": TODO converted
to implemented, inline rule summary with ledger citations.
- docs/planning/b2-v2-phase2b-ledger.md § "Deferred decisions":
checkbox flipped, full rule list recorded.
Validation against current corpus: zero forbidden, zero manual-review.
`ruff check` + `ruff format --check` clean, `mypy` advisory clean,
`pytest packages/memtomem/tests/test_drift_validator.py`: 23 passed.
CI wiring to run this against corpus_v2 on every PR lands at Phase 6.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This was referenced Apr 18, 2026
memtomem
added a commit
that referenced
this pull request
Apr 25, 2026
…on (#464) * feat(mem_add,mem_edit): canonical writer + mem_edit header preservation Stacks on PR #463 (chunker reader). Two coupled writer-side changes: 1. append_entry emits canonical blockquote tags > tags: ["a", "b"] # explicit `> ` prefix, JSON-quoted instead of the legacy lazy-continuation form tags: ['a', 'b'] # Python repr(), no `> ` prefix Lazy continuation is well-defined CommonMark but easy to break (a single blank line ends the blockquote), and Python repr() is not valid YAML. The chunker's section-leading parser already accepts both shapes, so existing files still parse — the writer just stops producing the fragile form on new entries. New test_memory_writer_tag_format.py pins the canonical output so a future regression that reverts to lazy / Python-repr fails fast. 2. mem_edit preserves the per-entry header Pre-PR, mem_edit called replace_lines(start_line, end_line, new_content), which replaced the entire chunk range — heading, blockquote header, body — with whatever the caller supplied. That was already wide pre-RFC, but post-PR-A the chunker strips the blockquote from chunk content, so a caller working from chunk.content would supply body-only and silently erase the metadata header on disk. New replace_chunk_body helper detects the heading + section- leading blockquote group at the top of the chunk's line range and keeps them intact while replacing only the body. If new_content itself starts with `## `, the call reverts to a full replacement (preserving the pre-RFC mem_edit semantic for users who want to override the heading explicitly). New test_mem_edit_header_preservation.py covers six cases (canonical, legacy lazy, no header, full-replace via `## `, non-first sub-chunk of an oversized section, trailing-newline round-trip). Round-trip and follow-ups: - test_multi_agent_integration test_share_copies_chunk_with_audit_tag now also asserts the forward direction: mem_search(tag_filter="shared-from=<src>") returns the share copy and only the share copy. Closes the e2e gap discovered during PR #462 that drove RFC mem-add-tags-blockquote-promote-rfc.md. - test_chunking_blockquote_tags adds a block-list-shape case (`> tags:\n> - a\n> - b`). The current writer never emits this shape, but the shared _parse_tags_value helper supports it and this locks the support in. Implements RFC §Migration & tests #2 + #3 and the writer-PR scope addition (mem_edit header preservation, ③-1) noted in PR #463 review. Co-Authored-By: Claude <[email protected]> * fix(web): preserve per-entry header in PATCH /api/chunks edit endpoint Reviewer caught that web/routes/chunks.py:edit_chunk still called replace_lines directly, which bypasses the header-preservation logic mem_edit gained in this PR. Same footgun, different surface — the Web UI editor surfaces chunk.content (already header-stripped by the chunker), so a Save from the browser would feed body-only content into replace_lines and silently erase the heading + > created: + > tags: on disk. Swap the call to replace_chunk_body so mem_edit and the Web UI route share the same preservation contract. Add a regression test (test_edit_chunk_preserves_blockquote_header) that drives the route end-to-end with a real tmp-file source and asserts the metadata header survives a body-only PATCH. Also folded in two reviewer nits: - mem_edit MCP-tool docstring now mentions the body-only / `## ` bypass policy (was only documented on the underlying helper). - replace_chunk_body has a one-line comment noting that ``## `` is the full-replace trigger because append_entry always emits H2; other heading levels in user input are body content. grep across src/ confirms replace_lines now has zero non-test callers in the codebase. Whether to deprecate it or keep it as a low-level escape hatch is tracked as a follow-up; not closing that question here. Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>
4 tasks
memtomem
pushed a commit
that referenced
this pull request
Apr 25, 2026
…ote header When chunk_file strips a section-leading ``> created:`` / ``> tags:`` blockquote before passing text to _split_section, base_line still points at the heading. line_offset in _split_section started at 0 and counted only body lines, so sub-chunks 2..N reported start_line 3-5 lines earlier than the body they actually covered. Practical impact: mem_edit on a non-first sub-chunk reads file lines [start_line..end_line] via replace_chunk_body. With drift, those lines included real body content from the previous sub-chunk — which _find_body_start_index does not detect as a header (no heading, no leading blockquote at the start of the range), so preserved == [] and the replacement silently dropped that content on save. Fix: _extract_section_blockquote_tags returns a third value, lines_consumed, counting how many input lines the strip consumed. chunk_file adds it to leading_strip_lines (lines stripped by .strip() on the raw section text) and passes the total + 1 (for the heading line itself) as a body_offset kwarg to _split_section, which seeds line_offset with body_offset. Sub-chunks 1..N now align with file lines as well as the pre-strip code did. Sub-chunk 1's start_line still anchors at the heading (line_offset is only consulted on the first emit boundary, not at init), so mem_edit's header-preservation contract for sub-chunk 1 is preserved. Single-chunk sections and sections without any blockquote header are unchanged: body_offset stays 0 in those paths. Two new tests in test_chunking_blockquote_tags.py: - test_blockquote_header_does_not_drag_sub_chunk_start_lines: a controlled oversize section with an explicit blockquote header, sub-chunks 2..N must report start_line >= body-start file line. Fails pre-fix. - test_oversize_section_without_blockquote_unchanged: pre-PR-A path is preserved when there's no header to strip. Closes RFC ``planning/mem-add-tags-blockquote-promote-rfc.md`` §Follow-ups #2. Co-Authored-By: Claude <[email protected]>
memtomem
added a commit
that referenced
this pull request
Apr 25, 2026
…ote header (#466) * fix(chunking): seed _split_section line counter past stripped blockquote header When chunk_file strips a section-leading ``> created:`` / ``> tags:`` blockquote before passing text to _split_section, base_line still points at the heading. line_offset in _split_section started at 0 and counted only body lines, so sub-chunks 2..N reported start_line 3-5 lines earlier than the body they actually covered. Practical impact: mem_edit on a non-first sub-chunk reads file lines [start_line..end_line] via replace_chunk_body. With drift, those lines included real body content from the previous sub-chunk — which _find_body_start_index does not detect as a header (no heading, no leading blockquote at the start of the range), so preserved == [] and the replacement silently dropped that content on save. Fix: _extract_section_blockquote_tags returns a third value, lines_consumed, counting how many input lines the strip consumed. chunk_file adds it to leading_strip_lines (lines stripped by .strip() on the raw section text) and passes the total + 1 (for the heading line itself) as a body_offset kwarg to _split_section, which seeds line_offset with body_offset. Sub-chunks 1..N now align with file lines as well as the pre-strip code did. Sub-chunk 1's start_line still anchors at the heading (line_offset is only consulted on the first emit boundary, not at init), so mem_edit's header-preservation contract for sub-chunk 1 is preserved. Single-chunk sections and sections without any blockquote header are unchanged: body_offset stays 0 in those paths. Two new tests in test_chunking_blockquote_tags.py: - test_blockquote_header_does_not_drag_sub_chunk_start_lines: a controlled oversize section with an explicit blockquote header, sub-chunks 2..N must report start_line >= body-start file line. Fails pre-fix. - test_oversize_section_without_blockquote_unchanged: pre-PR-A path is preserved when there's no header to strip. Closes RFC ``planning/mem-add-tags-blockquote-promote-rfc.md`` §Follow-ups #2. Co-Authored-By: Claude <[email protected]> * test: add no-tags blockquote oversize regression for line-drift fix Reviewer N3: gap in coverage — `_extract_section_blockquote_tags` short-circuits to ([], text, 0) when no `tags:` key is found, so the blockquote stays in content and `blockquote_strip_lines` is 0. The existing tests covered (a) tags present + strip and (b) no blockquote at all, but not (c) blockquote present without tags. The new case verifies for an oversize section with `> created:` only: - sub-chunks inherit no tags (no promotion happened) - first sub-chunk's content still carries `> created:` (mem_edit can preserve it on body-only edits) - sub-chunks 2..N start_line still aligns with the body — no drift from the leading_strip path even when the blockquote isn't consumed by the strip helper Also tracks reviewer N1 (paragraph-gap +2 imprecision in _split_section) in the RFC follow-ups: separate from this PR; cosmetic end_line overshoot, not data loss. Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>
4 tasks
This was referenced Apr 28, 2026
memtomem
pushed a commit
that referenced
this pull request
Apr 29, 2026
… in toast, align canonical_root style Follow-up to PR #549 review: - **#1 (low-medium)**: drop the hardcoded `_CTX_EMPTY_HINT_META` JS map. GET `/api/context/{skills,commands,agents}` now also returns `canonical_root: str` and `scanned_dirs: list[str]` (computed once at module import from `SKILL_DIRS` / `AGENT_DIRS` / `COMMAND_DIRS`), so the empty-state hint pulls from the wire instead of duplicating the detector layout client-side. The JS still has a one-line fallback (`.memtomem/${type}` + empty list) for older backends, but it never fires when the cache-bust takes effect. - **#2 (cosmetic)**: skills sync response now uses the `CANONICAL_SKILL_ROOT = ".memtomem/skills"` constant directly, matching the `CANONICAL_AGENT_ROOT` / `CANONICAL_COMMAND_ROOT` pattern in agents/commands routes. Drops the `_safe_rel(canonical_skills_root( project_root), project_root)` round-trip — both forms resolve to the same string today, but a constant won't drift if a future PR adds env / scope resolution to `canonical_skills_root()`. - **#3 (low)**: import-no-runtimes toast renders `basename(project_root)` instead of the absolute path. The wire still carries the absolute string (consistent with `/api/system/*` and useful for debugging / reverse-proxy contexts) but the user-facing toast keeps it short — `scanned_dirs` already gives full orientation. New `_ctxBasename()` helper handles the POSIX path split. Cache-bust bumped to `context-gateway.js?v=3`. Tests: - New assertions on `TestListSkills.test_empty`, `TestListCommands.test_empty`, `TestListAgents.test_empty` verify GET response now includes `canonical_root` + `scanned_dirs`. - `pytest -m "not ollama"` 3152 passed; `ruff check` + `ruff format --check` clean. Skipped per reviewer guidance: - #4 `Final[str]` → `Literal` / `StrEnum` (mypy is advisory). - #5 additional positive route assertions for `invalid_name` / `already_imported` / `toml_parse_error` (route layer is a trivial dict comprehension; covered at the core layer). - #6 Korean `<이름>` literal angle-bracket (intentional placeholder guide, not an unsubstituted variable). Co-Authored-By: Claude <[email protected]>
4 tasks
memtomem
added a commit
that referenced
this pull request
Apr 29, 2026
* fix(web): surface "why nothing happened" on Skills/Commands/Agents Sync+Import
Users reported "Sync 도 Import 도 동작이 이상함" — both buttons in
mm web → Settings → Skills/Commands/Agents looked successful (a green
toast popped) but nothing actually moved on disk. The root cause was a
shape mismatch in the response handling, not a wiring bug:
- For skills, the sync response carries `skipped` (e.g. "no canonical
skills") but never `dropped`. The client only read `dropped`, so the
skipped reason was silently dropped on the floor.
- The import response carried no metadata at all, so a 0-imported,
0-skipped result (the common case when no `.claude/skills/` exists in
the project) just rendered "Import completed" with no clue that the
scanner found nothing because the directories didn't exist.
This change keeps cwd-bound single-project behavior unchanged and adds
machine-readable plumbing the UI can match against:
- All three context types' `Sync` and `Import` core layers
(`memtomem.context.{skills,commands,agents}`) now record skipped items
as `(name, reason, reason_code)` triples. Reason codes come from a new
`memtomem.context._skip_reasons` module: `no_canonical_root`,
`unknown_runtime`, `parse_error`, `invalid_name`, `already_imported`,
`canonical_exists`, `toml_parse_error`. Human `reason` strings stay
for CLI/log output.
- Web route handlers (`web/routes/context_{skills,commands,agents}.py`)
surface `reason_code` per skipped item, plus `canonical_root` on sync
responses and `project_root` + `scanned_dirs` on import responses.
- The web client (`context-gateway.js`) reads `data.skipped` (in
addition to `data.dropped` for commands/agents field-level drops),
matches on `reason_code === "no_canonical_root"` to show
"No canonical {type} under {canonical}. Create one first." instead of
a generic success, and on a 0/0 import shows "No runtime {type} found
in {project_root}. Scanned: {scan_dirs}." so users see exactly which
paths the detector inspected.
- The empty list state hint now points at the canonical and runtime
paths instead of "Create one or import from existing runtimes." — a
fresh user can drop a `SKILL.md` into the right directory without
guessing.
- Three new placeholder-form i18n keys (`empty_hint`,
`import_no_runtimes`, `sync_empty_canonical`) cover all three types
via `{type}` / `{canonical}` / `{root}` / `{scan_dirs}` substitution
rather than 9 type-specific keys.
- `context-gateway.js?v=2` cache-bust.
All response changes are additive — existing clients that only read
`imported`/`generated`/`skipped[].name|runtime|reason` keep working
unchanged. The CLI (`mm context skills/agents/commands ...`) and the
MCP `context` tool also unpack the new triple shape via `_code` discard
so their human-readable output is byte-identical.
Tests:
- 3 core test suites (`test_context_{skills,commands,agents}.py`)
updated to the 3-tuple shape.
- New web route assertions verify `reason_code`, `canonical_root`,
`project_root`, and `scanned_dirs` are present in sync/import
responses for the empty-canonical and empty-runtime paths.
- Full suite green: `pytest -m "not ollama"` 3152 passed; `ruff check`
+ `ruff format --check` clean.
Co-Authored-By: Claude <[email protected]>
* fix(web/context): address review — surface scan dirs on GET, basename in toast, align canonical_root style
Follow-up to PR #549 review:
- **#1 (low-medium)**: drop the hardcoded `_CTX_EMPTY_HINT_META` JS map.
GET `/api/context/{skills,commands,agents}` now also returns
`canonical_root: str` and `scanned_dirs: list[str]` (computed once at
module import from `SKILL_DIRS` / `AGENT_DIRS` / `COMMAND_DIRS`), so the
empty-state hint pulls from the wire instead of duplicating the
detector layout client-side. The JS still has a one-line fallback
(`.memtomem/${type}` + empty list) for older backends, but it never
fires when the cache-bust takes effect.
- **#2 (cosmetic)**: skills sync response now uses the
`CANONICAL_SKILL_ROOT = ".memtomem/skills"` constant directly,
matching the `CANONICAL_AGENT_ROOT` / `CANONICAL_COMMAND_ROOT` pattern
in agents/commands routes. Drops the `_safe_rel(canonical_skills_root(
project_root), project_root)` round-trip — both forms resolve to the
same string today, but a constant won't drift if a future PR adds env
/ scope resolution to `canonical_skills_root()`.
- **#3 (low)**: import-no-runtimes toast renders
`basename(project_root)` instead of the absolute path. The wire still
carries the absolute string (consistent with `/api/system/*` and
useful for debugging / reverse-proxy contexts) but the user-facing
toast keeps it short — `scanned_dirs` already gives full orientation.
New `_ctxBasename()` helper handles the POSIX path split.
Cache-bust bumped to `context-gateway.js?v=3`.
Tests:
- New assertions on `TestListSkills.test_empty`,
`TestListCommands.test_empty`, `TestListAgents.test_empty` verify
GET response now includes `canonical_root` + `scanned_dirs`.
- `pytest -m "not ollama"` 3152 passed; `ruff check` + `ruff format
--check` clean.
Skipped per reviewer guidance:
- #4 `Final[str]` → `Literal` / `StrEnum` (mypy is advisory).
- #5 additional positive route assertions for `invalid_name` /
`already_imported` / `toml_parse_error` (route layer is a trivial
dict comprehension; covered at the core layer).
- #6 Korean `<이름>` literal angle-bracket (intentional placeholder
guide, not an unsubstituted variable).
Co-Authored-By: Claude <[email protected]>
* refactor(context): narrow reason_code to Literal SkipCode
Tightens the typing on the third element of `(name, reason, reason_code)`
triples produced by `SkillSyncResult.skipped`, `CommandSyncResult.skipped`,
`AgentSyncResult.skipped`, and `ExtractResult.skipped`. The new
`memtomem.context._skip_reasons.SkipCode` is a `Literal` of the seven
codes already enumerated in that module, so a typo at the construction
site fails type-check instead of slipping through to the wire.
Also strengthens the import-empty web-route tests for skills, commands,
and agents to assert `data["project_root"] == str(tmp_path)` instead of
just key presence — confirms the response actually echoes the
fixture's project root rather than some unrelated path.
No runtime behavior change; addresses follow-up polish flagged in the
review of this PR.
Co-Authored-By: Claude <[email protected]>
---------
Co-authored-by: pandas-studio <[email protected]>
Co-authored-by: Claude <[email protected]>
memtomem
pushed a commit
that referenced
this pull request
Apr 29, 2026
Self-review of PR #553 surfaced five real issues — fixing them keeps the surface honest about what's actually wired up. - Drop dead helpers in ``context/projects.py`` (#1): - ``_is_eaccess`` was never called. - ``is_project_scope_id`` claimed to support route error formatting but no route used it. - ``_new_temp_known_projects_path`` claimed test convenience but no test imported it. - ``errno`` and ``tempfile`` imports follow. - Drop ``ProjectScope.counts`` dataclass field (#2). Discovery never populated it; the route layer (``_scope_to_dict``) always overrides via ``_counts_for(root)``. Forward-design pollution. - ``POST /api/context/known-projects`` now returns ``warning_code: "no_runtime_marker"`` alongside the prose ``warning`` (#4). Matches PR1's (#549) machine-readable ``reason_code`` pattern so client matching is i18n-stable; the prose stays for back-compat. - ``_counts_for`` gains a cost comment (#3): 3 × (canonical scan + N runtime scans) per scope on every projects GET. Acceptable at <30 scopes; cache when discovery growth pushes that ceiling. - JS scope badge no longer carries a redundant ``data-i18n`` attribute (#5). The inline ``t()`` already renders text at construction; the attribute would let the i18n DOM walker re-translate and clobber. - ``ctx-scope-summary`` now carries a ``title="${scope.root}"`` (#8) so same-name scopes (``Edu/inflearn`` vs ``Work/inflearn``) disambiguate on hover without bloating the visible label. Cache-bust: ``context-gateway.js?v=4`` → ``?v=5``. Tests updated: ``test_post_warns_on_missing_marker`` asserts the new ``warning_code``; ``test_post_no_warning_when_marker_present`` asserts both fields are absent. 110/110 green; ruff + format clean. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
memtomem
added a commit
that referenced
this pull request
Apr 29, 2026
…#553) * feat(web): multi-project read-only discovery for context-gateway tabs PR2 of the multi-project context UI series — see ``memtomem-docs/memtomem/planning/multi-project-context-ui-rfc.md``. A user running ``mm web`` from the ``memtomem`` repo could previously only see skills / commands / agents under that one cwd; ``~/Edu/inflearn/ .claude/skills/`` was invisible without restarting the server. PR2 adds discovery for additional project roots (server cwd, user-registered ``known_projects.json``, opt-in ``~/.claude/projects/`` reverse-decode) and renders one collapsible ``<details>`` group per scope inside each tab. Mutating routes stay cwd-only — multi-scope writes ship in PR3. Backend ------- - ``memtomem.context.projects`` is the new discovery module: - ``ProjectScope`` carries ``{scope_id, label, root, tier, sources, missing, experimental, counts}``. - ``compute_scope_id`` derives ``"p-" + sha256(case-normalized Path.resolve())[:12]`` so refresh / restart preserve the id (RFC §Decision 4). On macOS the input is also lowercased: APFS is case-insensitive but ``Path.resolve()`` does not canonicalize case. - ``KnownProjectsStore`` writes ``~/.memtomem/known_projects.json`` via ``tempfile + os.replace`` plus a ``.<name>.lock`` sidecar fcntl lock — locking the data file directly does not survive ``os.replace`` (PR #548 issue). - ``discover_project_scopes`` unions cwd + known-projects + (opt-in) claude-projects, dedupes by resolved path, and clears ``experimental`` on union with a trusted source. - ``ContextGatewayConfig`` (new) carries ``known_projects_path``, ``experimental_claude_projects_scan: bool = False``, and ``user_tier_enabled: bool = False`` (forward-compat for PR3). Routes ------ New ``web/routes/context_projects.py``: - ``GET /api/context/projects`` — full scope list with per-type counts. - ``POST /api/context/known-projects`` — register; absolute + ``is_dir()`` validation; warning (HTTP 200) when no ``.claude``/``.gemini``/``.agents``/ ``.memtomem`` marker is present so users can pre-register an empty checkout. - ``DELETE /api/context/known-projects/{scope_id}`` — drop, including stale entries (matching is path-derived). - ``resolve_scope_root`` is the dependency the existing ``/api/context/{skills,commands,agents}`` GETs now use; without ``?scope_id=`` it falls back to the server cwd so PR1's mutating cwd flow keeps working unchanged. Unknown / stale scope_id → 404. UI -- Each tab gains an "Add Project" button; the list area renders one ``<details class="ctx-scope-group">`` per scope, lazy-fetching items on expand. Server CWD opens by default; non-cwd scope items render as read-only cards (PR2 keeps mutating buttons targeting cwd only). Each non-cwd group has an ✕ button that calls DELETE ``/api/context/known-projects/{scope_id}``. Three new i18n keys (placeholder-form, en + ko parity). Tests ----- PR2 minimum-bar from RFC §Test obligations: - ``tests/test_context_projects.py`` — scope_id stability + collision sanity, trailing-slash invariance, macOS case-insensitivity, symlink dedup, ``experimental`` flag clears on union, both ``experimental_claude_projects_scan`` defaults, ``known_projects.json`` corruption / unknown-version recovery, stale-entry removal, multiprocessing.Process atomic-write race. - ``tests/test_web_routes_context_projects.py`` — HTTP shape, unknown scope_id 404, POST 400 paths (relative / nonexistent / file), marker warning, idempotent registration, DELETE round-trip + 404. Manual smoke ------------ ``HOME=/tmp/.../home uv run mm web --port 8088``: GET projects shows cwd; POST /tmp returns 200 + warning; GET projects shows two scopes (``Server CWD`` + ``tmp``); skills?scope_id=p-bogus → 404; DELETE round-trip works; second DELETE → 404. Refs RFC ``multi-project-context-ui``; gates PR3 (multi-project mutating + Claude user-tier). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * review fixes: drop dead helpers, add warning_code, summary tooltip Self-review of PR #553 surfaced five real issues — fixing them keeps the surface honest about what's actually wired up. - Drop dead helpers in ``context/projects.py`` (#1): - ``_is_eaccess`` was never called. - ``is_project_scope_id`` claimed to support route error formatting but no route used it. - ``_new_temp_known_projects_path`` claimed test convenience but no test imported it. - ``errno`` and ``tempfile`` imports follow. - Drop ``ProjectScope.counts`` dataclass field (#2). Discovery never populated it; the route layer (``_scope_to_dict``) always overrides via ``_counts_for(root)``. Forward-design pollution. - ``POST /api/context/known-projects`` now returns ``warning_code: "no_runtime_marker"`` alongside the prose ``warning`` (#4). Matches PR1's (#549) machine-readable ``reason_code`` pattern so client matching is i18n-stable; the prose stays for back-compat. - ``_counts_for`` gains a cost comment (#3): 3 × (canonical scan + N runtime scans) per scope on every projects GET. Acceptable at <30 scopes; cache when discovery growth pushes that ceiling. - JS scope badge no longer carries a redundant ``data-i18n`` attribute (#5). The inline ``t()`` already renders text at construction; the attribute would let the i18n DOM walker re-translate and clobber. - ``ctx-scope-summary`` now carries a ``title="${scope.root}"`` (#8) so same-name scopes (``Edu/inflearn`` vs ``Work/inflearn``) disambiguate on hover without bloating the visible label. Cache-bust: ``context-gateway.js?v=4`` → ``?v=5``. Tests updated: ``test_post_warns_on_missing_marker`` asserts the new ``warning_code``; ``test_post_no_warning_when_marker_present`` asserts both fields are absent. 110/110 green; ruff + format clean. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
This was referenced Apr 30, 2026
Closed
Merged
memtomem
added a commit
that referenced
this pull request
Apr 30, 2026
The upload dropzone in the Index tab spanned the full panel width (~750px at typical card sizes) with only an icon and two short lines of content, making it look disproportionately empty. Cap `.upload-drop` at `max-width: 520px` with `margin: 0 auto` so the dropzone itself centers inside the panel while sibling controls (file list, upload button, result rows) keep their natural full-width layout. Refs #582 (Prev #2). Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>
This was referenced Apr 30, 2026
memtomem
pushed a commit
that referenced
this pull request
May 4, 2026
Codex review of #784 flagged that ``enforce_write_guard``'s audit log renders ``audit_context`` values verbatim with ``repr()``. Several callers pass user-controllable strings — ``file`` (mem_add / web_api_add), ``filename`` (web_api_upload), ``key`` (scratch promote), ``chunk_id`` — and a bypass on a request whose path / filename / key embeds the same secret as the body would surface that secret in ``server.log`` even though the matched bytes from ``content`` never do. The "matched bytes must never reach logs" goal must hold for context fields too, not just the body. Add ``_sanitize_audit_value`` between caller and log emit: - non-string values (None, int, bool) pass through untouched - string values are re-scanned with ``DEFAULT_PATTERNS``; any hit replaces the whole value with the fixed marker ``<redacted: secret-shape>`` so operators see the field was scrubbed rather than missing - clean strings longer than 200 chars are truncated to keep an abusive caller from ballooning the audit line Two regression pins: - ``test_secret_in_audit_context_value_is_redacted`` — embeds the same ``sk-…`` token in the body, in ``file=``, and in ``filename=``; asserts the matched bytes never reach the log message and the redaction marker takes the secret-shape value's place. Non-secret context fields (``namespace``, ``item_idx``) still pass through. - ``test_long_audit_context_string_is_truncated`` — caps line length with the ``...(truncated)`` marker. Codex Major #2 (SPA app.js / settings-harness.js cannot pass ``force_unsafe`` on the structured 403 path) is intentionally a follow-up: backend currently fails closed (HTTP 403, no write) which is the secure direction; the SPA confirm-and-retry UX is a separate frontend change with its own scope. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
memtomem
added a commit
that referenced
this pull request
May 5, 2026
…en write surface (#784) * sec(privacy): apply trust-boundary redaction guard to every user-driven write surface The MCP `mem_add` path enforces a secret-class redaction guard before any filesystem write, but the surrounding ingress paths (Web `POST /api/add`, `POST /api/upload`, `PATCH /api/chunks/{id}`, scratch promotion, MCP `mem_edit`, CLI `mm add` / shell `add` / agent share, LangGraph adapter) were writing `req.content` / `new_content` / `entry["value"]` directly. The same SQLite database could end up holding cleartext provider tokens depending on which route the agent or user came in through. Centralise the scan-decide-log shape in `privacy.enforce_write_guard(content, surface=…, force_unsafe=False, audit_context=…)` returning a `WriteGuardResult` (`pass` / `blocked` / `bypassed`) and apply it at every gap surface. Each surface records under its own `by_tool` key so attribution stays observable; audit-log lines render `audit_context` as `key=value` pairs after `surface=` and never echo matched bytes. Web mutating routes return HTTP 403 with `{detail: "redaction_blocked", hits: N, surface: …}` so the SPA can surface a confirm-and-retry UX before re-sending with `force_unsafe=true`. `mem_batch_add` keeps its inline scan + per-entry counter pattern — the helper's single-content shape would obscure the transactional "one hit rejects the whole batch" rule that test_memory_crud_redaction already pins. Out of scope (follow-up): - `indexing/importers.py` and `indexing/url_fetcher.py` write external file/URL content rather than user-driven content; the trust model is different (operator-authorised indexing, not direct user submission) and merits its own decision pass. Tests: - `test_privacy.py::TestEnforceWriteGuard` — helper contract (pass / blocked / bypassed-with-audit / clean+force-unsafe still records pass / matched bytes never reach the log). - `test_redaction_write_surfaces.py` — `mem_edit`, CLI `mm add`, and the LangGraph adapter all run the guard and route the counter to the right `by_tool` key. - `test_web_routes.py` — new `TestAddMemoryRedaction`, `TestEditChunkRedaction`, `TestUploadRedaction`, `TestScratchPromoteRedaction` cover all four web mutating routes (block / bypass / clean-pass) end-to-end via ASGI client. Existing 36-test redaction floor + 4048-test full suite stay green; ruff check + format clean. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * fix(privacy): redact secret-shaped audit-context values before logging Codex review of #784 flagged that ``enforce_write_guard``'s audit log renders ``audit_context`` values verbatim with ``repr()``. Several callers pass user-controllable strings — ``file`` (mem_add / web_api_add), ``filename`` (web_api_upload), ``key`` (scratch promote), ``chunk_id`` — and a bypass on a request whose path / filename / key embeds the same secret as the body would surface that secret in ``server.log`` even though the matched bytes from ``content`` never do. The "matched bytes must never reach logs" goal must hold for context fields too, not just the body. Add ``_sanitize_audit_value`` between caller and log emit: - non-string values (None, int, bool) pass through untouched - string values are re-scanned with ``DEFAULT_PATTERNS``; any hit replaces the whole value with the fixed marker ``<redacted: secret-shape>`` so operators see the field was scrubbed rather than missing - clean strings longer than 200 chars are truncated to keep an abusive caller from ballooning the audit line Two regression pins: - ``test_secret_in_audit_context_value_is_redacted`` — embeds the same ``sk-…`` token in the body, in ``file=``, and in ``filename=``; asserts the matched bytes never reach the log message and the redaction marker takes the secret-shape value's place. Non-secret context fields (``namespace``, ``item_idx``) still pass through. - ``test_long_audit_context_string_is_truncated`` — caps line length with the ``...(truncated)`` marker. Codex Major #2 (SPA app.js / settings-harness.js cannot pass ``force_unsafe`` on the structured 403 path) is intentionally a follow-up: backend currently fails closed (HTTP 403, no write) which is the secure direction; the SPA confirm-and-retry UX is a separate frontend change with its own scope. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * fix(privacy): route mem_batch_add bypass audit through the shared sanitizer Codex re-review of #784 showed the audit-context sanitizer landed in cd27c99 only covered ``enforce_write_guard``'s single-content path. ``mem_batch_add`` keeps its own transactional decision shape and its own ``logger.warning`` line (``memory_crud.py:472``), so a secret-shaped ``file=`` argument still leaked verbatim into the warning log on a bypass — exactly the leak class cd27c99 was meant to close. Extract the audit emit into a public ``privacy.emit_bypass_audit`` seam: - Same line shape as before (``redaction bypass via force_unsafe=True (surface=…, …, content_chars=…, hits=…)``). - Each ``audit_context`` value runs through ``_sanitize_audit_value`` before ``%r`` formatting, so a secret embedded in a user-controlled string field (``namespace`` / ``file`` / ``filename`` / ``key`` / ``chunk_id``) gets replaced with ``<redacted: secret-shape>`` regardless of which entry point logged it. - Counter (``record(...)``) is intentionally NOT bumped by the emitter; ``mem_batch_add`` still owns its per-item bookkeeping contract that ``test_memory_crud_redaction.py`` already pins. ``enforce_write_guard`` now delegates to ``emit_bypass_audit``, and ``mem_batch_add`` calls it directly with the same audit_context shape (``namespace`` / ``file`` / ``item_idx``). New regression ``TestMemBatchAddRedactionGuard.test_secret_shaped_file_argument_is_redacted_in_audit_log`` reproduces Codex's manual repro: feed a secret-matching value plus a ``file=…sk-…`` filename into a force_unsafe batch and assert the warning line never contains the matched-bytes prefix while still carrying the redaction marker. Stale-comment Minor (``app.js:3730`` / ``:3763``) lands separately as #786 on main; out of scope here. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * fix(privacy): emit real per-item hit count in mem_batch_add bypass audit The shared sanitizer wired up in 18208bd carried ``hits=1`` for every batch bypass, because the pre-scan loop only stored ``hit_indices`` without keeping the per-item ``len(privacy.scan(value))``. The single- content path (``enforce_write_guard``) reports the real count, so operators reading the audit log saw inconsistent ``hits=`` values across surfaces (Codex review of PR #784, Minor #1). Switch the pre-scan accumulator to ``hit_counts: dict[int, int]`` so each entry that matches records its true hit count, then pass that through ``emit_bypass_audit(..., hits=hit_counts[idx])``. The whole-batch reject error message keeps the same shape — the indices list is still rendered (now via ``sorted(hit_counts)``) so the user- facing message and the existing ``test_full_reject_lists_hit_indices`` pin both still hold. New regression ``TestMemBatchAddRedactionGuard.test_audit_hits_count_matches_scan_result`` feeds an entry that matches two patterns (``api_key:`` + ``sk-…``) and asserts the audit line carries ``hits=<real count>``, not the placeholder ``hits=1`` it used to. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>
memtomem
pushed a commit
that referenced
this pull request
May 5, 2026
Addresses ultrareview P2 follow-ups on PR-1: P2 #1 — services.tag_management.rename_tag/delete_tag/merge_tags now strip inputs and reject if empty after strip. Web routes passed ``body.new_name`` raw, while MCP pre-stripped — the asymmetry let whitespace-only names persist via PUT /api/tags/{name}. The service is the single source of truth, so the gate moves there; MCP's pre-strip stays as harmless defense-in-depth. P2 #2 — services.tag_management.merge_tags dry-run no longer caps ``affected_chunks`` at the per-tag scan limit. A new ``count_chunks_by_any_tag`` storage helper does a single ``COUNT(*) WHERE EXISTS (… IN (?, …))`` over the source-tag union, so a tag attached to more than 10,000 chunks reports the true count instead of an undercount in the confirmation modal. The sample loop still caps at ``_DRY_RUN_SAMPLE_CAP`` since the UI only renders ten previews. Co-Authored-By: Claude <[email protected]>
memtomem
added a commit
that referenced
this pull request
May 5, 2026
…795) * feat(storage): scaffold tag-management helpers + service skeleton (#688 PR1 prep) Lays the policy-independent groundwork for the tag rename/delete/merge ops requested in #688. No callers wired yet — bodies fill in after the confirm thread on the RFC resolves the updated_at policy and MCP-migration scope. - ``storage.list_chunks_by_tag`` + ``count_chunks_by_tag`` for the dry-run sample / count path the confirmation modal needs. - ``storage.merge_tags`` as the third primitive next to the existing ``rename_tag`` / ``delete_tag``; dedupes the resulting per-chunk tag list and treats the target appearing in ``sources`` as a no-op source. - ``SqliteBackend._tag_write_lock`` (``asyncio.Lock``) so the upcoming service layer and ``auto_tag_storage`` can serialize their read-modify-write of ``chunks.tags`` within a single process. Cross-process safety still falls back to SQLite's WAL file lock — this is an in-process invariant only. - ``services/tag_management.py`` skeleton (``TagOpResult`` / ``TagOpSample`` + ``NotImplementedError`` stubs) so the Web (PR1), MCP (PR1 if migration is approved), and CLI (PR3) surfaces share a single entry point. Service exposes ``bump_updated_at`` as an explicit parameter so the policy choice lives at the route layer rather than buried in the service body. 22 storage tests pass (7 new); lint + format clean. Co-Authored-By: Claude <[email protected]> * fix(auto_tag): preserve full chunk metadata + simplify list helper (#688) Two corrections to the prep commit (8507b54), driven by an audit cross-check on the #688 RFC: 1. ``auto_tag_storage`` rebuilt ``ChunkMetadata`` from an explicit 8-field list, silently dropping ``overlap_before/after``, ``parent_context``, ``file_context``, and the temporal-validity columns (``valid_from_unix`` / ``valid_to_unix``). Same shape bug ``web/routes/chunks.py:update_chunk_tags`` already fixed with a spread pattern; consolidate. Pinned by a regression test that fails on the pre-fix code (mutation-validated: ``valid_from_unix`` reads as ``None`` after the broken rewrite). 2. ``list_chunks_by_tag`` now thin-wraps ``recall_chunks(tag_filter=...)``. The latter already implements the ``json_each`` EXISTS match and ORDER BY shape this helper needs (``sqlite_backend.py:996``); the parallel SQL added in the prep commit was dead duplication. 22 tag/auto-tag tests pass; lint/format clean. Co-Authored-By: Claude <[email protected]> * feat(storage): bump updated_at on tag rename/delete/merge (#688) Per the #688 confirm thread (option (a)): tag-only mutations now bump ``chunks.updated_at`` so downstream consumers see the rewrite as a write event. ``decay.py`` reads ``updated_at`` for chunk age, which means a rename or delete deliberately resets the decay timer for the affected chunks. The trade-off is documented in each helper's docstring; v1 treats tag fix-ups as "this chunk is being curated." Symmetric pin tests cover both halves of the invariant: the mutated chunk's ``updated_at`` moves forward AND an untouched chunk's ``updated_at`` stays anchored. Negative-only would false-pass against a no-op implementation. Mutation-validated (3/3 fail on the unhardened code). Co-Authored-By: Claude <[email protected]> * feat(services): implement tag-management service bodies (#688) Replaces the prep skeleton's NotImplementedError stubs with the real rename / delete / merge implementations. Each entry point: - Acquires ``storage._tag_write_lock`` for the read-modify-write window (in-process serialization with ``auto_tag_storage``). - On dry-run: returns ``count_chunks_by_tag`` plus a sample of up to 10 chunk previews via ``list_chunks_by_tag``; storage stays untouched. - On apply: delegates to the hardened storage helper (which now bumps ``updated_at``) and calls ``search_pipeline.invalidate_cache()`` on a non-zero affected count, mirroring ``web/routes/system.py:418``. The ``bump_updated_at`` parameter from the skeleton is dropped: the storage helpers now bump ``updated_at`` unconditionally per the #688 confirm thread, so the parameter would have been dead. Merge dry-run counts the union of candidate chunks across ``sources`` (excluding ``target``); this set equals what the storage helper will actually mutate because every source-bearing chunk's tag list changes when a source is rewritten to target. Target-only chunks are filtered out of the source set up front, so they are not counted. 13 service tests pass — input validation, dry-run no-write, apply + invalidate, no-match no-invalidate, merge union counting, lock acquisition observable from outside. Co-Authored-By: Claude <[email protected]> * feat(auto_tag): serialize write window via _tag_write_lock + re-read (#688) Closes the race window between ``auto_tag_storage`` and the new tag-management service: previously an auto-tag run could spend N seconds in keyword/LLM extraction, then upsert with the pre-extraction tag snapshot — silently overwriting any rename / delete / merge that landed during that window. The lock is held narrowly. Slow extraction stays outside, only the read-modify-write step (re-read latest chunk → spread metadata → upsert) runs under ``storage._tag_write_lock``. The re-read is what actually preserves the concurrent mutation: spreading on top of the just-loaded ``ChunkMetadata`` carries the latest tag list (already mutated by the service) so auto_tag layers ``new_tags`` on top instead of clobbering it. Lock-narrow + invariant via mutation ordering, per the pattern in ``feedback_lock_narrow_invariant_ordering``. Lock-acquisition test holds the lock externally and verifies auto_tag waits for it (mutation-validated: removing the lock makes the test fail). Existing field-preservation regression still passes against the re-read shape. Co-Authored-By: Claude <[email protected]> * feat(web): add /api/tags rename/delete/merge routes via shared service (#688) Closes the Web-side asymmetry the audit on #688 surfaced: until now the MCP surface had ``tag_management`` tools but the Web tab had no rename / delete / merge UX surface, so any tag fix-up required either editing chunks one by one or re-running ``auto_tag overwrite=true``. - ``PUT /api/tags/{name}`` — rename to ``body.new_name`` - ``DELETE /api/tags/{name}`` — drop the tag from every chunk - ``POST /api/tags/merge`` — fold ``body.sources`` into ``body.target`` All three honour ``?dry_run=true`` and route through ``services.tag_management``, so they pick up the lock acquisition, ``updated_at`` bump, and ``search_pipeline.invalidate_cache()`` calls without re-implementing them. ``ValueError`` from the service (empty new_name / target) maps to a 400. 8 route tests cover apply + dry-run + 400 paths and assert that ``invalidate_cache`` fires on apply but not on dry-run or no-op (``sources`` collapsing to just ``target``). Co-Authored-By: Claude <[email protected]> * feat(mcp): route tag management through shared service + add mem_tag_merge (#688) Closes the MCP/Web asymmetry the audit on #688 surfaced. The MCP tools now route through ``services.tag_management`` so both surfaces share one lock, one ``updated_at`` policy, and one ``invalidate_cache`` contract. ``mem_tag_merge`` is added so MCP picks up the same merge shape Web exposed in the previous commit. - ``mem_tag_rename`` / ``mem_tag_delete`` / ``mem_tag_merge`` all gain ``dry_run: bool = False`` for the preview path. Defaults to False so existing MCP callers (no migration noise) keep applying as before. - Same input validation as the Web routes (empty old/new/target/sources rejected with a clear error string the chat agent can show the user). - Result text formatter renders the dry-run sample chunks below the "would affect N chunks" line so the user sees what the apply would touch. Parity test discriminator: the cache invalidation. ``storage.rename_tag`` / ``delete_tag`` / ``merge_tags`` don't know about ``SearchPipeline``, so only the service-layer path can fire ``invalidate_cache``. The test swaps ``invalidate_cache`` for a counter and asserts it fires exactly once on apply, never on dry-run, and never on no-op (``sources`` collapsing to just ``target``). Mutation-validated: swapping ``mem_tag_rename`` body to call storage directly makes the counter assertion fail with ``0 == 1``. Co-Authored-By: Claude <[email protected]> * test(guards): register new tag-management surfaces in AST invariants (#688) The CSRF / redaction / namespace AST guards keep the security classification registries honest by failing whenever a new mutating surface lands without an explicit row in the registry. PR1 added three Web routes (``tags.rename_tag``, ``tags.delete_tag``, ``tags.merge_tags``) and one MCP tool (``mem_tag_merge``); register them so the suite stays green and the next contributor sees the right classification template. - ``_CSRF_PROTECTED``: all three routes (default for unsafe-method surfaces; the middleware already covers them). - ``_REDACTION_EXEMPT``: all three routes — tags are short labels and these handlers rewrite existing chunk metadata only, so the LTM trust-boundary write guard does not apply (mirrors the existing ``chunks.update_chunk_tags`` and ``tags.run_auto_tag`` rows). - ``DEFERRED_NS_SURFACES``: ``mem_tag_merge.target`` — the parameter-name heuristic catches ``target=`` but the value is a tag name, not a namespace, so no namespace validator applies. Co-Authored-By: Claude <[email protected]> * fix(services): strip tag inputs + accurate merge dry-run count (#688) Addresses ultrareview P2 follow-ups on PR-1: P2 #1 — services.tag_management.rename_tag/delete_tag/merge_tags now strip inputs and reject if empty after strip. Web routes passed ``body.new_name`` raw, while MCP pre-stripped — the asymmetry let whitespace-only names persist via PUT /api/tags/{name}. The service is the single source of truth, so the gate moves there; MCP's pre-strip stays as harmless defense-in-depth. P2 #2 — services.tag_management.merge_tags dry-run no longer caps ``affected_chunks`` at the per-tag scan limit. A new ``count_chunks_by_any_tag`` storage helper does a single ``COUNT(*) WHERE EXISTS (… IN (?, …))`` over the source-tag union, so a tag attached to more than 10,000 chunks reports the true count instead of an undercount in the confirmation modal. The sample loop still caps at ``_DRY_RUN_SAMPLE_CAP`` since the UI only renders ten previews. Co-Authored-By: Claude <[email protected]> * fix(services): absorb replace_chunk_tags + tighten rename same-name gate (#688) Two follow-ups on the cross-surface symmetry pass: - ``services.tag_management.replace_chunk_tags`` now backs ``PUT /api/chunks/{id}/tags``. The route used to ``upsert_chunks`` directly, which (a) skipped the ``_tag_write_lock`` and could race with an in-flight bulk rename/delete/merge, and (b) skipped the search TTL cache flush so a follow-up tag-filtered search could serve stale results. Routing through the service closes both gaps; the per-chunk path now obeys the same invariants as the global ones. ``Chunk`` no-op (deduped tags equal current) short-circuits without an upsert so ``updated_at`` and decay scoring are not disturbed. - ``services.tag_management.rename_tag`` rejects ``old == new`` after strip. The MCP wrapper's raw ``==`` gate was both wrong (``"foo"`` vs ``" foo "`` slipped through) and not honoured by Web — a same-name rename would still rewrite ``chunks.tags``, bumping ``updated_at`` and flushing the result cache as if data had changed. Service-layer is now the single gate; the MCP pre-check is removed. Tests: 30 service tests (incl. 4 ``replace_chunk_tags`` + 4 preserve invariants for content/embedding/hash/created_at), 1 MCP parity test for the same-name reject path, 4 web-route tests for ``update_chunk_tags`` lock + cache invariants. Pin-mutation validated. Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SearchPipeline,SqliteBackend,extract_entities,create_components,Mem2MemConfig)에 직접 의존하던 부분을 모두 제거 — 이제McpClientSearchAdapter만이 LTM 통신 통로memtomem-stm) 사전 작업Why
STM은 본질적으로 MCP 프록시 게이트웨이입니다 (
ProxyManager가 임의 개수 upstream MCP server를 받음 — brave-search/github/memtomem 등). 이전엔 memtomem만 in-process import로 특별 취급했는데, 이는 STM이 memtomem-server 역할까지 겸하는 architectural mismatch였습니다. remote-only로 단일화하면:git filter-repo로 별도 저장소 분리할 때 cross-repo dependency가 깔끔Changes
Removed:
SurfacingEngine.search_pipeline/storage생성자 인자 (이제mcp_adapterkeyword-only required)SurfacingConfig.ltm_mode필드 (remote가 유일한 모드)server.py의 in-process bootstrap 분기 + try/except fallback 로직_extract_heuristic의memtomem.tools.entity_extractionimport (빈 list stub)TestFeedbackBoost테스트 클래스 3개 (SqliteBackend.increment_access직접 호출에 의존)_boosted_surfacings추적 set (access boost 분기와 함께)Added:
tests/_fake_memtomem_server.py— 가벼운 stdio MCP fake server (32줄)tests/test_remote_ltm_integration.py— end-to-end integration test, fake server에 child process로 연결해서 SurfacingEngine 동작 검증 (0.36s)Updated:
README.md"LTM Connection Modes" → "LTM Connection" (단일 모드 설명)_make_search_pipeline→_make_mcp_adapterrename,search_pipeline=→mcp_adapter=일괄 전환--fix로 정리된 unused importsFollow-ups (별도 PR)
이 PR은 in-process 전용 부수 기능 3개를 임시 비활성화합니다. 0.1.0 알파에서 사용자 거의 없으므로 안전하지만 후속 작업 필요:
mem_increment_accessmem_do action을 추가한 후,McpClientSearchAdapter.increment_access(ids)메서드로 복원McpClientSearchAdapter.scratch_list()메서드 추가 (mem_scratch_get(key=None)호출 + 파싱)Test plan
uv run pytest packages/memtomem-stm/tests/ -m \"not ollama\")uv run ruff check packages/memtomem-stm/src)pip install memtomem-stm후memtomem-stm기동 (Phase 4 PyPI 배포 시)Phase roadmap
이 PR은 4단계 분리 작업의 Phase 1입니다:
cli/stm_cmd.py,StmProxyConfig,web/routes/proxy.py,lifespan.pySTM bootstrapping 제거git filter-repo로packages/memtomem-stm/→ 별도 저장소(https://github.com/memtomem/memtomem-stm.git) 추출, 모노레포에서 디렉터리 제거🤖 Generated with Claude Code