Skip to content

docs: post-STM-extraction cleanup + ground-truth count fixes#5

Merged
tsdata merged 1 commit intomainfrom
docs/post-stm-extraction-cleanup
Apr 9, 2026
Merged

docs: post-STM-extraction cleanup + ground-truth count fixes#5
tsdata merged 1 commit intomainfrom
docs/post-stm-extraction-cleanup

Conversation

@tsdata
Copy link
Copy Markdown
Collaborator

@tsdata tsdata commented Apr 9, 2026

Summary

Pre-PyPI publish documentation pass. After Phase 1-3 extracted STM into memtomem/memtomem-stm, this repo's docs still referenced the old monorepo layout, dead CLI commands (mm stm init/reset), broken paths (packages/memtomem-stm/...), a removed Web UI tab, and stale tool/test counts. This is the cleanup.

Ground-truth verification

Verified against current source before fixing inconsistent doc claims:

Claim Stale values found Authoritative
Tests 1581 / 1519 / 846 / "1500+" 886
MCP tools (full mode) 65 / 66 / 63 72
mem_do actions 61 / 62 63
Standard mode "~30" 32

Changes (14 files)

PyPI-facing surfaces

  • README.md — STM section rewritten to point at memtomem-stm + drop broken uv pip install -e "packages/memtomem-stm[ltm]" path and dead mm stm commands; CLI Reference, Features table, Glossary, Documentation, Contributing block all updated; tool counts 65→72 / actions 61→63
  • CHANGELOG.md — Rewrite 0.1.0 to describe what actually ships in this package (drop STM Proxy section now in its own repo, drop stm from CLI list, fix counts, add "Related projects" pointer)

User-facing guides

  • docs/guides/getting-started.md — STM setup section + CLI reference cleanup
  • docs/guides/user-guide.md — Tool/action counts, broken ../../packages/memtomem-stm/README.md links, embedded STM Quick-setup/tools/safety subsection (lived in STM repo's README anyway)
  • docs/guides/mcp-clients.md — Tool mode line, STM Proxy Tools subsection collapsed to a separate-package pointer
  • docs/guides/web-ui.md — Removed entire STM Proxy Dashboard section (~64 lines documenting /api/proxy/* endpoints that no longer exist after Phase 2 removed web/routes/proxy.py); STM tab dropped from the tabs table
  • docs/guides/hooks.md / use-cases.md / integrations/claude-code.md — STM proxy callouts now link to the separate repo
  • docs/guides/stm-guide.mdDeleted (690 lines). Content lives in memtomem-stm's README; nothing in this repo links to it anymore.

Plugin docs

  • packages/memtomem-claude-plugin/README.md — Tool count 63→72, mode line 30→32, STM proxy callout
  • packages/memtomem-claude-plugin/skills/setup/SKILL.md — STM proxy callout
  • packages/memtomem-openclaw-plugin/README.md — Tool count 63→72

Internal docs (development reference, not on PyPI)

  • CONTRIBUTING.md — Drop packages/memtomem-stm/src from ruff invocation, add the two plugin packages to project structure, add separate-repo pointer for STM

Note on gitignored files

CLAUDE.md, GEMINI.md, and docs/marketing/{blog-post-draft,hackernews-draft}.md are gitignored (or untracked, in GEMINI.md's case). They received the same fixes locally for accuracy, but are not part of this commit by design.

Local verification

  • ruff check packages/memtomem/src → All checks passed
  • ruff format --check packages/memtomem/src → 162 files already formatted
  • 14 files changed, 45 insertions(+), 860 deletions(-)

Test plan

  • CI lint job passes
  • CI typecheck job runs
  • CI test job passes (886 tests, no doc-related changes touched code)

🤖 Generated with Claude Code

Pre-PyPI publish documentation pass. After Phase 1-3 extracted STM into
its own repository (memtomem/memtomem-stm), the docs in this repo still
referenced the old monorepo layout, dead CLI commands, broken paths, a
removed Web UI tab, and stale tool/test counts. This PR is the cleanup.

Ground-truth verification first (against current source):

- Tests: 886 (was variously 1581 / 1519 / 846 / "1500+")
- MCP tools (full mode): 72 (was 65 / 66 / 63)
- mem_do actions: 63 (was 61 / 62)
- Standard mode: 32 (was "~30")

PyPI-facing surfaces:

- README.md (top-level / GitHub landing)
  - STM section rewritten to point at memtomem/memtomem-stm + drop the
    broken `uv pip install -e "packages/memtomem-stm[ltm]"` path and
    the dead `mm stm init` / `mm stm reset` commands
  - CLI Reference: drop `mm stm init` / `mm stm reset`
  - Features table: drop "STM monitoring" from Web UI description
    (the STM tab was removed in the Phase 2 cleanup)
  - Tool/action counts updated to 72 / 63
  - Glossary: STM entry now flags it as a separate package
  - Contributing block: pytest count + lint path
- CHANGELOG.md
  - Rewrite 0.1.0 to describe what actually ships in this package:
    drop the STM Proxy section (now its own release in its own repo),
    drop `stm` from the CLI list, fix tool / action / test counts,
    add a "Related projects" pointer to memtomem-stm

User-facing guides:

- getting-started.md: STM setup section now points to PyPI install +
  memtomem-stm README; CLI reference drops `mm stm init/reset`
- user-guide.md: tool count, action count, broken relative links to
  ../../packages/memtomem-stm/README.md, and the embedded STM
  Quick-setup / tools / safety subsection (all of which lived in the
  STM repo's README anyway)
- mcp-clients.md: tool mode line (65 → 72, ~30 → ~32) + STM Proxy
  Tools subsection collapsed to a separate-package pointer
- web-ui.md: removed entire "STM Proxy Dashboard" section (~64 lines)
  documenting `/api/proxy/*` endpoints that no longer exist after
  web/routes/proxy.py was removed in Phase 2; STM tab dropped from
  the tabs table
- hooks.md / use-cases.md / integrations/claude-code.md: STM proxy
  callouts now link to the separate repo
- stm-guide.md: deleted (690 lines). The content lives in
  memtomem-stm's README; nothing in this repo links to it anymore.

Plugin docs:

- packages/memtomem-claude-plugin/README.md: tool count 63 → 72, mode
  line ~30 → ~32, STM proxy callout points at separate repo
- packages/memtomem-claude-plugin/skills/setup/SKILL.md: STM proxy
  callout points at separate repo
- packages/memtomem-openclaw-plugin/README.md: tool count 63 → 72

Internal docs (development reference, not on PyPI):

- CONTRIBUTING.md: drop packages/memtomem-stm/src from ruff invocation,
  add the two plugin packages to project structure, add a separate-
  repo pointer for STM

Note: CLAUDE.md, GEMINI.md, and docs/marketing/{blog-post-draft,
hackernews-draft}.md also received the same fixes locally for accuracy,
but they are gitignored (or, in GEMINI.md's case, untracked) so they
are not part of this commit.

Local verification:
- ruff check packages/memtomem/src   → All checks passed
- ruff format --check                → 162 files already formatted

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@tsdata tsdata merged commit b42bace into main Apr 9, 2026
@tsdata tsdata deleted the docs/post-stm-extraction-cleanup branch April 9, 2026 13:19
tsdata added a commit that referenced this pull request Apr 9, 2026
Both READMEs were doubling as user manuals. The PyPI inner README (377
lines) included a 47-line env var reference and 100 lines of tool
usage examples; the top-level GitHub README (251 lines) duplicated
much of it. New users had to scroll past hundreds of lines of
reference material to figure out whether memtomem was for them.

This PR slims both READMEs to landing-page form, extracts the
reference material into dedicated guides under docs/guides/, and
verifies every documented command, flag, and env var against the
actual source code first.

Sizes:

- packages/memtomem/README.md (PyPI):  377 → 87 lines  (-77%)
- README.md (top-level / GitHub):      251 → 129 lines (-49%)
- docs/guides/configuration.md (NEW):  106 lines
- docs/guides/embeddings.md (NEW):     77 lines

What's in the new READMEs:

- packages/memtomem/README.md: tagline, "Built for:" personas,
  install (MCP server + optional CLI), 3-call quickstart, key
  features list with doc links, documentation tree (absolute GitHub
  URLs because PyPI doesn't resolve relative links to the repo),
  license. That's it.
- README.md: existing "Why memtomem?" table, 3-step quickstart,
  features list (now with optional STM pointer), documentation tree,
  contributing block. Removed the embedded embedding-models table
  and glossary; those moved into docs/guides.

What moved out:

- 47-line "Environment Variables" section (PyPI README)
  → docs/guides/configuration.md
- 30-line "Embedding Providers" section (PyPI README)
  + 6-line "Embedding Models" table (top-level README)
  → docs/guides/embeddings.md
- 100-line "Key Tool Usage Examples" section (PyPI README)
  → already covered by docs/guides/user-guide.md (deleted to avoid
    drift)
- 16-line "CLI Usage" section (PyPI README)
  → already covered by docs/guides/getting-started.md (deleted)
- 12-line MCP tool category tables (PyPI README)
  → already covered by docs/guides/user-guide.md
- "Glossary" (top-level README)
  → already covered by docs/guides/user-guide.md (deleted)

"Docs as tests" — verified before writing:

- mm subcommands listed in CLI reference all exist in `mm --help`:
  add, config, context, embedding-reset, index, init, recall, search,
  shell, watchdog, web. (No `mm stm` — already removed in #5.)
- `mm embedding-reset` --mode flags: `status`, `apply-current`,
  `revert-to-stored` all exist (verified via --help).
- `mm context` subcommands: detect, diff, generate, init, sync.
- All 22 MEMTOMEM_* env vars in configuration.md match pydantic
  config classes (StorageConfig, EmbeddingConfig, SearchConfig,
  IndexingConfig, DecayConfig, MMRConfig, NamespaceConfig,
  ContextWindowConfig). Env prefix `MEMTOMEM_`, nested delimiter
  `__`, both confirmed via Mem2MemConfig.model_config.
- All embedding model dimensions match what each provider actually
  returns (nomic-embed-text 768, bge-m3 1024, text-embedding-3-small
  1536, text-embedding-3-large 3072).
- Tool counts (72 / 9 / ~32 / 63 actions) match the values verified
  in #5 against tool_registry.ACTIONS and full-mode tool count.
- Test count 886 verified via `pytest --co -q`.
- `[all]` extra exists in packages/memtomem/pyproject.toml.
- All guide links in both READMEs point to existing
  docs/guides/*.md files.

Local checks:
- ruff check packages/memtomem/src   → All checks passed
- pytest --co                        → 886 tests collected

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
tsdata added a commit that referenced this pull request Apr 9, 2026
Both READMEs were doubling as user manuals. The PyPI inner README (377
lines) included a 47-line env var reference and 100 lines of tool
usage examples; the top-level GitHub README (251 lines) duplicated
much of it. New users had to scroll past hundreds of lines of
reference material to figure out whether memtomem was for them.

This PR slims both READMEs to landing-page form, extracts the
reference material into dedicated guides under docs/guides/, and
verifies every documented command, flag, and env var against the
actual source code first.

Sizes:

- packages/memtomem/README.md (PyPI):  377 → 87 lines  (-77%)
- README.md (top-level / GitHub):      251 → 129 lines (-49%)
- docs/guides/configuration.md (NEW):  106 lines
- docs/guides/embeddings.md (NEW):     77 lines

What's in the new READMEs:

- packages/memtomem/README.md: tagline, "Built for:" personas,
  install (MCP server + optional CLI), 3-call quickstart, key
  features list with doc links, documentation tree (absolute GitHub
  URLs because PyPI doesn't resolve relative links to the repo),
  license. That's it.
- README.md: existing "Why memtomem?" table, 3-step quickstart,
  features list (now with optional STM pointer), documentation tree,
  contributing block. Removed the embedded embedding-models table
  and glossary; those moved into docs/guides.

What moved out:

- 47-line "Environment Variables" section (PyPI README)
  → docs/guides/configuration.md
- 30-line "Embedding Providers" section (PyPI README)
  + 6-line "Embedding Models" table (top-level README)
  → docs/guides/embeddings.md
- 100-line "Key Tool Usage Examples" section (PyPI README)
  → already covered by docs/guides/user-guide.md (deleted to avoid
    drift)
- 16-line "CLI Usage" section (PyPI README)
  → already covered by docs/guides/getting-started.md (deleted)
- 12-line MCP tool category tables (PyPI README)
  → already covered by docs/guides/user-guide.md
- "Glossary" (top-level README)
  → already covered by docs/guides/user-guide.md (deleted)

"Docs as tests" — verified before writing:

- mm subcommands listed in CLI reference all exist in `mm --help`:
  add, config, context, embedding-reset, index, init, recall, search,
  shell, watchdog, web. (No `mm stm` — already removed in #5.)
- `mm embedding-reset` --mode flags: `status`, `apply-current`,
  `revert-to-stored` all exist (verified via --help).
- `mm context` subcommands: detect, diff, generate, init, sync.
- All 22 MEMTOMEM_* env vars in configuration.md match pydantic
  config classes (StorageConfig, EmbeddingConfig, SearchConfig,
  IndexingConfig, DecayConfig, MMRConfig, NamespaceConfig,
  ContextWindowConfig). Env prefix `MEMTOMEM_`, nested delimiter
  `__`, both confirmed via Mem2MemConfig.model_config.
- All embedding model dimensions match what each provider actually
  returns (nomic-embed-text 768, bge-m3 1024, text-embedding-3-small
  1536, text-embedding-3-large 3072).
- Tool counts (72 / 9 / ~32 / 63 actions) match the values verified
  in #5 against tool_registry.ACTIONS and full-mode tool count.
- Test count 886 verified via `pytest --co -q`.
- `[all]` extra exists in packages/memtomem/pyproject.toml.
- All guide links in both READMEs point to existing
  docs/guides/*.md files.

Local checks:
- ruff check packages/memtomem/src   → All checks passed
- pytest --co                        → 886 tests collected

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
memtomem pushed a commit that referenced this pull request Apr 18, 2026
…e 5d+5e)

Phase 5d (per-type floor calibration) and Phase 5e (genre confusion
matrix) share a single runner: `tools/retrieval-eval/calibrate_portfolio.py`.

**Calibration (5d)**: indexes the 6-topic corpus (192 chunks) with
ONNX paraphrase-multilingual-MiniLM-L12-v2, runs every portfolio
query at `rrf_weights=[1.0, 1.0]` (balanced fusion = canonical
operating point), measures recall@10 / MRR@10 / nDCG@10 per query,
and aggregates per (lang, type) across `--runs` passes (default 3).
Floor = `round(mean × factor, 2)` with factor default 0.9.

Relevance grading matches design doc § "Relevance grading":
primary-in-targets = 1.0, secondary-in-targets = 0.5, else 0.0.
Recall and MRR use primary-only binary relevance; nDCG uses graded.

Chunk-level matching joins search results back to tagged fixtures
via `(resolved source_file, heading)` keys — `_retrieved_key`
strips the markdown `## ` prefix from `heading_hierarchy[-1]` so
the join lines up with `drift_validator.parse_fixture` output.

**Genre confusion matrix (5e)**: added `genre` field to Query
(`Literal["runbook", "postmortem", "adr", "troubleshooting"] | None`).
Populated for all 20 `genre_primary` queries, None elsewhere. Per run
the runner tallies `(lang, expected_genre, observed_top_1_genre)`
across all genre_primary queries and prints the aggregate matrix.

Also quiet an FTS5 tokenizer failure: removed the apostrophe from
KO direct #5 (`cert-manager Let's Encrypt …` → `cert-manager ACME …`).
Apostrophes in the query string break the BM25 branch with a
`fts5: syntax error near "'"`, silently producing zero recall for
that query. The ACME phrasing is closer to current Kubernetes cert-
manager vocabulary anyway.

**Smoke-run output** (3 runs, factor 0.9 — indicative only, not
committed as floors; numbers will be re-measured by whoever wires
CI):

  EN direct: mean recall 0.75 → floor 0.68
  EN paraphrase: 0.43 → 0.39
  EN multi_topic: 0.42 → 0.38
  EN negation: 0.37 → 0.33
  EN underspecified: 0.68 → 0.61
  EN genre_primary: 0.16 → 0.15
  KO direct: 0.79 → 0.71
  KO paraphrase: 0.39 → 0.35
  ...

Genre confusion spot-check (EN, 3 queries per expected × 3 runs):
- adr 9/9 correct, postmortem 6/6, runbook 9/9
- troubleshooting 3/6 correct, 3/6 → postmortem (expected pair
  confusion per design doc § "Genre-pair confusability predictions")
KO shows a separate signal: 0/9 runbook queries top-1 to runbook
(6→troubleshooting, 3→postmortem). Recorded as a Phase 5 finding
for Phase 6 CI gate calibration.

Tests at packages/memtomem/tests/test_calibrate_portfolio.py (12):
- build_relevance primary/secondary/language/precedence
- compute_floors factor, empty samples
- _retrieved_key markdown-prefix stripping + missing/deep hierarchy
- collect_tagged_chunks live-corpus count (192 = 96 KO + 96 EN)
- _genre_of genre-stem detection

End-to-end `calibrate()` covers ONNX indexing and is run via the CLI,
not pytest, to keep the test suite fast.

Validation: ruff check + format clean; mypy advisory on
`load_config_overrides` lambda arg-name mismatch (same issue as
measure_sensitivity.py; mypy CI scope is src/ only, advisory for
tools/). 47 fast tests pass (12 drift + 12 portfolio + 23 validator).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
memtomem added a commit that referenced this pull request Apr 25, 2026
…line (#468)

`_split_section` advances `line_offset` by `count("\n") + 2` per
paragraph part, where `+2` matches the `"\n\n"` join separator that
sits between paragraphs in the source. When closing an accumulated
sub-chunk the formula then read `base_line + line_offset - 1`, which
landed one line past the paragraph — on the blank separator that
belongs to no chunk. `mem_edit`'s `replace_chunk_body` reads
`lines[start..end]` inclusive, so the overstated `end_line` pulled the
next paragraph's blank into the replacement on save, collapsing the
paragraph gap. Cosmetic, not data-loss.

Subtract the separator (`- 2` instead of `- 1`) so `end_line` lands on
the actual last content line. The trailing sub-chunk is unaffected —
it inherits `section["end_line"]` directly. Pre-existing inequality
tests stay green; new exact-alignment tests cover both the no-header
(`body_offset = 1`) and blockquote-header (`body_offset > 1`) paths.

Closes RFC `mem-add-tags-blockquote-promote-rfc.md` §Follow-ups #5.

Co-authored-by: pandas-studio <[email protected]>
Co-authored-by: Claude <[email protected]>
memtomem pushed a commit that referenced this pull request Apr 28, 2026
P2 cron Phase A added the ``scheduler`` config section
(Mem2MemConfig.scheduler, BaseSettings-backed). A developer with
MEMTOMEM_SCHEDULER__ENABLED set in their shell would silently mutate
fixture-built configs. Add it to ``isolate_memtomem_env``'s strip list
so the helper covers every env-var-bound section that exists today.

Sibling sweep (per #529 review #5): ``_StubCtx`` duplicates also live
in test_sessions, test_validate_namespace, test_validate_namespace_ns_tools,
test_validate_agent_id, test_chunk_links_writer, test_server_degraded_mode
— left for a separate cleanup PR to keep #467's scope to the two files
named in the issue.

Co-Authored-By: Claude <[email protected]>
memtomem added a commit that referenced this pull request Apr 28, 2026
* chore(tests): hoist bm25-only fixture trio to conftest/helpers

PR #465 added ``test_batch_add_tag_isolation.py`` which redefined the
``_isolate_memtomem_env`` + ``_StubCtx`` + ``integration_components``
fixture trio that already lived almost verbatim in
``test_multi_agent_integration.py``. Two callers was tolerable; a third
would push it past the extraction threshold (per
feedback_refactor_axis.md / the #465 review N1).

- conftest.py grows the ``bm25_only_components`` fixture (renamed from
  ``integration_components`` so the BM25-only / hermetic-config scope is
  visible at the call site).
- helpers.py grows ``StubCtx`` and ``isolate_memtomem_env`` — the latter
  is exposed (not underscore-prefixed) because the LangGraph adapter
  cases construct their own ``MemtomemStore`` outside the fixture and
  need the helper directly.
- Both test files drop their local copies and import from the shared
  module. No behavior change; 23 tests in the two files still pass and
  the full suite (2999 tests) is green.

Closes #467.

Co-Authored-By: Claude <[email protected]>

* test: add MEMTOMEM_SCHEDULER__ENABLED to hermeticity strip-list

P2 cron Phase A added the ``scheduler`` config section
(Mem2MemConfig.scheduler, BaseSettings-backed). A developer with
MEMTOMEM_SCHEDULER__ENABLED set in their shell would silently mutate
fixture-built configs. Add it to ``isolate_memtomem_env``'s strip list
so the helper covers every env-var-bound section that exists today.

Sibling sweep (per #529 review #5): ``_StubCtx`` duplicates also live
in test_sessions, test_validate_namespace, test_validate_namespace_ns_tools,
test_validate_agent_id, test_chunk_links_writer, test_server_degraded_mode
— left for a separate cleanup PR to keep #467's scope to the two files
named in the issue.

Co-Authored-By: Claude <[email protected]>

---------

Co-authored-by: pandas-studio <[email protected]>
Co-authored-by: Claude <[email protected]>
memtomem pushed a commit that referenced this pull request Apr 29, 2026
… in toast, align canonical_root style

Follow-up to PR #549 review:

- **#1 (low-medium)**: drop the hardcoded `_CTX_EMPTY_HINT_META` JS map.
  GET `/api/context/{skills,commands,agents}` now also returns
  `canonical_root: str` and `scanned_dirs: list[str]` (computed once at
  module import from `SKILL_DIRS` / `AGENT_DIRS` / `COMMAND_DIRS`), so the
  empty-state hint pulls from the wire instead of duplicating the
  detector layout client-side. The JS still has a one-line fallback
  (`.memtomem/${type}` + empty list) for older backends, but it never
  fires when the cache-bust takes effect.
- **#2 (cosmetic)**: skills sync response now uses the
  `CANONICAL_SKILL_ROOT = ".memtomem/skills"` constant directly,
  matching the `CANONICAL_AGENT_ROOT` / `CANONICAL_COMMAND_ROOT` pattern
  in agents/commands routes. Drops the `_safe_rel(canonical_skills_root(
  project_root), project_root)` round-trip — both forms resolve to the
  same string today, but a constant won't drift if a future PR adds env
  / scope resolution to `canonical_skills_root()`.
- **#3 (low)**: import-no-runtimes toast renders
  `basename(project_root)` instead of the absolute path. The wire still
  carries the absolute string (consistent with `/api/system/*` and
  useful for debugging / reverse-proxy contexts) but the user-facing
  toast keeps it short — `scanned_dirs` already gives full orientation.
  New `_ctxBasename()` helper handles the POSIX path split.

Cache-bust bumped to `context-gateway.js?v=3`.

Tests:
- New assertions on `TestListSkills.test_empty`,
  `TestListCommands.test_empty`, `TestListAgents.test_empty` verify
  GET response now includes `canonical_root` + `scanned_dirs`.
- `pytest -m "not ollama"` 3152 passed; `ruff check` + `ruff format
  --check` clean.

Skipped per reviewer guidance:
- #4 `Final[str]` → `Literal` / `StrEnum` (mypy is advisory).
- #5 additional positive route assertions for `invalid_name` /
  `already_imported` / `toml_parse_error` (route layer is a trivial
  dict comprehension; covered at the core layer).
- #6 Korean `<이름>` literal angle-bracket (intentional placeholder
  guide, not an unsubstituted variable).

Co-Authored-By: Claude <[email protected]>
memtomem added a commit that referenced this pull request Apr 29, 2026
* fix(web): surface "why nothing happened" on Skills/Commands/Agents Sync+Import

Users reported "Sync 도 Import 도 동작이 이상함" — both buttons in
mm web → Settings → Skills/Commands/Agents looked successful (a green
toast popped) but nothing actually moved on disk. The root cause was a
shape mismatch in the response handling, not a wiring bug:

- For skills, the sync response carries `skipped` (e.g. "no canonical
  skills") but never `dropped`. The client only read `dropped`, so the
  skipped reason was silently dropped on the floor.
- The import response carried no metadata at all, so a 0-imported,
  0-skipped result (the common case when no `.claude/skills/` exists in
  the project) just rendered "Import completed" with no clue that the
  scanner found nothing because the directories didn't exist.

This change keeps cwd-bound single-project behavior unchanged and adds
machine-readable plumbing the UI can match against:

- All three context types' `Sync` and `Import` core layers
  (`memtomem.context.{skills,commands,agents}`) now record skipped items
  as `(name, reason, reason_code)` triples. Reason codes come from a new
  `memtomem.context._skip_reasons` module: `no_canonical_root`,
  `unknown_runtime`, `parse_error`, `invalid_name`, `already_imported`,
  `canonical_exists`, `toml_parse_error`. Human `reason` strings stay
  for CLI/log output.
- Web route handlers (`web/routes/context_{skills,commands,agents}.py`)
  surface `reason_code` per skipped item, plus `canonical_root` on sync
  responses and `project_root` + `scanned_dirs` on import responses.
- The web client (`context-gateway.js`) reads `data.skipped` (in
  addition to `data.dropped` for commands/agents field-level drops),
  matches on `reason_code === "no_canonical_root"` to show
  "No canonical {type} under {canonical}. Create one first." instead of
  a generic success, and on a 0/0 import shows "No runtime {type} found
  in {project_root}. Scanned: {scan_dirs}." so users see exactly which
  paths the detector inspected.
- The empty list state hint now points at the canonical and runtime
  paths instead of "Create one or import from existing runtimes." — a
  fresh user can drop a `SKILL.md` into the right directory without
  guessing.
- Three new placeholder-form i18n keys (`empty_hint`,
  `import_no_runtimes`, `sync_empty_canonical`) cover all three types
  via `{type}` / `{canonical}` / `{root}` / `{scan_dirs}` substitution
  rather than 9 type-specific keys.
- `context-gateway.js?v=2` cache-bust.

All response changes are additive — existing clients that only read
`imported`/`generated`/`skipped[].name|runtime|reason` keep working
unchanged. The CLI (`mm context skills/agents/commands ...`) and the
MCP `context` tool also unpack the new triple shape via `_code` discard
so their human-readable output is byte-identical.

Tests:
- 3 core test suites (`test_context_{skills,commands,agents}.py`)
  updated to the 3-tuple shape.
- New web route assertions verify `reason_code`, `canonical_root`,
  `project_root`, and `scanned_dirs` are present in sync/import
  responses for the empty-canonical and empty-runtime paths.
- Full suite green: `pytest -m "not ollama"` 3152 passed; `ruff check`
  + `ruff format --check` clean.

Co-Authored-By: Claude <[email protected]>

* fix(web/context): address review — surface scan dirs on GET, basename in toast, align canonical_root style

Follow-up to PR #549 review:

- **#1 (low-medium)**: drop the hardcoded `_CTX_EMPTY_HINT_META` JS map.
  GET `/api/context/{skills,commands,agents}` now also returns
  `canonical_root: str` and `scanned_dirs: list[str]` (computed once at
  module import from `SKILL_DIRS` / `AGENT_DIRS` / `COMMAND_DIRS`), so the
  empty-state hint pulls from the wire instead of duplicating the
  detector layout client-side. The JS still has a one-line fallback
  (`.memtomem/${type}` + empty list) for older backends, but it never
  fires when the cache-bust takes effect.
- **#2 (cosmetic)**: skills sync response now uses the
  `CANONICAL_SKILL_ROOT = ".memtomem/skills"` constant directly,
  matching the `CANONICAL_AGENT_ROOT` / `CANONICAL_COMMAND_ROOT` pattern
  in agents/commands routes. Drops the `_safe_rel(canonical_skills_root(
  project_root), project_root)` round-trip — both forms resolve to the
  same string today, but a constant won't drift if a future PR adds env
  / scope resolution to `canonical_skills_root()`.
- **#3 (low)**: import-no-runtimes toast renders
  `basename(project_root)` instead of the absolute path. The wire still
  carries the absolute string (consistent with `/api/system/*` and
  useful for debugging / reverse-proxy contexts) but the user-facing
  toast keeps it short — `scanned_dirs` already gives full orientation.
  New `_ctxBasename()` helper handles the POSIX path split.

Cache-bust bumped to `context-gateway.js?v=3`.

Tests:
- New assertions on `TestListSkills.test_empty`,
  `TestListCommands.test_empty`, `TestListAgents.test_empty` verify
  GET response now includes `canonical_root` + `scanned_dirs`.
- `pytest -m "not ollama"` 3152 passed; `ruff check` + `ruff format
  --check` clean.

Skipped per reviewer guidance:
- #4 `Final[str]` → `Literal` / `StrEnum` (mypy is advisory).
- #5 additional positive route assertions for `invalid_name` /
  `already_imported` / `toml_parse_error` (route layer is a trivial
  dict comprehension; covered at the core layer).
- #6 Korean `<이름>` literal angle-bracket (intentional placeholder
  guide, not an unsubstituted variable).

Co-Authored-By: Claude <[email protected]>

* refactor(context): narrow reason_code to Literal SkipCode

Tightens the typing on the third element of `(name, reason, reason_code)`
triples produced by `SkillSyncResult.skipped`, `CommandSyncResult.skipped`,
`AgentSyncResult.skipped`, and `ExtractResult.skipped`. The new
`memtomem.context._skip_reasons.SkipCode` is a `Literal` of the seven
codes already enumerated in that module, so a typo at the construction
site fails type-check instead of slipping through to the wire.

Also strengthens the import-empty web-route tests for skills, commands,
and agents to assert `data["project_root"] == str(tmp_path)` instead of
just key presence — confirms the response actually echoes the
fixture's project root rather than some unrelated path.

No runtime behavior change; addresses follow-up polish flagged in the
review of this PR.

Co-Authored-By: Claude <[email protected]>

---------

Co-authored-by: pandas-studio <[email protected]>
Co-authored-by: Claude <[email protected]>
memtomem pushed a commit that referenced this pull request Apr 29, 2026
Self-review of PR #553 surfaced five real issues — fixing them keeps
the surface honest about what's actually wired up.

- Drop dead helpers in ``context/projects.py`` (#1):
  - ``_is_eaccess`` was never called.
  - ``is_project_scope_id`` claimed to support route error formatting
    but no route used it.
  - ``_new_temp_known_projects_path`` claimed test convenience but no
    test imported it.
  - ``errno`` and ``tempfile`` imports follow.
- Drop ``ProjectScope.counts`` dataclass field (#2). Discovery never
  populated it; the route layer (``_scope_to_dict``) always overrides
  via ``_counts_for(root)``. Forward-design pollution.
- ``POST /api/context/known-projects`` now returns ``warning_code:
  "no_runtime_marker"`` alongside the prose ``warning`` (#4). Matches
  PR1's (#549) machine-readable ``reason_code`` pattern so client
  matching is i18n-stable; the prose stays for back-compat.
- ``_counts_for`` gains a cost comment (#3): 3 × (canonical scan +
  N runtime scans) per scope on every projects GET. Acceptable at
  <30 scopes; cache when discovery growth pushes that ceiling.
- JS scope badge no longer carries a redundant ``data-i18n`` attribute
  (#5). The inline ``t()`` already renders text at construction; the
  attribute would let the i18n DOM walker re-translate and clobber.
- ``ctx-scope-summary`` now carries a ``title="${scope.root}"`` (#8) so
  same-name scopes (``Edu/inflearn`` vs ``Work/inflearn``) disambiguate
  on hover without bloating the visible label.

Cache-bust: ``context-gateway.js?v=4`` → ``?v=5``.

Tests updated: ``test_post_warns_on_missing_marker`` asserts the new
``warning_code``; ``test_post_no_warning_when_marker_present`` asserts
both fields are absent. 110/110 green; ruff + format clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
memtomem added a commit that referenced this pull request Apr 29, 2026
…#553)

* feat(web): multi-project read-only discovery for context-gateway tabs

PR2 of the multi-project context UI series — see
``memtomem-docs/memtomem/planning/multi-project-context-ui-rfc.md``.

A user running ``mm web`` from the ``memtomem`` repo could previously
only see skills / commands / agents under that one cwd; ``~/Edu/inflearn/
.claude/skills/`` was invisible without restarting the server. PR2 adds
discovery for additional project roots (server cwd, user-registered
``known_projects.json``, opt-in ``~/.claude/projects/`` reverse-decode)
and renders one collapsible ``<details>`` group per scope inside each
tab. Mutating routes stay cwd-only — multi-scope writes ship in PR3.

Backend
-------

- ``memtomem.context.projects`` is the new discovery module:
  - ``ProjectScope`` carries ``{scope_id, label, root, tier, sources,
    missing, experimental, counts}``.
  - ``compute_scope_id`` derives ``"p-" + sha256(case-normalized
    Path.resolve())[:12]`` so refresh / restart preserve the id (RFC
    §Decision 4). On macOS the input is also lowercased: APFS is
    case-insensitive but ``Path.resolve()`` does not canonicalize case.
  - ``KnownProjectsStore`` writes ``~/.memtomem/known_projects.json``
    via ``tempfile + os.replace`` plus a ``.<name>.lock`` sidecar
    fcntl lock — locking the data file directly does not survive
    ``os.replace`` (PR #548 issue).
  - ``discover_project_scopes`` unions cwd + known-projects + (opt-in)
    claude-projects, dedupes by resolved path, and clears
    ``experimental`` on union with a trusted source.
- ``ContextGatewayConfig`` (new) carries ``known_projects_path``,
  ``experimental_claude_projects_scan: bool = False``, and
  ``user_tier_enabled: bool = False`` (forward-compat for PR3).

Routes
------

New ``web/routes/context_projects.py``:

- ``GET /api/context/projects`` — full scope list with per-type counts.
- ``POST /api/context/known-projects`` — register; absolute + ``is_dir()``
  validation; warning (HTTP 200) when no ``.claude``/``.gemini``/``.agents``/
  ``.memtomem`` marker is present so users can pre-register an empty
  checkout.
- ``DELETE /api/context/known-projects/{scope_id}`` — drop, including
  stale entries (matching is path-derived).
- ``resolve_scope_root`` is the dependency the existing
  ``/api/context/{skills,commands,agents}`` GETs now use; without
  ``?scope_id=`` it falls back to the server cwd so PR1's mutating
  cwd flow keeps working unchanged. Unknown / stale scope_id → 404.

UI
--

Each tab gains an "Add Project" button; the list area renders one
``<details class="ctx-scope-group">`` per scope, lazy-fetching items
on expand. Server CWD opens by default; non-cwd scope items render as
read-only cards (PR2 keeps mutating buttons targeting cwd only). Each
non-cwd group has an ✕ button that calls DELETE
``/api/context/known-projects/{scope_id}``.

Three new i18n keys (placeholder-form, en + ko parity).

Tests
-----

PR2 minimum-bar from RFC §Test obligations:

- ``tests/test_context_projects.py`` — scope_id stability + collision
  sanity, trailing-slash invariance, macOS case-insensitivity,
  symlink dedup, ``experimental`` flag clears on union, both
  ``experimental_claude_projects_scan`` defaults, ``known_projects.json``
  corruption / unknown-version recovery, stale-entry removal,
  multiprocessing.Process atomic-write race.
- ``tests/test_web_routes_context_projects.py`` — HTTP shape, unknown
  scope_id 404, POST 400 paths (relative / nonexistent / file),
  marker warning, idempotent registration, DELETE round-trip + 404.

Manual smoke
------------

``HOME=/tmp/.../home uv run mm web --port 8088``: GET projects shows
cwd; POST /tmp returns 200 + warning; GET projects shows two scopes
(``Server CWD`` + ``tmp``); skills?scope_id=p-bogus → 404; DELETE
round-trip works; second DELETE → 404.

Refs RFC ``multi-project-context-ui``; gates PR3 (multi-project
mutating + Claude user-tier).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

* review fixes: drop dead helpers, add warning_code, summary tooltip

Self-review of PR #553 surfaced five real issues — fixing them keeps
the surface honest about what's actually wired up.

- Drop dead helpers in ``context/projects.py`` (#1):
  - ``_is_eaccess`` was never called.
  - ``is_project_scope_id`` claimed to support route error formatting
    but no route used it.
  - ``_new_temp_known_projects_path`` claimed test convenience but no
    test imported it.
  - ``errno`` and ``tempfile`` imports follow.
- Drop ``ProjectScope.counts`` dataclass field (#2). Discovery never
  populated it; the route layer (``_scope_to_dict``) always overrides
  via ``_counts_for(root)``. Forward-design pollution.
- ``POST /api/context/known-projects`` now returns ``warning_code:
  "no_runtime_marker"`` alongside the prose ``warning`` (#4). Matches
  PR1's (#549) machine-readable ``reason_code`` pattern so client
  matching is i18n-stable; the prose stays for back-compat.
- ``_counts_for`` gains a cost comment (#3): 3 × (canonical scan +
  N runtime scans) per scope on every projects GET. Acceptable at
  <30 scopes; cache when discovery growth pushes that ceiling.
- JS scope badge no longer carries a redundant ``data-i18n`` attribute
  (#5). The inline ``t()`` already renders text at construction; the
  attribute would let the i18n DOM walker re-translate and clobber.
- ``ctx-scope-summary`` now carries a ``title="${scope.root}"`` (#8) so
  same-name scopes (``Edu/inflearn`` vs ``Work/inflearn``) disambiguate
  on hover without bloating the visible label.

Cache-bust: ``context-gateway.js?v=4`` → ``?v=5``.

Tests updated: ``test_post_warns_on_missing_marker`` asserts the new
``warning_code``; ``test_post_no_warning_when_marker_present`` asserts
both fields are absent. 110/110 green; ruff + format clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

---------

Co-authored-by: pandas-studio <[email protected]>
Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant