docs: slim READMEs, add configuration + embeddings guides by tsdata · Pull Request #6 · memtomem/memtomem

tsdata · 2026-04-09T13:38:09Z

Summary

Both READMEs were doubling as user manuals. The PyPI inner README (377 lines) included a 47-line env var reference and 100 lines of tool usage examples; the top-level GitHub README (251 lines) duplicated much of it. New users had to scroll past hundreds of lines of reference material to figure out whether memtomem was for them.

This PR slims both READMEs to landing-page form, extracts the reference material into dedicated guides under docs/guides/, and verifies every documented command, flag, and env var against the actual source code first.

Sizes

File	Before	After	Δ
`packages/memtomem/README.md` (PyPI)	377	87	-77%
`README.md` (top-level / GitHub)	251	129	-49%
`docs/guides/configuration.md` (NEW)	—	106	new
`docs/guides/embeddings.md` (NEW)	—	77	new

What's in the new READMEs

`packages/memtomem/README.md` (PyPI landing page)

Tagline → "Built for:" personas → install (MCP server + optional CLI) → 3-call quickstart → key features list with doc links → documentation tree (absolute GitHub URLs because PyPI doesn't resolve relative links to the repo) → license. That's it.

`README.md` (GitHub landing page)

Existing "Why memtomem?" table → 3-step quickstart → features list (now with optional STM pointer) → documentation tree → contributing block. Removed the embedded embedding-models table and glossary; those moved into docs/guides.

What moved out

Removed from	Moved to
47-line "Environment Variables" section (PyPI README)	`docs/guides/configuration.md`
30-line "Embedding Providers" section (PyPI README) + 6-line "Embedding Models" table (top-level README)	`docs/guides/embeddings.md`
100-line "Key Tool Usage Examples" section (PyPI README)	already covered by `docs/guides/user-guide.md` (deleted to avoid drift)
16-line "CLI Usage" section (PyPI README)	already covered by `docs/guides/getting-started.md` (deleted)
12-line MCP tool category tables (PyPI README)	already covered by `docs/guides/user-guide.md`
"Glossary" (top-level README)	already covered by `docs/guides/user-guide.md` (deleted)

"Docs as tests" — verified before writing

All mm subcommands listed in CLI references exist in mm --help: add, config, context, embedding-reset, index, init, recall, search, shell, watchdog, web. (No mm stm — already removed in docs: post-STM-extraction cleanup + ground-truth count fixes #5.)
mm embedding-reset --mode flags status, apply-current, revert-to-stored all exist (verified via --help).
mm context subcommands: detect, diff, generate, init, sync.
All 22 MEMTOMEM_* env vars in configuration.md match pydantic config classes (StorageConfig, EmbeddingConfig, SearchConfig, IndexingConfig, DecayConfig, MMRConfig, NamespaceConfig, ContextWindowConfig). Env prefix MEMTOMEM_, nested delimiter __, both confirmed via Mem2MemConfig.model_config.
All embedding model dimensions match what each provider actually returns (nomic-embed-text 768, bge-m3 1024, text-embedding-3-small 1536, text-embedding-3-large 3072).
Tool counts (72 / 9 / ~32 / 63 actions) match the values verified in docs: post-STM-extraction cleanup + ground-truth count fixes #5 against tool_registry.ACTIONS and full-mode tool count.
Test count 886 verified via pytest --co -q.
[all] extra exists in packages/memtomem/pyproject.toml.
All guide links in both READMEs point to existing docs/guides/*.md files.

Local checks

ruff check packages/memtomem/src → All checks passed
pytest --co → 886 tests collected

Test plan

CI lint job passes
CI typecheck job runs
CI test job passes (886 tests, no code changes)
PyPI README preview renders (the new layout uses absolute GitHub URLs since PyPI doesn't resolve relative links)

🤖 Generated with Claude Code

Both READMEs were doubling as user manuals. The PyPI inner README (377 lines) included a 47-line env var reference and 100 lines of tool usage examples; the top-level GitHub README (251 lines) duplicated much of it. New users had to scroll past hundreds of lines of reference material to figure out whether memtomem was for them. This PR slims both READMEs to landing-page form, extracts the reference material into dedicated guides under docs/guides/, and verifies every documented command, flag, and env var against the actual source code first. Sizes: - packages/memtomem/README.md (PyPI): 377 → 87 lines (-77%) - README.md (top-level / GitHub): 251 → 129 lines (-49%) - docs/guides/configuration.md (NEW): 106 lines - docs/guides/embeddings.md (NEW): 77 lines What's in the new READMEs: - packages/memtomem/README.md: tagline, "Built for:" personas, install (MCP server + optional CLI), 3-call quickstart, key features list with doc links, documentation tree (absolute GitHub URLs because PyPI doesn't resolve relative links to the repo), license. That's it. - README.md: existing "Why memtomem?" table, 3-step quickstart, features list (now with optional STM pointer), documentation tree, contributing block. Removed the embedded embedding-models table and glossary; those moved into docs/guides. What moved out: - 47-line "Environment Variables" section (PyPI README) → docs/guides/configuration.md - 30-line "Embedding Providers" section (PyPI README) + 6-line "Embedding Models" table (top-level README) → docs/guides/embeddings.md - 100-line "Key Tool Usage Examples" section (PyPI README) → already covered by docs/guides/user-guide.md (deleted to avoid drift) - 16-line "CLI Usage" section (PyPI README) → already covered by docs/guides/getting-started.md (deleted) - 12-line MCP tool category tables (PyPI README) → already covered by docs/guides/user-guide.md - "Glossary" (top-level README) → already covered by docs/guides/user-guide.md (deleted) "Docs as tests" — verified before writing: - mm subcommands listed in CLI reference all exist in `mm --help`: add, config, context, embedding-reset, index, init, recall, search, shell, watchdog, web. (No `mm stm` — already removed in #5.) - `mm embedding-reset` --mode flags: `status`, `apply-current`, `revert-to-stored` all exist (verified via --help). - `mm context` subcommands: detect, diff, generate, init, sync. - All 22 MEMTOMEM_* env vars in configuration.md match pydantic config classes (StorageConfig, EmbeddingConfig, SearchConfig, IndexingConfig, DecayConfig, MMRConfig, NamespaceConfig, ContextWindowConfig). Env prefix `MEMTOMEM_`, nested delimiter `__`, both confirmed via Mem2MemConfig.model_config. - All embedding model dimensions match what each provider actually returns (nomic-embed-text 768, bge-m3 1024, text-embedding-3-small 1536, text-embedding-3-large 3072). - Tool counts (72 / 9 / ~32 / 63 actions) match the values verified in #5 against tool_registry.ACTIONS and full-mode tool count. - Test count 886 verified via `pytest --co -q`. - `[all]` extra exists in packages/memtomem/pyproject.toml. - All guide links in both READMEs point to existing docs/guides/*.md files. Local checks: - ruff check packages/memtomem/src → All checks passed - pytest --co → 886 tests collected Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

… in toast, align canonical_root style Follow-up to PR #549 review: - **#1 (low-medium)**: drop the hardcoded `_CTX_EMPTY_HINT_META` JS map. GET `/api/context/{skills,commands,agents}` now also returns `canonical_root: str` and `scanned_dirs: list[str]` (computed once at module import from `SKILL_DIRS` / `AGENT_DIRS` / `COMMAND_DIRS`), so the empty-state hint pulls from the wire instead of duplicating the detector layout client-side. The JS still has a one-line fallback (`.memtomem/${type}` + empty list) for older backends, but it never fires when the cache-bust takes effect. - **#2 (cosmetic)**: skills sync response now uses the `CANONICAL_SKILL_ROOT = ".memtomem/skills"` constant directly, matching the `CANONICAL_AGENT_ROOT` / `CANONICAL_COMMAND_ROOT` pattern in agents/commands routes. Drops the `_safe_rel(canonical_skills_root( project_root), project_root)` round-trip — both forms resolve to the same string today, but a constant won't drift if a future PR adds env / scope resolution to `canonical_skills_root()`. - **#3 (low)**: import-no-runtimes toast renders `basename(project_root)` instead of the absolute path. The wire still carries the absolute string (consistent with `/api/system/*` and useful for debugging / reverse-proxy contexts) but the user-facing toast keeps it short — `scanned_dirs` already gives full orientation. New `_ctxBasename()` helper handles the POSIX path split. Cache-bust bumped to `context-gateway.js?v=3`. Tests: - New assertions on `TestListSkills.test_empty`, `TestListCommands.test_empty`, `TestListAgents.test_empty` verify GET response now includes `canonical_root` + `scanned_dirs`. - `pytest -m "not ollama"` 3152 passed; `ruff check` + `ruff format --check` clean. Skipped per reviewer guidance: - #4 `Final[str]` → `Literal` / `StrEnum` (mypy is advisory). - #5 additional positive route assertions for `invalid_name` / `already_imported` / `toml_parse_error` (route layer is a trivial dict comprehension; covered at the core layer). - #6 Korean `<이름>` literal angle-bracket (intentional placeholder guide, not an unsubstituted variable). Co-Authored-By: Claude <[email protected]>

* fix(web): surface "why nothing happened" on Skills/Commands/Agents Sync+Import Users reported "Sync 도 Import 도 동작이 이상함" — both buttons in mm web → Settings → Skills/Commands/Agents looked successful (a green toast popped) but nothing actually moved on disk. The root cause was a shape mismatch in the response handling, not a wiring bug: - For skills, the sync response carries `skipped` (e.g. "no canonical skills") but never `dropped`. The client only read `dropped`, so the skipped reason was silently dropped on the floor. - The import response carried no metadata at all, so a 0-imported, 0-skipped result (the common case when no `.claude/skills/` exists in the project) just rendered "Import completed" with no clue that the scanner found nothing because the directories didn't exist. This change keeps cwd-bound single-project behavior unchanged and adds machine-readable plumbing the UI can match against: - All three context types' `Sync` and `Import` core layers (`memtomem.context.{skills,commands,agents}`) now record skipped items as `(name, reason, reason_code)` triples. Reason codes come from a new `memtomem.context._skip_reasons` module: `no_canonical_root`, `unknown_runtime`, `parse_error`, `invalid_name`, `already_imported`, `canonical_exists`, `toml_parse_error`. Human `reason` strings stay for CLI/log output. - Web route handlers (`web/routes/context_{skills,commands,agents}.py`) surface `reason_code` per skipped item, plus `canonical_root` on sync responses and `project_root` + `scanned_dirs` on import responses. - The web client (`context-gateway.js`) reads `data.skipped` (in addition to `data.dropped` for commands/agents field-level drops), matches on `reason_code === "no_canonical_root"` to show "No canonical {type} under {canonical}. Create one first." instead of a generic success, and on a 0/0 import shows "No runtime {type} found in {project_root}. Scanned: {scan_dirs}." so users see exactly which paths the detector inspected. - The empty list state hint now points at the canonical and runtime paths instead of "Create one or import from existing runtimes." — a fresh user can drop a `SKILL.md` into the right directory without guessing. - Three new placeholder-form i18n keys (`empty_hint`, `import_no_runtimes`, `sync_empty_canonical`) cover all three types via `{type}` / `{canonical}` / `{root}` / `{scan_dirs}` substitution rather than 9 type-specific keys. - `context-gateway.js?v=2` cache-bust. All response changes are additive — existing clients that only read `imported`/`generated`/`skipped[].name|runtime|reason` keep working unchanged. The CLI (`mm context skills/agents/commands ...`) and the MCP `context` tool also unpack the new triple shape via `_code` discard so their human-readable output is byte-identical. Tests: - 3 core test suites (`test_context_{skills,commands,agents}.py`) updated to the 3-tuple shape. - New web route assertions verify `reason_code`, `canonical_root`, `project_root`, and `scanned_dirs` are present in sync/import responses for the empty-canonical and empty-runtime paths. - Full suite green: `pytest -m "not ollama"` 3152 passed; `ruff check` + `ruff format --check` clean. Co-Authored-By: Claude <[email protected]> * fix(web/context): address review — surface scan dirs on GET, basename in toast, align canonical_root style Follow-up to PR #549 review: - **#1 (low-medium)**: drop the hardcoded `_CTX_EMPTY_HINT_META` JS map. GET `/api/context/{skills,commands,agents}` now also returns `canonical_root: str` and `scanned_dirs: list[str]` (computed once at module import from `SKILL_DIRS` / `AGENT_DIRS` / `COMMAND_DIRS`), so the empty-state hint pulls from the wire instead of duplicating the detector layout client-side. The JS still has a one-line fallback (`.memtomem/${type}` + empty list) for older backends, but it never fires when the cache-bust takes effect. - **#2 (cosmetic)**: skills sync response now uses the `CANONICAL_SKILL_ROOT = ".memtomem/skills"` constant directly, matching the `CANONICAL_AGENT_ROOT` / `CANONICAL_COMMAND_ROOT` pattern in agents/commands routes. Drops the `_safe_rel(canonical_skills_root( project_root), project_root)` round-trip — both forms resolve to the same string today, but a constant won't drift if a future PR adds env / scope resolution to `canonical_skills_root()`. - **#3 (low)**: import-no-runtimes toast renders `basename(project_root)` instead of the absolute path. The wire still carries the absolute string (consistent with `/api/system/*` and useful for debugging / reverse-proxy contexts) but the user-facing toast keeps it short — `scanned_dirs` already gives full orientation. New `_ctxBasename()` helper handles the POSIX path split. Cache-bust bumped to `context-gateway.js?v=3`. Tests: - New assertions on `TestListSkills.test_empty`, `TestListCommands.test_empty`, `TestListAgents.test_empty` verify GET response now includes `canonical_root` + `scanned_dirs`. - `pytest -m "not ollama"` 3152 passed; `ruff check` + `ruff format --check` clean. Skipped per reviewer guidance: - #4 `Final[str]` → `Literal` / `StrEnum` (mypy is advisory). - #5 additional positive route assertions for `invalid_name` / `already_imported` / `toml_parse_error` (route layer is a trivial dict comprehension; covered at the core layer). - #6 Korean `<이름>` literal angle-bracket (intentional placeholder guide, not an unsubstituted variable). Co-Authored-By: Claude <[email protected]> * refactor(context): narrow reason_code to Literal SkipCode Tightens the typing on the third element of `(name, reason, reason_code)` triples produced by `SkillSyncResult.skipped`, `CommandSyncResult.skipped`, `AgentSyncResult.skipped`, and `ExtractResult.skipped`. The new `memtomem.context._skip_reasons.SkipCode` is a `Literal` of the seven codes already enumerated in that module, so a typo at the construction site fails type-check instead of slipping through to the wire. Also strengthens the import-empty web-route tests for skills, commands, and agents to assert `data["project_root"] == str(tmp_path)` instead of just key presence — confirms the response actually echoes the fixture's project root rather than some unrelated path. No runtime behavior change; addresses follow-up polish flagged in the review of this PR. Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>

``IndexEngine.index_path_stream`` and the ``GET /api/index/stream`` endpoint were missing two contract elements that ``index_path`` / ``POST /api/index`` already had: the ``namespace`` parameter (silently dropped, so streamed indexes ignored user namespace selection) and error aggregation in the ``complete`` event (per-file ``result["errors"]`` was summed only into chunk counters and never emitted, so partial-failure runs appeared successful in the UI). Engine: thread ``namespace`` through to ``_index_file``, accumulate per-file errors, include them as ``errors: list[str]`` in the ``complete`` event — same loose shape as ``IndexingStats.errors`` so non-stream UI handlers reuse verbatim. Path-prefix the stream-level uncaught-exception branch (``f"{fp.name}: {exc}"``) to match the non-stream ``asyncio.gather(return_exceptions=True)`` branch. Route: forward the new ``namespace`` query param to the engine. Tests: ``test_index_path_stream_namespace_propagates`` (chunks gain ``metadata.namespace`` for the streamed run) and ``test_index_path_stream_complete_errors_no_silent_drop`` (the binary file triggers a per-file error that surfaces in ``complete.errors``). The test names document intent so future grep finds the regression guards. Out of scope: per-file streaming of errors (option ii/iii from the planning discussion); normalizing ``_index_file``'s mixed path-prefix convention. Both noted in the issue. Unblocks #582 PR-B (Prev #1 — Index tab two-button collapse) and informs #582 PR #6 (4.11 — indexing in-flight visibility). Refs #590. Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>

…592) The folder-mode panel exposed two side-by-side buttons — primary ``Index`` (POST /api/index) and ghost ``Index with Progress`` (SSE stream). The two-button shape made users guess which one to pick and diluted the form's primary action. Drop the non-stream button entirely and route the single primary ``인덱스 / Index`` button through the streaming handler so progress is always visible. This is option (a) from the umbrella discussion (Prev #1). Option (b) — single button + a "show progress" toggle — would have kept two indexing paths alive and required a more complex ``STATE.indexing`` shape in #582 PR #6 (4.11). With (a) the indexing path is unified; PR #6 can use a single boolean flag. Made possible by #591 (closes #590), which gave ``index_path_stream`` the missing ``namespace`` parameter and a ``complete.errors`` field — both consumed here. The streaming handler now matches the full UX of the removed inline POST handler: - Reads ``namespace`` from the form input and forwards it as a query param when non-empty. - On ``complete``: renders ``r-errors-row`` with the same 5-cap "+N more" rule that the inline handler used (#354 partial-failure surface), then dispatches either ``toast.index_partial`` (red, with error count + first error) or ``toast.indexed_count`` (green, with the "Register as Source" toast action — folder mode is one-shot, so this nudge mirrors the inline behavior). Removes ``index.stream_btn``, ``toast.stream_complete``, and the ``.index-btn-row`` CSS rule (now wraps a single button → row container unnecessary). Updates ``test_i18n.py``'s required key set so the parity guard stays green. Refs #582 (Prev #1). Co-authored-by: pandas-studio <[email protected]> Co-authored-by: Claude <[email protected]>

tsdata merged commit 5074ba7 into main Apr 9, 2026

tsdata deleted the docs/restructure-readmes-with-guides branch April 9, 2026 13:40

memtomem mentioned this pull request Apr 28, 2026

feat(scheduler): mm schedule CLI + schedule_* mem_do actions (P2 Phase A.4) #522

Merged

4 tasks

memtomem mentioned this pull request Apr 29, 2026

fix(web): surface "why nothing happened" on context Sync/Import #549

Merged

4 tasks

This was referenced Apr 30, 2026

feat(web): Index tab — surface hard-to-discover behaviors in one bundle (static surfacing) #579

Closed

Streaming indexing endpoint missing namespace + errors parity #590

Closed

fix(indexing): stream endpoint namespace + errors parity (#590) #591

Merged

memtomem mentioned this pull request Apr 30, 2026

feat(web): collapse Index tab's two indexing buttons into one (#582) #592

Merged

4 tasks

This was referenced Apr 30, 2026

feat(web): global indexing-progress state + cross-tab indicator (#582) #602

Merged

RFC: CSRF and browser-origin guard for the local Web UI #787

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: slim READMEs, add configuration + embeddings guides#6

docs: slim READMEs, add configuration + embeddings guides#6
tsdata merged 1 commit intomainfrom
docs/restructure-readmes-with-guides

tsdata commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tsdata commented Apr 9, 2026

Summary

Sizes

What's in the new READMEs

packages/memtomem/README.md (PyPI landing page)

README.md (GitHub landing page)

What moved out

"Docs as tests" — verified before writing

Local checks

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`packages/memtomem/README.md` (PyPI landing page)

`README.md` (GitHub landing page)