feat: opt-in provider memory dirs + scope narrowing per official docs#292
feat: opt-in provider memory dirs + scope narrowing per official docs#292
Conversation
Replace the silent `ensure_auto_discovered_dirs` runtime path with an
explicit `mm init` wizard step (Step 4 of 10: "Provider memory folders")
and a non-interactive `--include-provider {claude-memory,claude-plans,codex}`
flag. Accepted categories land directly in `indexing.memory_dirs`.
Auto-discovery scope is narrowed to each provider's canonical memory
surface (verified against the official documentation):
- Claude Code per-project memory: `~/.claude/projects/<project>/memory/`
only (not the whole projects tree with session JSONL transcripts and
staging/). Subdirs without any *.md files are skipped so empty session
scaffolding doesn't pollute the index.
- Claude Code plans: `~/.claude/plans/`.
- Codex CLI memories: `~/.codex/memories/`.
Gemini CLI is removed from auto-discovery — its memory is the single
file `~/.gemini/GEMINI.md` (incompatible with the directory-based
`memory_dirs` abstraction) and the parent dir contains secrets like
`oauth_creds.json`. `mm ingest gemini-memory` remains as the supported
one-shot path for Gemini users.
`indexing.auto_discover` becomes a one-shot migration trigger. For
existing installs with the flag True AND a config.json on disk, the next
startup appends canonical provider dirs to `memory_dirs`, flips the flag
to False, and persists both atomically. Brand-new installs (no
config.json yet) skip migration entirely; the wizard is the only path
that adds provider dirs.
Manual verification across four paths (fresh install, skip-wizard,
legacy migration, idempotent re-run) passed; 1837 tests green.
Co-Authored-By: Claude <[email protected]>
리뷰 (Opus 4.7)전반적으로 단단한 설계입니다. 사일런트 🔴 Blocker — CI lint 실패
PR body 는 "ruff check / ruff format --check clean" 으로 적혀 있는데 CI 는 2 파일이 reformat 필요하다고 보고합니다 — uv run ruff format packages/memtomem/tests/test_config_overrides.py packages/memtomem/tests/test_init_cmd.py로직 변경 없는 mechanical fix 입니다. 코드 품질 — 좋은 점
마이너 코멘트 (non-blocking)
테스트 커버리지 — 갭 1 개8 개 신규 테스트가 wizard 분기 / 누락: legacy install 에서 사용자가 명시적으로 설정한 def test_migration_preserves_existing_user_memory_dirs(...):
override_path.write_text(
json.dumps({"indexing": {"memory_dirs": ["/user/explicit/path"]}}),
encoding="utf-8",
)
# assert both /user/explicit/path AND fake provider dir are persisted문서configuration.md / getting-started.md (step 9→10 renumber) / reference.md / README / CHANGELOG 모두 업데이트 — 정리
Lint fix + migration preservation 테스트 1 개면 모든 우려 해소됩니다. 나머지는 follow-up 으로 충분. 🤖 Generated with Claude Code |
Addresses review feedback on #292: - ruff format on tests/test_config_overrides.py and tests/test_init_cmd.py (CI ran `ruff format --check` against the whole repo including tests/; the earlier local check was scoped to src/ only and missed 3 trivial json.dumps/loads line-collapse differences in the new tests). - Add test_migration_preserves_existing_user_memory_dirs — directly exercises the ``_persist_auto_discover_migration`` preservation invariant for the most common legacy install shape: user-set ``memory_dirs`` alongside the default-True ``auto_discover`` flag. Asserts both in-memory and persisted state keep the user entry while appending the newly-discovered provider dir. - Replace the vague "Pre-Z behaviour" phrasing in ``_migrate_auto_discover_once`` docstring with "Releases 0.1.11 and earlier" so future readers don't need to decode the codename. No behaviour changes — format + test + doc only. 1838 tests green. Co-Authored-By: Claude <[email protected]>
|
리뷰 감사합니다. Blocker + Medium 한 번에 반영했습니다 ( Blocker (ruff format): Medium (preservation 테스트): Low — "Pre-Z" docstring: Non-blocking — path 형식 혼재: 의도적으로 두겠습니다. user dir 는 사용자가 wizard 에 입력한 원문 ( Non-blocking — symlink 추적: 코멘트만 확인. 로컬 재검증: 1838 tests green, ruff check + format --check + mypy 모두 clean. CI 재확인 부탁드립니다. |
Summary
Replace silent
ensure_auto_discovered_dirsruntime indexing with an explicit opt-in wizard step, narrow scope to canonical memory surfaces per each provider's official docs, and migrate legacyauto_discover=Trueinstalls transparently.Motivation
The previous default silently indexed three provider home directories on every startup:
~/.claude/projects/— but this contains huge per-session.jsonltranscripts (1MB+ each, 200+ files) +staging/, not just memory~/.gemini/— containsoauth_creds.json, browser profile dirs,history/,tmp/— most of it isn't memory and some is sensitive~/.codex/memories/— this one was correctly scopedUsers were never asked. The scope was much wider than intended even though
supported_extensionsfiltered most transcript/JSON noise — the directory walks were still happening on every index/watch.Changes
Wizard (user-facing)
New Step 4 "Provider memory folders" — detects canonical paths on disk and prompts per-category. Non-existent categories are skipped silently.
Non-interactive:
mm init -y --include-provider claude-memory --include-provider codex(repeatable flag).Narrowed scope (verified against official docs)
~/.claude/projects/<project>/memory/(with*.mdcontent)~/.claude/plans/~/.codex/memories/Gemini CLI removed. Its memory is a single file
~/.gemini/GEMINI.mdthat doesn't fit thememory_dirsdirectory abstraction; the parent dir containsoauth_creds.json.mm ingest gemini-memory(one-shot import) is unchanged for Gemini users.Migration
indexing.auto_discoverbecomes a one-shot migration trigger. For existing installs with the flag True (default or explicit) AND aconfig.jsonon disk:memory_dirs.Brand-new installs (no
config.json) skip migration entirely — the wizard is the only path that adds provider dirs. Subsequent startups seeauto_discover: false→ no-op.Verification
ruff check/ruff format --checkclean,mypy0 errors across 193 source files.TestProviderDirsStep— wizard step: no-detect skip, mixed accept/reject, empty-subdir filter, per-project enumeration via single prompt,state["provider_dirs"]→memory_dirsmerge with dedup +auto_discover: falsepin.TestIncludeProviderFlag— non-interactive: category-to-dir resolution, silent skip when unavailable, no-flag = no provider dirs.MEMTOMEM_INDEXING__AUTO_DISCOVER=falseshort-circuits._detect_provider_dirsstructural tests: fixed categories, excludes Gemini, filters empty Claude memory subdirs, finds plans + codex.HOME=/tmp/...):--include-provider codex→ opt-in works, only selected dirs inmemory_dirs,auto_discover: falsepersisted ✓config.json) → factory default only, no auto-indexing despite fake~/.claude/projects/<proj>/memory/present ✓auto_discover: true→ all 3 canonical categories migrated, full list persisted (factory + discovered), flag flipped ✓Test plan
mm index --rebuildafter upgrade for cleanest index state)Breaking change notes
~/.gemini/will lose that surface. Documented in CHANGELOG + configuration.md;mm ingest gemini-memoryis the replacement.~/.claude/projects/previously walked wholesale (transcripts skipped only bysupported_extensions, not excluded at walk time); now only the*/memory/subdirs appear inmemory_dirs. Index entries for paths no longer inmemory_dirswon't be auto-pruned — recommendmm index --rebuildpost-upgrade for the cleanest state.indexing.auto_discoverfield is deprecated; removal scheduled for a future minor.🤖 Generated with Claude Code