RFC: make auto-discover of well-known memory dirs overridable (currently unconditional)

## Problem

`ensure_auto_discovered_dirs` (`packages/memtomem/src/memtomem/config.py:876`) runs unconditionally in `packages/memtomem/src/memtomem/server/component_factory.py:48` after `load_config_overrides`, augmenting `memory_dirs` with three hardcoded paths whenever they exist on disk:

- `~/.claude/projects` — Claude Code
- `~/.gemini` — Gemini CLI
- `~/.codex/memories` — Codex CLI

No env var, flag, or config key disables this. The docstring frames it as intentional ("user overrides don't suppress auto-discovery").

For workflows where these directories contain sensitive content (Gemini OAuth tokens, Claude Code session/conversation data, Codex memories), users currently cannot opt out of indexing them without removing the directory itself — which breaks the owning CLI. Removing entries from `memory_dirs` in `config.json` is silently reverted on every server start.

## Current defense-in-depth status

The upcoming 0.1.10 release closes the specific leak path that combined auto-discover with an unguarded `IndexEngine.index_file`. PRs #225, #226, and #252 are already on `main`; tag + PyPI publish pending.

Once 0.1.10 ships, the engine refuses to index files matching the exclude filter (built-in denylist + user `indexing.exclude_patterns`, against both absolute paths and memory-dir-relative paths) at the `IndexEngine.index_file` entry point.

Auto-discover coupling remains a defense-in-depth concern independent of that fix: any future engine-layer regression immediately re-opens the leak, and the exclude filter only covers what its patterns already enumerate. An explicit opt-out would give users agency over which directories are indexed at all, independent of filename heuristics.

## Three candidate directions

### 1. Allowlist default

`memory_dirs` starts empty; users add directories explicitly. Auto-discover is removed entirely.

- ✅ Cleanest semantics. No magic.
- ❌ Biggest UX break. Every user must configure once before indexing does anything useful.

### 2. JIT consent

First encounter of a well-known directory prompts for approval; the decision is stored in config.

- ✅ Preserves the current first-launch UX for interactive setups.
- ❌ Adds friction for headless / CI environments. Prompt UX needs a non-interactive fallback policy.

### 3. Memory vs. watch split

Split `memory_dirs` (intentional, user-curated) from `watch_dirs` (auto-discovered, narrower permissions). Auto-discover populates only the latter.

- ✅ Cleanest conceptual boundary between "things the user wants indexed" vs. "things the system was told to notice".
- ⚠️ Moderate UX change. Requires rethinking the indexing-permissions model (e.g., should `watch_dirs` entries have a denylist-by-default?).

## Trade-off summary

All three options require migration paths. Option 1 offers the cleanest semantics at the cost of first-launch UX. Option 2 preserves first-launch UX but needs a headless/CI prompt policy to ship. Option 3 restructures the permissions model for a cleaner conceptual boundary, with moderate UX change. No fixed maintainer leaning yet — community input on these trade-offs welcome.

## Out of scope for this issue

- The security fix itself (shipped by the 0.1.10 release on `main`, PyPI publish pending)
- Namespace / tagging changes for auto-discovered dirs (separate ingest-pipeline concern)
- Changes to the built-in denylist default contents

## Cross-refs

- PR #252 — entry-point guard (part of the 0.1.10 security fix)
- PR #225 / PR #226 — built-in denylist + `indexing.exclude_patterns` (part of the 0.1.10 security fix)
- Related investigation: #261


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: make auto-discover of well-known memory dirs overridable (currently unconditional) #260

Problem

Current defense-in-depth status

Three candidate directions

1. Allowlist default

2. JIT consent

3. Memory vs. watch split

Trade-off summary

Out of scope for this issue

Cross-refs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RFC: make auto-discover of well-known memory dirs overridable (currently unconditional) #260

Description

Problem

Current defense-in-depth status

Three candidate directions

1. Allowlist default

2. JIT consent

3. Memory vs. watch split

Trade-off summary

Out of scope for this issue

Cross-refs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions