Skip to content

feat(indexing,security): add exclude_patterns + built-in secret denylist#226

Merged
memtomem merged 1 commit intomainfrom
feat/exclude-patterns
Apr 18, 2026
Merged

feat(indexing,security): add exclude_patterns + built-in secret denylist#226
memtomem merged 1 commit intomainfrom
feat/exclude-patterns

Conversation

@memtomem
Copy link
Copy Markdown
Owner

Summary

Implements PR (a) of #225. Adds a two-tier exclude system for indexing:

  • Built-in denylist_BUILTIN_SECRET_PATTERNS (oauth_creds.json,
    credentials*, id_rsa*, *.pem, *.key, .ssh/**) and
    _BUILTIN_NOISE_PATTERNS (.claude/**/*.meta.json) are always applied;
    user config cannot disable or negate them.
  • User slot — new IndexingConfig.exclude_patterns: list[str]
    (gitignore syntax via pathspec). Registered in MUTABLE_FIELDS +
    FIELD_CONSTRAINTS so CLI (mm config set), MCP (mem_config), and the
    future Web UI share one validation path.
  • Built-in and user specs are separate PathSpec instances — a user
    !negation only affects the user spec, so security invariants hold
    regardless of user config.
  • Case-insensitive matching (lowercase on both pattern and candidate).
  • _EXCLUDED_DIRS extended with .aws, .ssh, .gnupg for directory-
    level secret stores.

New CLI

mm purge --matching-excluded            # dry-run
mm purge --matching-excluded --apply    # actually delete

Scans storage.get_all_source_files() against the current exclude set
(built-in + user) and deletes matching chunks via delete_by_source.
Needed because the denylist applies retroactively only to new indexing
runs — existing chunks for secret files would otherwise remain.

Scope notes

Test plan

  • uv run ruff check packages/memtomem/src packages/memtomem/tests
  • uv run ruff format --check packages/memtomem/src packages/memtomem/tests
  • uv run pytest -m "not ollama" — 1575 passed, 46 deselected
  • 17 new tests covering:
    • Built-in secret & noise patterns block at discovery
    • User patterns honored (positive + negation)
    • Security regression: user !**/oauth_creds.json cannot unset
      built-in secret patterns
    • Case-insensitive matching
    • Directory-level .aws/.ssh/.gnupg excluded
    • mm purge --matching-excluded dry-run does not delete;
      --apply calls delete_by_source
    • No-match branch reports clean
  • Manual: mm purge --matching-excluded against a dev DB with existing
    Claude Code subagent meta chunks — verify sample output and --apply
    removal.

Migration note (in CHANGELOG)

Existing users with auto-discovered AI tool directories in memory_dirs
should run mm purge --matching-excluded to clean up chunks that predate
this change. Without it, credential chunks already stored remain
searchable and exportable.

Closes #225 (PR (a)).

…ist (#225)

Files matching supported_extensions inside auto-discovered memory_dirs
were indexed without filtering, including credential files in well-known
AI tool directories and tool-generated metadata. Add a two-tier exclude
system:

- _BUILTIN_SECRET_PATTERNS and _BUILTIN_NOISE_PATTERNS (module constants,
  always applied; user negation cannot override).
- IndexingConfig.exclude_patterns (user-configurable, gitignore syntax
  via pathspec, evaluated as a separate PathSpec so built-in invariants
  hold regardless of user config).

Case-insensitive matching via lowercase on both pattern and candidate.
Directory-level denylist extended with .aws, .ssh, .gnupg.

mm purge --matching-excluded scans storage for already-indexed files
matching the current exclude set; dry-run default, --apply to delete.

Registered via MUTABLE_FIELDS + FIELD_CONSTRAINTS so CLI/MCP/future
Web UI share one validation path.

Closes #225 (Web UI editor is a follow-up).

Generated with Claude Code.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
@memtomem memtomem added enhancement New feature or request security Security-related issue or fix labels Apr 18, 2026
@memtomem memtomem merged commit 8f440c3 into main Apr 18, 2026
7 checks passed
@memtomem memtomem deleted the feat/exclude-patterns branch April 18, 2026 05:09
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request security Security-related issue or fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(indexing,security): user-configurable exclude_patterns + built-in secret denylist

2 participants