Skip to content

feat(namespace): path-based policy rules for namespace auto-assignment#253

Merged
memtomem merged 3 commits intomainfrom
feat/namespace-policy-rules
Apr 18, 2026
Merged

feat(namespace): path-based policy rules for namespace auto-assignment#253
memtomem merged 3 commits intomainfrom
feat/namespace-policy-rules

Conversation

@memtomem
Copy link
Copy Markdown
Owner

@memtomem memtomem commented Apr 18, 2026

Summary

Adds NamespacePolicyRule to NamespaceConfig so users can declare path-glob → namespace mappings instead of passing namespace= on every mem_index call. Closes the practical gap in enable_auto_ns, which uses the immediate parent folder name and so produces opaque namespaces like <UUID> or subagents under auto-discovered roots (~/.claude/projects/<UUID>/...).

Resolution priority

explicit param → rules (first valid match) → enable_auto_ns → default_namespace

Rules are evaluated between the explicit-argument check and the enable_auto_ns fallback. Default is [], so existing users see no behavior change until they opt in via config.json or config.d/*.json.

Design decisions

  • Pattern engine: pathspec.GitIgnoreSpec, case-insensitive, matched against the absolute resolved file path with any leading / stripped. Same semantics as IndexingConfig.exclude_patterns so users learn one pattern language.
  • {parent} placeholder: expands to the matched file's immediate parent folder name. Only {parent} is supported in this release; unknown placeholders are rejected at load time so typos fail loudly at startup rather than silently skipping rules at index time.
  • Empty-parent fall-through: if {parent} would expand to an empty string at runtime, the rule is skipped (fall through to the next rule / auto_ns / default_namespace) with a one-time warning log per rule index.
  • APPEND merge: Annotated[list[NamespacePolicyRule], APPEND] so config.d/*.json fragments can contribute rules independently. Fragments concatenate in alphabetical filename order — use numeric prefixes (10-claude.json, 20-gdrive.json) for cross-fragment precedence.
  • Validation: non-empty namespace after strip, ≤ 128 chars, no ASCII control chars; non-empty path_glob with ~/ expanded at load time.

Incidental fixes to config merge machinery

This PR is the first list[BaseSettings] field in the codebase and surfaced two gaps in load_config_d that are fixed inline. Both are scoped to load time (~/.memtomem/config.json + ~/.memtomem/config.d/*.json merging) and do not touch the mutation paths (config_set CLI, PATCH /api/config, future Web UI editors).

  1. _dedup_key now handles dict / BaseSettings / nested list. Previously it only normalised Pathstr and let everything else fall through to raw Python hashing, which raised TypeError: unhashable type: 'dict' the first time a list-of-objects field went through APPEND dedup. The new version recursively converts BaseSettings via model_dump(mode="json") and dicts into sorted tuples, so a native model instance and the raw JSON dict that produced it share a dedup key. Existing list[str] / list[Path] / list[float] fields are untouched.
  2. APPEND merge loop coerces JSON dicts to the declared item type before dedup. A non-validate_assignment BaseSettings lets setattr(section_obj, key, [<dict>]) succeed without coercing, so the merged list would contain raw dicts. The loop now introspects the field annotation, and if the item type is a BaseSettings subclass it calls item_type.model_validate(item) on each incoming dict before the dedup step. Invalid entries are logged and skipped rather than crashing the whole fragment.

Dedup semantics freeze (commit 3): _dedup_key intentionally hashes post-validator field values, so a rule written as ~/foo/** and the same rule written as the expanded absolute path /Users/X/foo/** collide after construction and dedupe to one entry. A new test (test_config_d_namespace_rules_dedup_after_home_expansion) pins this contract so a future refactor that moved ~/ expansion out of the validator — or moved the dedup key off model_dump — would fail at PR time rather than surfacing as a confused bug report from a user who split their rules across two config.d fragments.

Out of scope: the mutation path still lacks nested-BaseModel support — coerce_and_validate in config.py can't consume a list[BaseSettings] today, so config_set namespace.rules ... is rejected and a Web UI editor can't persist rules via PATCH /api/config yet. Fixing that is a separate PR (see Intentionally deferred below).

Intentionally deferred

  • Web UI rules editor — blocked on the coerce_and_validate mutation-path gap above; attempting the editor before that fix would re-discover the same list[BaseSettings] problem on the mutate side.
  • config_set namespace.rules CLI — same root cause as Web UI editor.
  • {match:N} capture group placeholder{parent} covers the common cases; add if evidence accrues.
  • Rule priority: int field — alphabetical fragment ordering + in-fragment declaration order cover 99% of cases.
  • Rule debugging CLI (mm namespace test <path>)mm config show is enough for now.

Each is listed in the plan file's "Out of scope" block so the bounds are explicit.

Test plan

  • 12 new TestNamespacePolicyRules cases (core 7 + edge 5): explicit-wins, rule-match, first-match-wins, rules-beat-auto_ns, {parent} substitution, ~/ expansion, empty-parent fall-through (with log-once assertion), case-insensitive matching, literal template, unknown-placeholder rejection, length-limit rejection.
  • 3 test_config_overrides.py cases covering APPEND merge, the alphabetical-order contract, and the home-expansion dedup freeze.
  • Existing TestResolveNamespace (5 tests) — regression-free.
  • Existing test_every_list_field_declares_merge_strategy — passes for the new field.
  • Full uv run pytest -m "not ollama" — 1650 + new 15 = 1651 passed, 46 deselected.
  • uv run ruff check + uv run ruff format --check.
  • uv run mypy packages/memtomem/src/memtomem/config.py packages/memtomem/src/memtomem/indexing/engine.py — no issues.

Docs in the same PR

  • docs/guides/configuration.md — "Namespace rules (path-based auto-tagging)" subsection under Namespace: JSON example, full semantics, "Verifying your rules" block.
  • docs/guides/user-guide.md — cross-reference from the Auto-namespace section so readers land on the rules doc when the parent folder heuristic isn't enough.

Adds a rule-based layer to NamespaceConfig so users can declare
path-glob -> namespace mappings instead of passing `namespace=` on
every `mem_index` call, and to escape the limitation of `enable_auto_ns`
(which uses the immediate parent folder name and so produces opaque
namespaces like `<UUID>` or `subagents` under auto-discovered roots).

The resolution priority is now:
  explicit param -> rules (first valid match) -> enable_auto_ns -> default

path_glob uses gitignore syntax via `pathspec.GitIgnoreSpec` -- the same
engine as `IndexingConfig.exclude_patterns`, with leading `~/` expanded
at load time and case-insensitive matching against the absolute resolved
file path. The namespace template supports the `{parent}` placeholder
which expands to the matched file's immediate parent folder name.

Validation decisions:

- Unknown placeholders (e.g. `{unknown}`) are rejected at load time so
  typos fail loudly at startup rather than silently skipping rules at
  index time.
- A rule whose `{parent}` would expand to an empty string is skipped at
  runtime (fall through to the next rule / auto_ns / default) with a
  one-time warning log per rule index.
- Non-empty namespace, <= 128 chars, no ASCII control characters.

Rules is an `Annotated[list[NamespacePolicyRule], APPEND]` field so
config.d/*.json fragments can contribute rules independently. Fragments
concatenate in alphabetical filename order -- use numeric prefixes
(`10-claude.json`, `20-gdrive.json`) for cross-fragment precedence.

To make APPEND merge work for a list-of-BaseSettings field, `_dedup_key`
now handles dict / BaseSettings / nested list, and the fragment loader
coerces JSON dicts into the declared item type before dedup so the
resulting list contains fully-validated model instances.

Default is []; backwards compatible for existing users.

Web UI editor and `config_set` CLI support are intentionally deferred
-- list-of-objects editing is a separate architectural concern that
doesn't need to block the data-model change.

Tests: 12 TestNamespacePolicyRules cases (core path + edge cases
including unknown-placeholder rejection, empty-parent fall-through,
case-insensitive matching, literal namespace without placeholder,
length-limit rejection) plus 2 config.d fragment tests covering APPEND
merge and the alphabetical-order contract.
Adds a "Namespace rules (path-based auto-tagging)" subsection to the
configuration guide covering semantics (gitignore syntax, case
sensitivity, first-match-wins, `{parent}` placeholder, APPEND merge in
alphabetical filename order) and a "Verifying your rules" block so
users know how to confirm a rule fired (`mm config show`, force
re-index, `mm session list` or Web UI `/#sources`, namespace labels in
search output).

Cross-references the new section from the user guide's Auto-namespace
section so readers looking at `enable_auto_ns` discover rules as the
preferred path when the immediate parent folder name is opaque.
… path_glob

Pins the contract that ``_dedup_key`` uses ``BaseSettings.model_dump``
post-validator values: a rule written as ``~/foo/**`` and the same rule
written in its expanded absolute form collide after construction and
collapse to one. A future refactor that moved the ``~/`` expansion out
of the validator (or changed the dedup key off ``model_dump``) would
start emitting two rules and fail here -- making the behavior change
visible at PR time rather than via a surprised bug report from a user
who split their rules across two config.d fragments.
@memtomem memtomem merged commit 3e799af into main Apr 18, 2026
7 checks passed
@memtomem memtomem deleted the feat/namespace-policy-rules branch April 18, 2026 12:34
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 18, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants