fix(config): lock provider category vocabulary (RFC #304 prep)#313
Merged
fix(config): lock provider category vocabulary (RFC #304 prep)#313
Conversation
Until RFC #304 settles the vendor/product hierarchy for memory_dir categorization, extending ``_PROVIDER_CATEGORY_PATTERNS`` silently expands the returned vocabulary and pre-empts the RFC. Mirror the existing ``_VALID_PRESET_PLACEHOLDERS`` lock in ``cli/init_cmd.py`` on the category side so both axes are enforced, not just one: - ``_VALID_PROVIDER_CATEGORIES`` frozenset enumerating the approved four categories (``user``, ``claude-memory``, ``claude-plans``, ``codex``) - ``_VOCABULARY_LOCK_MESSAGE`` constant holding the assertion string so the RFC-reference pin test can verify the exact message instead of ``inspect.getsource`` scanning, which would false-positive on unrelated ``#304`` mentions elsewhere in ``config.py`` - Import-time ``assert`` that the derived category set (``cat for cat, _ in _PROVIDER_CATEGORY_PATTERNS`` unioned with the ``"user"`` fallback) matches the declared vocabulary; placed before the ``PROVIDER_DIR_CATEGORIES`` derivation so the failure surfaces at import rather than at a later call site - Two pin tests in ``test_init_cmd.py`` next to the existing ``TestCategorizeMemoryDir`` / ``TestDetectProviderDirsRoundtrip`` suites (no ``test_config.py`` in this repo; follows PR #301 precedent) Non-destructive: no API shape change, no new response field, no UI work. RFC #304 itself stays OPEN; its Phase 1 (``provider`` response field) remains blocked on the RFC deciding the wire format. Smoke (manual, reverted): adding ``("gemini", re.compile(r"/\.gemini/memories/?$"))`` to ``_PROVIDER_CATEGORY_PATTERNS`` without updating ``_VALID_PROVIDER_CATEGORIES`` raises at import: AssertionError: Provider category vocabulary changed without updating _VALID_PROVIDER_CATEGORIES. See RFC #304 before adding categories. Co-Authored-By: Claude <[email protected]>
This was referenced Apr 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_VALID_PROVIDER_CATEGORIESfrozenset + import-timeassertinconfig.pyso new rows in_PROVIDER_CATEGORY_PATTERNScannot silently expand the category vocabulary while RFC #304 is still open._VOCABULARY_LOCK_MESSAGEand pin-test that the constant still contains#304, so a future rewrite that drops the RFC reference fails a test rather than surviving on an unrelated#304mention elsewhere in the file._VALID_PRESET_PLACEHOLDERSlock incli/init_cmd.py:280-285— both axes (placeholder vocabulary and category vocabulary) are now guarded the same way.Why this is "prep", not "resolve"
RFC #304 has 7 unresolved design questions (vendor vs source axis, wire format, user-bucket placement, collapse semantics, group-reindex scope, Codex rename, Gemini exclusion) and is staying OPEN. This PR does not answer any of them. It closes a separate defensive gap that would otherwise let an unrelated contributor silently expand the category vocabulary — pre-empting the RFC — by adding a fourth regex row.
Non-goals / explicit follow-ups
These are not in this PR; each is a separate decision:
provider: strresponse field (RFC Phase 1) — blocked on the RFC deciding the wire format.byCategoryparallel literal insources-memory-dirs.js:249-259— couples to Phase 2 tree design.categorize_memory_dirreturn type toLiteral[...]— independent mypy-advisory sweep.Smoke evidence
The pin tests verify the pass condition. To prove the assertion actually fails loudly when violated, I manually added a stray row to
_PROVIDER_CATEGORY_PATTERNSand re-imported:Confirms: (1) fires at import time (not at a later call site), and (2) the message points at RFC #304 so anyone hitting this traceback lands in the right issue. Stray row reverted.
Test plan
uv run ruff check packages/memtomem/src— passuv run ruff format --check packages/memtomem/src packages/memtomem/tests— passuv run pytest packages/memtomem/tests/test_init_cmd.py -k "vocabulary_locked or references_rfc or TestCategorizeMemoryDir or TestDetectProviderDirsRoundtrip"— 11 passuv run pytest -m "not ollama"— 1902 passed, 46 deselected, 0 failuresCo-Authored-By: Claude [email protected]