fix: namespace defense-in-depth — AppContext setter + architectural guard + message constant#502
Closed
memtomem wants to merge 1 commit intofix/issue-500-validate-mem-ns-toolsfrom
Closed
Conversation
…uard + message constant Builds on PR #501 (issue #500) with three targeted reinforcements that close the *regression class* the multi-agent namespace-gate series (#491 → #494 → #496 → #498 → #499 → #500/#501) kept hitting one PR at a time — a public surface gets added or refactored that takes a namespace-shaped argument, but the new code path forgets to import / call ``validate_namespace``. The series itself is the evidence: every PR closed one more surface that had been silently ungated. Three changes, cheapest-to-most-invasive: ────────────────────────────────────────────────────────────────────── A. ``AppContext.current_namespace`` property setter ────────────────────────────────────────────────────────────────────── Convert the dataclass field to a backing-store + property pair so every write — ``mem_ns_set``, a future tool we forget to gate, a Python adapter, or test code doing ``app.current_namespace = X`` — runs through ``validate_namespace`` before the value lands in app state. The forward-shield contract on the public input surfaces (``mem_ns_set``, ``mem_session_start(namespace=)``, ``mem_agent_share(target=)``, …) still holds — this is defense-in- depth, not a replacement. The bypass this closes is the Python-level direct-mutation path: ``mem_ns_set`` is gated, but anyone with an ``AppContext`` reference can also do ``app.current_namespace = "agent-runtime:foo:bar"`` without going through the tool. ``mem_session_start(agent_id= "default")``'s priority chain (step 3 fallback, ``server/tools/ session.py:109-110``) reads ``current_namespace`` back as the ``sessions``-row namespace, so the same #496 / #500 bypass shape round-trips through this back door. The internally-derived contract from ``constants.py:96-100`` is preserved: re-validation only happens on caller-supplied writes that reach app state. Reads never re-validate; internally-derived namespaces (``f"{AGENT_NAMESPACE_PREFIX}{ agent_id}"`` after ``validate_agent_id`` succeeded, ``"default"``, ``SHARED_NAMESPACE``) skip the gate, same as before. ────────────────────────────────────────────────────────────────────── B. Architectural guard test ────────────────────────────────────────────────────────────────────── ``tests/test_validate_namespace_architectural_guard.py`` AST-scans ``server/tools/*.py`` for any function with a ``namespace=``/``target=``/``old=``/``new=``/``old_namespace=`` parameter and forces every match to be classified into one of two sets: * ``VALIDATED_NS_SURFACES`` — the 9 ``(file, fn, param)`` triples PR #499 + PR #501 gate today. Test asserts each function body actually calls ``validate_namespace(<param>)``; a regression that drops the call (refactor, accidental deletion, copy-paste from a deferred surface) trips the test. * ``DEFERRED_NS_SURFACES`` — the 22 triples the project has *explicitly* left ungated for now (issue #500's "broader UX call covered separately if pursued" line). Each entry carries inline rationale so future readers can tell deferral from oversight. A new tool with a namespace-shaped param that lands in *neither* set fails ``test_no_unclassified_ns_surfaces`` and forces the author to make the validation decision at PR time — gate it (move to VALIDATED) or explicitly defer it (move to DEFERRED with rationale). Mirrors the ``feedback_drift_close_must_derive`` and ``feedback_stub_gap_check`` discipline: every literal that lists "the gated surfaces" eventually drifts unless a guard forces re- classification when the code adds new ones. ────────────────────────────────────────────────────────────────────── E. Constantize ``"invalid namespace"`` message prefix ────────────────────────────────────────────────────────────────────── Per PR #501 review: the test suite's reliance on ``out.startswith( "Error: invalid namespace")`` made the validator's error formatting an *implicit* public API. Centralise it as ``INVALID_NAMESPACE_MESSAGE_PREFIX`` in ``constants.py`` and reference it from both the validator's f-strings and the test assertions, so a future tweak (capitalisation, wording, etc.) stays one change point. No behaviour change — the literal value is unchanged. ────────────────────────────────────────────────────────────────────── Out of scope (deliberate, per the original #500 contract) ────────────────────────────────────────────────────────────────────── * **Storage-layer charset alignment.** Tightening ``sqlite_namespace._NS_NAME_RE`` (``[\w\-.:@ ]{1,255}``) to the strict gate would change behaviour for legacy data — out of scope for the same forward-only reasoning #496 / #499 / #500 used. * **Final-resolution helper inside ``mem_session_start``.** Same pattern as the input gate at the boundary; marginal value over what (A) already delivers. * **Validating ``mem_add`` / ``mem_search`` / ``mem_recall`` ``namespace=``.** Tracked as deferred via the architectural guard's ``DEFERRED_NS_SURFACES``; promoting these is a UX call (do we reject ``mem_add(namespace="foo bar")`` loudly, or accept the legacy charset?) and belongs in a follow-up. Tests: * New ``TestAppContextCurrentNamespaceSetter`` class in ``test_validate_namespace_ns_tools.py`` — pins the property setter contract at the Python attribute level. * New ``test_validate_namespace_architectural_guard.py`` — 11 tests: 9 parametrised ``test_declared_validated_surfaces_call_validate_ namespace`` + ``test_no_unclassified_ns_surfaces`` + ``test_validated_and_deferred_are_disjoint``. * Existing assertions in ``test_validate_namespace_ns_tools.py`` switched to ``_ERROR_PREFIX`` (sourced from the new constant). * Full ``pytest -m "not ollama"`` green (2887 passed, was 2850 before the +37 new tests). Co-Authored-By: Claude <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on top of #501 — set base to that branch. Once #501 merges, this rebases onto
main.Summary
Three targeted reinforcements that close the regression class the multi-agent namespace-gate series (#491 → #494 → #496 → #498 → #499 → #500/#501) kept hitting one PR at a time — every PR closed one more public surface that had silently shipped without
validate_namespace. The series itself is the evidence; this PR makes the next iteration trip a test before review instead of in production.A.
AppContext.current_namespaceproperty setterConvert the dataclass field to a backing-store + property pair so every write —
mem_ns_set, a future tool we forget to gate, a Python adapter, or test code doingapp.current_namespace = X— runs throughvalidate_namespacebefore the value lands in app state.The bypass this closes is the Python-level direct-mutation path:
mem_ns_setis gated, but anyone with anAppContextreference can also doapp.current_namespace = "agent-runtime:foo:bar"without going through the tool.mem_session_start(agent_id="default")'s priority chain (step 3 fallback,server/tools/session.py:109-110) readscurrent_namespaceback as thesessions-row namespace, so the same #496 / #500 bypass shape round-trips through this back door.The forward-shield contract from
constants.py:96-100is preserved: re-validation only happens on caller-supplied writes that reach app state. Internal callers reading the value back never re-validate; internally-derived namespaces (f"{AGENT_NAMESPACE_PREFIX}{agent_id}"aftervalidate_agent_idsucceeded,"default",SHARED_NAMESPACE) skip the gate, same as before.B. Architectural guard test
tests/test_validate_namespace_architectural_guard.pyAST-scansserver/tools/*.pyfor any function with anamespace=/target=/old=/new=/old_namespace=parameter and forces every match to be classified:VALIDATED_NS_SURFACES— the 9(file, fn, param)triples PR fix: validate user-supplied namespace= overrides on session entry points #499 + fix: validate user-supplied namespace strings on mem_ns_* CRUD tools (#500) #501 gate today. Test asserts each function body actually callsvalidate_namespace(<param>); a regression that drops the call (refactor, accidental deletion, copy-paste) trips the test.DEFERRED_NS_SURFACES— the 22 triples the project has explicitly left ungated for now (issue Validate user-supplied namespace strings on mem_ns_* CRUD tools (transitive bypass to session rows) #500's "broader UX call covered separately if pursued"). Each entry carries inline rationale.A new tool with a namespace-shaped param that lands in neither set fails
test_no_unclassified_ns_surfacesand forces the author to make the validation decision at PR time — gate it (move to VALIDATED) or explicitly defer (move to DEFERRED with rationale).E. Constantize
"invalid namespace"message prefixPer #501 review: the test suite's reliance on
out.startswith("Error: invalid namespace")made the validator's error formatting an implicit public API. Centralised asINVALID_NAMESPACE_MESSAGE_PREFIXinconstants.py, referenced from both the validator's f-strings and the test assertions. No behaviour change.Out of scope (deliberate)
sqlite_namespace._NS_NAME_RE) — tightening would break legacy rows; same forward-only reasoning as Validate user-supplied namespace= overrides on agent session entry points #496 / fix: validate user-supplied namespace= overrides on session entry points #499 / Validate user-supplied namespace strings on mem_ns_* CRUD tools (transitive bypass to session rows) #500.mem_session_start— same pattern as the input gate; marginal value over what (A) delivers.mem_add/mem_search/mem_recallnamespace=— tracked viaDEFERRED_NS_SURFACES; UX call belongs in a follow-up.Test plan
TestAppContextCurrentNamespaceSetterclass intest_validate_namespace_ns_tools.py— pins the property setter contract at the Python attribute level (5 tests + parametrised hostile sweep).test_validate_namespace_architectural_guard.py— 11 tests: 9 parametrised +test_no_unclassified_ns_surfaces+test_validated_and_deferred_are_disjoint.test_validate_namespace_ns_tools.pyswitched to_ERROR_PREFIX(sourced from the new constant).uv run ruff check && ruff format --check— clean.uv run mypy packages/memtomem/src/memtomem/server/context.py packages/memtomem/src/memtomem/constants.py— clean.uv run pytest -m "not ollama"— 2887 passed (was 2850 pre-fix: validate user-supplied namespace strings on mem_ns_* CRUD tools (#500) #501; +37 new tests across the three reinforcements).🤖 Generated with Claude Code