fix: validate user-supplied namespace= overrides on session entry points by pandas-studio · Pull Request #499 · memtomem/memtomem

pandas-studio · 2026-04-26T05:44:53Z

Summary

Closes the bypass #496 flagged on PR #495 review (Concern 3): even after #491 / #494 / #498 gated every agent_id concatenation, the explicit namespace= argument on session-start (and target= on mem_agent_share) still landed verbatim in storage. A Python / MCP / CLI caller could write "agent-runtime:foo:bar" even though agent_id itself was clean.

Add validate_namespace in constants.py with structural rules: :-separated segments matching [A-Za-z0-9._-]+ (no leading dash, no ./..), 256-char total cap, and a strict-arity rule for agent-runtime:<seg> overrides that re-routes the trailing segment through validate_agent_id.
Wire it at every entry point that accepts a caller-supplied namespace argument: mem_session_start(namespace=) (MCP), mm session start --namespace (CLI), MemtomemStore.start_agent_session(namespace=) and MemtomemStore.start_session(namespace=) (LangGraph), and mem_agent_share(target=) (MCP).
New test_validate_namespace.py pins the wiring + the regression contract: store.start_agent_session(agent_id="planner", namespace="agent-runtime:foo:bar") cannot land an agent-runtime:foo:bar row, same shape pinned at every other surface.

Why

The kin issue to the mem_agent_share(target=...) gap flagged on PR #494 review: same bypass exists on every surface where a user-supplied namespace string is accepted alongside agent_id. Single boundary, single charset, single error message — Error: invalid namespace 'X': ... mirrors the agent_id gate's UX so error scrapers see one shape regardless of how the request entered the system.

The agent-runtime: prefix is the one place where the override path could widen the contract that the direct agent_id= path enforces. Re-routing the trailing segment through validate_agent_id keeps both paths symmetric: a malformed agent id is rejected whether you smuggled it through agent_id=foo:bar or namespace=agent-runtime:foo:bar. The chained agent_id error is wrapped so the outer fragment still says "invalid namespace" for log-scraper consistency (__cause__ preserves the underlying detail).

Out of scope

mem_ns_* CRUD tools — tracked as #500. Code review surfaced a transitive bypass via app.current_namespace: mem_ns_set(namespace="agent-runtime:foo:bar") followed by mem_session_start(agent_id="default") falls through priority step 3 (the app.current_namespace fallback) and lands the same shape this PR closes elsewhere. mem_ns_set / mem_ns_rename / mem_ns_assign / mem_ns_update / mem_ns_delete all accept user-supplied namespace strings without validation today; Validate user-supplied namespace strings on mem_ns_* CRUD tools (transitive bypass to session rows) #500 covers all five so the gate is consistent across the namespace-CRUD surface.
mm session wrap --namespace — the option doesn't currently exist on the wrap subcommand (only --agent-id and --title). The agent_id gate from fix: validate agent_id before agent-runtime namespace concat (CLI + MCP) #491 covers the wrap surface; adding the option belongs in the same change that wires it up, not this PR.
mem_add(namespace=...) / mem_search(namespace=...) — the issue scopes this to session-start surfaces and the kin mem_agent_share gap. CRUD-tool namespace validation is a broader UX call.
Migrating any namespace rows that may already be malformed in existing storage — this is a forward gate only.

Test plan

uv run pytest -m "not ollama" — 2730 passed, 46 deselected (ollama).
uv run ruff check + ruff format --check on src/ and tests/ — clean.
uv run mypy on the 5 changed source files — clean.
New test_validate_namespace.py (136 tests):
- 22 hostile-input rejections at the validator
- 13 in-tree namespaces still accepted
- MCP boundary: mem_session_start(namespace="agent-runtime:foo:bar") returns "Error: invalid namespace ..." and storage stays empty
- CLI boundary: full CLI_HOSTILE_NAMESPACES parametrize (defense-in-depth) + canonical issue-Validate user-supplied namespace= overrides on agent session entry points #496 shape pinned separately
- LangGraph boundary: both start_agent_session and start_session raise InvalidNameError and never await create_session
- mem_agent_share boundary: rejects before chunk lookup
- Cross-surface error parity: MCP and CLI emit the same invalid namespace 'X' core text
- agent-runtime:<oversized> wraps the agent_id error: outer says invalid namespace, __cause__ preserves the agent_id detail

Closes #496. Follow-up tracked as #500.

🤖 Generated with Claude Code

…nts (#496) PR #491 / #494 / #498 closed the bypass on ``agent_id`` itself, but each session-start surface also accepted an explicit ``namespace=`` argument that landed verbatim in storage — a Python / MCP / CLI caller could write ``"agent-runtime:foo:bar"`` even though ``agent_id`` was clean. Same shape on ``mem_agent_share(target=)``. Add ``validate_namespace`` to ``constants.py`` and wire it at every public entry point that accepts a caller-supplied namespace argument: - ``mem_session_start(namespace=...)`` (MCP) - ``mm session start --namespace`` (CLI) - ``MemtomemStore.start_agent_session(namespace=...)`` (LangGraph) - ``MemtomemStore.start_session(namespace=...)`` (LangGraph escape hatch) - ``mem_agent_share(target=...)`` (MCP) The validator allows ``:``-separated segments matching the agent_id charset (``[A-Za-z0-9._-]+``, no leading dash, not ``"."`` / ``".."``) with a 256-char total cap. When the first segment is the bare ``agent-runtime`` prefix, exactly one trailing segment is required and that segment is re-validated through ``validate_agent_id`` so the override path can't widen the contract that the direct ``agent_id=`` path enforces — this is the bypass shape #496 was filed to close. ``mm session wrap`` doesn't accept ``--namespace`` today, so the agent_id gate from #491 covers the wrap surface; this PR doesn't add the option (out of scope). Internally-derived namespaces (``f"{AGENT_NAMESPACE_PREFIX}{agent_id}"`` after ``validate_agent_id`` succeeded, ``"default"``, ``SHARED_NAMESPACE``) are safe by construction and skip the gate — the shield is on caller input, not internal derivation. New ``test_validate_namespace.py`` pins: - 22 hostile shapes rejected at the validator (including the canonical ``agent-runtime:foo:bar`` plus path traversal, embedded whitespace, comma, leading dash, control chars, zero-width space, etc.) - 13 in-tree namespaces still accepted (``default``, ``shared``, ``archive:summary``, ``claude-memory:project-x``, ``custom:scope``, …) - All 5 entry points reject the canonical bypass shape and never awaited storage on the rejection path - Cross-surface error parity: MCP and CLI emit the same ``invalid namespace 'X'`` core text Closes #496. Co-Authored-By: Claude <[email protected]>

…w-up Code-review feedback from #499: - **Concern 2 (CHANGELOG bare ``agent-runtime`` collision):** add a migration sentence noting the strict-arity rule rejects single-segment ``"agent-runtime"`` (no trailing colon) along with the documented hostile shapes. Anyone holding such a name should rename via ``mem_ns_rename`` before running session-start with the override. - **Concern 3 (error text drift):** wrap ``InvalidNameError`` from the ``agent-runtime:<seg>`` agent_id re-routing so the outer message carries ``"invalid namespace"`` (what the user / log scraper grep for at the namespace boundary). The agent_id detail is preserved on ``__cause__`` for anyone debugging the underlying contract violation. Updated ``test_agent_runtime_prefix_routes_through_validate_agent_id`` to pin both the outer fragment and the chained cause. - **Concern 5 (CLI parametrize defense-in-depth):** widen ``TestCliSessionStartNamespaceOverride`` from 2 hostile shapes to the full ``CLI_HOSTILE_NAMESPACES`` list (HOSTILE_NAMESPACES minus leading-dash variants, which Click parses as missing-option-value before our validator sees them). Kept a separate ``test_rejects_agent_runtime_foo_bar_with_quoted_value`` for the canonical issue-#496 shape so failures surface directly rather than under one of N parametrise IDs. - **Concern 1 (mem_ns_set transitive bypass):** filed as #500 — the same shape #496 closes, smuggled through ``app.current_namespace`` via ``mem_ns_set`` (and unguarded ``mem_ns_rename`` / ``mem_ns_assign`` / ``mem_ns_update`` / ``mem_ns_delete``). PR body follow-up added separately. - **Concern 4 (module-level constant):** skipped — trivial perf, no surface-area win. Tests: 2730 passed, 46 deselected (ollama). Ruff + ruff format clean. Co-Authored-By: Claude <[email protected]>

memtomem mentioned this pull request Apr 26, 2026

Validate user-supplied namespace strings on mem_ns_* CRUD tools (transitive bypass to session rows) #500

Closed

memtomem merged commit 55f8c89 into main Apr 26, 2026
7 checks passed

memtomem deleted the fix/validate-namespace-overrides branch April 26, 2026 05:57

github-actions Bot locked and limited conversation to collaborators Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: validate user-supplied namespace= overrides on session entry points#499

fix: validate user-supplied namespace= overrides on session entry points#499
memtomem merged 2 commits intomainfrom
fix/validate-namespace-overrides

pandas-studio commented Apr 26, 2026 •

edited by memtomem

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pandas-studio commented Apr 26, 2026 • edited by memtomem Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Out of scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pandas-studio commented Apr 26, 2026 •

edited by memtomem

Loading