Skip to content

sec(web): CSRF / Origin / Host guard middleware (#787 stage 1, observe-only)#793

Merged
memtomem merged 1 commit intomainfrom
feat/csrf-middleware-log-only
May 5, 2026
Merged

sec(web): CSRF / Origin / Host guard middleware (#787 stage 1, observe-only)#793
memtomem merged 1 commit intomainfrom
feat/csrf-middleware-log-only

Conversation

@memtomem
Copy link
Copy Markdown
Owner

@memtomem memtomem commented May 5, 2026

Summary

PR1 of the two-PR rollout for #787 (RFC: CSRF and browser-origin guard for the local Web UI).

  • Adds CSRFGuardMiddleware checking X-Memtomem-CSRF token + Origin/Referer + Host against loopback / operator-trusted allow-lists.
  • Adds GET /api/session for SPA token bootstrap; threads X-Memtomem-CSRF through api() helper and 10 direct-fetch callsites.
  • Adds AST-walking write-boundary registry that classifies every unsafe-method handler under web/routes/ for both CSRF coverage and privacy.enforce_write_guard coverage — drift fails noisily.
  • Observe-only: emits web.csrf.observe log records with would_block decision but never returns 403. PR2 will flip app.state.csrf_enforce to True plus a MEMTOMEM_WEB__CSRF_ENFORCE env rollback.

Decisions locked in #787 comment: A (token + Origin/Referer) over B-only; Host-header guard included for DNS rebinding; GET /api/session token transport; method-based scope on /api/**; shared registry for both invariants.

What changed

Server

File Purpose
web/middleware/csrf.py New. Observe-mode middleware with token / Origin / Host checks.
web/app.py Generates per-process csrf_token at create_app, wires middleware inside SecurityHeadersMiddleware so the eventual 403 picks up CSP / nosniff.
web/routes/system.py Adds GET /api/session returning {csrf, mode} for SPA bootstrap.

CLI

mm web gains:

  • --allow-remote-ui — acknowledgment flag for non-loopback binds. PR2 will refuse to start without it; PR1 emits a yellow warning.
  • --trusted-origin HOST (repeatable) — populates app.state.csrf_trusted_origins.
  • --trusted-host HOST (repeatable) — populates app.state.csrf_trusted_hosts.

SPA

  • static/app.jsapi(...) helper bootstraps the token from GET /api/session once, caches it, and threads X-Memtomem-CSRF on every unsafe method.
  • 10 direct fetch() callsites bypass api(...) and are cleaned up to call ensureCsrfToken() and attach the header explicitly:
    • app.js:3919, 5451 (POST /api/upload)
    • context-gateway.js:228, 235, 904 (POST /api/context/.../sync)
    • context-gateway.js:483 (DELETE /api/context/known-projects/{id})
    • context-gateway.js:625 (PUT /api/context/{type}/{name})
    • context-gateway.js:670 (DELETE /api/context/{type}/{name})
    • settings-hooks-watchdog.js:164 (POST /api/settings-sync)
    • settings-maintenance.js:303 (POST /api/export/import)
  • Cache-bust per feedback_static_asset_cache_bust.md: app.js v100→v101, context-gateway.js v12→v13, settings-maintenance.js v1→v2, settings-hooks-watchdog.js v2→v3.

Tests

  • tests/test_web_csrf_middleware.py — 12 TestClient cases covering: safe-method short-circuit, no/wrong/right token, loopback-Host pass, attacker-origin block, DELETE coverage, /api/session bootstrap exemption (exact equality, not prefix), enforce-mode 403 wiring (so PR2 is a default flip), --trusted-host allow-list, and non-/api/* pass-through.
  • tests/test_web_invariants_registry.py — AST-walking write-boundary registry. Classifies every @router.{post,patch,put,delete} handler under web/routes/ for both _CSRF_PROTECTED/_CSRF_EXEMPT and _REDACTION_PROTECTED/_REDACTION_EXEMPT. Asserts every _REDACTION_PROTECTED handler actually calls privacy.enforce_write_guard. Adding a new unsafe handler without classifying it fails noisily.
  • tests-js/csrf-token-bootstrap.test.mjs — 4 vitest pins: GET skips /api/session and sends no header; POST bootstraps once and caches across subsequent POST/PATCH/PUT/DELETE; /api/session failure degrades gracefully without attaching an empty header.

Mutation-validated per feedback_pin_test_mutation_validation.md: removing the await ensureCsrfToken() call in app.js makes the JS POST/PATCH/PUT/DELETE branches fail; flipping app.state.csrf_enforce=True makes the enforce test return 403 (already pinned).

Out of scope (PR2)

  • Flip app.state.csrf_enforce default to True + MEMTOMEM_WEB__CSRF_ENFORCE env rollback.
  • Refuse to start when --host is non-loopback without --allow-remote-ui.
  • SECURITY.md threat-model documentation.

Test plan

  • uv run ruff check packages/memtomem/src — clean
  • uv run ruff format --check packages/memtomem/src — 233 files already formatted
  • uv run pytest -m "not ollama" — 4069 passed, 11 skipped
  • npx vitest run — 18 passed (5 prior + 4 new)
  • Web subset: uv run pytest -k web -m "not ollama" — 535 passed
  • CI green (lint / js-unit / typecheck / 3-OS python tests / Playwright / golden-path)
  • Manual smoke (after merge): mm web → DevTools → unsafe action → verify X-Memtomem-CSRF header present, server logs web.csrf.observe ... would_block=False.

References

🤖 Generated with Claude Code

…ode (#787 stage 1)

PR1 of the two-PR rollout for RFC #787. Adds the middleware, the SPA
wiring, and the AST-walking write-boundary registry, but does not yet
enforce — every gated request still passes through. Stage 2 (separate
PR) flips ``app.state.csrf_enforce`` to ``True`` plus a
``MEMTOMEM_WEB__CSRF_ENFORCE`` env rollback toggle and updates
SECURITY.md.

Why now: the local Web UI has a ``CORS allow-origin`` policy but no
write boundary against browser-origin attackers. Drive-by tabs,
DNS rebinding, and ``--host 0.0.0.0`` misconfigs all reach the
unsafe-method handlers today. Per the decision summary on #787:
A (token + Origin/Referer) over B-only, Host-header guard included
to defend rebinding, ``GET /api/session`` token transport, method-based
endpoint scope on ``/api/**``, and a shared write-boundary registry that
also locks down ``privacy.enforce_write_guard`` callsite coverage.

Server side
-----------
* ``web/middleware/csrf.py`` — ``CSRFGuardMiddleware`` checks
  ``X-Memtomem-CSRF`` header against ``app.state.csrf_token``,
  ``Origin`` / ``Referer`` against the loopback + operator-trusted
  origin allow-list, and ``Host`` against the loopback +
  operator-trusted host allow-list. Emits one structured
  ``web.csrf.observe`` log per gated request with the four decision
  flags. Stage 1 always passes through; ``app.state.csrf_enforce`` is
  the toggle PR2 will flip.
* ``web/app.py`` — generates the per-process token at ``create_app``,
  initializes empty allow-lists + ``csrf_enforce=False``, wires the
  middleware *inside* ``SecurityHeadersMiddleware`` so the eventual 403
  picks up nosniff / frame-options / CSP on its way out.
* ``web/routes/system.py`` — adds ``GET /api/session`` returning
  ``{csrf, mode}`` for SPA bootstrap. Not localhost-guarded at the
  route layer; the middleware covers Origin/Host for it uniformly.

CLI
---
* ``mm web`` gains ``--allow-remote-ui`` (acknowledgment flag for
  non-loopback binds), ``--trusted-origin HOST`` and
  ``--trusted-host HOST`` (repeatable allow-list entries that populate
  ``app.state.csrf_trusted_{origins,hosts}``). ``--host`` non-loopback
  without ``--allow-remote-ui`` emits a yellow warning explaining that
  PR2 will refuse to start in that combination.

SPA
---
* ``static/app.js`` — ``api(...)`` helper bootstraps the token from
  ``GET /api/session`` once, caches it, and threads
  ``X-Memtomem-CSRF`` on every unsafe method.
* 10 direct ``fetch()`` callsites that bypass ``api(...)`` are
  cleaned up to call ``ensureCsrfToken()`` and attach the header
  themselves: 2 in ``app.js`` (uploads), 5 in ``context-gateway.js``,
  1 in ``settings-hooks-watchdog.js``, 1 in ``settings-maintenance.js``.
* Cache-bust per ``feedback_static_asset_cache_bust.md``: app.js v100→v101,
  context-gateway.js v12→v13, settings-maintenance.js v1→v2,
  settings-hooks-watchdog.js v2→v3.

Tests
-----
* ``tests/test_web_csrf_middleware.py`` — 12 ``TestClient`` cases
  covering safe-method short-circuit, every negative-token case,
  loopback-Host pass, attacker-origin block, DELETE coverage,
  ``/api/session`` bootstrap exemption (exact path, not prefix),
  enforce-mode 403 wiring (so PR2 is a default flip), trusted-host
  allow-list, and non-``/api/*`` pass-through.
* ``tests/test_web_invariants_registry.py`` — AST-walking write-boundary
  registry per ``feedback_ast_architectural_guard_pattern.md``. Pins
  classification for every ``@router.{post,patch,put,delete}`` handler
  under ``web/routes/`` into ``_CSRF_PROTECTED`` / ``_CSRF_EXEMPT`` and
  separately into ``_REDACTION_PROTECTED`` / ``_REDACTION_EXEMPT``. The
  redaction side asserts each ``_REDACTION_PROTECTED`` handler actually
  calls ``privacy.enforce_write_guard``. Adding a new unsafe handler
  without classifying it fails noisily.
* ``tests-js/csrf-token-bootstrap.test.mjs`` — 4 vitest pins on the
  ``api()`` helper: GET sends no header and skips ``/api/session``,
  POST bootstraps once and caches across subsequent unsafe methods,
  and ``/api/session`` failure degrades gracefully without attaching
  an empty header.

Out of scope (PR2)
------------------
* Flip ``csrf_enforce`` default + add ``MEMTOMEM_WEB__CSRF_ENFORCE``
  rollback toggle.
* Refuse to start when ``--host`` is non-loopback without
  ``--allow-remote-ui``.
* SECURITY.md threat model documentation.

Co-Authored-By: Claude <[email protected]>
@memtomem memtomem merged commit 9275dc7 into main May 5, 2026
11 checks passed
@memtomem memtomem deleted the feat/csrf-middleware-log-only branch May 5, 2026 01:35
@github-actions github-actions Bot locked and limited conversation to collaborators May 5, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants