Background
PR #784 unified the trust-boundary redaction guard across every server-side ingress surface, including the four web mutating routes:
POST /api/add
POST /api/upload (query param force_unsafe)
PATCH /api/chunks/{chunk_id}
POST /api/scratch/{key}/promote
On a hit they return HTTP 403 with a structured detail body:
{"detail": "redaction_blocked", "hits": <int>, "surface": "<surface_label>"}
…so the SPA can render a confirm-and-retry dialog and resubmit with force_unsafe=true after the operator acknowledges the matched-pattern count. Codex review of PR #784 flagged that the SPA does not yet exercise this flow.
Concrete gaps (from the Codex review)
web/static/app.js:3789 — /api/add POST body does not include force_unsafe.
web/static/app.js:2151, web/static/app.js:3621 — chunk-edit PATCH body does not include force_unsafe.
web/static/app.js:3875 — upload form does not append the force_unsafe query param.
web/static/settings-harness.js:228 — scratch promote POST body does not include force_unsafe.
web/static/app.js:216 — the shared api() helper throws new Error(err.detail) directly. When err.detail is the structured object above it stringifies as [object Object], swallowing hits / surface and breaking any UX that wants to render them.
What to add
api() helper: when a 4xx response has a detail object with detail === "redaction_blocked", throw a typed error (or return a sentinel) carrying {hits, surface} so callers can branch.
- For each of the four mutating surfaces, add a confirm-and-retry path:
- On a redaction-blocked error, surface a modal / toast that names the matched-pattern count (
hits) and the surface label.
- If the user confirms, re-issue the same request with
force_unsafe: true (body field) or ?force_unsafe=true (upload query param).
- Add at least one Playwright MCP smoke that exercises the full SPA flow: paste a
sk-… token → expect the confirm dialog → confirm → expect the row to land and mem_add_redaction_stats to show bypassed incremented under the right by_tool key.
Out of scope
Verification
uv run pytest -m "not ollama" stays green.
mm web localhost smoke: secret payload → 403 dialog → confirm → 200 + counter increment.
References
Background
PR #784 unified the trust-boundary redaction guard across every server-side ingress surface, including the four web mutating routes:
POST /api/addPOST /api/upload(query paramforce_unsafe)PATCH /api/chunks/{chunk_id}POST /api/scratch/{key}/promoteOn a hit they return
HTTP 403with a structured detail body:{"detail": "redaction_blocked", "hits": <int>, "surface": "<surface_label>"}…so the SPA can render a confirm-and-retry dialog and resubmit with
force_unsafe=trueafter the operator acknowledges the matched-pattern count. Codex review of PR #784 flagged that the SPA does not yet exercise this flow.Concrete gaps (from the Codex review)
web/static/app.js:3789—/api/addPOST body does not includeforce_unsafe.web/static/app.js:2151,web/static/app.js:3621— chunk-edit PATCH body does not includeforce_unsafe.web/static/app.js:3875— upload form does not append theforce_unsafequery param.web/static/settings-harness.js:228— scratch promote POST body does not includeforce_unsafe.web/static/app.js:216— the sharedapi()helper throwsnew Error(err.detail)directly. Whenerr.detailis the structured object above it stringifies as[object Object], swallowinghits/surfaceand breaking any UX that wants to render them.What to add
api()helper: when a 4xx response has adetailobject withdetail === "redaction_blocked", throw a typed error (or return a sentinel) carrying{hits, surface}so callers can branch.hits) and the surface label.force_unsafe: true(body field) or?force_unsafe=true(upload query param).sk-…token → expect the confirm dialog → confirm → expect the row to land andmem_add_redaction_statsto showbypassedincremented under the rightby_toolkey.Out of scope
--force-unsafeflag /force_unsafe=Truekwarg) and do not need SPA changes.privacy.pymodule docstring.Verification
uv run pytest -m "not ollama"stays green.mm weblocalhost smoke: secret payload → 403 dialog → confirm → 200 + counter increment.References
.agents-dev/log/1/codex-20260505-085534.logdocs/reports/security-hardening-plan-2026-05-05.md(High chore: prepare for open-source release #1, original finding)