Motivation
/api/context/overview returns a per-surface summary that collapses
every failure into a single boolean:
# packages/memtomem/src/memtomem/web/routes/context_gateway.py:42-44
except Exception:
logger.exception("diff_skills failed")
result["skills"] = {"total": 0, "error": True}
A user staring at "error" can't tell whether to:
- fix a malformed
.md frontmatter (parse error),
- chmod a directory (permission error),
- install a missing dependency (ImportError on the diff helpers),
- or file an issue (server-side bug nobody caused).
All four end in the same toast. Logs have the traceback, but most
end-users never see logs.
Current state
- Single overview route catches
Exception four times (skills,
commands, agents, settings) and returns error: True. No
classification.
- Detail routes (
PUT /context/{type}/{name}) have slightly more
granular errors (404, 409, 500) but no machine-readable
sub-category.
- Front-end consumes
error as a boolean — flips a red badge in the
Context Gateway summary card, no further action.
Proposed change
Introduce a small taxonomy returned alongside error: True:
Mapping rules (catch-then-classify, not classify-then-catch — the
classification is the test surface, the catch is the boundary):
| Exception class |
error_kind |
UI guidance |
yaml.YAMLError, tomllib.TOMLDecodeError, our ParseError |
parse |
"Fix the malformed file at <path> and reload." |
PermissionError, OSError(EACCES) |
permission |
"Read access denied to <path>." |
FileNotFoundError, ModuleNotFoundError |
missing |
"Path <path> not found / dependency missing." |
| anything else |
internal |
"Server error — see logs / file an issue." |
Keep error: True for backwards compat (front-end and any external
callers can ignore error_kind and the existing red-badge flow keeps
working).
Alternatives considered
- HTTP status codes per kind. Rejected —
/context/overview
returns a partial-success envelope (skills can fail while agents
succeed); switching to per-surface 4xx breaks the aggregation.
- Stream errors via WebSocket. Out of scope — current
request/response contract is fine for a low-cardinality summary.
- Classify in the front-end via regex on
error_message. Rejected
— classification on the source side is testable; UI regex would
drift the moment we change a Python exception message.
Open questions
error_message UX: surface verbatim (good debug info, leaks paths)
or redact to a stable phrase (clean UI, less actionable)?
- Should the taxonomy extend to detail routes (
PUT /context/...)?
Probably yes for consistency, but the surface there is smaller and
408/409/422/500 already cover most ground.
- Worth adding a dedicated
Counter metric per error_kind for
watchdog visibility, or is the log line enough?
Out of scope
- Localization of error messages (Python side stays English; UI maps
error_kind → i18n key).
- Front-end "click to expand traceback in dev mode" — separate UX.
Motivation
/api/context/overviewreturns a per-surface summary that collapsesevery failure into a single boolean:
A user staring at "error" can't tell whether to:
.mdfrontmatter (parse error),All four end in the same toast. Logs have the traceback, but most
end-users never see logs.
Current state
Exceptionfour times (skills,commands, agents, settings) and returns
error: True. Noclassification.
PUT /context/{type}/{name}) have slightly moregranular errors (404, 409, 500) but no machine-readable
sub-category.
erroras a boolean — flips a red badge in theContext Gateway summary card, no further action.
Proposed change
Introduce a small taxonomy returned alongside
error: True:{ "skills": { "total": 0, "error": true, "error_kind": "parse" | "permission" | "missing" | "internal", "error_message": "<short user-facing string, optional>" } }Mapping rules (catch-then-classify, not classify-then-catch — the
classification is the test surface, the catch is the boundary):
error_kindyaml.YAMLError,tomllib.TOMLDecodeError, ourParseErrorparse<path>and reload."PermissionError,OSError(EACCES)permission<path>."FileNotFoundError,ModuleNotFoundErrormissing<path>not found / dependency missing."internalKeep
error: Truefor backwards compat (front-end and any externalcallers can ignore
error_kindand the existing red-badge flow keepsworking).
Alternatives considered
/context/overviewreturns a partial-success envelope (skills can fail while agents
succeed); switching to per-surface 4xx breaks the aggregation.
request/response contract is fine for a low-cardinality summary.
error_message. Rejected— classification on the source side is testable; UI regex would
drift the moment we change a Python exception message.
Open questions
error_messageUX: surface verbatim (good debug info, leaks paths)or redact to a stable phrase (clean UI, less actionable)?
PUT /context/...)?Probably yes for consistency, but the surface there is smaller and
408/409/422/500 already cover most ground.
Countermetric pererror_kindforwatchdog visibility, or is the log line enough?
Out of scope
error_kind→ i18n key).