Skip to content

feat: canonicalize Stop-hook topic at daemon boundary with warning log#8

Open
jphein wants to merge 2 commits intorboarescu:mainfrom
jphein:fix/canonical-topic-at-daemon-boundary
Open

feat: canonicalize Stop-hook topic at daemon boundary with warning log#8
jphein wants to merge 2 commits intorboarescu:mainfrom
jphein:fix/canonical-topic-at-daemon-boundary

Conversation

@jphein
Copy link
Copy Markdown
Contributor

@jphein jphein commented Apr 27, 2026

Summary

Adds _canonical_topic() between /silent-save's payload parse and tool_diary_write. When a topic comes in matching a known legacy synonym (currently \"auto-save\"), it gets rewritten to the canonical CHECKPOINT_TOPIC (\"checkpoint\") and a warning is logged. Non-checkpoint topics (e.g. \"musings\", \"decisions\") pass through unchanged.

Why

The bundled clients (clients/hook.py, clients/mempal-fast.py) already write the canonical value, so this is defense-in-depth at the daemon boundary. It catches:

  1. Future drift in those clients (someone edits one and forgets the convention)
  2. Third-party callers POSTing to /silent-save directly with the older synonym
  3. Buggy hook installs that haven't been updated

Without this, drift silently leaks into palace metadata. Mempalace's tool_diary_write topic-routing — which sends matching topics to a dedicated session-recovery collection rather than the searchable main collection — misses those drawers, and they end up dominating vector top-N in mempalace_search results.

The warning log makes drift observable so the operator can find the misbehaving client; the rewrite keeps the palace clean in the meantime.

Scope

  • New module-level constants: CHECKPOINT_TOPIC, CHECKPOINT_TOPIC_SYNONYMS
  • New helper: _canonical_topic(topic: str) -> str (10 lines)
  • One call-site change in _do_silent_save_write

+33 / -1. No new dependencies, no behavior change for already-canonical writes, no breaking change for callers using the default.

Test plan

  • Synonym path: POST /silent-save with topic=\"auto-save\" — expect successful diary write with topic stored as \"checkpoint\", plus a warning log line silent-save: rewriting non-canonical checkpoint topic 'auto-save' → 'checkpoint'.
  • Canonical path: POST with topic=\"checkpoint\" — expect no log line, normal save.
  • Pass-through path: POST with topic=\"musings\" — expect no log line, topic stored as \"musings\".

Originally extracted from a larger fork-side commit (b4b39fc) that bundled this with a now-retired kind= filter; sending it standalone per the "small separate PRs" preference noted on PR #4.

Copilot AI review requested due to automatic review settings April 27, 2026 16:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a small normalization layer at the palace-daemon /silent-save boundary to keep Stop-hook checkpoint diary topics canonical and observable, preventing legacy topic drift from impacting MemPalace’s topic-based routing and downstream search behavior.

Changes:

  • Introduces CHECKPOINT_TOPIC and CHECKPOINT_TOPIC_SYNONYMS constants for Stop-hook checkpoint topic conventions.
  • Adds _canonical_topic() to rewrite legacy synonyms (currently "auto-save") to the canonical "checkpoint" while emitting a warning.
  • Updates _do_silent_save_write() to apply topic canonicalization before calling tool_diary_write.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread main.py
Comment on lines +316 to +322
if topic in CHECKPOINT_TOPIC_SYNONYMS:
_log.warning(
"silent-save: rewriting non-canonical checkpoint topic %r → %r",
topic, CHECKPOINT_TOPIC,
)
return CHECKPOINT_TOPIC
return topic
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed at main.py:321-326: if not isinstance(topic, str): _log.warning("silent-save: non-string topic %r (%s); coercing to %r%s", topic, type(topic).__name__, CHECKPOINT_TOPIC, ctx); return CHECKPOINT_TOPIC. Non-string inputs (None, numbers, lists from a malformed JSON payload) collapse to the canonical topic with a warning instead of leaking through to tool_diary_write and turning a bad client request into an internal type error.

Comment thread main.py
Comment on lines +317 to +320
_log.warning(
"silent-save: rewriting non-canonical checkpoint topic %r → %r",
topic, CHECKPOINT_TOPIC,
)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3527635. _canonical_topic now takes an optional caller={…} dict; _do_silent_save_write plumbs agent_name, wing, and session_id through. Warning now reads e.g. silent-save: rewriting non-canonical checkpoint topic 'auto-save' -> 'CHECKPOINT' (caller: agent_name='session-hook', wing='mempalace', session_id='abc123'). Empty/None values are suppressed so a partial payload doesn't produce noisy logs. Also flipped the unicode to ASCII -> for Windows-console safety, matching the pattern from mempalace #1204/#1382.

jphein added a commit to jphein/palace-daemon that referenced this pull request Apr 27, 2026
…eralization

Mirrors the mempalace fork's README pattern: top-of-document section
listing PRs we've filed and their state, then a "pending" queue for
PR-ready items, then an explicit "needs generalization" table for
items that bake in JP-specific assumptions and are held until they
can be split into a universal shape.

## What's now where

- **Open upstream PRs** (3): rboarescu#7 limit= fix, rboarescu#8 _canonical_topic, rboarescu#9
  verify-routes.sh
- **Recently landed in upstream**: PR #4 (cherry-picked into ef6ac03,
  closed) — the v1.5.0 daemon work
- **Pending PRs — ready to file** (6): GET /graph (folds in
  /graph.tunnels fix), GET /viz, palace-mcp-dispatch.sh, mempal-fast.py,
  event-log-frame.md, graph-endpoint.md
- **Needs generalization before PR** (3): deploy.sh (PALACE_HOST=disks
  + fork-mempalace import-check), palace-mode (Claude Code plugin
  cache layout), auto-repair-if-empty.sh (systemd --user shape)

## Why

JP's note: "we can wait if we need to generalize things but make sure
that is noted in the readme like we do with memorypalace." Filing PRs
that have JP-specific assumptions baked in would either fail in
contributor review or land features other operators couldn't use.
Better to be explicit about which items are PR-ready, which are queued,
and which need generalization work first — and what that work is —
so anyone can pick up a generalization-ready item without re-discovering
the constraint set.

`/graph.tunnels` was previously listed as its own row but it's
structurally a sub-feature of `/graph`; merged into that row to
avoid implying it can ship independently.
Adds `_canonical_topic()` between `/silent-save`'s payload parse and
`tool_diary_write`. When a topic comes in matching a known legacy
synonym (currently `"auto-save"`), it gets rewritten to the canonical
`CHECKPOINT_TOPIC` ("checkpoint") and a warning is logged.

## Why

The bundled clients (`clients/hook.py`, `clients/mempal-fast.py`)
already write the canonical value, so this is defense-in-depth. It
catches:

1. Future drift in those clients (someone edits one and forgets the
   convention)
2. Third-party callers POSTing to `/silent-save` directly with the
   older synonym
3. Buggy hook installs that haven't been updated

Without this, drift would silently leak into palace metadata and
mempalace's `tool_diary_write` topic-routing — which sends matching
topics to the dedicated `mempalace_session_recovery` collection rather
than the searchable main collection — would miss those drawers and
they'd dominate vector top-N in `mempalace_search` results. The
warning makes drift observable; the rewrite keeps the palace clean.

## Scope

- New module-level constants: `CHECKPOINT_TOPIC`, `CHECKPOINT_TOPIC_SYNONYMS`
- New helper: `_canonical_topic(topic: str) -> str` (10 lines)
- One call-site change in `_do_silent_save_write`

Non-checkpoint topics (e.g. `"musings"`, `"decisions"`) pass through
unchanged — the rewrite only fires on synonyms in the explicit list.

No new dependencies, no behavior change for already-canonical writes,
no breaking change for callers using the default.

Originally extracted from a larger fork-side commit (`b4b39fc`) that
bundled this with a now-retired `kind=` filter; sending it standalone
per the "small separate PRs" preference noted on PR #4.
@jphein jphein force-pushed the fix/canonical-topic-at-daemon-boundary branch from f70c136 to 3371153 Compare April 27, 2026 17:51
@jphein
Copy link
Copy Markdown
Contributor Author

jphein commented Apr 27, 2026

Thanks for the careful read — addressed both comments in the latest force-push (3371153):

On the type robustness: added a non-string guard at the top of _canonical_topic that coerces None/numbers/lists to CHECKPOINT_TOPIC with a warning log, so a malformed {"topic": null} payload yields a clean recovery rather than letting the value leak through to tool_diary_write.

On caller context in the synonym-rewrite warning: held off on this one. The warning already includes the offending topic value (e.g. 'auto-save' → 'checkpoint'), and in practice every Stop hook writes with agent_name="session-hook" (constant — not high-cardinality enough to be useful as a discriminator). session_id would help in multi-agent setups but it isn't on the function's current signature, and threading it through would change the API for marginal benefit. If multi-client drift becomes a real diagnostic problem, the right shape is probably to hoist the rewrite into the caller (_do_silent_save_write, where payload is in scope) — happy to do that as a follow-up if you'd prefer.

jphein added a commit to jphein/palace-daemon that referenced this pull request Apr 27, 2026
…boarescu#10/rboarescu#12/rboarescu#13 to main

The amendments to those open upstream PRs caught real bugs that exist
on fork main too — production at disks runs fork main, so these were
shipping live until now.

## main.py — /graph endpoint (from PR rboarescu#13)

- `_read_wings_rooms_direct()`: build `rooms` from union of
  `wings.keys()` + `rooms_by_wing.keys()` rather than just wings.
  Otherwise a partial schema-drift (one query OperationalError-ed
  but the other succeeded) silently drops the half that worked.
  Real data-loss bug for the SME / /viz consumers.
- `/graph` handler: gate the direct-sqlite reads on `_read_sem` via
  a `_direct_under_sem` helper. Coordinates with `_exclusive_palace()`
  used by `/repair mode=rebuild` (so /graph yields rather than racing
  delete-then-create on chroma.sqlite3) and rate-limits unbounded
  `asyncio.to_thread` reads under load (was 2 threads/request × N
  concurrent requests).
- `_canonical_topic()` (from PR rboarescu#8): drop the `topic: str` annotation
  and add a non-string isinstance guard at the top, coerce
  `None`/numbers/lists to `CHECKPOINT_TOPIC` with a warning. A
  malformed `{"topic": null}` JSON payload no longer leaks `None`
  through to `tool_diary_write`.

## scripts/verify-routes.sh (from PR rboarescu#9)

- `set -euo pipefail` (was `set -e`) — pipeline failures and unset
  vars now surface instead of being masked.
- `curl -fsS` instead of `-sS` on every probe so HTTP non-2xx
  (e.g. `/health` 503 when palace is degraded) fails curl directly.
  Without this, a 503 body with `"daemon":"palace-daemon"` slips
  through the body-grep as a pass.
- Limit-check rewrite: split curl exit code from JSON parse so a
  connection error or non-JSON response `?`-flags rather than
  fails-with-traceback (the previous `2>&1` redirect captured
  Python tracebacks into `$COUNT`, which then triggered the fail
  path with a confusing message).

## clients/palace-mcp-dispatch.sh (from PR rboarescu#10)

- Replace `readlink -f` (GNU-specific, fails on macOS BSD readlink)
  with `cd "$(dirname …)" && pwd -P`. POSIX-portable; covers the
  actual common case (parent directory is a symlink under the
  Claude Code plugin cache).
- Add explicit existence check on `$MCP_CLIENT` before exec, with
  a clear error message instead of a generic Python "can't open
  file" trace.

## clients/palace-mode (from PR rboarescu#12)

- Make `PALACE_API_KEY` optional in `set_mode_remote`. Server-side
  `_check_auth` only enforces auth when its own `PALACE_API_KEY` is
  set, so a daemon running without auth accepts unauthenticated
  clients. The previous fail-fast was overly strict for that case.
  Warn-but-proceed instead.

All five fixes are amendments on the corresponding upstream PR
branches (force-pushed earlier today). No behavior change for the
healthy path; the changes harden failure modes that were caught in
review.
…ings

Addresses Copilot review on rboarescu#8: the warning log emitted when
`_canonical_topic` rewrites a synonym (or coerces a non-str topic)
didn't include any caller identifier, making it hard to track down
a misbehaving client in multi-tenant logs.

`_canonical_topic` now takes an optional `caller=` dict; bundled
fields (`agent_name`, `wing`, `session_id`) are appended to the log
line as a compact `(caller: …)` tail, with empty/None values
suppressed so a partial payload doesn't produce noisy logs.
`_do_silent_save_write` plumbs the relevant fields through.

Also opportunistically replaced the unicode `→` arrow in the
synonym-rewrite log with ASCII `->` for Windows-console safety,
matching the pattern from #1204 / #1382 in mempalace.

No behavior change beyond log content.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants