hooks: unify internal + plugin hook registries to fix 5 systemic bugs#30853
hooks: unify internal + plugin hook registries to fix 5 systemic bugs#30853widingmarcus-cyber wants to merge 5 commits intoopenclaw:mainfrom
Conversation
Fixes openclaw#30784 **Bug 1 — Bundler duplication**: Replace module-level singletons in internal-hooks.ts and hook-runner-global.ts with globalThis[Symbol.for()] pattern (already proven in plugins/runtime.ts). Hooks now survive Rolldown chunk splitting. **Bug 2 — clearInternalHooks() wipes plugin hooks**: Add source metadata to every registered hook ("bundled", "workspace", "managed", "config", "plugin"). Replace handlers.clear() with clearInternalHooksBySource() that preserves plugin-registered hooks during hot-reload. **Bug 3 — gateway:startup race condition**: Remove setTimeout(250) from server-startup.ts. Both gateway_start (plugin) and gateway:startup (internal) now fire together via emitGatewayStartup() after sidecars are fully loaded. **Bug 4 — sessions.reset missing before_reset**: Add the plugin before_reset hook call to the gateway RPC handler in sessions.ts, matching the existing behavior in commands-core.ts. **Bug 5 — after_compaction missing sessionKey**: Pass ctx.params.sessionKey to the after_compaction hook context, matching the before_compaction hook. **Dispatch helpers**: New src/hooks/dispatch-unified.ts co-locates dual dispatch for the 3 overlapping events (message_received, message_sent, gateway_start). Each call site now uses one function instead of 15-30 lines of manual dual dispatch. Files: 1 new, 10 modified. ~160 LOC new. Backward compatible — no changes to HookRunner facade or existing hook handler signatures.
🔒 Aisle Security AnalysisWe found 2 potential security issue(s) in this PR:
1. 🟡 Event-loop blocking / DoS via synchronous unbounded transcript read in sessions.reset before_reset hook
DescriptionIn
Vulnerable code: const fileContent = fs.readFileSync(sessionFile, "utf-8");
for (const line of fileContent.split("\n")) {
...
const parsed = JSON.parse(line);
...
}
await hookRunner.runBeforeReset({ sessionFile, messages, reason: commandReason }, ...)Even though RecommendationAvoid synchronous, unbounded reads in request handlers.
Example (stream + caps): import { createReadStream } from "node:fs";
import readline from "node:readline";
const MAX_BYTES = 5 * 1024 * 1024;
const MAX_MESSAGES = 2000;
const stat = await fs.promises.stat(sessionFile);
if (stat.size > MAX_BYTES) {
// either skip, or only read tail/head depending on needs
}
const rl = readline.createInterface({
input: createReadStream(sessionFile, { encoding: "utf-8" }),
crlfDelay: Infinity,
});
const messages: unknown[] = [];
for await (const line of rl) {
if (messages.length >= MAX_MESSAGES) break;
if (!line.trim()) continue;
try {
const parsed = JSON.parse(line);
if (parsed?.type === "message" && parsed?.message) messages.push(parsed.message);
} catch {}
}
await Promise.race([
hookRunner.runBeforeReset({ sessionFile, messages, reason }, ctx),
new Promise((_, reject) => setTimeout(() => reject(new Error("before_reset timeout")), 2000)),
]);2. 🟡 Persistent plugin internal hooks survive in-process restarts due to globalThis hook registry (Symbol.for), bypassing config/plugin disable and causing handler accumulation
DescriptionThe internal hook registry was moved to a
This creates a security-relevant integrity issue (configuration meant to disable hooks/plugins may not actually stop their code from executing) and a potential DoS vector via handler accumulation. Vulnerable code (startup clears everything except plugin hooks): // Clear file-discovered hooks but preserve plugin-registered hooks (#30784)
clearInternalHooksBySource(["bundled", "workspace", "managed", "config"]);
const loadedCount = await loadInternalHooks(params.cfg, params.defaultWorkspaceDir);Because the registry itself persists on RecommendationEnsure plugin internal hooks cannot outlive the lifecycle/configuration that enabled them. Recommended options:
// On startup/restart, clear all sources including plugin
clearInternalHooksBySource(["bundled","workspace","managed","config","plugin"]);
await loadInternalHooks(cfg, workspaceDir);
// plugins load should then register their hooks for this lifecycle
Add regression tests:
Analyzed PR: #30853 at commit Last updated on: 2026-03-01T19:42:24Z |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2e374c453
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
src/hooks/dispatch-unified.ts
Outdated
| from: params.from, | ||
| content: params.content, | ||
| timestamp: params.timestamp, | ||
| metadata: params.metadata ?? {}, |
There was a problem hiding this comment.
Preserve messageId in plugin message_received payloads
emitMessageReceived stops forwarding params.messageId to the plugin hook payload and only passes params.metadata as-is, while the old call path included metadata.messageId. This breaks existing message_received plugins that rely on message IDs for correlation/deduping (internal hooks still get messageId, so behavior is now inconsistent between the two hook systems).
Useful? React with 👍 / 👎.
| await hookRunner.runBeforeReset( | ||
| { sessionFile, messages, reason: commandReason }, | ||
| { | ||
| agentId: (target.canonicalKey ?? key).split(":")[0] ?? "main", |
There was a problem hiding this comment.
Derive before_reset agentId from parsed session keys
The new before_reset context computes agentId with (target.canonicalKey ?? key).split(":")[0], which returns the literal prefix "agent" for canonical keys like agent:ops:main instead of the actual agent (ops). Any plugin using ctx.agentId during gateway sessions.reset will attach state/memory to the wrong agent.
Useful? React with 👍 / 👎.
Greptile Summaryunified internal and plugin hook registries to resolve 5 systemic bugs related to hook lifecycle management Key improvements:
Technical implementation:
Confidence Score: 4/5
Last reviewed commit: 9751562 |
…eset - Forward messageId into plugin metadata in emitMessageReceived to maintain parity with the old dual-dispatch path (openclaw#30784) - Fix agentId derivation for canonical session keys (agent:X:Y) in before_reset hook context — previously returned literal 'agent' instead of the actual agent id
|
@greptile-apps please review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e0fb80b0c6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // Clear any previously registered hooks to ensure fresh loading | ||
| clearInternalHooks(); | ||
| // Clear file-discovered hooks but preserve plugin-registered hooks (#30784) | ||
| clearInternalHooksBySource(["bundled", "workspace", "managed", "config"]); |
There was a problem hiding this comment.
Clear plugin-source hooks on gateway sidecar startup
When the gateway uses an in-process restart path (for example OPENCLAW_NO_RESPAWN=1 or respawn fallback), this call preserves all prior source: "plugin" internal hooks from the previous lifecycle. Plugin loading can then register plugin hooks again (registerInternalHook(event, handler, "plugin") in src/plugins/registry.ts), which leaves disabled plugins' handlers active and can double-run unchanged hooks after restart. Clearing plugin-source entries before reloading avoids stale/duplicate hook execution in these restart contexts.
Useful? React with 👍 / 👎.
| // Fire before_reset plugin hook — extract memories before session history is lost (#30784) | ||
| const hookRunner = getGlobalHookRunner(); | ||
| if (hookRunner?.hasHooks("before_reset")) { | ||
| const sessionFile = entry?.sessionFile; | ||
| void (async () => { | ||
| try { | ||
| const messages: unknown[] = []; | ||
| if (sessionFile) { | ||
| const content = fs.readFileSync(sessionFile, "utf-8"); | ||
| for (const line of content.split("\n")) { | ||
| if (!line.trim()) { | ||
| continue; | ||
| } | ||
| try { | ||
| const parsed = JSON.parse(line); | ||
| if (parsed.type === "message" && parsed.message) { | ||
| messages.push(parsed.message); | ||
| } | ||
| } catch { | ||
| // skip malformed lines | ||
| } | ||
| } | ||
| } | ||
| await hookRunner.runBeforeReset( | ||
| { sessionFile, messages, reason: commandReason }, | ||
| { | ||
| agentId: (() => { | ||
| const k = target.canonicalKey ?? key; | ||
| const parts = k.split(":"); | ||
| return parts[0] === "agent" && parts[1] ? parts[1] : (parts[0] ?? "main"); | ||
| })(), | ||
| sessionKey: target.canonicalKey ?? key, | ||
| sessionId: entry?.sessionId, | ||
| }, | ||
| ); | ||
| } catch { | ||
| // Don't block session reset on hook failure | ||
| } | ||
| })(); | ||
| } |
There was a problem hiding this comment.
race condition: file read vs archival. the plugin hook fires in a fire-and-forget async function that reads sessionFile with fs.readFileSync (line 375), but the main thread immediately proceeds to call archiveSessionTranscriptsForSession (line 450) which renames the file with fs.renameSync. since the async function isn't awaited, the file might be renamed before the read completes, causing the hook to fail silently.
fix: move the file read operation outside the async IIFE (before the hook is called) and pass the already-read messages to the hook. this ensures the file is read before archival begins.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/gateway/server-methods/sessions.ts
Line: 367-406
Comment:
race condition: file read vs archival. the plugin hook fires in a fire-and-forget async function that reads `sessionFile` with `fs.readFileSync` (line 375), but the main thread immediately proceeds to call `archiveSessionTranscriptsForSession` (line 450) which renames the file with `fs.renameSync`. since the async function isn't awaited, the file might be renamed before the read completes, causing the hook to fail silently.
fix: move the file read operation outside the async IIFE (before the hook is called) and pass the already-read messages to the hook. this ensures the file is read before archival begins.
How can I resolve this? If you propose a fix, please make it concise.Read session file synchronously before cleanup/archive can rename it. Previously the file read was inside a fire-and-forget async block that raced against ensureSessionRuntimeCleanup which renames the session file. Now the file content is captured before any cleanup runs, and only the hook execution itself is async. Addresses Greptile 2/5 score feedback.
|
@greptile-apps please review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d7d638f07d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| agentId: resetAgentId, | ||
| sessionKey: target.canonicalKey ?? key, | ||
| sessionId: entry?.sessionId, | ||
| }, |
There was a problem hiding this comment.
Include workspaceDir in gateway before_reset hook context
sessions.reset now emits before_reset, but the context passed to runBeforeReset omits workspaceDir, unlike the existing /new//reset command path in emitResetCommandHooks (src/auto-reply/reply/commands-core.ts) which includes it. Plugins that resolve memory/db files from ctx.workspaceDir will work for chat-command resets but fail (or silently skip work) when the same reset is triggered through the gateway RPC, creating inconsistent behavior across reset entrypoints.
Useful? React with 👍 / 👎.
| switch (source) { | ||
| case "openclaw-bundled": return "bundled"; | ||
| case "openclaw-managed": return "managed"; | ||
| case "openclaw-workspace": return "workspace"; | ||
| case "openclaw-plugin": return "plugin"; | ||
| default: return "config"; | ||
| } | ||
| } |
There was a problem hiding this comment.
Duplicate hook registrations on hot-reload for "openclaw-plugin" workspace hooks
resolveHookSource maps "openclaw-plugin" entries to "plugin", but clearInternalHooksBySource(["bundled", "workspace", "managed", "config"]) does not include "plugin" in the cleared set. This means workspace-discovered hooks from plugin directories are never cleared before the next call to loadInternalHooks.
The sequence on hot-reload is:
clearInternalHooksBySource(["bundled", "workspace", "managed", "config"])— skips"plugin"hooks, so"openclaw-plugin"workspace hooks remain registered.loadInternalHookscallsloadWorkspaceHookEntriesagain, discovers the same"openclaw-plugin"entries, and re-registers them.- The handler is now registered twice, so each matching event fires the handler twice.
The "plugin" source was intended to protect hooks registered programmatically via src/plugins/registry.ts, not filesystem-discovered plugin hooks that loadInternalHooks re-scans on every reload. Mapping "openclaw-plugin" workspace entries to the same "plugin" source accidentally opts them into the "never clear" bucket while loadInternalHooks still re-registers them.
Consider returning a distinct source (e.g. "workspace") for "openclaw-plugin" entries so they are cleared and re-registered cleanly, or filtering them out of the loadWorkspaceHookEntries scan when they are already registered:
| switch (source) { | |
| case "openclaw-bundled": return "bundled"; | |
| case "openclaw-managed": return "managed"; | |
| case "openclaw-workspace": return "workspace"; | |
| case "openclaw-plugin": return "plugin"; | |
| default: return "config"; | |
| } | |
| } | |
| function resolveHookSource(source: string): HookSource { | |
| switch (source) { | |
| case "openclaw-bundled": return "bundled"; | |
| case "openclaw-managed": return "managed"; | |
| case "openclaw-workspace": return "workspace"; | |
| case "openclaw-plugin": return "workspace"; | |
| default: return "config"; | |
| } | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/hooks/loader.ts
Line: 24-31
Comment:
**Duplicate hook registrations on hot-reload for `"openclaw-plugin"` workspace hooks**
`resolveHookSource` maps `"openclaw-plugin"` entries to `"plugin"`, but `clearInternalHooksBySource(["bundled", "workspace", "managed", "config"])` does not include `"plugin"` in the cleared set. This means workspace-discovered hooks from plugin directories are never cleared before the next call to `loadInternalHooks`.
The sequence on hot-reload is:
1. `clearInternalHooksBySource(["bundled", "workspace", "managed", "config"])` — skips `"plugin"` hooks, so `"openclaw-plugin"` workspace hooks remain registered.
2. `loadInternalHooks` calls `loadWorkspaceHookEntries` again, discovers the same `"openclaw-plugin"` entries, and re-registers them.
3. The handler is now registered **twice**, so each matching event fires the handler twice.
The `"plugin"` source was intended to protect hooks registered programmatically via `src/plugins/registry.ts`, not filesystem-discovered plugin hooks that `loadInternalHooks` re-scans on every reload. Mapping `"openclaw-plugin"` workspace entries to the same `"plugin"` source accidentally opts them into the "never clear" bucket while `loadInternalHooks` still re-registers them.
Consider returning a distinct source (e.g. `"workspace"`) for `"openclaw-plugin"` entries so they are cleared and re-registered cleanly, or filtering them out of the `loadWorkspaceHookEntries` scan when they are already registered:
```suggestion
function resolveHookSource(source: string): HookSource {
switch (source) {
case "openclaw-bundled": return "bundled";
case "openclaw-managed": return "managed";
case "openclaw-workspace": return "workspace";
case "openclaw-plugin": return "workspace";
default: return "config";
}
}
```
How can I resolve this? If you propose a fix, please make it concise.| ) | ||
| .catch(() => {}); | ||
| } |
There was a problem hiding this comment.
Silent plugin hook error swallowing removes observability
The previous call sites logged plugin hook failures via logVerbose:
// old dispatch-from-config.ts
.catch((err) => {
logVerbose(`dispatch-from-config: message_received plugin hook failed: ${String(err)}`);
});The new unified helpers use empty .catch(() => {}) throughout (emitMessageReceived, emitMessageSent, and emitGatewayStartup), so plugin hook errors are completely silent. Note that triggerInternalHook already logs errors internally, so internal hook failures are still visible — but plugin hook failures via hookRunner.run*() are not.
Consider adding at least a logVerbose in the catch to preserve the previous debug signal:
| ) | |
| .catch(() => {}); | |
| } | |
| .catch((err) => { | |
| logVerbose(`dispatch-unified: message_received plugin hook failed: ${String(err)}`); | |
| }); |
(And similarly for the emitMessageSent and emitGatewayStartup plugin hook calls at lines 94 and 129.)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/hooks/dispatch-unified.ts
Line: 44-46
Comment:
**Silent plugin hook error swallowing removes observability**
The previous call sites logged plugin hook failures via `logVerbose`:
```ts
// old dispatch-from-config.ts
.catch((err) => {
logVerbose(`dispatch-from-config: message_received plugin hook failed: ${String(err)}`);
});
```
The new unified helpers use empty `.catch(() => {})` throughout (`emitMessageReceived`, `emitMessageSent`, and `emitGatewayStartup`), so plugin hook errors are completely silent. Note that `triggerInternalHook` already logs errors internally, so internal hook failures are still visible — but plugin hook failures via `hookRunner.run*()` are not.
Consider adding at least a `logVerbose` in the catch to preserve the previous debug signal:
```suggestion
.catch((err) => {
logVerbose(`dispatch-unified: message_received plugin hook failed: ${String(err)}`);
});
```
(And similarly for the `emitMessageSent` and `emitGatewayStartup` plugin hook calls at lines 94 and 129.)
How can I resolve this? If you propose a fix, please make it concise.- Change "openclaw-plugin" → "workspace" in resolveHookSource so
plugin-discovered hooks are properly cleared on hot-reload
- Replace silent .catch(() => {}) with logVerbose in dispatch-unified
to preserve debug observability for plugin hook failures
The before_reset plugin hook was fire-and-forget (void), so archiveSessionTranscriptsForSession could rename the session file before the hook finished executing. Await the hook to guarantee it completes before the file is archived. Also add logVerbose to the catch handler for debug observability. Fixes the race condition identified in openclaw#30784.
|
@greptile review Addressed all three findings from the previous review:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9751562e53
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| await hookRunner | ||
| .runBeforeReset( | ||
| { sessionFile, messages, reason: commandReason }, |
There was a problem hiding this comment.
Run before_reset only after reset preconditions succeed
sessions.reset now executes before_reset before ensureSessionRuntimeCleanup, but that cleanup can still fail (for example when waitForEmbeddedPiRunEnd times out), in which case the RPC returns UNAVAILABLE and no reset happens. In that failure path the plugin hook has already run and may persist memories or trigger external side effects for a reset that never occurred, which can cause duplicated or inconsistent plugin state on retry.
Useful? React with 👍 / 👎.
|
Hey, nice work tackling the root cause here — the 5 bugs all tracing back to the singleton duplication + clearing issue is a good catch. A few things I think would strengthen this before merge: Tests This PR introduces
Registry separation Right now the registry state, source-scoped clearing, and hook lookup are all embedded inside Inline source array The Plugin loader cache key
The
The hook fires before Missing The gateway RPC path omits Metadata mutation isolation In the dispatch helpers, the plugin hook and internal hook paths should receive separate object allocations for metadata/context. Currently Error logging consistency The internal hook AGENTS.md documentation The hook system now has meaningful architectural invariants (globalThis storage, source-scoped clearing semantics, which sources get cleared on hot-reload, why plugin clearing lives in You can use #30860 as reference. |
|
Closing — significant merge conflicts after 12 days. The hooks system has evolved substantially. Will re-evaluate if still needed. |
Summary
Fixes #30784 — The codebase has two independent hook systems with overlapping responsibilities and duplicated module-level singleton state. This PR fixes all 5 systemic bugs from the same root cause.
What changed
internal-hooks.tsandhook-runner-global.tsnow useglobalThis[Symbol.for()]instead of module-level variables — same pattern asplugins/runtime.ts. Hooks survive Rolldown chunk splitting."bundled","workspace","managed","config","plugin").clearInternalHooksBySource()preserves plugin hooks during hot-reload.src/hooks/dispatch-unified.ts—emitMessageReceived(),emitMessageSent(),emitGatewayStartup()— replaces 15-30 lines of manual dual-dispatch per call site.setTimeout(250), both hooks now fire together viaemitGatewayStartup()after sidecars are fully loaded.before_resethook (matchingcommands-core.tsbehavior).sessionKey(matchingbefore_compaction).Files
src/hooks/dispatch-unified.tssrc/hooks/internal-hooks.tssrc/plugins/hook-runner-global.tssrc/hooks/loader.tssrc/plugins/registry.ts"plugin"source to registerInternalHooksrc/gateway/server-startup.tssrc/gateway/server.impl.tssrc/auto-reply/reply/dispatch-from-config.tssrc/infra/outbound/deliver.tssrc/gateway/server-methods/sessions.tssrc/agents/pi-embedded-subscribe.handlers.compaction.ts1 new file, 10 modified. ~160 LOC new. Backward compatible — no changes to HookRunner facade, existing hook handler signatures, or the 24 typed hook definitions.
Test plan
registerInternalHookwith 2 args still works (backward compat)clearInternalHooksBySource(["bundled", "workspace", "managed", "config"])/resetfrom gateway RPC firesbefore_resetplugin hookafter_compactionhook receivessessionKeyin contextsrc/hooks/internal-hooks.test.ts)src/plugins/wired-hooks-compaction.test.ts)