Skip to content

fix: use live plugin registry for HTTP route matching after reload#45606

Open
dubthree wants to merge 1 commit intoopenclaw:mainfrom
dubthree:fix/plugin-registry-mismatch
Open

fix: use live plugin registry for HTTP route matching after reload#45606
dubthree wants to merge 1 commit intoopenclaw:mainfrom
dubthree:fix/plugin-registry-mismatch

Conversation

@dubthree
Copy link
Copy Markdown

@dubthree dubthree commented Mar 14, 2026

Summary

  • Fix BlueBubbles (and other channel webhook) routes returning 404 due to plugin registry singleton divergence across compiled chunk boundaries
  • Introduce a globalThis-backed registry bridge so registerPluginHttpRoute always targets the gateway handler's registry
  • Add regression tests in both http-registry.test.ts and plugins-http.test.ts

Problem

createGatewayPluginRequestHandler captures a registry object by closure at creation time. Channel plugins that register webhook routes later via registerPluginHttpRoute() (without an explicit registry parameter) fall back to requireActivePluginRegistry(). In the source, this is a single module (src/plugins/runtime.ts), so it returns the same object. But in the compiled output, the build system (rolldown) duplicates the registry singleton IIFE across chunk boundaries. Each chunk gets its own state variable. Although all are initialized from the same globalThis[Symbol.for(...)] entry, the entry itself can be replaced during startup, causing the state variables in different chunks to reference different registry objects.

The result: registerPluginHttpRoute adds the BlueBubbles webhook route to a registry object that the gateway handler never checks. POST requests to /bluebubbles-webhook return 404.

Confirmed with instrumentation:

[bb] live=1 captured=1 same=false

Both the handler's captured registry and the global singleton have only 1 route (twilio-whatsapp). The BlueBubbles route was added to a third registry object held by a different chunk's state variable.

Plugins that register routes during plugin loading via api.registerHttpRoute() (e.g., twilio-whatsapp) are unaffected because they write directly to the captured registry.

Fix

Three files changed:

  1. src/plugins/runtime.ts: Add setGatewayPluginRegistry() / getGatewayPluginRegistry() backed by a dedicated globalThis[Symbol.for("openclaw.gatewayPluginRegistry")]. This bypasses chunk-boundary singleton divergence entirely.

  2. src/gateway/server-runtime-state.ts: Call setGatewayPluginRegistry(params.pluginRegistry) before creating the HTTP request handler, stashing the gateway's registry on the global bridge.

  3. src/plugins/http-registry.ts: registerPluginHttpRoute now checks getGatewayPluginRegistry() before falling back to requireActivePluginRegistry(), ensuring routes always land in the registry the handler checks.

Test plan

  • src/plugins/http-registry.test.ts: "prefers the gateway plugin registry over the active singleton" (verifies routes go to the gateway registry, not the singleton)
  • src/gateway/server/plugins-http.test.ts: "sees routes registered via registerPluginHttpRoute after handler creation" (end-to-end: handler finds late-registered routes)
  • All existing tests pass: plugins-http.test.ts (14/14), http-registry.test.ts (11/11), webhook-targets.test.ts (16/16)
  • pnpm build passes (type-check + rolldown)
  • Verified locally on macOS: iMessage inbound via BlueBubbles works after applying equivalent runtime patch

Fixes #45598

🤖 Generated with Claude Code

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime size: S labels Mar 14, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 14, 2026

Greptile Summary

This PR fixes a stale closure bug in createGatewayPluginRequestHandler where the registry object captured at handler creation time becomes out of date after a config reload replaces the active registry via setActivePluginRegistry. The fix captures the registry version at handler creation and, on every request, switches to the current live registry if the version has changed — correctly covering the case where channel plugins register webhook routes against the new post-reload registry.

  • Fix is correct and well-scoped: The version-based check in plugins-http.ts accurately detects a registry swap and transparently falls through to getActivePluginRegistry(). The ?? registry fallback when the active registry is null is a safe defensive guard.
  • Regression test is effective: The new test in plugins-http.test.ts faithfully reproduces the original failure scenario and confirms the fix.
  • Minor test hygiene issue: The new test calls setActivePluginRegistry twice (incrementing the global version counter) without restoring state in an afterEach or finally block. While no existing tests are broken today, this leaves the global registry in a modified state and could cause subtle ordering-dependent failures if new tests are added later that assume a clean version counter.

Confidence Score: 4/5

  • This PR is safe to merge; the fix is logically correct and well-tested, with only a minor test hygiene concern.
  • The implementation is straightforward and the logic is sound. The version-based registry detection correctly handles single and multiple reloads without any race conditions (safe in Node.js's single-threaded model). The one point docked is for the test leaving global state dirty after execution, which is a latent risk for future test additions.
  • src/gateway/server/plugins-http.test.ts — the new test should restore global registry state after execution.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/gateway/server/plugins-http.test.ts
Line: 288-325

Comment:
**Test leaves global registry state dirty**

The test calls `setActivePluginRegistry(registryA)` and `setActivePluginRegistry(registryB)` but never restores the original registry state after the test completes. This also permanently increments the global version counter (each `setActivePluginRegistry` call does `state.version += 1`).

`unregister()` cleans up the late-registered route but leaves `registryB` as the active registry and the version at whatever it was bumped to. While this doesn't break any current tests in the file, it creates a subtle dependency on test execution order. Any future test that creates a handler expecting `creationVersion` to match the `currentVersion` at handler-creation time could silently switch to the live registry if this test has already mutated the global state.

Consider adding an `afterEach` (or a local cleanup via `try/finally`) that restores the original registry reference:

```ts
import { getActivePluginRegistry, setActivePluginRegistry } from "../../plugins/runtime.js";

it("sees routes added to the active registry after a reload replaces it", async () => {
  const originalRegistry = getActivePluginRegistry();
  try {
    // ... test body ...
  } finally {
    if (originalRegistry) setActivePluginRegistry(originalRegistry);
    unregister();
  }
});
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 8c31a08

Comment on lines +288 to +325
it("sees routes added to the active registry after a reload replaces it", async () => {
// Simulate gateway startup: handler is created with registry A.
const registryA = createTestRegistry({
httpRoutes: [createRoute({ path: "/existing", auth: "plugin" })],
});
setActivePluginRegistry(registryA);
const handler = createGatewayPluginRequestHandler({
registry: registryA,
log: createPluginLog(),
});

// Simulate a config reload that replaces the active registry with a new
// object (registry B). This is what loadOpenClawPlugins does on reload.
const registryB = createTestRegistry({
httpRoutes: [createRoute({ path: "/existing", auth: "plugin" })],
});
setActivePluginRegistry(registryB);

// Simulate a channel provider registering a webhook route after reload.
// registerPluginHttpRoute falls back to requireActivePluginRegistry(),
// which now returns registry B. The handler still holds registry A.
const lateHandler = vi.fn(async () => true);
const unregister = registerPluginHttpRoute({
path: "/late-webhook",
handler: lateHandler,
auth: "plugin",
match: "exact",
pluginId: "late-plugin",
source: "test",
});

const { res } = makeMockHttpResponse();
const handled = await handler({ url: "/late-webhook" } as IncomingMessage, res);
expect(handled).toBe(true);
expect(lateHandler).toHaveBeenCalledTimes(1);

unregister();
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test leaves global registry state dirty

The test calls setActivePluginRegistry(registryA) and setActivePluginRegistry(registryB) but never restores the original registry state after the test completes. This also permanently increments the global version counter (each setActivePluginRegistry call does state.version += 1).

unregister() cleans up the late-registered route but leaves registryB as the active registry and the version at whatever it was bumped to. While this doesn't break any current tests in the file, it creates a subtle dependency on test execution order. Any future test that creates a handler expecting creationVersion to match the currentVersion at handler-creation time could silently switch to the live registry if this test has already mutated the global state.

Consider adding an afterEach (or a local cleanup via try/finally) that restores the original registry reference:

import { getActivePluginRegistry, setActivePluginRegistry } from "../../plugins/runtime.js";

it("sees routes added to the active registry after a reload replaces it", async () => {
  const originalRegistry = getActivePluginRegistry();
  try {
    // ... test body ...
  } finally {
    if (originalRegistry) setActivePluginRegistry(originalRegistry);
    unregister();
  }
});
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/gateway/server/plugins-http.test.ts
Line: 288-325

Comment:
**Test leaves global registry state dirty**

The test calls `setActivePluginRegistry(registryA)` and `setActivePluginRegistry(registryB)` but never restores the original registry state after the test completes. This also permanently increments the global version counter (each `setActivePluginRegistry` call does `state.version += 1`).

`unregister()` cleans up the late-registered route but leaves `registryB` as the active registry and the version at whatever it was bumped to. While this doesn't break any current tests in the file, it creates a subtle dependency on test execution order. Any future test that creates a handler expecting `creationVersion` to match the `currentVersion` at handler-creation time could silently switch to the live registry if this test has already mutated the global state.

Consider adding an `afterEach` (or a local cleanup via `try/finally`) that restores the original registry reference:

```ts
import { getActivePluginRegistry, setActivePluginRegistry } from "../../plugins/runtime.js";

it("sees routes added to the active registry after a reload replaces it", async () => {
  const originalRegistry = getActivePluginRegistry();
  try {
    // ... test body ...
  } finally {
    if (originalRegistry) setActivePluginRegistry(originalRegistry);
    unregister();
  }
});
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8c31a08ff1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +73 to +74
const effectiveRegistry =
currentVersion !== creationVersion ? (getActivePluginRegistry() ?? registry) : registry;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep plugin HTTP handler bound to runtime registry

Switching to getActivePluginRegistry() whenever the global version changes lets unrelated registry activations hijack request routing. setActivePluginRegistry() increments the version on every activation (src/plugins/runtime.ts:25-29), and loadOpenClawPlugins() activates a registry even on cache hits (src/plugins/loader.ts:539-543); config.get/config.schema/config.schema.lookup call that path via loadSchemaWithPlugins() (src/gateway/server-methods/config.ts:245-303). After this change, those read-only RPCs can cause the HTTP handler to route against a different registry than the one it was created with, so plugin routes can change before the gateway runtime is actually reloaded.

Useful? React with 👍 / 👎.

Channel plugins that register webhook routes via registerPluginHttpRoute()
(called from registerWebhookTargetWithPluginRoute) target the active plugin
registry singleton. When the build system duplicates the singleton state across
chunk boundaries, the singleton in the plugin-sdk chunk can diverge from the one
the gateway handler captured, making webhook routes invisible to the HTTP
request dispatcher.

Introduce a dedicated globalThis-backed registry bridge
(setGatewayPluginRegistry / getGatewayPluginRegistry) so
registerPluginHttpRoute always targets the same registry object that the
gateway handler checks, regardless of chunk-boundary singleton divergence.

Fixes openclaw#45598
@xmoxmo
Copy link
Copy Markdown

xmoxmo commented Mar 26, 2026

We are seeing what looks like the WS/gateway-method analogue of this stale-registry pattern.

In our case, a plugin-backed gateway method (bncr.connect) can be visible in openclaw plugins inspect bncr and method listings, while the live gateway WS still returns unknown method: bncr.connect in the broken state.

Relevant code paths on 2026.3.23-2:

  • WS dispatch checks opts.extraHandlers?.[req.method] ?? coreGatewayHandlers[req.method]
  • method listing comes from listChannelPlugins().flatMap((plugin) => plugin.gatewayMethods ?? [])
  • WS extraHandlers is built from { ...pluginRegistry.gatewayHandlers, ... }
  • active plugin registry remains globally replaceable via setActivePluginRegistry(...)

So the shape looks very similar to the HTTP route issue here:

  • declaration / inspection path sees one registry view
  • live runtime dispatch uses an earlier captured/expanded handler table

I opened a dedicated issue for the WS/gateway-method variant here: #54783

If useful, I can also add a minimal repro focused specifically on plugin gateway methods rather than HTTP routes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BlueBubbles webhook route invisible to gateway handler (registry object mismatch)

2 participants