fix(plugins): late-binding subagent runtime for non-gateway load paths#46648
fix(plugins): late-binding subagent runtime for non-gateway load paths#46648jalehman merged 11 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR introduces a process-global late-binding mechanism so that plugin runtimes loaded via non-gateway code paths (e.g. Key changes:
Minor observation:
Confidence Score: 4/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/plugins/runtime/index.ts
Line: 101-105
Comment:
**Proxy `receiver` may mismatch resolved target**
The `get` trap passes the Proxy itself as `receiver` to `Reflect.get(resolved, prop, receiver)`. If `resolved` is a real subagent whose methods are defined as *accessor properties* (getters) that reference `this`, `this` would be the Proxy rather than the resolved subagent object, potentially breaking those accessors.
This is safe with the current `createGatewaySubagentRuntime()` implementation because all methods are plain closure-based async functions that never use `this`. However, it is a subtle contract: any future subagent implementation that relies on `this` in a getter would silently break. Consider binding `receiver` to `resolved` instead:
```suggestion
get(_target, prop, _receiver) {
const resolved = gatewaySubagentState.subagent ?? unavailable;
return Reflect.get(resolved, prop, resolved);
},
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 0dc25e0 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0dc25e0276
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| import type { loadConfig } from "../config/config.js"; | ||
| import { loadOpenClawPlugins } from "../plugins/loader.js"; | ||
| import { getPluginRuntimeGatewayRequestScope } from "../plugins/runtime/gateway-request-scope.js"; | ||
| import { setGatewaySubagentRuntime } from "../plugins/runtime/index.js"; |
There was a problem hiding this comment.
Preserve lazy plugin runtime loading in gateway plugin setup
Importing setGatewaySubagentRuntime from plugins/runtime/index here eagerly loads the full plugin runtime module graph at server-plugins module load time. That bypasses the lazy-loading contract documented in src/plugins/loader.ts (the runtime is supposed to initialize only when actually needed), so gateway startup now pays the full runtime/channel import cost even when plugins are disabled or skipped.
Useful? React with 👍 / 👎.
0dc25e0 to
12443e2
Compare
🔒 Aisle Security AnalysisWe found 3 potential security issue(s) in this PR:
1. 🔴 Privilege escalation: gateway-bindable plugin subagent uses synthetic admin client when no request scope (reachable via /tools/invoke)
DescriptionGateway-bindable plugin runtimes can dispatch subagent operations through the gateway without inheriting the HTTP caller’s authorization context. When a plugin runtime subagent call occurs without an active
This enables a plugin tool invoked by a non-admin HTTP caller to perform admin-only gateway methods via
Vulnerable code paths (newly enabled):
Vulnerable code: function createSyntheticOperatorClient(): GatewayRequestOptions["client"] {
return {
connect: {
role: "operator",
scopes: ["operator.admin", "operator.approvals", "operator.pairing"],
},
};
}
await handleGatewayRequest({
...,
client: scope?.client ?? createSyntheticOperatorClient(),
context,
});Because RecommendationEnsure plugin subagent dispatch never falls back to an admin-scoped synthetic client for non-gateway-request contexts. Suggested fixes:
const scope = getPluginRuntimeGatewayRequestScope();
if (!scope?.context || !scope?.client) {
throw new Error("subagent methods require an active gateway request scope");
}
return await handleGatewayRequest({ ..., client: scope.client, context: scope.context });
2. 🟠 Process-global late-bound gateway subagent allows out-of-request privileged gateway dispatch
DescriptionThe plugin runtime now supports process-global late-binding of This creates an access-control/scope boundary break:
Impact (when plugins are considered less-trusted than core gateway code):
Vulnerable code (late binding to process-global state): return new Proxy(unavailable, {
get(_target, prop, _receiver) {
const resolved = gatewaySubagentState.subagent ?? unavailable;
return Reflect.get(resolved, prop, resolved);
},
});Related call stack / data flow:
RecommendationEnforce request scoping and prevent privileged fallback dispatch from being reachable via plugin runtime:
Example (strictly require scope in dispatch): import { getPluginRuntimeGatewayRequestScope } from "../plugins/runtime/gateway-request-scope.js";
async function dispatchGatewayMethod<T>(method: string, params: Record<string, unknown>): Promise<T> {
const scope = getPluginRuntimeGatewayRequestScope();
if (!scope?.context || !scope?.client) {
throw new Error("subagent dispatch requires an active gateway request scope");
}
// ... call handleGatewayRequest using scope.client/scope.context ...
}This keeps 3. 🔵 Process-global gateway subagent runtime can be tampered with by any in-process plugin via Symbol.for/globalThis
Description
Because plugins are loaded as ordinary in-process Node modules (see
Impact if plugins are not fully trusted / are expected to be isolated from each other:
Vulnerable code: const GATEWAY_SUBAGENT_SYMBOL: unique symbol = Symbol.for(
"openclaw.plugin.gatewaySubagentRuntime",
) as unknown as typeof GATEWAY_SUBAGENT_SYMBOL;
const gatewaySubagentState: GatewaySubagentState = (() => {
const g = globalThis as typeof globalThis & {
[GATEWAY_SUBAGENT_SYMBOL]?: GatewaySubagentState;
};
const existing = g[GATEWAY_SUBAGENT_SYMBOL];
if (existing) return existing;
const created: GatewaySubagentState = { subagent: undefined };
g[GATEWAY_SUBAGENT_SYMBOL] = created;
return created;
})();RecommendationIf plugins are intended to be untrusted or at least isolated from each other, avoid exposing mutable gateway state via a predictable global symbol. Recommended fixes (pick based on desired threat model):
Example integrity-hardened pattern: // module-private
let gatewaySubagent: PluginRuntime["subagent"] | undefined;
export function setGatewaySubagentRuntime(subagent: PluginRuntime["subagent"]) {
gatewaySubagent = subagent;
}
function getGatewaySubagentRuntime() {
return gatewaySubagent;
}
function createLateBindingSubagent(/*...*/) {
const unavailable = createUnavailableSubagentRuntime();
return new Proxy(unavailable, {
get(_t, prop) {
const resolved = getGatewaySubagentRuntime() ?? unavailable;
return Reflect.get(resolved, prop, resolved);
},
});
}Additionally, consider enforcing request-/tenant-scope authorization inside the gateway subagent methods themselves so that swapping references cannot bypass checks. Analyzed PR: #46648 at commit Last updated on: 2026-03-16T22:52:50Z |
Keep plugin runtime subagent access unavailable by default, but allow explicit gateway-bound late binding for gateway plugin registries and embedded reply flows used by the lossless-claw context engine. Split loader cache entries by subagent mode and add targeted tests for the security boundary and the new opt-in propagation. Regeneration-Prompt: | Address the Aisle concern on PR openclaw#46648 without backing out the intent of the PR. The original change made plugin runtime subagent access late-bind to a process-global gateway runtime for every plugin runtime in the process, which widened the default capability boundary. Preserve the fix for cached plugin registries and the lossless-claw context engine, but require explicit opt-in for gateway-bound runtimes instead of changing createPluginRuntime() globally. Thread that opt-in through gateway-owned plugin loads and the embedded reply/compaction paths that power lossless-claw, keep CLI/local/cron paths on the unavailable default unless they explicitly request binding, separate loader cache keys by subagent mode, and update tests to prove the narrower security boundary and the new flag propagation.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: acf10e444d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2d5fde5022
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b74bf833db
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Keep plugin runtime subagent access unavailable by default, but allow explicit gateway-bound late binding for gateway plugin registries and embedded reply flows used by the lossless-claw context engine. Split loader cache entries by subagent mode and add targeted tests for the security boundary and the new opt-in propagation. Regeneration-Prompt: | Address the Aisle concern on PR openclaw#46648 without backing out the intent of the PR. The original change made plugin runtime subagent access late-bind to a process-global gateway runtime for every plugin runtime in the process, which widened the default capability boundary. Preserve the fix for cached plugin registries and the lossless-claw context engine, but require explicit opt-in for gateway-bound runtimes instead of changing createPluginRuntime() globally. Thread that opt-in through gateway-owned plugin loads and the embedded reply/compaction paths that power lossless-claw, keep CLI/local/cron paths on the unavailable default unless they explicitly request binding, separate loader cache keys by subagent mode, and update tests to prove the narrower security boundary and the new flag propagation.
Refresh the prep branch after rebasing onto current main by aligning two stale tests with current semantics and adding the required Unreleased changelog entry for the gateway-owned plugin subagent binding fix. The branch behavior is unchanged; this only fixes prep-time expectations that diverged from upstream. Regeneration-Prompt: | Prepare PR openclaw#46648 for merge after the functional fix already landed on the branch. The prep workflow required an Unreleased changelog entry and CI was failing on two rebased tests that no longer matched upstream behavior: configured-channel plugin loading now returns an empty configured set when a channel is explicitly disabled, and ACP session load for openai/gpt-5.4 no longer advertises xhigh in the available thought levels. Update only those stale expectations and add a generic changelog line for the gateway-owned plugin subagent binding fix without introducing plugin-specific naming or changing runtime behavior.
3997084 to
41a58ab
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1579f2c5f8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1439a4c7c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Keep plugin runtime subagent access unavailable by default, but allow explicit gateway-bound late binding for gateway plugin registries and embedded reply flows used by the lossless-claw context engine. Split loader cache entries by subagent mode and add targeted tests for the security boundary and the new opt-in propagation. Regeneration-Prompt: | Address the Aisle concern on PR openclaw#46648 without backing out the intent of the PR. The original change made plugin runtime subagent access late-bind to a process-global gateway runtime for every plugin runtime in the process, which widened the default capability boundary. Preserve the fix for cached plugin registries and the lossless-claw context engine, but require explicit opt-in for gateway-bound runtimes instead of changing createPluginRuntime() globally. Thread that opt-in through gateway-owned plugin loads and the embedded reply/compaction paths that power lossless-claw, keep CLI/local/cron paths on the unavailable default unless they explicitly request binding, separate loader cache keys by subagent mode, and update tests to prove the narrower security boundary and the new flag propagation.
Adjust the compact hooks test double for model resolution so TypeScript accepts the mocked resolveModelAsync signature used by the embedded runner tests. Regeneration-Prompt: | CI reported a TypeScript error in src/agents/pi-embedded-runner/compact.hooks.test.ts after adding resolveModelAsync coverage. Keep the test behavior the same, but make the mocked resolveModel/resolveModelAsync signatures match the real embedded runner model resolver shape so tsgo passes without broadening runtime behavior or changing production code.
Plugin runtime gateway opt-in was already reaching runtime-plugin bootstrap, but plugin tools created inside embedded agent runs still loaded through resolvePluginTools() without runtimeOptions. That left tools like lossless-claw's lcm_expand_query with the default blocked subagent even during Discord gateway requests. Thread allowGatewaySubagentBinding through createOpenClawTools/createOpenClawCodingTools and the embedded run + compaction tool builders so gateway-owned runs can late-bind subagent access without reopening the broader process-global default. Regeneration-Prompt: | A follow-up test on the PR branch showed lossless-claw still failing with "Plugin runtime subagent methods are only available during a gateway request" from lcm_expand_query over Discord. The earlier fix only covered runtime plugin bootstrap and context-engine loading. Investigate the actual tool execution path and preserve Aisle's security concern by keeping late-bound gateway subagent access opt-in rather than default. The missing hop is plugin tool creation inside embedded agent runs and compaction: resolvePluginTools, createOpenClawTools, createOpenClawCodingTools, and the run/compact callers need to thread the explicit allowGatewaySubagentBinding flag. Add focused regression tests that prove plugin tool resolution receives runtimeOptions and that the tool-construction layers forward the flag, while noting that the compact.hooks suite may still hang during Vitest worker teardown in this environment.
Keep plugin subagent late binding opt-in by default, but enable it for the remaining gateway-owned direct tool execution surfaces. Gateway HTTP tool invocation and inline tool dispatch both execute plugin tools inside an active gateway request, so they should instantiate plugin runtimes with the same explicit subagent binding as embedded agent runs. Regeneration-Prompt: | A follow-up runtime check still hit "Plugin runtime subagent methods are only available during a gateway request" after the embedded-run tool path was patched. Investigate other gateway-owned plugin tool entry points without changing the default runtime boundary. The important constraint is to preserve Aisle's concern: no process-wide late binding, only explicit gateway request opt-in. Update direct gateway tool execution surfaces that call createOpenClawTools during an active request so plugin tools can use runtime.subagent there too, and add a generic regression test for the HTTP path without mentioning any specific plugin by name.
The embedded runner was receiving allowGatewaySubagentBinding from gateway-owned reply flows, but it dropped that flag before calling runEmbeddedAttempt(). Plugin tools are instantiated inside the attempt layer, so lcm_expand_query still saw the default throwing runtime.subagent stub even after the earlier gateway-binding fixes. Add the missing forward in runEmbeddedPiAgent() and tighten the existing usage-reporting test to assert the flag reaches runEmbeddedAttempt(). Regeneration-Prompt: | The plugin subagent late-binding hardening introduced an explicit allowGatewaySubagentBinding opt-in, and lossless-claw still failed with the "Plugin runtime subagent methods are only available during a gateway request" error even on the PR branch. Compare the current behavior to commit 12443e2, where late binding still worked, and trace the full gateway-owned agent execution path instead of assuming the earlier tool/runtime patches were sufficient. The key requirement is to preserve the narrowed security boundary while restoring the original intent for legitimate gateway-owned tool execution. Check every handoff from the reply runners into runEmbeddedPiAgent() and then into the lower embedded attempt layer, because plugin tools are actually constructed there. If the opt-in flag is dropped before runEmbeddedAttempt(), forward it and add a regression assertion that verifies the embedded runner passes allowGatewaySubagentBinding through to runEmbeddedAttempt().
Refresh the prep branch after rebasing onto current main by aligning two stale tests with current semantics and adding the required Unreleased changelog entry for the gateway-owned plugin subagent binding fix. The branch behavior is unchanged; this only fixes prep-time expectations that diverged from upstream. Regeneration-Prompt: | Prepare PR openclaw#46648 for merge after the functional fix already landed on the branch. The prep workflow required an Unreleased changelog entry and CI was failing on two rebased tests that no longer matched upstream behavior: configured-channel plugin loading now returns an empty configured set when a channel is explicitly disabled, and ACP session load for openai/gpt-5.4 no longer advertises xhigh in the available thought levels. Update only those stale expectations and add a generic changelog line for the gateway-owned plugin subagent binding fix without introducing plugin-specific naming or changing runtime behavior.
Two gateway-owned plugin entry points were still rebuilding plugin registries in default subagent mode after the cache split: tools.catalog metadata loads and the context-engine onSubagentEnded reload path. Opt both into gateway subagent binding and pin the behavior with focused tests. Regeneration-Prompt: | After the main plugin subagent fixes, re-check unresolved PR review comments for gateway-owned paths that still rebuild plugin registries or plugin tools without allowGatewaySubagentBinding. In this branch, two valid gaps remain: tools.catalog builds plugin tool metadata during a gateway request, and subagent-registry reloads runtime plugins before calling context-engine onSubagentEnded. Both should use the same explicit gateway subagent binding as the other gateway-owned paths after the loader cache split. Patch only those two forwards and add minimal tests that assert the flag is passed through.
Reset provider-runtime hook cache in the ACP session-rate-limit suite so provider plugin thinking hooks from earlier tests do not leak xhigh support into loadSession assertions. Update the configured-channel loader fixture to include real channel config instead of relying on enabled:true, which does not count as meaningful configuration in this repo. Also opt cron-owned embedded runs into gateway subagent late binding and assert that handoff in the cron fast-mode test. Regeneration-Prompt: | The PR already fixed most gateway-owned plugin runtime paths after splitting the plugin loader cache by subagent mode, but a remaining codex comment pointed out cron isolated agent runs still called runEmbeddedPiAgent without allowGatewaySubagentBinding. Patch that cron path so plugin tools and context engines triggered by cron still get gateway-bound subagent access, and extend an existing cron test to assert the flag is forwarded. While checking follow-up CI failures, distinguish real regressions from stale or order-dependent tests. The loader test named for configured channel loads was using only enabled:true, but channel configuration detection now requires a meaningful config field, so make the fixture actually configured instead of changing production behavior. The ACP session-rate-limit test was flaky because provider-runtime hook cache could leak mocked xhigh support from provider-runtime.test.ts into the ACP file; reset that cache in the ACP suite so the expectation reflects isolated behavior.
Command-system-prompt flows build plugin tools through createOpenClawCodingTools, but they were still omitting allowGatewaySubagentBinding after the plugin loader cache was partitioned by runtimeSubagentMode. Opt that gateway-owned path into gateway subagent late binding and add a focused unit test that asserts resolveCommandsSystemPromptBundle forwards the flag when constructing command tools. Regeneration-Prompt: | After splitting plugin loader cache entries by runtime subagent mode, every gateway-owned tool/runtime construction path has to opt into allowGatewaySubagentBinding explicitly or it will get a separate default plugin runtime where runtime.subagent methods throw. A new review comment pointed out the command-system-prompt path used for command flows like /context and /export-session still built tools without that flag. Update that path to pass allowGatewaySubagentBinding: true and add a focused test around resolveCommandsSystemPromptBundle so future refactors do not drop the flag again.
b449373 to
61f9acd
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 61f9acdfbc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Outbound channel bootstrap can cold-load plugins to recover channel adapters when the active registry is missing the requested channel. After the plugin cache was partitioned by runtimeSubagentMode, loading that path without allowGatewaySubagentBinding could activate a separate default registry and overwrite global plugin commands and interactive handlers with runtimes where runtime.subagent throws. Pass the gateway-bindable runtime option through channel bootstrap and pin the behavior with a focused test. Regeneration-Prompt: | A review comment pointed out that adding runtimeSubagentMode to the plugin loader cache key means gateway-bindable and default loads now produce separate cached registries, but plugin commands and interactive handlers are still process-global and are only rebuilt on cache misses. The outbound channel bootstrap path in src/infra/outbound/channel-resolution.ts can cold-load plugins when a requested channel is missing from the active registry. If that bootstrap uses the default runtime mode inside the gateway process, it can replace global plugin handlers with runtimes whose subagent methods throw, and later gateway-mode cache hits will not rebuild them. Update that bootstrap load to pass allowGatewaySubagentBinding: true and add a regression test asserting the loader call includes the runtime option.
|
Merged via squash.
Thanks @jalehman! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 44742652c9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| runtimeSubagentMode: | ||
| options.runtimeOptions?.allowGatewaySubagentBinding === true | ||
| ? "gateway-bindable" | ||
| : options.runtimeOptions?.subagent |
There was a problem hiding this comment.
Propagate gateway subagent mode to slug-generator runs
Partitioning the plugin cache by runtimeSubagentMode means any gateway path that does not pass allowGatewaySubagentBinding now gets a separate default runtime where runtime.subagent.* throws. One gateway-owned embedded-run path still omits that opt-in: generateSlugViaLLM calls runEmbeddedPiAgent without the flag (src/hooks/llm-slug-generator.ts:53-66), so when the session-memory hook runs in gateway processes, plugin hooks/tools that rely on subagent APIs can fail during slug generation despite gateway startup having initialized subagent support.
Useful? React with 👍 / 👎.
openclaw#46648) Merged via squash. Prepared head SHA: 4474265 Co-authored-by: jalehman <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
* refactor: make setup the primary wizard surface * test: move setup surface coverage * docs: prefer setup wizard command * fix: follow up shared interactive regressions (openclaw#47715) * fix(plugins): resolve lazy runtime from package root * fix(daemon): accept 'Last Result' schtasks key variant on Windows (openclaw#47726) Some Windows locales/versions emit 'Last Result' instead of 'Last Run Result' in schtasks output, causing gateway status to falsely report 'Runtime: unknown'. Fall back to the shorter key when the canonical key is absent. * fix: accept schtasks Last Result key on Windows (openclaw#47844) (thanks @MoerAI) * refactor(plugins): move auth profile hooks into providers * fix: resume orphaned subagent sessions after SIGUSR1 reload Closes openclaw#47711 After a SIGUSR1 gateway reload aborts in-flight subagent LLM calls, the gateway now scans for orphaned sessions and sends a synthetic resume message to restart their work. Also makes the deferral timeout configurable via gateway.reload.deferralTimeoutMs (default: 5 minutes, up from 90s). * fix: address Greptile review feedback - Remove unrelated pnpm-lock.yaml changes - Move abortedLastRun flag clearing to AFTER successful resume (prevents permanent session loss on transient gateway failures) - Use dynamic import for orphan recovery module to avoid startup memory overhead - Add test assertion that flag is preserved on resume failure * fix: add retry with exponential backoff for orphan recovery Addresses Codex review feedback — if recovery fails (e.g. gateway still booting), retries up to 3 times with exponential backoff (5s → 10s → 20s) before giving up. * fix: address all review comments on PR openclaw#47719 + implement resume context and config idempotency guard * fix: address 6 review comments on PR openclaw#47719 1. [P1] Treat remap failures as resume failures — if replaceSubagentRunAfterSteer returns false, do NOT clear abortedLastRun, increment failed count. 2. [P2] Count scan-level exceptions as retryable failures — set result.failed > 0 in the outer catch block so scheduleOrphanRecovery retry logic triggers. 3. [P2] Persist resumed-session dedupe across recovery retries — accept resumedSessionKeys as a parameter; scheduleOrphanRecovery lifts the Set to its own scope and passes it through retries. 4. [Greptile] Use typed config accessors instead of raw structural cast for TLS check in lifecycle.ts. 5. [Greptile] Forward gateway.reload.deferralTimeoutMs to deferGatewayRestartUntilIdle in scheduleGatewaySigusr1Restart so user-configured value is not silently ignored. 6. [Greptile] Same as openclaw#4 — already addressed by the typed config fix. Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> * fix: land SIGUSR1 orphan recovery regressions (openclaw#47719) (thanks @joeykrug) * refactor: move channel messaging hooks into plugins * fix: stabilize windows parallels smoke harness * refactor(plugins): move provider onboarding auth into plugins * docs: restore onboard as canonical setup command * docs: restore onboard docs references * refactor: move channel capability diagnostics into plugins * refactor: split slack block action handling * refactor: split plugin interactive dispatch adapters * refactor: unify reply content checks * refactor: extract discord shared interactive mapper * refactor: unify telegram interactive button resolution * build: add land gate parity script * test: fix setup wizard smoke mocks * docs: sync config baseline * Status: split heartbeat summary helpers * Security: trim audit policy import surfaces * Security: lazy-load deep skill audit helpers * Security: lazy-load audit config snapshot IO * Config: keep native command defaults off heavy channel registry * Status: split lightweight gateway agent list * refactor(plugins): simplify provider auth choice metadata * test: add openshell sandbox e2e smoke * feishu: add structured card actions and interactive approval flows (openclaw#47873) * feishu: add structured card actions and interactive approval flows * feishu: address review fixes and test-gate regressions * feishu: hold inflight card dedup until completion * feishu: restore fire-and-forget bot menu handling * feishu: format card interaction helpers * Feishu: add changelog entry for card interactions * Feishu: add changelog entry for ACP session binding * build: remove land gate script * Status: lazy-load tailscale and memory scan deps * Tests: fix Feishu full registration mock * Tests: cover plugin capability matrix * Gateway: import normalizeAgentId in hooks * fix: recover bonjour advertiser from ciao announce loops * fix: preserve loopback gateway scopes for local auth * Status: lazy-load summary session helpers * Status: lazy-load security audit commands * refactor: move channel delivery and ACP seams into plugins * Security: split audit runtime surfaces * Tests: add channel actions contract helper * Tests: add channel plugin contract helper * Tests: add Slack channel contract suite * Tests: add Mattermost channel contract suite * Tests: add Telegram channel contract suite * Tests: add Discord channel contract suite * fix(session): preserve external channel route when webchat views session (openclaw#47745) When a Telegram/WhatsApp/iMessage session was viewed or messaged from the dashboard/webchat, resolveLastChannelRaw() unconditionally returned 'webchat' for any isDirectSessionKey() or isMainSessionKey() match, overwriting the persisted external delivery route. This caused subagent completion events to be delivered to the webchat/dashboard instead of the original channel (Telegram, WhatsApp, etc.), silently dropping messages for the channel user. Fix: only allow webchat to own routing when no external delivery route has been established (no persisted external lastChannel, no external channel hint in the session key). If an external route exists, webchat is treated as admin/monitoring access and must not mutate the delivery route. Updated/added tests to document the correct behaviour. Fixes openclaw#47745 * fix: address bot nit on session route preservation (openclaw#47797) (thanks @brokemac79) * Tests: add plugin contract suites * Tests: add plugin contract registry * Tests: add global plugin contract suite * Tests: add global actions contract suite * Tests: add global setup contract suite * Tests: add global status contract suite * Tests: replace local channel contracts * refactor(plugins): move onboarding auth metadata to manifests * refactor: move remaining channel seams into plugins * refactor: add plugin-owned outbound adapters * fix: scope localStorage settings key by basePath to prevent cross-deployment conflicts - Add settingsKeyForGateway() function similar to tokenSessionKeyForGateway() - Use scoped key format: openclaw.control.settings.v1:https://example.com/gateway-a - Add migration from legacy static key on load - Fixes openclaw#47481 * Tests: add provider contract suites * Tests: add provider contract registry * Tests: add global provider contract suite * Tests: add global web search contract suite * fix: stabilize ci gate * Tests: add provider registry contract suite * Tests: relax provider auth hint contract * !refactor(browser): remove Chrome extension path and add MCP doctor migration (openclaw#47893) * Browser: replace extension path with Chrome MCP * Browser: clarify relay stub and doctor checks * Docs: mark browser MCP migration as breaking * Browser: reject unsupported profile drivers * Browser: accept clawd alias on profile create * Doctor: narrow legacy browser driver migration * feishu: harden media support and align capability docs (openclaw#47968) * feishu: harden media support and action surface * feishu: format media action changes * feishu: fix review follow-ups * fix: scope Feishu target aliases to Feishu (openclaw#47968) (thanks @Takhoffman) * fix: make docs i18n use gpt-5.4 overrides * docs: regenerate zh-CN onboarding references * Tests: tighten provider wizard contracts * Tests: add plugin loader contract suite * refactor: remove dock shim and move session routing into plugins * Plugins: add provider runtime contracts * GitHub Copilot: move runtime tests to provider contracts * Z.ai: move runtime tests to provider contracts * Anthropic: move runtime tests to provider contracts * Google: move runtime tests to provider contracts * OpenAI: move runtime tests to provider contracts * Qwen Portal: move runtime tests to provider contracts * fix(core): restore outbound fallbacks and gate checks * style(core): normalize rebase fallout * fix: accept sandbox plugin id hints * Plugins: capture tool registrations in test registry * Plugins: cover Firecrawl tool ownership * Firecrawl: drop local registration contract test * Plugins: add provider catalog contracts * Plugins: narrow provider runtime contracts * refactor(plugin-sdk): use scoped core imports for bundled channels * fix: unblock ci gates * Plugins: add provider wizard contracts * fix: unblock docs and registry checks * refactor: finish plugin-owned channel runtime seams * refactor(plugin-sdk): clean shared core imports * Plugins: add provider auth contracts * Plugins: dedupe routing imports in channel adapters * Plugins: add provider discovery contracts * fix: stop bonjour before re-advertising * Plugins: extend provider discovery contracts * docs: codify macOS parallels discord smoke * refactor: move session lifecycle and outbound fallbacks into plugins * refactor(plugins): derive compat provider ids from manifests * Plugins: cover catalog discovery providers * Tests: type auth contract prompt mocks * fix: mount CLI auth dirs in docker live tests * refactor: route shared channel sdk imports through plugin seams * Plugins: add auth choice contracts * Plugins: restore routing seams and discovery fixtures * fix: harden bonjour retry recovery * Runtime: lazy-load channel runtime singletons * refactor: tighten plugin sdk channel seams * refactor: route remaining channel imports through plugin sdk * refactor(plugins): finish provider auth boundary cleanup * fix(infra): wire gaxios-fetch-compat shim to prevent node-fetch crash on Node.js 25 * fix(infra): also wire gaxios-fetch-compat shim into src/index.ts (gateway entry) * fix: keep gaxios compat off the package root (openclaw#47914) (thanks @pdd-cli) * refactor: shrink public channel plugin sdk surfaces * refactor: add private channel sdk bridges * fix: restore effective setup wizard lazy import * docs: add frontmatter to parallels discord skill * fix: retry runtime postbuild skill copy races * Gateway: lazily resolve channel runtime * Gateway: cover lazy channel runtime resolution * Plugins: add Claude marketplace registry installs (openclaw#48058) * Changelog: note Claude marketplace plugin support * Plugins: add Claude marketplace installs * E2E: cover marketplace plugin installs in Docker * UI: keep thinking helpers browser-safe * Infra: restore check after gaxios compat * Docs: refresh generated config baseline * docs: reorder unreleased changelog entries * fix: split browser-safe thinking helpers * fix(macos): restore debug build helpers (openclaw#48046) * Docs: add Claude marketplace plugin install guidance * Channels: expand contract suites * Channels: add contract surface coverage * Channels: centralize outbound payload contracts * Channels: centralize group policy contracts * Channels: centralize inbound context contracts * Tests: add contract runner * Tests: harden WhatsApp inbound contract cleanup * Tests: add extension test runner * Runtime: lazy-load Discord channel ops * Docs: use placeholders for marketplace plugin examples * Release: trim generated docs from npm pack * Runtime: lazy-load Telegram and Slack channel ops * Tests: detect changed extensions * Tests: cover changed extension detection * Docs: add extension test workflow * CI: add changed extension test lane * BlueBubbles: lazy-load channel runtime paths * Plugin SDK: restore scoped imports for bundled channels * Plugin SDK: consolidate shared channel exports * Channels: fix surface contract plugin lookup * Status: stabilize startup memory probes * Media: avoid slow auth misses in auto-detect * Tests: scope Codex bundle loader fixture * Tests: isolate bundle surface fixtures * feat(telegram): add configurable silent error replies (openclaw#19776) Port and complete openclaw#19776 on top of the current Telegram extension layout. Adds a default-off `channels.telegram.silentErrorReplies` setting. When enabled, Telegram bot replies marked as errors are delivered silently across the regular bot reply flow, native/slash command replies, and fallback sends. Thanks @auspic7 Co-authored-by: Myeongwon Choi <[email protected]> Co-authored-by: ImLukeF <[email protected]> * fix(ui): auto load Usage tab data on navigation * fix(telegram): keep silent error fallback replies quiet * test(gateway): restore agent request route mock * Cron: isolate active-model delivery tests * Tests: align media auth fixture with selection checks * Plugins: preserve lazy runtime provider resolution * Bootstrap: report nested entry import misses * fix(channels): parse bundled targets without plugin registry * test(telegram): cover shared parsing without registry * Plugin SDK: split setup and sandbox subpaths * Providers: centralize setup defaults and helper boundaries * Plugins: decouple bundled web search discovery * Plugin SDK: update entrypoint metadata * Secrets: honor caller env during runtime validation * Tests: align Docker cache checks with non-root images * Plugin SDK: keep root alias reflection lazy * Providers: scope compat resolution to owning plugins * Plugin SDK: add narrow setup subpaths * Plugin SDK: update entrypoint metadata * fix(slack): harden bolt import interop (openclaw#45953) * fix(slack): harden bolt import interop * fix(slack): simplify bolt interop resolver * fix(slack): harden startup bolt interop * fix(slack): place changelog entry at section end --------- Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Altay <[email protected]> * Tests: fix green check typing regressions * Plugins: avoid booting bundled providers for catalog hooks * fix: bypass telegram runtime proxy during health checks * fix: align telegram probe test mock * test: remove stale synology zod mock * fix(android): reduce chat recomposition churn * fix(android): preserve chat message identity on refresh * fix(android): shrink chat image attachments * Browser: support non-Chrome existing-session profiles via userDataDir (openclaw#48170) Merged via squash. Prepared head SHA: e490035 Co-authored-by: velvet-shark <[email protected]> Co-authored-by: velvet-shark <[email protected]> Reviewed-by: @velvet-shark * fix(local-storage): improve VITEST environment check for localStorage access * fix: normalize discord commands allowFrom auth * test: update discord subagent hook mocks * test: mock telegram native command reply pipeline * fix(android): lazy-init node runtime after onboarding * docs(config): refresh generated baseline * Plugins: share channel plugin id resolution * Gateway: defer full channel plugins until after listen * Gateway: gate deferred channel startup behind opt-in * Docs: document deferred channel startup opt-in * feat(skills): preserve all skills in prompt via compact fallback before dropping (openclaw#47553) * feat(skills): add compact format fallback for skill catalog truncation When the full-format skill catalog exceeds the character budget, applySkillsPromptLimits now tries a compact format (name + location only, no description) before binary-searching for the largest fitting prefix. This preserves full model awareness of registered skills in the common overflow case. Three-tier strategy: 1. Full format fits → use as-is 2. Compact format fits → switch to compact, keep all skills 3. Compact still too large → binary search largest compact prefix Other changes: - escapeXml() utility for safe XML attribute values - formatSkillsCompact() emits same XML structure minus <description> - Compact char-budget check reserves 150 chars for the warning line the caller prepends, preventing prompt overflow at the boundary - 13 tests covering all tiers, edge cases, and budget reservation - docs/.generated/config-baseline.json: fix pre-existing oxfmt issue * docs: document compact skill prompt fallback --------- Co-authored-by: Frank Yang <[email protected]> * Gateway: simplify startup and stabilize mock responses tests * test: fix stale web search and boot-md contracts * Gateway tests: centralize mock responses provider setup * Gateway tests: share ordered client teardown helper * Infra: ignore ciao probing cancellations * Docs: repair unreleased changelog attribution * Docs: normalize unreleased changelog refs * Channels: ignore enabled-only disabled plugin config * perf: reduce status json startup memory * perf: lazy-load status route startup helpers * Plugins: stage local bundled runtime tree * Build: share root dist chunks across tsdown entries * fix(logging): make logger import browser-safe * fix(changelog): add entry for Control UI logger import fix (openclaw#48469) * fix(changelog): note Control UI logger import fix * fix(changelog): attribute Control UI logger fix entry * fix(changelog): credit original Control UI fix author * Plugins: remove public extension-api surface (openclaw#48462) * Plugins: remove public extension-api surface * Plugins: fix loader setup routing follow-ups * CI: ignore non-extension helper dirs in extension-fast * Docs: note extension-api removal as breaking * fix(ui): language dropdown selection not persisting after refresh (openclaw#48019) Merged via squash. Prepared head SHA: 06c8258 Co-authored-by: git-jxj <[email protected]> Co-authored-by: altaywtf <[email protected]> Reviewed-by: @altaywtf * fix(plugins): late-binding subagent runtime for non-gateway load paths (openclaw#46648) Merged via squash. Prepared head SHA: 4474265 Co-authored-by: jalehman <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman * fix: enable auto-scroll during assistant response streaming Fix auto-scroll behavior when AI assistant streams responses in the web UI. Previously, the viewport would remain at the sent message position and users had to manually click a badge to see streaming responses. Fixes openclaw#14959 Changes: - Reset chat scroll state before sending message to ensure viewport readiness - Force scroll to bottom after message send to position viewport correctly - Detect streaming start (chatStream: null -> string) and trigger auto-scroll - Ensure smooth scroll-following during entire streaming response Co-Authored-By: Claude Opus 4.6 <[email protected]> * fix(ui): align chatStream lifecycle type with nullable state * fix(whatsapp): restore implicit reply mentions for LID identities (openclaw#48494) Threads selfLid from the Baileys socket through the inbound WhatsApp pipeline and adds LID-format matching to the implicit mention check in group gating, so reply-to-bot detection works when WhatsApp sends the quoted sender in @lid format. Also fixes the device-suffix stripping regex (was a silent no-op). Closes openclaw#23029 Co-authored-by: sparkyrider <[email protected]> Reviewed-by: @ademczuk * fix(compaction): stabilize toolResult trim/prune flow in safeguard (openclaw#44133) Merged via squash. Prepared head SHA: ec789c6 Co-authored-by: SayrWolfridge <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman * Fix launcher startup regressions (openclaw#48501) * Fix launcher startup regressions * Fix CI follow-up regressions * Fix review follow-ups * Fix workflow audit shell inputs * Handle require resolve gaxios misses * fix: remove orphaned tool_result blocks during compaction (openclaw#15691) (openclaw#16095) Merged via squash. Prepared head SHA: b772432 Co-authored-by: claw-sylphx <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman * docs: rename onboarding user-facing wizard copy Co-authored-by: Tak <[email protected]> * fix(plugins): keep built plugin loading on one module graph (openclaw#48595) * Plugins: stabilize global catalog contracts * Channels: add global threading and directory contracts * Tests: improve extension runner discovery * CI: run global contract lane * Plugins: speed up auth-choice contracts * Plugins: fix catalog contract mocks * refactor: move provider catalogs into extensions * refactor: route bundled channel setup helpers through private sdk bridges * test: fix check contract type drift * Tests: lock plugin slash commands to one runtime graph * Tests: cover Discord provider plugin registry * Tests: pin loader command activation semantics * Tests: cover Telegram plugin auth on real registry * refactor(slack): share setup helpers * refactor(whatsapp): reuse shared normalize helpers * Tlon: lazy-load channel runtime paths * Tests: document Discord plugin auth gating * feat(plugins): add speech provider registration * docs(plugins): document capability ownership model * fix: detect Ollama "prompt too long" as context overflow error (openclaw#34019) Merged via squash. Prepared head SHA: 825a402 Co-authored-by: lishuaigit <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman * agent: preemptive context overflow detection during tool loops (openclaw#29371) Merged via squash. Prepared head SHA: 19661b8 Co-authored-by: keshav55 <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman --------- Co-authored-by: Peter Steinberger <[email protected]> Co-authored-by: MoerAI <[email protected]> Co-authored-by: Joey Krug <[email protected]> Co-authored-by: bot_apk <[email protected]> Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Vincent Koc <[email protected]> Co-authored-by: Tak Hoffman <[email protected]> Co-authored-by: brokemac79 <[email protected]> Co-authored-by: ObitaBot <[email protected]> Co-authored-by: Prompt Driven <[email protected]> Co-authored-by: Nimrod Gutman <[email protected]> Co-authored-by: Gustavo Madeira Santana <[email protected]> Co-authored-by: Myeongwon Choi <[email protected]> Co-authored-by: Myeongwon Choi <[email protected]> Co-authored-by: ImLukeF <[email protected]> Co-authored-by: 郑耀宏 <[email protected]> Co-authored-by: Ayaan Zaidi <[email protected]> Co-authored-by: huntharo <[email protected]> Co-authored-by: Yauheni Shauchenka <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Altay <[email protected]> Co-authored-by: Radek Sienkiewicz <[email protected]> Co-authored-by: velvet-shark <[email protected]> Co-authored-by: Val Alexander <[email protected]> Co-authored-by: Hung-Che Lo <[email protected]> Co-authored-by: Frank Yang <[email protected]> Co-authored-by: git-jxj <[email protected]> Co-authored-by: git-jxj <[email protected]> Co-authored-by: altaywtf <[email protected]> Co-authored-by: Josh Lehman <[email protected]> Co-authored-by: jalehman <[email protected]> Co-authored-by: Jaewon Hwang <[email protected]> Co-authored-by: Claude Opus 4.6 <[email protected]> Co-authored-by: sparkyrider <[email protected]> Co-authored-by: sparkyrider <[email protected]> Co-authored-by: Sayr Wolfridge <[email protected]> Co-authored-by: SayrWolfridge <[email protected]> Co-authored-by: Clayton Shaw <[email protected]> Co-authored-by: claw-sylphx <[email protected]> Co-authored-by: Tak <[email protected]> Co-authored-by: lishuaigit <[email protected]> Co-authored-by: lishuaigit <[email protected]> Co-authored-by: Keshav Rao <[email protected]> Co-authored-by: keshav55 <[email protected]>
openclaw#46648) Merged via squash. Prepared head SHA: 4474265 Co-authored-by: jalehman <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
openclaw#46648) Merged via squash. Prepared head SHA: 4474265 Co-authored-by: jalehman <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
openclaw#46648) Merged via squash. Prepared head SHA: 4474265 Co-authored-by: jalehman <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
openclaw#46648) Merged via squash. Prepared head SHA: 4474265 Co-authored-by: jalehman <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman
Resolves issue where plugins loaded via non-gateway code paths (e.g. loadSchemaWithPlugins) could not access subagent methods, even when the gateway had initialized a real subagent runtime.
This change introduces: