Skip to content

Fix launcher startup regressions#48501

Merged
Takhoffman merged 5 commits intomainfrom
codex/startup-memory-launcher-fix
Mar 16, 2026
Merged

Fix launcher startup regressions#48501
Takhoffman merged 5 commits intomainfrom
codex/startup-memory-launcher-fix

Conversation

@Takhoffman
Copy link
Copy Markdown
Contributor

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: openclaw.mjs startup could fail after build when a transitive import was missing, and the launcher incorrectly reported that dist/entry.(m)js was missing.
  • Why it matters: this broke the startup-memory CI lane at the launcher boundary and hid the real root cause from developers.
  • What changed: made gaxios compat lazy/optional with a safe fallback, hardened the launcher to preserve transitive import errors, added launcher smoke coverage in CI, and added startup-memory failure guidance with an LLM-ready prompt.
  • What did NOT change (scope boundary): this PR does not change the separate startup RSS budget or reduce command memory usage.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #

User-visible / Behavior Changes

  • node openclaw.mjs --help now surfaces real transitive import failures instead of always collapsing them into a generic missing-build-output error.
  • startup-memory CI failures now print a short local repro path and an LLM-ready prompt.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 24.14.0 for build/help verification; local shell for targeted tests
  • Model/provider: N/A
  • Integration/channel (if any): N/A
  • Relevant config (redacted): default local config; one unrelated stale plugin warning observed during status --json

Steps

  1. pnpm build
  2. node openclaw.mjs --help
  3. pnpm test:startup:memory

Expected

  • Built launcher commands succeed.
  • Missing transitive imports surface the real error.
  • startup-memory failures emit actionable guidance.

Actual

  • Verified locally after the patch: launcher smoke commands succeed and startup-memory passes on this branch.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: pnpm vitest run src/infra/gaxios-fetch-compat.test.ts test/openclaw-launcher.e2e.test.ts src/index.test.ts, pnpm build, node openclaw.mjs --help, node openclaw.mjs status --json --timeout 1, and pnpm test:startup:memory.
  • Edge cases checked: missing direct gaxios resolution falls back to the legacy fetch shim; launcher preserves transitive ERR_MODULE_NOT_FOUND; true missing build output still reports the friendly launcher message.
  • What you did not verify: full GitHub Actions execution on Linux runners.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert commit aa8180156c.
  • Files/config to restore: openclaw.mjs, src/entry.ts, src/index.ts, src/infra/gaxios-fetch-compat.ts, .github/workflows/ci.yml, scripts/check-cli-startup-memory.mjs.
  • Known bad symptoms reviewers should watch for: openclaw.mjs --help failing after build, generic missing-dist errors masking transitive import failures, or startup-memory failures without remediation guidance.

Risks and Mitigations

  • Risk: the fallback compat path could diverge from the directly patched gaxios path in nested dependency layouts.
    • Mitigation: targeted compat tests now cover both direct patching and fallback shim behavior.

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 16, 2026

🔒 Aisle Security Analysis

We found 2 potential security issue(s) in this PR:

# Severity Title
1 🔵 Low Race condition in async installGaxiosFetchCompat can allow requests before proxy/TLS dispatcher patch is applied
2 🔵 Low CI smoke test runs openclaw status --json which performs outbound network calls (npm registry, git fetch, gateway probe) despite --timeout 1

1. 🔵 Race condition in async installGaxiosFetchCompat can allow requests before proxy/TLS dispatcher patch is applied

Property Value
Severity Low
CWE CWE-362
Location src/infra/gaxios-fetch-compat.ts:268-274

Description

The async initializer installGaxiosFetchCompat() uses a global installState guard that returns immediately for any state other than "not-installed".

Because the function now performs asynchronous work (dynamic module resolution/import), concurrent callers can observe installState === "installing" and return without waiting for the patch to complete. If such a caller proceeds to make HTTP requests via gaxios immediately after its awaited installGaxiosFetchCompat() call resolves, those requests may run without the compat fetch, meaning:

  • proxy / NO_PROXY and HTTP(S)_PROXY translation to undici ProxyAgent may not be applied
  • mTLS options (cert/key) may not be applied via an undici dispatcher

This creates a startup-time window where requests can bypass expected proxy/TLS routing controls.

Vulnerable code:

export async function installGaxiosFetchCompat(): Promise<void> {
  if (installState !== "not-installed" || typeof globalThis.fetch !== "function") {
    return;
  }

  installState = "installing";// async work ...
}

Notes on current repo call sites:

  • Current internal call sites (src/entry.ts, src/index.ts) are awaited and appear to be invoked once, so exploitability depends on whether other code (plugins/consumers) can call this concurrently or trigger network traffic during startup. However, the function is exported and the race is real if used concurrently.

Recommendation

Make installation idempotent and awaitable by sharing a single in-flight promise. Callers arriving during "installing" should await the same promise rather than returning early.

Example fix:

let installPromise: Promise<void> | null = null;

export function installGaxiosFetchCompat(): Promise<void> {
  if (typeof globalThis.fetch !== "function") return Promise.resolve();
  if (installState === "installed" || installState === "shimmed") return Promise.resolve();
  if (installState === "installing" && installPromise) return installPromise;

  installState = "installing";
  installPromise = (async () => {
    const Gaxios = await loadGaxiosConstructor();
    if (!Gaxios) {
      installLegacyWindowFetchShim();
      installState = "shimmed";
      return;
    }

    const prototype = Gaxios.prototype;
    const originalDefaultAdapter = prototype._defaultAdapter;
    const compatFetch = createGaxiosCompatFetch();

    prototype._defaultAdapter = function patchedDefaultAdapter(this: unknown, config: GaxiosFetchRequestInit) {
      return originalDefaultAdapter.call(this, config.fetchImplementation ? config : {
        ...config,
        fetchImplementation: compatFetch,
      });
    };

    installState = "installed";
  })().catch(err => {
    installState = "not-installed";
    throw err;
  });

  return installPromise;
}

Optionally, consider whether "shimmed" should allow retrying installation if gaxios becomes available later (to avoid a permanent degraded mode).


2. 🔵 CI smoke test runs openclaw status --json which performs outbound network calls (npm registry, git fetch, gateway probe) despite --timeout 1

Property Value
Severity Low
CWE CWE-693
Location .github/workflows/ci.yml:322-326

Description

The new CI smoke test step runs node openclaw.mjs status --json --timeout 1 on PRs. The status --json command performs multiple network-capable operations that are not disabled by --timeout 1:

  • NPM registry HTTP request: getUpdateCheckResult({ includeRegistry: true }) calls fetchWithTimeout("https://registry.npmjs.org/openclaw/<tag>").
  • Git remote fetch: getUpdateCheckResult({ fetchGit: true }) runs git fetch --quiet --prune.
  • Gateway probe WebSocket connection attempt: scanStatusJsonFast() calls probeGateway({ url, timeoutMs }), which creates a GatewayClient connection to the resolved Gateway URL.

Even with --timeout 1, the gateway probe enforces a minimum 250ms timer, and the update check uses a fixed ~2500ms timeout (independent of CLI --timeout).

Security impact / risk in CI context:

  • Outbound requests from PR jobs can be undesirable (policy/compliance) and may leak metadata (runner IP, timing) to external services.
  • Adds flakiness due to reliance on network availability.

Relevant code locations:

  • Workflow step: .github/workflows/ci.yml
  • Status JSON scan triggers update check + probe: src/commands/status.scan.fast-json.ts (getUpdateCheckResult({ fetchGit: true, includeRegistry: true }), resolveGatewayProbeSnapshot())
  • Registry fetch + git fetch implementation: src/infra/update-check.ts (fetchWithTimeout("https://registry.npmjs.org/... "), git fetch --quiet --prune)
  • Gateway probe uses WebSocket client + min timer: src/gateway/probe.ts (Math.max(250, opts.timeoutMs) timer)

Recommendation

Make the CI smoke test deterministic and offline-safe:

  • Prefer a pure local smoke test that does not reach the network (e.g., only --help, or a dedicated --smoke/--offline mode).
  • If you need to keep status --json in CI, add and use a flag/env to disable update checks and probes in smoke contexts (example):
- name: Smoke test CLI launcher status json (offline)
  env:
    OPENCLAW_DISABLE_UPDATE_CHECK: "1"
    OPENCLAW_DISABLE_GATEWAY_PROBE: "1"
  run: node openclaw.mjs status --json --timeout 1

and gate the implementation in code (example):

const disableUpdate = process.env.OPENCLAW_DISABLE_UPDATE_CHECK === "1";
const updatePromise = disableUpdate
  ? Promise.resolve({ /* minimal update object */ })
  : getUpdateCheckResult({ timeoutMs: updateTimeoutMs, fetchGit: true, includeRegistry: true });

This keeps CI from making outbound HTTP/git requests and reduces flakiness.


Analyzed PR: #48501 at commit be78da9

Last updated on: 2026-03-16T23:00:41Z

@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation scripts Repository scripts size: XL maintainer Maintainer-authored PR labels Mar 16, 2026
@Takhoffman Takhoffman force-pushed the codex/startup-memory-launcher-fix branch from aa81801 to b53818c Compare March 16, 2026 21:41
@openclaw-barnacle openclaw-barnacle bot added size: M and removed docs Improvements or additions to documentation size: XL labels Mar 16, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 16, 2026

Greptile Summary

This PR fixes two startup regressions: (1) the openclaw.mjs launcher now distinguishes direct entry-file misses from transitive import failures, surfacing the real root cause instead of always printing a generic "missing dist/entry.(m)js" message; and (2) installGaxiosFetchCompat is made async/lazy so a missing gaxios package no longer crashes startup—it falls back to the legacy window-fetch shim instead. Supporting changes add CI smoke steps for the launcher, structured actionable guidance for startup-memory failures, and a new plugin-boundary ratchet script + baseline.

  • openclaw.mjs: The tryImport helper uses a URL-aware isDirectModuleNotFoundError check so transitive ERR_MODULE_NOT_FOUND errors are re-thrown and visible. Complementary e2e tests cover both the transitive-failure and truly-missing-dist paths.
  • gaxios-fetch-compat.ts: Top-level import { Gaxios } from "gaxios" removed; loading is now lazy via loadGaxiosConstructor(). The fallback installLegacyWindowFetchShim is idempotent. entry.ts and index.ts correctly await the now-async function. One concern: the idempotency guard (installState !== "not-installed") is evaluated before the first await, leaving a window where concurrent callers can both proceed and double-wrap _defaultAdapter.
  • check-plugin-boundary-ratchet.mjs: Solid new lint script using TypeScript AST parsing to track cross-boundary imports in bundled plugins against a snapshot baseline, supporting gradual migration without blocking on a single large refactor.
  • check-cli-startup-memory.mjs: Failure output now includes a local repro command and an LLM-ready prompt, making CI failures directly actionable.

Confidence Score: 4/5

  • Safe to merge; the core launcher and gaxios fixes are correct and well-tested, with one low-risk async idempotency gap in the compat module.
  • The launcher fix is clean and URL-precise. The gaxios lazy-load fallback is functionally correct for all realistic single-caller paths. The one real concern — the async gap between the installState guard and the first await in installGaxiosFetchCompat — would only manifest with concurrent callers, which is not how any current call site uses it. All changes ship with targeted tests and the new CI smoke steps catch launcher regressions early.
  • src/infra/gaxios-fetch-compat.ts — review the async idempotency gap in installGaxiosFetchCompat before the pattern is reused elsewhere.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/infra/gaxios-fetch-compat.ts
Line: 228-246

Comment:
**Async gap leaves idempotency guard open to concurrent calls**

`installState` is checked before the `await loadGaxiosConstructor()` call but is not updated until after the `await` resolves. If two callers invoke `installGaxiosFetchCompat()` concurrently (both are awaited on separate microtask queues before either resolves), both will pass the `installState !== "not-installed"` guard simultaneously. The first caller then patches `Gaxios.prototype._defaultAdapter` and sets `installState = "installed"`, but the second caller has already captured `originalDefaultAdapter` before the first patched it — or the second call proceeds and double-wraps the adapter.

In practice this is low risk today (both call sites are serial), but the invariant is fragile. A minimal fix is to set `installState` to a sentinel before the `await`:

```typescript
export async function installGaxiosFetchCompat(): Promise<void> {
  if (installState !== "not-installed" || typeof globalThis.fetch !== "function") {
    return;
  }
  installState = "installed"; // claim the slot before the first await

  const Gaxios = await loadGaxiosConstructor();
  if (!Gaxios) {
    installLegacyWindowFetchShim();
    installState = "shimmed";
    return;
  }
  // ... rest of install
}
```

(Adjust the sentinel value as appropriate for the final state machine.)

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: test/openclaw-launcher.e2e.test.ts
Line: 7-16

Comment:
**Temp directory leaks if `makeLauncherFixture` throws mid-setup**

`fs.mkdtemp` creates the directory before `fs.copyFile` and `fs.mkdir` are called. If either of those subsequent operations throws, the function exits without returning the path, so the caller never pushes it to `fixtureRoots`, and the `afterEach` cleanup never runs on it.

Consider registering the path for cleanup as soon as it is created:

```typescript
async function makeLauncherFixture(fixtureRoots: string[]): Promise<string> {
  const fixtureRoot = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-launcher-"));
  fixtureRoots.push(fixtureRoot); // register immediately so afterEach can clean it up
  await fs.copyFile(
    path.resolve(process.cwd(), "openclaw.mjs"),
    path.join(fixtureRoot, "openclaw.mjs"),
  );
  await fs.mkdir(path.join(fixtureRoot, "dist"), { recursive: true });
  return fixtureRoot;
}
```

Each test would then pass `fixtureRoots` to the helper instead of pushing after the fact.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: b53818c

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2dc3411350

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26f6757a3c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@Takhoffman Takhoffman merged commit 313e5bb into main Mar 16, 2026
29 of 37 checks passed
@Takhoffman Takhoffman deleted the codex/startup-memory-launcher-fix branch March 16, 2026 22:21
vincentkoc pushed a commit to vincentkoc/openclaw that referenced this pull request Mar 17, 2026
* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses
jonBoone added a commit to jonBoone/openclaw that referenced this pull request Mar 17, 2026
* refactor: make setup the primary wizard surface

* test: move setup surface coverage

* docs: prefer setup wizard command

* fix: follow up shared interactive regressions (openclaw#47715)

* fix(plugins): resolve lazy runtime from package root

* fix(daemon): accept 'Last Result' schtasks key variant on Windows (openclaw#47726)

Some Windows locales/versions emit 'Last Result' instead of 'Last Run Result' in schtasks output, causing gateway status to falsely report 'Runtime: unknown'. Fall back to the shorter key when the canonical key is absent.

* fix: accept schtasks Last Result key on Windows (openclaw#47844) (thanks @MoerAI)

* refactor(plugins): move auth profile hooks into providers

* fix: resume orphaned subagent sessions after SIGUSR1 reload

Closes openclaw#47711

After a SIGUSR1 gateway reload aborts in-flight subagent LLM calls, the gateway now scans for orphaned sessions and sends a synthetic resume message to restart their work. Also makes the deferral timeout configurable via gateway.reload.deferralTimeoutMs (default: 5 minutes, up from 90s).

* fix: address Greptile review feedback

- Remove unrelated pnpm-lock.yaml changes
- Move abortedLastRun flag clearing to AFTER successful resume
  (prevents permanent session loss on transient gateway failures)
- Use dynamic import for orphan recovery module to avoid startup
  memory overhead
- Add test assertion that flag is preserved on resume failure

* fix: add retry with exponential backoff for orphan recovery

Addresses Codex review feedback — if recovery fails (e.g. gateway
still booting), retries up to 3 times with exponential backoff
(5s → 10s → 20s) before giving up.

* fix: address all review comments on PR openclaw#47719 + implement resume context and config idempotency guard

* fix: address 6 review comments on PR openclaw#47719

1. [P1] Treat remap failures as resume failures — if replaceSubagentRunAfterSteer
   returns false, do NOT clear abortedLastRun, increment failed count.

2. [P2] Count scan-level exceptions as retryable failures — set result.failed > 0
   in the outer catch block so scheduleOrphanRecovery retry logic triggers.

3. [P2] Persist resumed-session dedupe across recovery retries — accept
   resumedSessionKeys as a parameter; scheduleOrphanRecovery lifts the Set to
   its own scope and passes it through retries.

4. [Greptile] Use typed config accessors instead of raw structural cast for TLS
   check in lifecycle.ts.

5. [Greptile] Forward gateway.reload.deferralTimeoutMs to deferGatewayRestartUntilIdle
   in scheduleGatewaySigusr1Restart so user-configured value is not silently ignored.

6. [Greptile] Same as openclaw#4 — already addressed by the typed config fix.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix: land SIGUSR1 orphan recovery regressions (openclaw#47719) (thanks @joeykrug)

* refactor: move channel messaging hooks into plugins

* fix: stabilize windows parallels smoke harness

* refactor(plugins): move provider onboarding auth into plugins

* docs: restore onboard as canonical setup command

* docs: restore onboard docs references

* refactor: move channel capability diagnostics into plugins

* refactor: split slack block action handling

* refactor: split plugin interactive dispatch adapters

* refactor: unify reply content checks

* refactor: extract discord shared interactive mapper

* refactor: unify telegram interactive button resolution

* build: add land gate parity script

* test: fix setup wizard smoke mocks

* docs: sync config baseline

* Status: split heartbeat summary helpers

* Security: trim audit policy import surfaces

* Security: lazy-load deep skill audit helpers

* Security: lazy-load audit config snapshot IO

* Config: keep native command defaults off heavy channel registry

* Status: split lightweight gateway agent list

* refactor(plugins): simplify provider auth choice metadata

* test: add openshell sandbox e2e smoke

* feishu: add structured card actions and interactive approval flows (openclaw#47873)

* feishu: add structured card actions and interactive approval flows

* feishu: address review fixes and test-gate regressions

* feishu: hold inflight card dedup until completion

* feishu: restore fire-and-forget bot menu handling

* feishu: format card interaction helpers

* Feishu: add changelog entry for card interactions

* Feishu: add changelog entry for ACP session binding

* build: remove land gate script

* Status: lazy-load tailscale and memory scan deps

* Tests: fix Feishu full registration mock

* Tests: cover plugin capability matrix

* Gateway: import normalizeAgentId in hooks

* fix: recover bonjour advertiser from ciao announce loops

* fix: preserve loopback gateway scopes for local auth

* Status: lazy-load summary session helpers

* Status: lazy-load security audit commands

* refactor: move channel delivery and ACP seams into plugins

* Security: split audit runtime surfaces

* Tests: add channel actions contract helper

* Tests: add channel plugin contract helper

* Tests: add Slack channel contract suite

* Tests: add Mattermost channel contract suite

* Tests: add Telegram channel contract suite

* Tests: add Discord channel contract suite

* fix(session): preserve external channel route when webchat views session (openclaw#47745)

When a Telegram/WhatsApp/iMessage session was viewed or messaged from the
dashboard/webchat, resolveLastChannelRaw() unconditionally returned 'webchat'
for any isDirectSessionKey() or isMainSessionKey() match, overwriting the
persisted external delivery route.

This caused subagent completion events to be delivered to the webchat/dashboard
instead of the original channel (Telegram, WhatsApp, etc.), silently dropping
messages for the channel user.

Fix: only allow webchat to own routing when no external delivery route has been
established (no persisted external lastChannel, no external channel hint in the
session key). If an external route exists, webchat is treated as admin/monitoring
access and must not mutate the delivery route.

Updated/added tests to document the correct behaviour.

Fixes openclaw#47745

* fix: address bot nit on session route preservation (openclaw#47797) (thanks @brokemac79)

* Tests: add plugin contract suites

* Tests: add plugin contract registry

* Tests: add global plugin contract suite

* Tests: add global actions contract suite

* Tests: add global setup contract suite

* Tests: add global status contract suite

* Tests: replace local channel contracts

* refactor(plugins): move onboarding auth metadata to manifests

* refactor: move remaining channel seams into plugins

* refactor: add plugin-owned outbound adapters

* fix: scope localStorage settings key by basePath to prevent cross-deployment conflicts

- Add settingsKeyForGateway() function similar to tokenSessionKeyForGateway()
- Use scoped key format: openclaw.control.settings.v1:https://example.com/gateway-a
- Add migration from legacy static key on load
- Fixes openclaw#47481

* Tests: add provider contract suites

* Tests: add provider contract registry

* Tests: add global provider contract suite

* Tests: add global web search contract suite

* fix: stabilize ci gate

* Tests: add provider registry contract suite

* Tests: relax provider auth hint contract

* !refactor(browser): remove Chrome extension path and add MCP doctor migration (openclaw#47893)

* Browser: replace extension path with Chrome MCP

* Browser: clarify relay stub and doctor checks

* Docs: mark browser MCP migration as breaking

* Browser: reject unsupported profile drivers

* Browser: accept clawd alias on profile create

* Doctor: narrow legacy browser driver migration

* feishu: harden media support and align capability docs (openclaw#47968)

* feishu: harden media support and action surface

* feishu: format media action changes

* feishu: fix review follow-ups

* fix: scope Feishu target aliases to Feishu (openclaw#47968) (thanks @Takhoffman)

* fix: make docs i18n use gpt-5.4 overrides

* docs: regenerate zh-CN onboarding references

* Tests: tighten provider wizard contracts

* Tests: add plugin loader contract suite

* refactor: remove dock shim and move session routing into plugins

* Plugins: add provider runtime contracts

* GitHub Copilot: move runtime tests to provider contracts

* Z.ai: move runtime tests to provider contracts

* Anthropic: move runtime tests to provider contracts

* Google: move runtime tests to provider contracts

* OpenAI: move runtime tests to provider contracts

* Qwen Portal: move runtime tests to provider contracts

* fix(core): restore outbound fallbacks and gate checks

* style(core): normalize rebase fallout

* fix: accept sandbox plugin id hints

* Plugins: capture tool registrations in test registry

* Plugins: cover Firecrawl tool ownership

* Firecrawl: drop local registration contract test

* Plugins: add provider catalog contracts

* Plugins: narrow provider runtime contracts

* refactor(plugin-sdk): use scoped core imports for bundled channels

* fix: unblock ci gates

* Plugins: add provider wizard contracts

* fix: unblock docs and registry checks

* refactor: finish plugin-owned channel runtime seams

* refactor(plugin-sdk): clean shared core imports

* Plugins: add provider auth contracts

* Plugins: dedupe routing imports in channel adapters

* Plugins: add provider discovery contracts

* fix: stop bonjour before re-advertising

* Plugins: extend provider discovery contracts

* docs: codify macOS parallels discord smoke

* refactor: move session lifecycle and outbound fallbacks into plugins

* refactor(plugins): derive compat provider ids from manifests

* Plugins: cover catalog discovery providers

* Tests: type auth contract prompt mocks

* fix: mount CLI auth dirs in docker live tests

* refactor: route shared channel sdk imports through plugin seams

* Plugins: add auth choice contracts

* Plugins: restore routing seams and discovery fixtures

* fix: harden bonjour retry recovery

* Runtime: lazy-load channel runtime singletons

* refactor: tighten plugin sdk channel seams

* refactor: route remaining channel imports through plugin sdk

* refactor(plugins): finish provider auth boundary cleanup

* fix(infra): wire gaxios-fetch-compat shim to prevent node-fetch crash on Node.js 25

* fix(infra): also wire gaxios-fetch-compat shim into src/index.ts (gateway entry)

* fix: keep gaxios compat off the package root (openclaw#47914) (thanks @pdd-cli)

* refactor: shrink public channel plugin sdk surfaces

* refactor: add private channel sdk bridges

* fix: restore effective setup wizard lazy import

* docs: add frontmatter to parallels discord skill

* fix: retry runtime postbuild skill copy races

* Gateway: lazily resolve channel runtime

* Gateway: cover lazy channel runtime resolution

* Plugins: add Claude marketplace registry installs (openclaw#48058)

* Changelog: note Claude marketplace plugin support

* Plugins: add Claude marketplace installs

* E2E: cover marketplace plugin installs in Docker

* UI: keep thinking helpers browser-safe

* Infra: restore check after gaxios compat

* Docs: refresh generated config baseline

* docs: reorder unreleased changelog entries

* fix: split browser-safe thinking helpers

* fix(macos): restore debug build helpers (openclaw#48046)

* Docs: add Claude marketplace plugin install guidance

* Channels: expand contract suites

* Channels: add contract surface coverage

* Channels: centralize outbound payload contracts

* Channels: centralize group policy contracts

* Channels: centralize inbound context contracts

* Tests: add contract runner

* Tests: harden WhatsApp inbound contract cleanup

* Tests: add extension test runner

* Runtime: lazy-load Discord channel ops

* Docs: use placeholders for marketplace plugin examples

* Release: trim generated docs from npm pack

* Runtime: lazy-load Telegram and Slack channel ops

* Tests: detect changed extensions

* Tests: cover changed extension detection

* Docs: add extension test workflow

* CI: add changed extension test lane

* BlueBubbles: lazy-load channel runtime paths

* Plugin SDK: restore scoped imports for bundled channels

* Plugin SDK: consolidate shared channel exports

* Channels: fix surface contract plugin lookup

* Status: stabilize startup memory probes

* Media: avoid slow auth misses in auto-detect

* Tests: scope Codex bundle loader fixture

* Tests: isolate bundle surface fixtures

* feat(telegram): add configurable silent error replies (openclaw#19776)

Port and complete openclaw#19776 on top of the current Telegram extension layout.

Adds a default-off `channels.telegram.silentErrorReplies` setting. When enabled, Telegram bot replies marked as errors are delivered silently across the regular bot reply flow, native/slash command replies, and fallback sends.

Thanks @auspic7 

Co-authored-by: Myeongwon Choi <[email protected]>
Co-authored-by: ImLukeF <[email protected]>

* fix(ui): auto load Usage tab data on navigation

* fix(telegram): keep silent error fallback replies quiet

* test(gateway): restore agent request route mock

* Cron: isolate active-model delivery tests

* Tests: align media auth fixture with selection checks

* Plugins: preserve lazy runtime provider resolution

* Bootstrap: report nested entry import misses

* fix(channels): parse bundled targets without plugin registry

* test(telegram): cover shared parsing without registry

* Plugin SDK: split setup and sandbox subpaths

* Providers: centralize setup defaults and helper boundaries

* Plugins: decouple bundled web search discovery

* Plugin SDK: update entrypoint metadata

* Secrets: honor caller env during runtime validation

* Tests: align Docker cache checks with non-root images

* Plugin SDK: keep root alias reflection lazy

* Providers: scope compat resolution to owning plugins

* Plugin SDK: add narrow setup subpaths

* Plugin SDK: update entrypoint metadata

* fix(slack): harden bolt import interop (openclaw#45953)

* fix(slack): harden bolt import interop

* fix(slack): simplify bolt interop resolver

* fix(slack): harden startup bolt interop

* fix(slack): place changelog entry at section end

---------

Co-authored-by: Ubuntu <[email protected]>
Co-authored-by: Altay <[email protected]>

* Tests: fix green check typing regressions

* Plugins: avoid booting bundled providers for catalog hooks

* fix: bypass telegram runtime proxy during health checks

* fix: align telegram probe test mock

* test: remove stale synology zod mock

* fix(android): reduce chat recomposition churn

* fix(android): preserve chat message identity on refresh

* fix(android): shrink chat image attachments

* Browser: support non-Chrome existing-session profiles via userDataDir (openclaw#48170)

Merged via squash.

Prepared head SHA: e490035
Co-authored-by: velvet-shark <[email protected]>
Co-authored-by: velvet-shark <[email protected]>
Reviewed-by: @velvet-shark

* fix(local-storage): improve VITEST environment check for localStorage access

* fix: normalize discord commands allowFrom auth

* test: update discord subagent hook mocks

* test: mock telegram native command reply pipeline

* fix(android): lazy-init node runtime after onboarding

* docs(config): refresh generated baseline

* Plugins: share channel plugin id resolution

* Gateway: defer full channel plugins until after listen

* Gateway: gate deferred channel startup behind opt-in

* Docs: document deferred channel startup opt-in

* feat(skills): preserve all skills in prompt via compact fallback before dropping (openclaw#47553)

* feat(skills): add compact format fallback for skill catalog truncation

When the full-format skill catalog exceeds the character budget,
applySkillsPromptLimits now tries a compact format (name + location
only, no description) before binary-searching for the largest fitting
prefix. This preserves full model awareness of registered skills in
the common overflow case.

Three-tier strategy:
1. Full format fits → use as-is
2. Compact format fits → switch to compact, keep all skills
3. Compact still too large → binary search largest compact prefix

Other changes:
- escapeXml() utility for safe XML attribute values
- formatSkillsCompact() emits same XML structure minus <description>
- Compact char-budget check reserves 150 chars for the warning line
  the caller prepends, preventing prompt overflow at the boundary
- 13 tests covering all tiers, edge cases, and budget reservation
- docs/.generated/config-baseline.json: fix pre-existing oxfmt issue

* docs: document compact skill prompt fallback

---------

Co-authored-by: Frank Yang <[email protected]>

* Gateway: simplify startup and stabilize mock responses tests

* test: fix stale web search and boot-md contracts

* Gateway tests: centralize mock responses provider setup

* Gateway tests: share ordered client teardown helper

* Infra: ignore ciao probing cancellations

* Docs: repair unreleased changelog attribution

* Docs: normalize unreleased changelog refs

* Channels: ignore enabled-only disabled plugin config

* perf: reduce status json startup memory

* perf: lazy-load status route startup helpers

* Plugins: stage local bundled runtime tree

* Build: share root dist chunks across tsdown entries

* fix(logging): make logger import browser-safe

* fix(changelog): add entry for Control UI logger import fix (openclaw#48469)

* fix(changelog): note Control UI logger import fix

* fix(changelog): attribute Control UI logger fix entry

* fix(changelog): credit original Control UI fix author

* Plugins: remove public extension-api surface (openclaw#48462)

* Plugins: remove public extension-api surface

* Plugins: fix loader setup routing follow-ups

* CI: ignore non-extension helper dirs in extension-fast

* Docs: note extension-api removal as breaking

* fix(ui): language dropdown selection not persisting after refresh (openclaw#48019)

Merged via squash.

Prepared head SHA: 06c8258
Co-authored-by: git-jxj <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Reviewed-by: @altaywtf

* fix(plugins): late-binding subagent runtime for non-gateway load paths (openclaw#46648)

Merged via squash.

Prepared head SHA: 4474265
Co-authored-by: jalehman <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* fix: enable auto-scroll during assistant response streaming

Fix auto-scroll behavior when AI assistant streams responses in the web UI.
Previously, the viewport would remain at the sent message position and users
had to manually click a badge to see streaming responses.

Fixes openclaw#14959

Changes:
- Reset chat scroll state before sending message to ensure viewport readiness
- Force scroll to bottom after message send to position viewport correctly
- Detect streaming start (chatStream: null -> string) and trigger auto-scroll
- Ensure smooth scroll-following during entire streaming response

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix(ui): align chatStream lifecycle type with nullable state

* fix(whatsapp): restore implicit reply mentions for LID identities (openclaw#48494)

Threads selfLid from the Baileys socket through the inbound WhatsApp
pipeline and adds LID-format matching to the implicit mention check
in group gating, so reply-to-bot detection works when WhatsApp sends
the quoted sender in @lid format.

Also fixes the device-suffix stripping regex (was a silent no-op).

Closes openclaw#23029

Co-authored-by: sparkyrider <[email protected]>
Reviewed-by: @ademczuk

* fix(compaction): stabilize toolResult trim/prune flow in safeguard (openclaw#44133)

Merged via squash.

Prepared head SHA: ec789c6
Co-authored-by: SayrWolfridge <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* Fix launcher startup regressions (openclaw#48501)

* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses

* fix: remove orphaned tool_result blocks during compaction (openclaw#15691) (openclaw#16095)

Merged via squash.

Prepared head SHA: b772432
Co-authored-by: claw-sylphx <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* docs: rename onboarding user-facing wizard copy

Co-authored-by: Tak <[email protected]>

* fix(plugins): keep built plugin loading on one module graph (openclaw#48595)

* Plugins: stabilize global catalog contracts

* Channels: add global threading and directory contracts

* Tests: improve extension runner discovery

* CI: run global contract lane

* Plugins: speed up auth-choice contracts

* Plugins: fix catalog contract mocks

* refactor: move provider catalogs into extensions

* refactor: route bundled channel setup helpers through private sdk bridges

* test: fix check contract type drift

* Tests: lock plugin slash commands to one runtime graph

* Tests: cover Discord provider plugin registry

* Tests: pin loader command activation semantics

* Tests: cover Telegram plugin auth on real registry

* refactor(slack): share setup helpers

* refactor(whatsapp): reuse shared normalize helpers

* Tlon: lazy-load channel runtime paths

* Tests: document Discord plugin auth gating

* feat(plugins): add speech provider registration

* docs(plugins): document capability ownership model

* fix: detect Ollama "prompt too long" as context overflow error (openclaw#34019)

Merged via squash.

Prepared head SHA: 825a402
Co-authored-by: lishuaigit <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* agent: preemptive context overflow detection during tool loops (openclaw#29371)

Merged via squash.

Prepared head SHA: 19661b8
Co-authored-by: keshav55 <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

---------

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: MoerAI <[email protected]>
Co-authored-by: Joey Krug <[email protected]>
Co-authored-by: bot_apk <[email protected]>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Tak Hoffman <[email protected]>
Co-authored-by: brokemac79 <[email protected]>
Co-authored-by: ObitaBot <[email protected]>
Co-authored-by: Prompt Driven <[email protected]>
Co-authored-by: Nimrod Gutman <[email protected]>
Co-authored-by: Gustavo Madeira Santana <[email protected]>
Co-authored-by: Myeongwon Choi <[email protected]>
Co-authored-by: Myeongwon Choi <[email protected]>
Co-authored-by: ImLukeF <[email protected]>
Co-authored-by: 郑耀宏 <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: huntharo <[email protected]>
Co-authored-by: Yauheni Shauchenka <[email protected]>
Co-authored-by: Ubuntu <[email protected]>
Co-authored-by: Altay <[email protected]>
Co-authored-by: Radek Sienkiewicz <[email protected]>
Co-authored-by: velvet-shark <[email protected]>
Co-authored-by: Val Alexander <[email protected]>
Co-authored-by: Hung-Che Lo <[email protected]>
Co-authored-by: Frank Yang <[email protected]>
Co-authored-by: git-jxj <[email protected]>
Co-authored-by: git-jxj <[email protected]>
Co-authored-by: altaywtf <[email protected]>
Co-authored-by: Josh Lehman <[email protected]>
Co-authored-by: jalehman <[email protected]>
Co-authored-by: Jaewon Hwang <[email protected]>
Co-authored-by: Claude Opus 4.6 <[email protected]>
Co-authored-by: sparkyrider <[email protected]>
Co-authored-by: sparkyrider <[email protected]>
Co-authored-by: Sayr Wolfridge <[email protected]>
Co-authored-by: SayrWolfridge <[email protected]>
Co-authored-by: Clayton Shaw <[email protected]>
Co-authored-by: claw-sylphx <[email protected]>
Co-authored-by: Tak <[email protected]>
Co-authored-by: lishuaigit <[email protected]>
Co-authored-by: lishuaigit <[email protected]>
Co-authored-by: Keshav Rao <[email protected]>
Co-authored-by: keshav55 <[email protected]>
hqwuzhaoyi pushed a commit to hqwuzhaoyi/openclaw that referenced this pull request Mar 17, 2026
* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses
nikolaisid pushed a commit to nikolaisid/openclaw that referenced this pull request Mar 18, 2026
* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 25, 2026
* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses

(cherry picked from commit 313e5bb)
alexey-pelykh added a commit to remoteclaw/remoteclaw that referenced this pull request Mar 25, 2026
* fix(ci): stop serializing push workflow runs

(cherry picked from commit 0a20c5c)

* test: harden path resolution test helpers

(cherry picked from commit 1ad47b8)

* Fix launcher startup regressions (openclaw#48501)

* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses

(cherry picked from commit 313e5bb)

* refactor(scripts): move container setup entrypoints

(cherry picked from commit 46ccbac)

* perf(ci): gate install smoke on changed-smoke (openclaw#52458)

(cherry picked from commit 4bd90f2)

* Docs: prototype generated plugin SDK reference (openclaw#51877)

* Chore: unblock synced main checks

* Docs: add plugin SDK docs implementation plan

* Docs: scaffold plugin SDK reference phase 1

* Docs: mark plugin SDK reference surfaces unstable

* Docs: prototype generated plugin SDK reference

* docs(plugin-sdk): replace generated reference with api baseline

* docs(plugin-sdk): drop generated reference plan

* docs(plugin-sdk): align api baseline flow with config docs

---------

Co-authored-by: Onur <[email protected]>
Co-authored-by: Vincent Koc <[email protected]>
(cherry picked from commit 4f1e12a)

* fix(ci): harden docker builds and unblock config docs

(cherry picked from commit 9f08af1)

* Docs: add config drift baseline statefile (openclaw#45891)

* Docs: add config drift statefile generator

* Docs: generate config drift baseline

* CI: move config docs drift runner into workflow sanity

* Docs: emit config drift baseline json

* Docs: commit config drift baseline json

* Docs: wire config baseline into release checks

* Config: fix baseline drift walker coverage

* Docs: regenerate config drift baselines

(cherry picked from commit cbec476)

* Release: add plugin npm publish workflow (openclaw#47678)

* Release: add plugin npm publish workflow

* Release: make plugin publish scope explicit

(cherry picked from commit d41c9ad)

* build: default to Node 24 and keep Node 22 compat

(cherry picked from commit deada7e)

* ci(android): use explicit flavor debug tasks

(cherry picked from commit 0c2e6fe)

* ci: harden pnpm sticky cache on PRs

(cherry picked from commit 29b36f8)

* CI: add built plugin singleton smoke (openclaw#48710)

(cherry picked from commit 5a2a4ab)

* chore: add code owners for npm release paths

(cherry picked from commit 5c9fae5)

* test add extension plugin sdk boundary guards

(cherry picked from commit 77fb258)

* ci: tighten cache docs and node22 gate

(cherry picked from commit 797b6fe)

* ci: add npm release workflow and CalVer checks (openclaw#42414) (thanks @onutc)

(cherry picked from commit 8ba1b6e)

* CI: add CLI startup memory regression check

(cherry picked from commit c0e0115)

* Add bad-barnacle label to prevent barnacle closures. (openclaw#51945)

(cherry picked from commit c449a0a)

* ci: speed up scoped workflow lanes

(cherry picked from commit d17490f)

* ci: restore PR pnpm cache fallback

(cherry picked from commit e1d0545)

* CI: guard gateway watch against duplicate runtime regressions (openclaw#49048)

(cherry picked from commit f036ed2)

* fix: correct domain reference in docker setup script

* fix: adapt cherry-picks for fork TS strictness

* fix: adapt cherry-picked tests for fork structure

- Dockerfile test: OPENCLAW_ → REMOTECLAW_ ARG names
- ci-changed-scope test: add missing runChangedSmoke field
- doc-baseline test: rename to e2e (needs dist/ build artifacts)
- extension boundary test: update baselines and expectations for fork

* fix: adjust ci-changed-scope test for fork's narrower skills regex

---------

Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Tak Hoffman <[email protected]>
Co-authored-by: Bob <[email protected]>
Co-authored-by: Onur <[email protected]>
Co-authored-by: Altay <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: Onur Solmaz <[email protected]>
Co-authored-by: Harold Hunt <[email protected]>
sbezludny pushed a commit to sbezludny/openclaw that referenced this pull request Mar 27, 2026
* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR scripts Repository scripts size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant