fix(feishu): recover WebSocket after SDK retry exhaustion by vincentkoc · Pull Request #73739 · openclaw/openclaw

vincentkoc · 2026-04-28T18:36:29Z

Summary

Add a narrow Feishu WebSocket supervisor/recovery path for #52618 so transient network disruption or SDK retry exhaustion does not leave the gateway with a dead WSClient until manual restart.

This follows current main at e2295b3 and should preserve the existing reconnect/heartbeat behavior from #72411 while adding the missing OpenClaw-owned recovery loop for the #52618 failure mode.

Scope

Recreate Feishu WSClient after confirmed SDK retry exhaustion or terminal start failure.
Use bounded exponential backoff with abortable sleep.
Preserve bot identity and per-account maps across supervisor cycles, clearing them only on true monitor shutdown.
Keep shutdown cleanup race-safe and avoid reopening sockets after abort.
Add focused regression coverage for retry exhaustion, abort during backoff, and cleanup state.

Validation

pnpm test:serial extensions/feishu/src/async.test.ts extensions/feishu/src/monitor.cleanup.test.ts extensions/feishu/src/client.test.ts
pnpm check:changed

Fixes #52618.
Related: #59753, #55532, #68766, #72411.

ProjectClownfish replacement details:

Cluster: ghcrawl-156682-autonomous-smoke
Source PRs: none
Credit: Preserve public credit for the merged maintainer repair fix(feishu): repair WebSocket reconnect and heartbeat config #72411 and its credited source PRs where applicable.; Do not reuse or summarize security-routed PR fix(feishu): supervisor loop to recover from WSClient retry exhaustion #57978 in a public fix PR until central security handling clears that exact item.; Mention fix(feishu): WebSocket connection not recovered after network disruption #52618 as the canonical non-security supervisor/retry-exhaustion issue and [Bug]: 飞书无限重连Feishu WebSocket reconnect loop never stops — gateway becomes unresponsive without manual restart #59753 as closed duplicate evidence.
Validation: pnpm test:serial extensions/feishu/src/async.test.ts extensions/feishu/src/monitor.cleanup.test.ts extensions/feishu/src/client.test.ts; pnpm check:changed

clawsweeper · 2026-04-28T18:39:45Z

Codex review: passed for ClawSweeper automerge.

What this changes:

The PR adds Feishu WSClient lifecycle callback plumbing, terminal retry-exhaustion detection, abortable backoff client recreation, identity cleanup handling, focused regression tests, and a changelog entry.

Automerge follow-up:

Automerge review is clean, but the PR is maintainer/member-authored and GitHub reports dirty mergeability; the remaining action is branch repair or maintainer merge handling plus exact-head validation, not a new source fix PR from this review.

Security review:

Security review cleared: The diff only changes Feishu runtime recovery logic, focused tests, and changelog text; it adds no dependency, CI, install, publish, permission, or secret-handling surface.

Review details

Best possible solution:

Resolve the dirty merge state on this branch or a credited replacement, rerun exact-head Feishu tests plus the changed gate, then land this Feishu-owned recovery as the canonical fix for SDK retry exhaustion; keep broader silent-stall/watchdog behavior as a separate follow-up if maintainers still want it.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main is reproducible by inspection: after a successful wsClient.start, monitorWebSocket only waits for abort, while the PR tests simulate the SDK terminal onError callback and assert client recreation after backoff.

Is this the best way to solve the issue?

Yes. The proposed fix is the narrow maintainable path for the reported SDK retry-exhaustion mode: it uses the SDK constructor callback contract, filters non-terminal callback errors, preserves Feishu heartbeat defaults, and keeps the behavior inside the Feishu plugin rather than adding a core channel special case.

Acceptance criteria:

pnpm test:serial extensions/feishu/src/async.test.ts extensions/feishu/src/monitor.cleanup.test.ts extensions/feishu/src/client.test.ts
pnpm check:changed

What I checked:

Current main lacks post-start SDK error recovery: On current main, monitorWebSocket creates one WSClient, calls start, resets the startup attempt counter, and then waits only for abort; there is no SDK onError callback or post-start client recreation path. (extensions/feishu/src/monitor.transport.ts:181, 12882a88b137)
PR adds terminal callback recovery: At PR head 1faf660c3482, createFeishuWSClient(account, { onError }) is wired through monitorWebSocket; terminal errors are classified, the dead client is closed without clearing identity, and a new client is created after bounded abortable backoff. (extensions/feishu/src/monitor.transport.ts:213, 1faf660c3482)
Dependency contract supports the callback seam: The published @larksuiteoapi/[email protected] types document onError as firing when the initial connect cannot retry or reconnectCount is exhausted, and the published JS invokes it with WebSocket reconnect exhausted after ... attempts.
Regression tests cover the intended recovery path: The PR tests client recreation after SDK reconnect exhaustion, recoverable callback errors staying on the current client, and abort during reconnect backoff without opening a new socket. (extensions/feishu/src/monitor.cleanup.test.ts:139, 1faf660c3482)
Mergeability is still blocked: The public PR API reports head 1faf660c34824e84e84b941dd747da449d23fcf2, mergeable: false, rebaseable: false, and mergeable_state: dirty. (1faf660c3482)

Likely related people:

steipete: History and prior ClawSweeper context point to Peter as a dominant recent maintainer around the Feishu monitor/client files and release/test maintenance for this area. (role: recent maintainer and adjacent owner; confidence: high; commits: f46bd2e0cc17, 8e5c2efb8d85, a448042c2edd; files: extensions/feishu/src/monitor.transport.ts, extensions/feishu/src/client.ts, extensions/feishu/src/monitor.cleanup.test.ts)
vincentkoc: Vincent authored the adjacent merged Feishu WebSocket startup/backoff and heartbeat repair in fix(feishu): repair WebSocket reconnect and heartbeat config #72411, which this PR explicitly builds on; local history verifies commit d93e6f61580a touched the same monitor, client, and cleanup-test files. (role: recent Feishu WebSocket maintainer; confidence: high; commits: 217a8fbd4dd6, d93e6f61580a; files: extensions/feishu/src/monitor.transport.ts, extensions/feishu/src/client.ts, extensions/feishu/src/monitor.cleanup.test.ts)
Yifeng Wang: Prior feature-history context identifies Yifeng as connected to the bundled Feishu plugin replacement and multi-account baseline that the current monitor/client path evolved from. (role: introduced Feishu plugin baseline; confidence: medium; commits: 2267d58afcc7, 5f6e1c19bd18; files: extensions/feishu)

Remaining risk / open question:

GitHub reports the PR head as dirty/unrebaseable, so the reviewed code cannot be merged until the branch is repaired or replaced.
This read-only review did not run the targeted Vitest command or changed gate; exact-head validation should be rerun after the mergeability repair.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 12882a88b137.

greptile-apps · 2026-04-28T18:41:17Z

Greptile Summary

This PR adds an application-owned supervisor loop to monitorWebSocket that detects Feishu SDK retry exhaustion via the onError lifecycle callback and recreates the WSClient with bounded exponential backoff (1 s–30 s). Bot identity maps (botOpenIds, botNames) are preserved across recovery cycles and cleared only on true monitor shutdown, and a new clearIdentity flag on cleanupFeishuWsClient encodes that distinction explicitly.

Confidence Score: 4/5

Safe to merge; only a documentation/implicit-contract P2 concern was found.

No P0 or P1 issues. The supervisor loop logic, race-safe settled flag, abort cleanup, and backoff math are all correct. The single P2 concern is that onError is used as an unconditional terminal signal without a comment documenting the SDK contract this relies on.

extensions/feishu/src/monitor.transport.ts — the implicit onError-is-terminal assumption near line 203.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: extensions/feishu/src/monitor.transport.ts
Line: 203-204

Comment:
**Silent assumption that `onError` is terminal-only**

`reportTerminalError` is passed as `onError`, treating every SDK error callback as a signal to destroy and recreate the WSClient. If the Feishu SDK fires `onError` for intermediate, recoverable errors (e.g., a single failed reconnect attempt within its own retry window), this will prematurely close a live connection and restart the full client — potentially interfering with the SDK's own built-in reconnect logic.

The separation of `onReconnecting`/`onReconnected` from `onError` in the SDK's constructor options suggests `onError` is indeed reserved for terminal failure, but this assumption is not documented. A brief comment explaining this contract would protect against regressions if the SDK's error semantics ever change or if a future maintainer adds filtering.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(feishu): recover WebSocket after SDK..." | Re-trigger Greptile}

openclaw-clownfish · 2026-04-29T01:26:02Z

ProjectClownfish pushed a narrow repair to this branch so the original contributor path can stay canonical.

Source PR: #73739
Validation: pnpm test:serial extensions/feishu/src/monitor.cleanup.test.ts extensions/feishu/src/client.test.ts; pnpm check:changed
Contributor credit is preserved in the branch history and PR context.

vincentkoc · 2026-05-01T12:03:33Z

/clawsweeper automerge

clawsweeper · 2026-05-01T12:04:23Z

🦞🦞
ClawSweeper saw the passing review, but did not merge yet.

Source: clawsweeper[bot]
Feedback: structured ClawSweeper verdict: pass (sha=1faf660c34824e84e84b941dd747da449d23fcf2)
Merge status: mergeable state is CONFLICTING

I left the PR open for the remaining gate instead of bypassing it.

Automerge progress:

2026-05-01 12:40:42 UTC review queued `1faf660c3482` (queued)
2026-05-01 12:44:35 UTC review passed `1faf660c3482` (structured ClawSweeper verdict: pass (sha=1faf660c34824e84e84b941dd747da449d23f...)
2026-05-01 12:08:26 UTC repair queued `1faf660c3482` (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/25213737576
2026-05-01 12:37:56 UTC repair queued `1faf660c3482` (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/25214525112
2026-05-01 12:40:51 UTC repair queued `1faf660c3482` (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/25214609454
2026-05-01 12:45:36 UTC repair completed (no branch change) in 4m 53s Run: https://github.com/openclaw/clawsweeper/actions/runs/25214525112 validation command failed (pnpm check:changed): [check:changed] lanes=extensions, extensionTests, docs [check:changed] extensions/feishu/src/client.test.ts: ex...

@jalehman

* refactor: trim provider discovery internal exports * refactor: trim voice runtime internal exports * fix(whatsapp): stage qrcode runtime dependency * refactor: trim stream helper internal exports * refactor: trim provider constant exports * fix: make Discord voice reconnect timing resilient * refactor: trim onboarding internal helpers * docs(sandboxing): clarify sandbox setup scripts require source checkout (openclaw#75594) Add inline docker build commands for npm-installed users who don't have the source checkout scripts. Update all docs referencing sandbox-setup.sh, sandbox-common-setup.sh and sandbox-browser-setup.sh to note they are source-checkout-only and link to the new inline instructions. Fixes openclaw#75485. * fix(plugins): skip update when bundled plugin version is newer than installed clawhub/marketplace version (openclaw#75604) * fix(plugin-sdk): restore deprecated reply pipeline compat exports * refactor: trim test support helper exports * test: stabilize Parallels update smokes * refactor: trim sessions spawn harness type exports * fix(gateway): refresh stale channel health cache * fix(agent): apply configured fast mode to embedded runs * fix(agent): default missing model cost metadata * fix(agent): honor explicit OpenAI SSE transport * fix(extensions): guard model and Twilio fetches * fix(cli): repair agent runtime deps during startup * fix(whatsapp): mirror qrcode from root runtime deps * refactor: trim sanitize history harness exports * refactor: trim subagent test helper exports * [codex] Defer status reaction cleanup (openclaw#75582) Summary: - The PR updates the shared status reaction controller to track active remove-capable reactions, defer cleanup until clear/restoreInitial, adjust controller and Slack lifecycle tests, add a changelog entry, and carries qrcode runtime-dependency mirror hunks from its older base. ClawSweeper fixups: - Included follow-up commit: fix: limit status reaction restore cleanup - Included follow-up commit: chore: merge main into status reaction cleanup - Included follow-up commit: fix: mirror qrcode runtime dependency Validation: - ClawSweeper review passed for head f3efcb4. - Required merge gates passed before the squash merge. Prepared head SHA: f3efcb4 Review: openclaw#75582 (comment) Co-authored-by: Peter Steinberger <[email protected]> Co-authored-by: Peter Steinberger <[email protected]> * refactor: trim runtime test helper type exports * refactor: delete unused test helpers * refactor: delete unused test support modules * fix(agents): trim trailing assistant turns and rewrite blank user messages in session repair (openclaw#75606) * fix(agents): trim trailing assistant turns and rewrite blank user messages in session repair Session-file repair now: - Trims trailing assistant messages so the JSONL never ends on role=assistant, preventing the Anthropic 400 prefill-loop that fires when thinking is enabled. (openclaw#75271) - Rewrites blank-only user messages to a synthetic '(continue)' placeholder instead of dropping them, so strict providers (Qwen/mlx-vlm, Anthropic) no longer reject transcripts missing a user turn. (openclaw#75313) Closes openclaw#75271, closes openclaw#75313. * refactor: clean up comments in session-file repair * fix(agents): preserve trailing assistant tool-call turns during session trim Mirror the outbound guard (stripTrailingAssistantPrefillTurns): skip assistant entries containing toolCall/toolUse/functionCall blocks so transcript repair can synthesize missing tool results. Addresses PR review feedback from clawsweeper on openclaw#75606. * refactor: delete unused contract test helpers * test: use built-in OpenAI provider in Windows smoke * fix: recover Discord voice auto-join after resume * refactor: trim cli test helper exports * fix: send twilio notify twiml directly * Prefer Codex native workspace tools (openclaw#75308) Summary: - The PR adds Codex dynamic-tool profile config defaulting to `native-first`, filters duplicate workspace/process/planning tools from Codex app-server thread payloads, keeps managed `web_search`, updates docs/manifest/config baselines/changelog, and adds regression tests. ClawSweeper fixups: - Included follow-up commit: test(codex): pin native-first tool catalog - Included follow-up commit: chore(config): refresh generated schema baseline - Included follow-up commit: chore: add codex native-first changelog - Included follow-up commit: chore: move native-first changelog entry - Included follow-up commit: chore: refresh config baseline after rebase Validation: - ClawSweeper review passed for head 30e5cec. - Required merge gates passed before the squash merge. Prepared head SHA: 30e5cec Review: openclaw#75308 (comment) Co-authored-by: Peter Steinberger <[email protected]> Co-authored-by: pashpashpash <[email protected]> * refactor: trim cli program test exports * refactor: trim command test helper exports * ci: stop parity gate on pull requests * chore: update dependencies * chore: sync whatsapp dependency lockfile * fix(agents): dedupe messaging tool replies by route * fix: handle EPIPE errors on child process stdin writes (openclaw#75602) Fix three child-process stdin write paths that let async EPIPE errors escape to uncaughtException and crash the gateway. extensions/imessage/src/client.ts (the actual openclaw#75438 crash path): - Add child.stdin.on('error') listener in start() to catch async EPIPE and reject all pending requests via failAll(). - Add write callback to request() stdin.write() that rejects the specific pending request on error, instead of leaving it hanging until timeout. src/agents/mcp-stdio-transport.ts: - Fix write callback race in send(): previously resolved the promise immediately when write() returned true, then the write callback with EPIPE would fire after the promise was already fulfilled. Now always settles the promise from the write callback so the outcome is known before resolving. src/process/exec.ts: - Add stdin.on('error') before writing input so EPIPE from a prematurely-exited child is swallowed — the process exit handler reports the real status. One reporter observed a gateway crash after 10.5 hours of stable uptime — a single EPIPE on an iMessage RPC child process stdin write killed the gateway with code 1. Fixes: openclaw#75438 * refactor: trim cron test helper exports * refactor: delete unused repo scan helper * fix: default Discord voice to explicit opt-in * refactor: trim test utility exports * fix: show google meet twilio call diagnostics * build: refresh a2ui bundle hash * refactor: trim auto reply test helper exports * fix(app): retry device tokens on pinned gateways (openclaw#75537) * test(e2e): add bundled plugin runtime smoke * test(e2e): activate channel rows for runtime smoke * test(e2e): let channel runtime smoke load channels * test(e2e): tolerate missing pgrep in runtime smoke * test(e2e): assert canonical TTS provider in smoke * test(e2e): satisfy runtime smoke lint * test(e2e): account for lazy plugin commands in smoke * test(e2e): give plugin runtime RPCs more headroom * test(e2e): share runtime deps across matrix probes * test(e2e): skip lazy tool catalog probes * test(e2e): isolate plugin matrix runtime deps * test(e2e): configure tts provider sections in matrix * fix(plugins): preserve requested speech fallback * test(e2e): harden bundled plugin runtime smoke * docs(changelog): credit plugin runtime smoke fix * fix(plugins): scope requested speech providers * fix(test): satisfy plugin smoke lint * fix(test): tolerate channel readiness degradation * refactor: trim gateway test helper exports * fix: allow doctor repair size drops * fix: async transcript I/O to unblock gateway event loop (openclaw#75595) * fix: async transcript I/O to unblock gateway event loop Two related fixes for event-loop starvation caused by synchronous file operations on session transcript files during gateway hot paths. ## sessions.list: yield between transcript reads (openclaw#75330) Extract filterAndSortSessionEntries() from listSessionsFromStore() and add a new listSessionsFromStoreAsync() that yields to the event loop via setImmediate every 10 session rows. The sessions.list RPC handler now uses the async version. The synchronous version is kept for callers that need it (sessions- resolve visibility checks, embedded backends, subagent tools). The dominant blocker is readSessionTitleFieldsFromTranscript(), which performs fs.statSync + fs.openSync + fs.readSync (head) + fs.readSync (tail) for every session row that requests derived titles or last- message previews. With 100+ sessions, this blocks the event loop for 32-64 seconds, starving WebSocket heartbeats, channel I/O, and concurrent RPC. ## session compaction: async file copy (openclaw#75414) Add captureCompactionCheckpointSnapshotAsync() using fs.promises for stat, copyFile, and unlink instead of fsSync equivalents. Switch both compact.ts and compact.queued.ts to the async version. The synchronous copyFileSync of large transcript files (20MB+ observed in production) was blocking the event loop for the entire copy duration — one reporter measured a 43-minute event loop block from a single compaction checkpoint capture. Refs: openclaw#75330, openclaw#75414 * test: cover async transcript I/O responsiveness * fix: avoid sync checkpoint metadata reads * refactor: trim agent test helper exports * fix(gateway): preflight strict agent delivery * fix(onboard): run noninteractive migration imports * fix(discord): send component-only native replies * fix(slack): send block-only slash replies * fix(telegram): send interactive-only button replies * fix(line): send quick-reply-only payloads * fix(auto-reply): keep docking in direct chats * fix(discord): pin text dm main route owner * fix(channels): pin dm main route owners * fix(update): verify daemon restart port * fix(update): use service env for doctor * refactor: trim status scan test exports * slack: persist thread participation best-effort (openclaw#75583) * refactor: delete unused test helper code * fix(gateway): defer session store read maintenance * discord: persist component registries best-effort (openclaw#75584) * test: retry Windows Parallels agent turn * fix: type non-interactive onboard prompter * refactor: trim gateway test helper barrel * fix: declare deepinfra manifest model discovery * refactor: derive deepinfra catalog from manifest * docs: note deepinfra model catalog migration * refactor: trim gateway helper state exports * fix: allow onboarding config size drops * fix: declare groq setup auth metadata * fix: declare groq manifest model catalog * docs: note groq manifest catalog migration * fix: carry matrix dm allowlist state * refactor: trim shared test helper exports * fix: preserve OpenAI Codex xhigh thinking policy * fix(infra): block ambient Homebrew env vars from brew resolution (openclaw#74463) * fix: address issue * fix: address issue * fix: address PR review feedback * fix: address PR review feedback * fix: address codex review feedback * docs: add changelog entry for PR merge * fix: declare venice manifest catalog metadata * refactor: derive venice fallback catalog from manifest * docs: note venice manifest catalog migration * refactor: trim extension internal type exports * fix(infra): block Windows system path env vars from workspace .env injection (openclaw#74456) * fix: address issue * fix: address PR review feedback * fix: address codex review feedback * fix: address codex review feedback * fix: address codex review feedback * docs: add changelog entry for PR merge * Update CHANGELOG.md * fix: block workspace CLOUDSDK_PYTHON override and always set trusted interpreter for gcloud (openclaw#74492) * fix: address issue * docs: add changelog entry for PR merge * refactor: trim extension runtime barrels * fix(plugins): scope install slot selection * fix(plugins): satisfy slot registry type * fix: declare zai manifest model catalog * docs: note zai manifest catalog migration * refactor: trim private extension exports * test(docker): install procps for plugin watchdogs * refactor: delete unused extension shared shims * fix: declare stepfun setup auth metadata * test(pairing): clear allowlist cache before read spy (openclaw#74147) * fix: declare qianfan setup auth metadata * fix(doctor): keep plugin runtime deps repair explicit (openclaw#75603) * fix(doctor): keep plugin runtime deps repair explicit * fix(doctor): keep plugin runtime deps repair explicit * fix(doctor): keep plugin runtime deps repair explicit --------- Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> * feat(slack): publish App Home tab views * fix(onboarding): scope post-config runtime deps (openclaw#75653) * fix(whatsapp): drop stale qrcode runtime dependency * fix(zai): satisfy catalog lint * refactor: trim internal extension seams * test: require parallels agent responses * refactor: trim extension runtime reexports * fix(feishu): recover WebSocket after SDK retry exhaustion (openclaw#73739) * fix(feishu): recover WebSocket after SDK retry exhaustion * fix(feishu): recover WebSocket after SDK retry exhaustion --------- Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com> * feat(google-meet): add transcribe caption health * refactor: trim extension test hooks * test: configure parallels smoke provider timeout * fix(doctor): keep noninteractive service repair explicit * fix(plugins): use built code for tool discovery * test(pairing): pass read spy path after cache reset * refactor: trim extension helper shims * fix(slack): declare Slack type dependency * test(plugins): mock install slot registry * docs(changelog): backfill 84e9463 qianfan and a4fd45c stepfun setup auth metadata * test: quote parallels provider config json * refactor: trim messaging runtime barrels * test(plugin-sdk): align facade loader windows fast path * refactor: trim extension barrel leftovers * test(ci): stabilize pricing and codex web config checks * refactor: trim channel dead exports * fix(config): accept optional Codex search location * test(config): isolate codex web schema acceptance * refactor: trim extension shim reexports * test(config): type fresh codex schema import * refactor: trim browser action barrel * docs(changelog): finalize 2026.4.30 notes * refactor: trim provider model constants * fix(channels): honor module loader native opt-out * refactor: trim core command dead exports * refactor: trim provider internal exports * refactor: trim extension helper exports * test(parallels): write Windows provider config via batch file * refactor: trim browser route exports * refactor: trim qqbot helper exports * refactor: trim qqbot bridge exports * test(parallels): expose portable Git to Windows agent turns * refactor: trim qqbot utility exports * fix(context-engine): honor assembled prompt authority in precheck (openclaw#74255) Merged via squash. Prepared head SHA: 650b023 Co-authored-by: 100yenadmin <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman * refactor: trim qqbot internal types * refactor: trim mattermost helper exports * refactor: trim matrix helper exports * refactor: trim tlon helper exports * test(parallels): retry Windows agent idle exits * refactor: trim browser helper types * test(parallels): budget Windows agent retry * fix(ci): keep tool display serialization local * refactor: trim voice-call helper exports * refactor: trim bluebubbles helper exports * refactor: trim bluebubbles config helper exports * fix(webchat): create dashboard sessions from New Chat (openclaw#73725) Summary: - The PR rewires Control UI/WebChat New Chat to create and switch to a dashboard session through `sessions.create`, adds guarded UI/session helper logic and regression tests, and updates the changelog. ClawSweeper fixups: - Included follow-up commit: fix(webchat): create dashboard sessions from New Chat Validation: - ClawSweeper review passed for head 983c634. - Required merge gates passed before the squash merge. Prepared head SHA: 983c634 Review: openclaw#73725 (comment) Co-authored-by: vincentkoc <[email protected]> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> * refactor: trim brave and diffs helper exports * fix(discord): avoid resolving token during action discovery (openclaw#75424) Summary: - The PR changes Discord message-action discovery to inspect configured accounts without resolving bot tokens, resolves scoped channel SecretRefs during message-tool execution even with an injected config snapshot, adds regression tests and a changelog entry, and restores a tool-display serializer export. ClawSweeper fixups: - Included follow-up commit: fix(discord): avoid resolving token during action discovery - Included follow-up commit: fix(tools): restore tool display serializer export Validation: - ClawSweeper review passed for head a2cd832. - Required merge gates passed before the squash merge. Prepared head SHA: a2cd832 Review: openclaw#75424 (comment) Co-authored-by: Clawdbot <[email protected]> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> * Mattermost: refresh slash callback command validation (openclaw#72923) * fix(mattermost): refresh slash callback tokens * fix(mattermost): reconcile slash callback method * fix(mattermost): bound slash command lookups * fix(mattermost): cache slash validation lookups * fix(mattermost): refresh slash routing * fix(mattermost): require slash callback secret * fix(mattermost): rate limit slash validation * fix(mattermost): throttle slash validation * fix(mattermost): bound slash token cache * fix(mattermost): sanitize slash callback logs * fix(mattermost): avoid stale slash token cache * fix(mattermost): scope slash token gate to command * fix(mattermost): rate-limit slash validation * fix(mattermost): redact slash validation errors * fix(mattermost): satisfy slash sanitizer lint * Move Mattermost slash refresh changelog entry to Unreleased Fixes * Apply oxfmt accordion blank-line on Mattermost slash docs --------- Co-authored-by: Devin Robison <[email protected]> * refactor: trim discord helper exports * refactor: trim discord internal helper exports * refactor: trim discord monitor helper exports * refactor: trim discord test helper exports * refactor: trim feishu helper exports * docs(doctor): clarify service repair prompts Clarify when doctor reports service repair state versus when gateway install performs launcher writes.\n\nThanks @vincentkoc * fix(security): keep plain audit off plugin runtimes Keep routine security audit on config/filesystem checks by default, reserving plugin runtime collectors for deep audit paths.\n\nThanks @vincentkoc * test(e2e): bound telegram rtt warm samples * test(rtt): expose warm sample metrics * test(e2e): target successful rtt samples * test(e2e): measure telegram normal reply rtt * test(e2e): allow rtt retries to reach sample target * refactor: trim provider helper exports * fix(ci): satisfy rtt lint rules * refactor: trim google meet helper exports * refactor: trim google meet transport exports * refactor: trim google chat helper exports * test(rtt): support main package measurements * refactor: trim irc helper exports * refactor: trim signal helper exports * refactor: trim line helper exports * test(plugins): materialize runtime deps fixtures * test(parallels): force Windows OpenAI SSE smoke * refactor: trim mattermost helper exports * refactor: trim nextcloud talk helper exports * refactor: trim provider helper exports * fix(gateway): bound session transcript hot paths Bound recent transcript reads and oversized injected-message writes across gateway session paths.\n\nThanks @vincentkoc * fix(plugins): reuse cold inspect registry snapshots (openclaw#75620) Summary: - The PR reuses a request-scoped cold manifest registry/runtime context across plugin status and inspect report paths, threads that context through provider/setup/metadata helpers, adds targeted coverage, and adds a changelog entry. ClawSweeper fixups: - Included follow-up commit: fix(plugins): preserve setup auto-enable lookup Validation: - ClawSweeper review passed for head 4d8e8e2. - Required merge gates passed before the squash merge. Prepared head SHA: 4d8e8e2 Review: openclaw#75620 (comment) Co-authored-by: Vincent Koc <[email protected]> * fix(plugins): avoid source rebuilds for policy toggles Reuse current installed-plugin registry records for policy-only enable and disable refreshes.\n\nThanks @vincentkoc * refactor: trim msteams helper exports * test(parallels): force POSIX OpenAI SSE smoke * refactor: trim telegram helper exports * refactor: trim whatsapp helper exports * test(parallels): batch POSIX provider config * refactor: trim slack helper exports * refactor: trim matrix helper exports * test(release): repair release validation checks * refactor: trim voice call helper exports * refactor: trim memory wiki helper exports * refactor: trim nostr helper exports * test(release): forward validation fixes * test(release): run doctor fix in setup-entry e2e * refactor: trim qa matrix helper exports * refactor: trim memory core helper exports * test(release): fix setup fallback loader validation * refactor: trim file transfer helper exports * refactor: trim tlon helper exports * refactor: trim browser helper exports * test(release): harden channel add setup fallback * refactor: trim imessage helper exports * test(ci): update imessage runtime api guard * test(release): include mirrored root runtime deps * refactor: trim qa lab helper exports * fix(bluebubbles): UTI-aware audio attachment detection (openclaw#75488) Co-authored-by: Omar Shahine <[email protected]> * test(release): remove stale runtime deps local * refactor: trim qqbot helper exports * refactor: trim codex internal exports * refactor: trim whatsapp test helper exports * refactor: trim telegram test harness exports * refactor: remove stale openrouter runtime barrel * refactor: trim zalo helper exports * test(release): harden docker release validation * fix(rtt): parse telegram scenario list * refactor: trim feishu lifecycle helper exports * test(plugins): use valid plugin origin in loader test * fix(rtt): wait between telegram samples * fix(config): surface backup restore copy failures in audit and logs (openclaw#70515) Merged via squash. Prepared head SHA: 7c77974 Co-authored-by: davidangularme <[email protected]> Co-authored-by: jalehman <[email protected]> Reviewed-by: @jalehman * refactor: trim zalouser helper exports * fix: stop stale OpenAI Responses replay in Telegram Drop incomplete OpenAI/Codex tool-call replay turns and avoid previous_response_id deltas when Responses store is disabled. Verified on Zeus with focused tests, build, UI build, channel probe, and live Telegram-session canary. * fix: stop stale Telegram final-answer replay Suppress stale replayed final answers after newer Telegram/tool turns, supersede overlapping runs, and preserve current anchored output when replay frames contain old finals. Verified locally and on live Zeus with Telegram multi-skill expected-answer checks and ZeusChat model-switch browser checks. * Stop direct NO_REPLY fallback chatter (#11) Co-authored-by: Zeus3000 <[email protected]> * Repair mismatched reset transcript files (#13) Co-authored-by: Zeus3000 <[email protected]> * Promote substantive stale-replay commentary (#15) Co-authored-by: Zeus3000 <[email protected]> * fix: preserve Telegram transcript user turns --------- Co-authored-by: Peter Steinberger <[email protected]> Co-authored-by: Alex Knight <[email protected]> Co-authored-by: Vincent Koc <[email protected]> Co-authored-by: Peter Steinberger <[email protected]> Co-authored-by: Peter Steinberger <[email protected]> Co-authored-by: pashpashpash <[email protected]> Co-authored-by: Shakker <[email protected]> Co-authored-by: Pavan Kumar Gondhi <[email protected]> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com> Co-authored-by: Andrew <[email protected]> Co-authored-by: 100yenadmin <[email protected]> Co-authored-by: jalehman <[email protected]> Co-authored-by: vincentkoc <[email protected]> Co-authored-by: Conan-Scott <[email protected]> Co-authored-by: Clawdbot <[email protected]> Co-authored-by: Agustin Rivera <[email protected]> Co-authored-by: Devin Robison <[email protected]> Co-authored-by: Ayaan Zaidi <[email protected]> Co-authored-by: Omar Shahine <[email protected]> Co-authored-by: Omar Shahine <[email protected]> Co-authored-by: Fred David blum <[email protected]> Co-authored-by: davidangularme <[email protected]> Co-authored-by: Zeus3000 <[email protected]>

…3739) * fix(feishu): recover WebSocket after SDK retry exhaustion * fix(feishu): recover WebSocket after SDK retry exhaustion --------- Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>

vincentkoc added clownfish:human-review clawsweeper Tracked by ClawSweeper automation labels Apr 28, 2026

openclaw-barnacle Bot added channel: feishu Channel integration: feishu size: M maintainer Maintainer-authored PR labels Apr 28, 2026

greptile-apps Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread extensions/feishu/src/monitor.transport.ts Outdated

vincentkoc removed clownfish:merge-ready labels Apr 28, 2026

clawsweeper Bot mentioned this pull request Apr 28, 2026

fix(feishu): WebSocket connection not recovered after network disruption #52618

Closed

openclaw-clownfish Bot force-pushed the clownfish/ghcrawl-156682-autonomous-smoke branch from 5dd7729 to 1faf660 Compare April 29, 2026 01:26

clawsweeper Bot added the clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge label May 1, 2026

vincentkoc and others added 2 commits May 1, 2026 06:25

fix(feishu): recover WebSocket after SDK retry exhaustion

d503574

fix(feishu): recover WebSocket after SDK retry exhaustion

3d7dc23

vincentkoc force-pushed the clownfish/ghcrawl-156682-autonomous-smoke branch from 1faf660 to 3d7dc23 Compare May 1, 2026 13:27

vincentkoc self-assigned this May 1, 2026

vincentkoc merged commit f9b47ad into main May 1, 2026
79 of 86 checks passed

vincentkoc deleted the clownfish/ghcrawl-156682-autonomous-smoke branch May 1, 2026 13:27

github-actions Bot mentioned this pull request May 1, 2026

📡 Upstream Digest — 2026-05-01 14:47 UTC curtismercier/openclaw-mods#736

Open

clawsweeper Bot mentioned this pull request May 2, 2026

fix(feishu): reconcile WebSocket reconnect backoff #73998

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(feishu): recover WebSocket after SDK retry exhaustion#73739

fix(feishu): recover WebSocket after SDK retry exhaustion#73739
vincentkoc merged 2 commits intomainfrom
clownfish/ghcrawl-156682-autonomous-smoke

vincentkoc commented Apr 28, 2026

Uh oh!

clawsweeper Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 28, 2026

Uh oh!

Uh oh!

openclaw-clownfish Bot commented Apr 29, 2026

Uh oh!

vincentkoc commented May 1, 2026

Uh oh!

clawsweeper Bot commented May 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vincentkoc commented Apr 28, 2026

Summary

Scope

Validation

Uh oh!

clawsweeper Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 28, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

Uh oh!

openclaw-clownfish Bot commented Apr 29, 2026

Uh oh!

vincentkoc commented May 1, 2026

Uh oh!

clawsweeper Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Apr 28, 2026 •

edited

Loading

clawsweeper Bot commented May 1, 2026 •

edited

Loading