fix(process): use globalThis singleton for command-queue state to survive code-splitting#41903
fix(process): use globalThis singleton for command-queue state to survive code-splitting#41903best wants to merge 1 commit intoopenclaw:mainfrom
Conversation
…vive code-splitting
command-queue.ts stores shared runtime state (lanes, gatewayDraining,
nextTaskId) as module-scoped variables. tsdown code-splitting duplicates
this module into 10 independent chunks, each with its own isolated copy.
This causes:
- setCommandLaneConcurrency() only updates one chunk's lanes Map
- agents.defaults.maxConcurrent config is completely ignored
- All requests serialize through maxConcurrent=1 regardless of config
- resetAllLanes() (SIGUSR1) only resets one chunk's lanes
- markGatewayDraining() is invisible to pi-embedded chunks
Replace module-scoped state with a globalThis singleton using
Symbol.for('openclaw.commandQueueState'), following the same pattern
already used in internal-hooks.ts and context-engine/registry.ts.
Greptile SummaryThis PR correctly fixes a real bug where tsdown code-splitting duplicated The fix replaces three module-scoped variables ( Technical correctness verified:
The change is minimal, targeted, and follows an established codebase pattern. Error classes remain module-scoped (separate pre-existing concern noted in #41901). Confidence Score: 5/5
Last reviewed commit: cbff929 |
|
@codex review |
|
Codex Review: Didn't find any major issues. Chef's kiss. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@vincentkoc This PR is ready for review. CI green (30/30), Greptile 5/5, Codex passed with no issues. Fixes a state isolation bug in |
|
Linking #43683 here because it carries the same The broader audit found additional mutable state duplicated into separate bundled chunks beyond
So this is not a competing root cause. It is the same bundled-runtime singleton split, fixed across the rest of the live state surfaces that were still diverging in production builds. Tracking the broader pass in #43683. |
|
Thanks for the comprehensive follow-up in #43683 — that's exactly the broader audit we were planning to do next. Happy to see the fix expanded to cover the followup queue, dedupe cache, and channel-specific state as well. Closing this PR in favor of #43683 since it covers the same |
Summary
command-queue.tsstores shared runtime state (lanes,gatewayDraining,nextTaskId) as module-scoped variables. tsdown code-splitting duplicates this module into 10 independent chunks, each with isolated state.setCommandLaneConcurrency()only updates one chunk; actual message processing uses different chunks stuck atmaxConcurrent=1.agents.defaults.maxConcurrentconfig is completely ignored for all users. All requests serialize regardless of configuration. Additionally,resetAllLanes()andmarkGatewayDraining()only affect one chunk.globalThissingleton viaSymbol.for('openclaw.commandQueueState'), following the same pattern used ininternal-hooks.tsandcontext-engine/registry.ts. Updated 14 call sites.lanes.tslane-name helpers. Error classes remain module-scoped (separate concern).Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
agents.defaults.maxConcurrentnow actually takes effect across all chunks (was silently ignored)resetAllLanes()on SIGUSR1 now correctly resets lanes in all chunks (was only resetting one)markGatewayDraining()now visible to pi-embedded chunks during shutdown (was invisible)Security Impact (required)
Repro + Verification
Environment
agents.defaults.maxConcurrent: 10Steps
agents.defaults.maxConcurrent: 10Expected
Actual
maxConcurrent=1Evidence
All 16 existing
command-queue.test.tstests pass unchanged:dist/ state duplication confirmed:
$ rg -l 'let nextTaskId = 1;' dist/ dist/compact-BIp3lbnh.js dist/pi-embedded-CJVNBk0y.js dist/pi-embedded-DePf5reQ.js dist/plugin-sdk/dispatch-C5fHAX5a.js dist/plugin-sdk/dispatch-D36HVShU.js dist/plugin-sdk/dispatch-DHll-mp1.js dist/plugin-sdk/dispatch-DJpqKByO.js dist/plugin-sdk/dispatch-vBEK_Alg.js dist/plugin-sdk/reply-DYZ7va08.js dist/reply-BZQv8rey.jsHuman Verification (required)
resetAllLanes()still correctly clears globalThis state;markGatewayDraining()propagates across chunks; primitive values (gatewayDraining,nextTaskId) accessed through state object (not destructured) to ensure mutations propagatepnpm build && pnpm check(shallow clone, no full dev dependencies); cross-chunkinstanceoffor error classes (separate concern, noted in issue)Review Conversations
Compatibility / Migration
Failure Recovery (if this breaks)
src/process/command-queue.tsSymbol.forkey collides with external code (extremely unlikely given namespace), queue state could leakRisks and Mitigations
CommandLaneClearedError,GatewayDrainingError) are still per-chunk; cross-chunkinstanceofchecks would failAI-Assisted Contribution
globalThissingleton used ininternal-hooks.tsandcontext-engine/registry.ts) tocommand-queue.tswhich was missed