Summary
Implement lane backpressure management for multi-agent webchat sessions to prevent UI freezes during heavy agent-to-agent exchanges. Currently, when nested agents exchange large payloads (18k+ tokens), the session lane queue blocks new messages for 2-4+ minutes with no user feedback, causing browser "wait or leave page" dialogs.
Related bug report: #11284
Problem
In multi-agent setups (e.g., a coordinator agent + a Claude Code subagent, both on Opus 4.5), heavy exchanges create a cascade:
- Agent A sends a large request (~18k tokens input)
- Agent B generates a large response (~6k tokens output, streamed over 30-60+ seconds)
- During streaming, any new messages queue in the session lane (
queueAhead=1)
- The queue wait exceeds 4 minutes (
waitedMs=248826)
- The webchat UI freezes — no spinner, no progress indicator, no "agents are collaborating" message
- Browser shows "wait or leave page" dialog
- User has no way to know if the system is working or stuck
This is distinct from #7725 (Ollama timeout causing zombie state) — here the system is functioning correctly but the UX is broken.
Proposed solution
1. Webchat UI: streaming activity indicator during lane wait
When queueAhead > 0 and a run is active on the lane, the webchat UI should show a live indicator:
🔄 Agents collaborating... (2 of 3 messages processing)
CC is generating a response (~45s elapsed)
[Cancel] [View partial output]
Implementation hint: The gateway already emits lane wait exceeded diagnostics. Expose this as a WebSocket event (e.g., lane.status) that the Control UI subscribes to. The UI can then show a non-blocking overlay instead of letting the browser think the page is hung.
2. Lane: configurable per-session concurrency for nested agents
Currently, nested agent sessions share a single lane, serializing all messages. For multi-agent setups, allow configurable concurrency:
{
"agents": {
"defaults": {
"lane": {
"maxConcurrent": 1,
"warnThresholdMs": 30000,
"nestedAgentParallel": false
}
}
}
}
warnThresholdMs: Emit a user-visible warning when lane wait exceeds this (default: 30s)
nestedAgentParallel: When true, allow nested agent runs to execute in parallel rather than serialized (experimental, opt-in)
3. Tool call validation: fail-fast on missing required parameters
The read tool requires a path parameter. When an agent issues read with argsType=object but no path, the gateway should:
- Return a
tool_result error to the calling agent (not just log a warning)
- Include a helpful message:
"Error: read tool requires 'path' parameter. Received: {}"
- Allow the agent to retry rather than silently continuing without the data
This is a separate but related issue — malformed tool calls during heavy exchanges suggest the model may be under context pressure and needs the error signal to self-correct.
4. WebSocket keepalive during long runs
The webchat WebSocket connection should send periodic heartbeat frames during long-running lane tasks to prevent the browser from treating the connection as stale:
[Every 15s during active run]
→ ws ping (or custom "heartbeat" event with elapsed time)
← ws pong
This prevents browser-level "page unresponsive" dialogs that are triggered by lack of WebSocket activity.
Alternatives considered
| Approach |
Pros |
Cons |
| Do nothing |
Zero effort |
Multi-agent webchat is broken for heavy workloads |
| UI-only fix (option 1+4) |
Low effort, high UX impact |
Doesn't solve underlying serialization |
| Full lane refactor (option 2) |
Solves root cause |
Complex, risk of race conditions |
| This proposal (1+3+4) |
Moderate effort, addresses both UX and data integrity |
Option 2 deferred to later |
Recommended implementation order
- Tool call validation (option 3) — smallest change, highest correctness impact
- WebSocket keepalive (option 4) — small change, prevents browser dialogs
- UI activity indicator (option 1) — medium change, biggest UX improvement
- Lane concurrency (option 2) — future enhancement, needs design discussion
Environment where observed
- OpenClaw 2026.2.3-1, Windows 11, Node.js v24.13.0
- Two nested agents on Opus 4.5 via Claude Max
- Webchat in Brave browser
- Lane wait times: 123-248 seconds
- Gateway RAM: ~470 MB (single node process running 6+ hours)
Summary
Implement lane backpressure management for multi-agent webchat sessions to prevent UI freezes during heavy agent-to-agent exchanges. Currently, when nested agents exchange large payloads (18k+ tokens), the session lane queue blocks new messages for 2-4+ minutes with no user feedback, causing browser "wait or leave page" dialogs.
Related bug report: #11284
Problem
In multi-agent setups (e.g., a coordinator agent + a Claude Code subagent, both on Opus 4.5), heavy exchanges create a cascade:
queueAhead=1)waitedMs=248826)This is distinct from #7725 (Ollama timeout causing zombie state) — here the system is functioning correctly but the UX is broken.
Proposed solution
1. Webchat UI: streaming activity indicator during lane wait
When
queueAhead > 0and a run is active on the lane, the webchat UI should show a live indicator:Implementation hint: The gateway already emits
lane wait exceededdiagnostics. Expose this as a WebSocket event (e.g.,lane.status) that the Control UI subscribes to. The UI can then show a non-blocking overlay instead of letting the browser think the page is hung.2. Lane: configurable per-session concurrency for nested agents
Currently, nested agent sessions share a single lane, serializing all messages. For multi-agent setups, allow configurable concurrency:
{ "agents": { "defaults": { "lane": { "maxConcurrent": 1, "warnThresholdMs": 30000, "nestedAgentParallel": false } } } }warnThresholdMs: Emit a user-visible warning when lane wait exceeds this (default: 30s)nestedAgentParallel: Whentrue, allow nested agent runs to execute in parallel rather than serialized (experimental, opt-in)3. Tool call validation: fail-fast on missing required parameters
The
readtool requires apathparameter. When an agent issuesreadwithargsType=objectbut nopath, the gateway should:tool_resulterror to the calling agent (not just log a warning)"Error: read tool requires 'path' parameter. Received: {}"This is a separate but related issue — malformed tool calls during heavy exchanges suggest the model may be under context pressure and needs the error signal to self-correct.
4. WebSocket keepalive during long runs
The webchat WebSocket connection should send periodic heartbeat frames during long-running lane tasks to prevent the browser from treating the connection as stale:
This prevents browser-level "page unresponsive" dialogs that are triggered by lack of WebSocket activity.
Alternatives considered
Recommended implementation order
Environment where observed