-
-
Notifications
You must be signed in to change notification settings - Fork 69.5k
[Bug]: nested lane hardcoded to maxConcurrent: 1 — sessions_send broadcasts cause cascading timeouts #14214
Description
Summary
In multi-agent setups (10 agents), sessions_send calls all route through the nested command lane, which has a hardcoded maxConcurrent: 1. When an agent broadcasts to N other agents via sequential sessions_send calls, the queue grows linearly and each call blocks for up to 30 seconds waiting for its turn. Agents later in the queue reliably hit the 30-second timeout.
For example, with 10 agents, a broadcast to 9 agents takes up to 270 seconds wall-clock (9 × 30s serial), and agents at positions 3+ in the queue always see Session Send: agent:<id>:main failed: timeout.
Steps to reproduce
- Set up 3+ agents with
agentToAgentenabled - Have one agent use
sessions_sendto broadcast a message to all other agents sequentially - Observe the
nestedlane queue growing in the gateway logs:lane enqueue: lane=nested queueSize=2 lane enqueue: lane=nested queueSize=3 lane enqueue: lane=nested queueSize=4 ... - Agents later in the queue will get
timeouterrors
Expected behavior
The nested lane concurrency should be configurable via openclaw.json, similar to how agents.defaults.maxConcurrent (Main lane) and agents.defaults.subagents.maxConcurrent (Subagent lane) work today.
For example:
{
"agents": {
"defaults": {
"maxConcurrent": 8,
"subagents": { "maxConcurrent": 12 },
"nestedMaxConcurrent": 4
}
}
}Or alternatively via a dedicated sessions config key:
{
"agents": {
"defaults": {
"sessions": { "maxConcurrent": 4 }
}
}
}Actual behavior
The nested lane is always maxConcurrent: 1, regardless of configuration. The applyGatewayLaneConcurrency() function in src/gateway/server-lanes.ts sets concurrency for Main, Cron, and Subagent lanes but omits Nested:
function applyGatewayLaneConcurrency(cfg) {
setCommandLaneConcurrency(CommandLane.Cron, cfg.cron?.maxConcurrentRuns ?? 1);
setCommandLaneConcurrency(CommandLane.Main, resolveAgentMaxConcurrent(cfg));
setCommandLaneConcurrency(CommandLane.Subagent, resolveSubagentMaxConcurrent(cfg));
// Missing: setCommandLaneConcurrency(CommandLane.Nested, ???);
}The same omission exists in the hot-reload path where lane concurrency is re-applied after config changes.
Environment
- OpenClaw version: 2026.2.9
- OS: macOS 15.3.1 (Sequoia), Apple Silicon Mac Mini
- Install method: npm global (
npm install -g openclaw) - Node.js: v22.22.0
Logs or screenshots
Gateway diagnostic logs showing the queue growth during a broadcast:
17:47:40 lane enqueue: lane=nested queueSize=2
17:47:47 lane enqueue: lane=nested queueSize=3
17:48:10 lane enqueue: lane=nested queueSize=4
17:48:40 lane enqueue: lane=nested queueSize=5
17:49:10 lane enqueue: lane=nested queueSize=6
17:49:40 lane enqueue: lane=nested queueSize=7
Each sessions_send tool call returns after ~30 seconds with a timeout status, even though the message is eventually delivered.
Workaround
We're patching the bundled JS to inject setCommandLaneConcurrency(CommandLane.Nested, 4) at both injection points. This needs to be re-applied after every update.
Related issues
- [Enhancement]: Lane backpressure management + tool call validation for multi-agent webchat sessions #11286 (lane backpressure management)
- Session lane can deadlock indefinitely after LLM timeout, blocking all subsequent messages #7630 (session lane deadlock)