feat: per-agent queue lanes (fixes #16055)#52655
feat: per-agent queue lanes (fixes #16055)#52655yegorkryukov wants to merge 3 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR introduces per-agent queue lanes, allowing multi-agent setups to process messages in parallel rather than competing on the shared Key changes:
Two issues to address before merging:
Confidence Score: 4/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/config/agent-limits.test.ts
Line: 77-83
Comment:
**Test will fail: assertion contradicts implementation**
The test expects `toBeUndefined()` when `laneConcurrency: 0` is supplied, but the implementation in `agent-limits.ts` does:
```typescript
if (typeof raw === "number" && Number.isFinite(raw)) {
return Math.max(1, Math.floor(raw));
}
```
`0` is a finite number, so this branch is taken and returns `Math.max(1, 0)` = `1`, not `undefined`. Running this test as-is will produce a failing assertion.
Either fix the guard in `resolveAgentLaneConcurrency` to exclude non-positive values (matching the test's intent and the comment "0 is not positive, so zod would reject it, but the resolver guards anyway"):
```
// in agent-limits.ts
if (typeof raw === "number" && Number.isFinite(raw) && raw > 0) {
return Math.max(1, Math.floor(raw));
}
```
Or, if the clamping behaviour is intentional, update the expectation:
```suggestion
it("clamps to minimum of 1", () => {
const cfg = {
agents: { list: [{ id: "bot-a", laneConcurrency: 0 }] },
} as unknown as OpenClawConfig;
// 0 is not positive, so zod would reject it, but the resolver guards anyway
expect(resolveAgentLaneConcurrency(cfg, "bot-a")).toBe(1);
});
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/config/agent-limits.ts
Line: 41-51
Comment:
**Unsafe type cast — `AgentConfig` type not updated**
`entry` is typed as `AgentConfig` (from `OpenClawConfig`), but `AgentConfig` in `types.agents.ts` does not yet declare `lane` or `laneConcurrency`. This forces the `(entry as Record<string, unknown> | undefined)` cast here and in `resolveAgentLane`, bypassing TypeScript's type checker for the new fields.
Adding the two optional fields to `AgentConfig` in `src/config/types.agents.ts` would let both functions access them without any cast and catch any future typos at compile time:
```typescript
// in types.agents.ts — AgentConfig
/** Custom global queue lane for this agent's inbound runs (default: "main"). */
lane?: string;
/** Concurrency cap for this agent's custom lane. */
laneConcurrency?: number;
```
This is also visible in the test file, which uses `as unknown as OpenClawConfig` throughout the new test cases to work around the missing types.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "feat: per-agent queue lanes to fix multi..." | Re-trigger Greptile |
7b33388 to
427e5a8
Compare
Adds `lane` and `laneConcurrency` config fields to `agents.list[]` entries,
enabling each agent to use its own global queue lane instead of sharing the
default "main" lane.
Previously, all inbound messages from all agents were serialized through
the shared main lane (capped at maxConcurrent, default 4), creating a
bottleneck for multi-agent setups where independent bots should process
messages in parallel.
Changes:
- Add `lane` (string) and `laneConcurrency` (number) to AgentEntrySchema
- Add resolveAgentLane() and resolveAgentLaneConcurrency() helpers
- Pass per-agent lane to runEmbeddedPiAgent in the auto-reply dispatch
- Apply per-agent lane concurrency on gateway startup and hot-reload
- Add config labels and help text
- Add unit tests for the new resolvers
Config example:
agents:
list:
- id: "agent-one"
lane: "lane-one"
laneConcurrency: 10
- id: "agent-two"
lane: "lane-two"
laneConcurrency: 10
Fixes openclaw#16055
427e5a8 to
d49366f
Compare
Address Greptile review feedback: - Add lane and laneConcurrency fields to AgentConfig in types.agents.ts - Remove (entry as Record<string, unknown>) casts in resolvers - Remove (as unknown as OpenClawConfig) casts in tests
Issue:
|
Re:
|
|
Okay, I'll try again. It seems the problem is with my compilation. |
|
I tested a related concurrency case locally and found another likely bottleneck that may be worth considering alongside this PR. In my Discord setup, different channel-bound agents/sessions were still sometimes visibly queueing behind each other even though session routing was already isolated correctly. After ruling out Relevant code path:
Current logic: const sessionLane = resolveSessionLane(params.sessionKey?.trim() || params.sessionId);
const globalLane = resolveGlobalLane(params.lane);And in export function resolveGlobalLane(lane?: string) {
const cleaned = lane?.trim();
if (cleaned === CommandLane.Cron) {
return CommandLane.Nested;
}
return cleaned ? cleaned : CommandLane.Main;
}So when no explicit lane is passed, otherwise isolated embedded runs can still funnel through the same fallback global lane ( I tried a minimal local patch: const sessionLane = resolveSessionLane(params.sessionKey?.trim() || params.sessionId);
const resolvedLaneKey = params.sessionKey?.trim() || params.sessionId;
const globalLane =
params.lane?.trim() || (resolvedLaneKey ? `agent-run:${resolvedLaneKey}` : resolveGlobalLane());Result from local manual validation:
This is only local/manual validation, but it suggests that per-agent lanes alone may not fully eliminate cross-session serialization if I opened a follow-up issue with more detail here: #55566 Might be worth checking whether this PR should also cover the embedded global lane fallback path, or whether that should stay as a separate follow-up change. |
Re: embedded global lane fallback@jlin53882 — good catch, and I can reproduce the issue in the source. What this PR coversis wired through exactly one call site: (the main inbound reply path). That's the hot path for normal user messages, so per-agent lane config works correctly there. What it doesn't coverSeveral other
All of these fall back to My take on scopeYour patch (deriving a per-session global lane from I think the cleanest fix is surgical: pass I'll scope that into this PR — it's a small change and directly connected to the feature. Will push shortly. |
42c6677 to
48b0fda
Compare
|
Pushed the fix. Two changes in the new commit:
Intentionally left out:
All 12 tests still pass, no new type errors in changed files. |
Problem
All inbound messages from all agents are serialized through the shared
mainqueue lane, capped atagents.defaults.maxConcurrent(default 4). For multi-agent setups with 5+ independent bots, this creates a bottleneck where agents queue behind each other despite being completely independent (separate bot tokens, workspaces, and sessions).Reported in #16055 — setting
maxConcurrent: 100does not help because the value wasn't propagating to the main lane correctly for per-agent isolation.Solution
Add
laneandlaneConcurrencyconfig fields toagents.list[]entries. When configured, an agent's inbound runs use a dedicated global queue lane instead of"main", enabling true parallel processing across agents.Config example
Agents without a custom
lanecontinue to use the sharedmainlane (backwards compatible).Changes
src/config/zod-schema.agent-runtime.ts— Addlane(string) andlaneConcurrency(positive int) toAgentEntrySchemasrc/config/agent-limits.ts— AddresolveAgentLane()andresolveAgentLaneConcurrency()helperssrc/auto-reply/reply/agent-runner-execution.ts— Pass per-agent lane torunEmbeddedPiAgentin the auto-reply dispatch pathsrc/gateway/server-lanes.ts— Apply per-agent lane concurrency on gateway startupsrc/gateway/server-reload-handlers.ts— Apply per-agent lane concurrency on hot-reload (SIGUSR1 / config change)src/config/schema.labels.ts+schema.help.ts— Config labels and help textsrc/config/agent-limits.test.ts— Unit tests for the new resolversHow it works
The existing queue in
command-queue.tsalready supports arbitrary named lanes with independent concurrency caps viasetCommandLaneConcurrency(). This PR simply:When
laneConcurrencyis omitted, the custom lane inheritsagents.defaults.maxConcurrent.Backwards compatible
laneuse the sharedmainlane as beforeFixes #16055