-
-
Notifications
You must be signed in to change notification settings - Fork 69k
[Bug]: Gateway agentRunSeq Map Never Pruned Causes Memory Exhaustion #6036
Description
CVSS Assessment
| Metric | Value |
|---|---|
| Score | 6.5 / 10.0 |
| Severity | Medium |
| Vector | CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H |
Summary
The gateway server maintains an agentRunSeq Map to track sequence numbers for agent runs, but this Map is never pruned. Unlike other Maps in the gateway context (dedupe, chatAbortControllers, chatRunState.abortedRuns) which are cleaned up in the maintenance loop, agentRunSeq accumulates entries indefinitely, causing slow memory exhaustion on long-running gateway servers.
Affected Code
File: src/gateway/server-runtime-state.ts:155
const agentRunSeq = new Map<string, number>();File: src/gateway/server-chat.ts:246-264
const last = agentRunSeq.get(evt.runId) ?? 0;
if (evt.seq <= last) {
// skip duplicate
}
agentRunSeq.set(evt.runId, evt.seq); // Entry added, never removedFile: src/gateway/server-maintenance.ts:75-117
// Maintenance loop prunes other maps but NOT agentRunSeq
const dedupeCleanup = setInterval(() => {
// Prunes: dedupe, chatAbortControllers, chatRunState.abortedRuns
// MISSING: agentRunSeq is never cleaned up
}, 60_000);Attack Surface
How is this reached?
- Network (HTTP/WebSocket endpoint, API call)
- Adjacent Network (same LAN, requires network proximity)
- Local (local file, CLI argument, environment variable)
- Physical (requires physical access to machine)
Authentication required?
- None (unauthenticated/public access)
- Low (any authenticated user)
- High (admin/privileged user only)
Entry point: Any authenticated gateway client initiating agent runs via WebSocket connection
Exploit Conditions
Complexity:
- Low (no special conditions, works reliably)
- High (requires race condition, specific config, or timing)
User interaction:
- None (automatic, no victim action needed)
- Required (victim must click, visit, or perform action)
Prerequisites: Gateway server running with authenticated clients making agent requests over time
Impact Assessment
Scope:
- Unchanged (impact limited to vulnerable component)
- Changed (can affect other components, escape sandbox)
What can an attacker do?
| Impact Type | Level | Description |
|---|---|---|
| Confidentiality | None | No data disclosure |
| Integrity | None | No data modification |
| Availability | High | Gateway process memory exhaustion leading to OOM crash or degraded performance |
Steps to Reproduce
- Start a gateway server with persistent uptime
- Make repeated agent runs from authenticated clients over days/weeks
- Monitor memory usage of the gateway process
- Observe that memory grows linearly with total unique
runIdvalues ever seen - Eventually gateway becomes unresponsive or crashes due to memory exhaustion
Recommended Fix
Add agentRunSeq to the maintenance cleanup loop in server-maintenance.ts:
// In the dedupeCleanup interval, add:
const agentRunSeqCleanup = setInterval(() => {
const maxAge = 60 * 60 * 1000; // 1 hour
const now = Date.now();
// Either track timestamps with entries, or clear periodically
// Option 1: Clear entire map on interval (loses sequence tracking for old runs)
if (agentRunSeq.size > 10000) {
agentRunSeq.clear();
}
// Option 2: Track timestamps and prune old entries
}, 60_000);References
- CWE: CWE-770 - Allocation of Resources Without Limits or Throttling
- Related: Similar to other unbounded cache issues in the codebase (presence cache, sticker cache, rate limit map)