fix(oom): cap output buffers to stop 96GB memory blowups#288
fix(oom): cap output buffers to stop 96GB memory blowups#288subsy merged 6 commits intosubsy:mainfrom
Conversation
|
@dyxushuai is attempting to deploy a commit to the plgeek Team on Vercel. A member of the Team first needs to authorize it. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds memory-bounded streaming for stdout/stderr, per-iteration raw on-disk capture and plumbing to persist raw files, and exposes test helpers; includes unit tests for buffer-limiting and file-backed log streaming. Changes
Sequence Diagram(s)sequenceDiagram
participant Engine
participant AgentPlugin
participant Filesystem
participant LogsPersistence
Engine->>AgentPlugin: start iteration (capture IO)
AgentPlugin->>AgentPlugin: onStdout/onStderr -> appendWithCharLimit
AgentPlugin->>Filesystem: write raw stdout/stderr chunks to temp files
AgentPlugin-->>Engine: return AgentExecutionResult (includes raw file paths)
Engine->>LogsPersistence: saveIterationLog(..., rawStdoutFilePath, rawStderrFilePath)
LogsPersistence->>Filesystem: stream raw files into persisted log
LogsPersistence-->>Engine: persisted log metadata
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #288 +/- ##
==========================================
+ Coverage 44.17% 44.47% +0.29%
==========================================
Files 95 96 +1
Lines 29839 30132 +293
==========================================
+ Hits 13181 13400 +219
- Misses 16658 16732 +74
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adds in-memory output truncation to prevent unbounded stdout/stderr growth during long-running ralph-tui run sessions that can lead to OOMs.
Changes:
- Cap agent execution stdout/stderr buffers held by
BaseAgentPlugin. - Cap engine “live output” buffers and truncate per-iteration in-memory history output while preserving tail markers.
- Add regression tests covering truncation behavior and tail preservation.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/plugins/agents/base.ts | Adds a capped append helper and applies it to captured agent stdout/stderr buffers. |
| src/plugins/agents/base.memory.test.ts | Tests helper behavior for truncation + tail preservation. |
| src/engine/index.ts | Caps engine live output buffers and truncates iteration-history agentResult stored in memory; exposes helpers for tests. |
| src/engine/memory-limits.test.ts | Tests engine-side truncation of stored iteration history output. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/engine/index.ts (1)
1243-1250:⚠️ Potential issue | 🟡 MinorDisk log fallback for stderr may receive a truncated buffer.
The call
agentResult.stderr ?? this.state.currentStderrfalls back to the engine's live buffer, which is now capped at 250 K. If the agent's ownstderrfield is empty whilecurrentStderroverflowed, the on-disk iteration log will silently lose early stderr content. Given the PR's stated goal of keeping "full raw output in on-disk iteration logs", you may want to accumulate a separate unbounded (or disk-backed) stderr buffer for the log writer, or at least document this as an accepted limitation.
🧹 Nitpick comments (1)
src/plugins/agents/base.ts (1)
147-180: Duplicate implementation ofappendWithCharLimit— extract to a shared utility.This function is duplicated verbatim in
src/engine/index.ts(lines 92–125). Both copies share the same algorithm and signature. If a bug is found or the truncation logic changes, both sites must be updated in lockstep.Consider extracting it into a shared module (e.g.,
src/utils/string-limits.ts) and importing it in bothbase.tsandengine/index.ts, each supplying its own constants and prefix.♻️ Proposed shared utility
Create a new file
src/utils/string-limits.ts:/** * ABOUTME: Shared helper for capping in-memory string buffers while preserving tail content. */ /** * Append chunk data while enforcing an in-memory size cap. * Keeps the most recent content (tail) to preserve completion markers near the end. */ export function appendWithCharLimit( current: string, chunk: string, maxChars: number, prefix: string ): string { if (!chunk) return current; if (maxChars <= 0) return ''; const totalLen = current.length + chunk.length; if (totalLen <= maxChars) { return current + chunk; } if (maxChars <= prefix.length) { return prefix.slice(0, maxChars); } const keep = maxChars - prefix.length; const combinedTailStart = totalLen - keep; let tail: string; if (combinedTailStart >= current.length) { const startInChunk = combinedTailStart - current.length; tail = chunk.slice(startInChunk); } else { const tailFromCurrent = current.slice(combinedTailStart); const remaining = keep - tailFromCurrent.length; const tailFromChunk = remaining > 0 ? chunk.slice(-remaining) : ''; tail = tailFromCurrent + tailFromChunk; } return prefix + tail; }Then in both
base.tsandengine/index.ts:-function appendWithCharLimit( - current: string, - chunk: string, - maxChars: number, - prefix = STREAM_TRUNCATED_PREFIX -): string { - // ... duplicated body ... -} +import { appendWithCharLimit } from '../../utils/string-limits.js';
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/engine/index.ts`:
- Around line 1046-1058: The inner try/catch that creates rawOutputTempDir and
opens rawStdoutFd/rawStderrFd currently clears rawOutputTempDir on any failure
which prevents the outer finally cleanup from removing the temp dir; instead,
preserve rawOutputTempDir so the finally block can delete it: in the catch for
the mkdtemp/openSync block do not set rawOutputTempDir = undefined, only close
any successfully opened descriptors (rawStdoutFd, rawStderrFd) and clear the
corresponding file path variables (rawStdoutFilePath, rawStderrFilePath) as
needed (use closeRawOutputFiles or explicit closeSync), then rethrow or allow
the outer logic to handle the error so the finally cleanup that removes
rawOutputTempDir can run.
In `@src/logs/persistence.ts`:
- Around line 680-714: The finally block currently calls
closeWriteStream(stream) unconditionally which can throw and mask an earlier
error from writeChunkToStream or pipeFileToStream; change the pattern in the raw
file-stream branch so you capture any primary error (e.g., assign to a variable
like originalError inside the try/catch) and then in finally attempt to
closeWriteStream(stream) inside its own try/catch—if closeWriteStream throws,
rethrow the originalError if present, otherwise rethrow the close error;
reference stream, writeChunkToStream, pipeFileToStream, and closeWriteStream
when making this change so the original write/pipe errors aren’t swallowed by a
close failure.
🧹 Nitpick comments (2)
src/engine/index.ts (2)
1020-1044: A write failure on one stream disables both raw output streams.If
writeSyncto stdout fails, bothrawStdoutFilePathandrawStderrFilePathare set toundefined, losing the stderr raw capture as well (and vice-versa). This is probably intentional as a defensive measure (shared temp directory likely broken), but worth a brief inline comment to clarify intent.
75-129: ExtractappendWithCharLimitto a shared utility module.The function is defined identically in both
src/engine/index.ts(line 96) andsrc/plugins/agents/base.ts(line 147), differing only in the default prefix parameter. This duplication risks divergence and violates DRY principles. Extract it into a shared utility module (e.g.src/utils/buffer.ts) and import from both locations, passing the context-specific prefix strings as needed.
|
@dyxushuai thanks for reporting - will look at this tomorrow |
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
|
thanks @dyxushuai - made a few final fixes, and merging this |
Why
We hit a real OOM during a long
ralph-tui runsession, even on a 96 GB machine.The process was retaining output in several overlapping places:
Those copies kept growing over time and eventually exhausted memory.
How
currentOutput/currentStderr.current + chunkallocations under heavy streaming output.Tests
bun test src/plugins/agents/base.memory.test.ts src/engine/memory-limits.test.ts tests/logs/persistence.test.tsbun run typecheckbun run buildubs ...(not run in this environment: command not found)Summary by CodeRabbit
New Features
Bug Fixes
Tests