Summary
In dist/probe-*.js (the IMessageRpcClient class, an imsg rpc JSON-RPC client over a child process's stdin), request() writes to this.child.stdin.write(line) without an error callback or stream error listener. When the child closes its stdin (rate limit, child crash, network hiccup), Node emits an asynchronous error event with EPIPE. Since neither request()'s try/catch nor any 'error' listener catches async write failures, the error escapes to the process-level uncaughtException handler, which calls process.exit(1).
Affected code (2026.4.29, file probe-DGfoCahw.js:113)
async request(method, params, opts) {
if (!this.child || !this.child.stdin) throw new Error("imsg rpc not running");
const id = this.nextId++;
const line = `${JSON.stringify({ jsonrpc: "2.0", id, method, params: params ?? {} })}\n`;
const timeoutMs = opts?.timeoutMs ?? 1e4;
const response = new Promise((resolve, reject) => {
const key = String(id);
const timer = timeoutMs > 0 ? setTimeout(() => {
this.pending.delete(key);
reject(new Error(`imsg rpc timeout (${method})`));
}, timeoutMs) : void 0;
this.pending.set(key, { resolve, reject, timer });
});
this.child.stdin.write(line); // ← no callback, no error handler
return await response;
}
Symptom (real-world incident)
2026-05-01T10:50:57.032+08:00 [openclaw] Uncaught exception: Error: write EPIPE
at <CodexAppServerClient or IMessageRpcClient>.writeMessage
Supervisor:
gateway exited (code=1, total=38044s, listen=38026s)
Gateway had been up 10.5 hours; one EPIPE killed it.
Same pattern was present in 2026.4.11
harness-CmLE805l.js:478 (CodexAppServerClient.writeMessage) — refactored away in 2026.4.29 (good)
probe-Bh4qEP-V.js:343 (IMessageRpcClient.request) — still present at probe-DGfoCahw.js:113
Other unfixed sites (lower risk but same class of bug)
In 2026.4.29:
exec-BgVqrNG-.js: child.stdin.write(input ?? "") after spawn (one-shot pattern)
supervisor-8fcihB5y.js: child.stdin.write(params.input) (one-shot, mirrored pattern)
bash-tools.exec-runtime-*.js (if still present): PTY DSR cursor response from stdout event handler
Suggested patch
this.child.stdin.write(line, (err) => {
if (err) {
const pending = this.pending.get(String(id));
if (pending) {
this.pending.delete(String(id));
if (pending.timer) clearTimeout(pending.timer);
pending.reject(err);
}
}
});
This rejects the awaiting request() promise with the EPIPE error (caller sees a clean exception), avoids uncaughtException, and leaves the rest of the pending-map cleanup to the existing failAll / stop paths.
Reproduce locally (rough)
- Start gateway with anything that uses
imsg rpc (e.g. embedded Codex app-server, or one of the bundled plugins that use IMessageRpcClient).
- Force the child to close its stdin while a request is in-flight (kill the child process, close the underlying transport).
- Trigger another
request() on the (now closed) stdin → uncaught EPIPE → process.exit(1).
Environment
- Node v22.22
- Termux on Android (glibc wrapper), but EPIPE behaviour is kernel-level and reproduces on any platform.
- Same code path exists on Linux/macOS Node installs.
Why an 'error' listener on the stream is also worth considering
Even with the per-write callback fix, an attached child.stdin.on('error', ...) (set once at client init, no-op or routed to failAll) would catch races where write() returns synchronously OK but the kernel later signals EPIPE on flush, before any specific request can claim the error.
Summary
In
dist/probe-*.js(theIMessageRpcClientclass, animsg rpcJSON-RPC client over a child process's stdin),request()writes tothis.child.stdin.write(line)without an error callback or streamerrorlistener. When the child closes its stdin (rate limit, child crash, network hiccup), Node emits an asynchronouserrorevent withEPIPE. Since neitherrequest()'stry/catchnor any'error'listener catches async write failures, the error escapes to the process-leveluncaughtExceptionhandler, which callsprocess.exit(1).Affected code (2026.4.29, file
probe-DGfoCahw.js:113)Symptom (real-world incident)
Supervisor:
Gateway had been up 10.5 hours; one EPIPE killed it.
Same pattern was present in 2026.4.11
harness-CmLE805l.js:478(CodexAppServerClient.writeMessage) — refactored away in 2026.4.29 (good)probe-Bh4qEP-V.js:343(IMessageRpcClient.request) — still present atprobe-DGfoCahw.js:113Other unfixed sites (lower risk but same class of bug)
In 2026.4.29:
exec-BgVqrNG-.js:child.stdin.write(input ?? "")after spawn (one-shot pattern)supervisor-8fcihB5y.js:child.stdin.write(params.input)(one-shot, mirrored pattern)bash-tools.exec-runtime-*.js(if still present): PTY DSR cursor response from stdout event handlerSuggested patch
This rejects the awaiting
request()promise with the EPIPE error (caller sees a clean exception), avoidsuncaughtException, and leaves the rest of the pending-map cleanup to the existingfailAll/stoppaths.Reproduce locally (rough)
imsg rpc(e.g. embedded Codex app-server, or one of the bundled plugins that useIMessageRpcClient).request()on the (now closed) stdin → uncaught EPIPE →process.exit(1).Environment
Why an
'error'listener on the stream is also worth consideringEven with the per-write callback fix, an attached
child.stdin.on('error', ...)(set once at client init, no-op or routed tofailAll) would catch races wherewrite()returns synchronously OK but the kernel later signals EPIPE on flush, before any specific request can claim the error.