Skip to content

fix(codex): reject start promise immediately on process exit during startup#1929

Merged
piorpua merged 1 commit intomainfrom
fix/sentry-ELECTRON-E4
Mar 30, 2026
Merged

fix(codex): reject start promise immediately on process exit during startup#1929
piorpua merged 1 commit intomainfrom
fix/sentry-ELECTRON-E4

Conversation

@kaizhou-lab
Copy link
Copy Markdown
Collaborator

Summary

  • Sentry issues: ELECTRON-E4 (7 events), ELECTRON-G1 (4 events) — "Codex process failed to start or was killed during startup"
  • When the Codex child process exits during startup, the exit handler now rejects the start promise immediately with the actual exit code and signal, instead of waiting for the 5-second timeout to fire with a generic error message
  • Added unit tests covering startup exit rejection, spawn error, and successful startup scenarios

Test plan

  • bun run lint — 0 errors
  • bunx tsc --noEmit — passes
  • bunx vitest run — all relevant tests pass (pre-existing failures in unrelated previewFileWatch.dom.test.ts)
  • Verify in production that Codex startup errors now include exit code info in Sentry

…tartup

When the Codex child process exits during startup (before the 5-second
timeout), the exit handler now rejects the start promise immediately with
the actual exit code and signal, instead of waiting for the timeout to
fire with a generic "failed to start or was killed" message.

This provides more informative error messages for debugging and prevents
the error from being reported to Sentry as a generic startup failure.

Closes: ELECTRON-E4, ELECTRON-G1
@kaizhou-lab kaizhou-lab marked this pull request as ready for review March 30, 2026 08:57
@piorpua piorpua added the bot:reviewing Review in progress (mutex) label Mar 30, 2026
@piorpua
Copy link
Copy Markdown
Contributor

piorpua commented Mar 30, 2026

Code Review:fix(codex): reject start promise immediately on process exit during startup (#1929)

变更概述

此 PR 修复了 Sentry 上的 ELECTRON-E4 和 ELECTRON-G1 问题("Codex process failed to start or was killed during startup")。在 CodexConnection.tsexit 事件处理器中新增了立即 reject 启动 promise 的逻辑,避免等待 5 秒超时才报错。同时新增了 tests/unit/codexConnectionStartup.test.ts,覆盖了启动失败、spawn 错误和正常启动三个场景。


方案评估

结论:✅ 方案合理

修复精准命中根因:进程启动时退出,不再等 5 秒超时才 reject,改为在 exit 事件触发时立即 reject 并携带具体退出码。与现有架构完全一致,没有引入新的复杂度。测试覆盖了三种场景,覆盖充分。


问题清单

🔵 LOW — 5 秒 setTimeout 在进程提前退出后仍会触发

文件src/process/agent/codex/connection/CodexConnection.ts,第 295–301 行

问题说明

当进程以非零码退出时,exit 处理器会立即调用 reject(),并通过 handleProcessExit()this.child 设为 null。5 秒后,setTimeout 仍会触发,因为 this.child === null,走到 else 分支再次调用 reject()。JavaScript Promise 只能被 settle 一次,第二次 reject() 是 no-op,行为上没有问题,但这个冗余调用可以通过标志位或 clearTimeout 优化掉。

修复建议(可选):可在 Promise 内维护 settled 标志位,在 exit 处理器和 setTimeout 中统一用 settleReject / settleResolve 包装,避免重复调用。


汇总

# 严重级别 文件 问题
1 🔵 LOW CodexConnection.ts:295–301 进程提前退出后 5s setTimeout 仍触发,冗余 reject()

结论

批准合并 — 无阻塞性问题,修复逻辑正确,测试覆盖充分,仅有一个不影响功能的 LOW 级代码整洁度建议。


本报告由本地 pr-review skill 生成,包含完整项目上下文,无截断限制。

@piorpua
Copy link
Copy Markdown
Contributor

piorpua commented Mar 30, 2026

✅ 已自动 review,无阻塞性问题,正在触发自动合并。

@piorpua piorpua merged commit d36e5fe into main Mar 30, 2026
17 checks passed
@piorpua piorpua deleted the fix/sentry-ELECTRON-E4 branch March 30, 2026 10:53
@piorpua piorpua added bot:done Auto-merged by bot and removed bot:reviewing Review in progress (mutex) labels Mar 30, 2026
wuhao1477 added a commit to wuhao1477/AionUi that referenced this pull request Mar 30, 2026
* 'main' of github.com:wuhao1477/AionUi: (40 commits)
  fix(agents): prevent unhandled promise rejection in bootstrap initialization (iOfficeAI#1933)
  fix(gemini): restore context after stopping a reply (iOfficeAI#1932)
  fix(codex): reject start promise immediately on process exit during startup (iOfficeAI#1929)
  fix(conversation): sync renamed titles with detail view (iOfficeAI#1927)
  fix(paste): deduplicate filenames when pasting multiple images simultaneously
  fix(mobile): add SafeArea support and update app icon (iOfficeAI#1926)
  fix(database): guard against undefined params in databaseBridge providers (iOfficeAI#1924)
  fix(conversation): validate type field before creating conversation (iOfficeAI#1921)
  fix(docs): restore wechat_group_5.png reference to wx-5.png in readme
  fix(snapshot): add maxBuffer to git add/commit exec calls (iOfficeAI#1914)
  refactor(acp): consolidate AGENT_SKILLS_DIRS into ACP_BACKENDS_ALL (iOfficeAI#1913)
  fix(gemini): guard against EACCES in workspace realpath during init (ELECTRON-BM) (iOfficeAI#1912)
  fix(channels): send raw QR ticket instead of page URL in WeChat WebUI login SSE (iOfficeAI#1910)
  .md format
  chore(pr-automation): fix missed sleep 5 in comment to sleep 10
  chore(pr-automation): increase auto-merge retry delay to 10s
  chore(pr-automation): add 5s retry for transient GitHub mergeStateStatus UNKNOWN
  fix(docs): remove trailing whitespace in OfficeCLI readmes
  chore(pr-automation): verify auto-merge success before labeling bot:done
  fix(snapshot): guard against non-existent workspace in WorkspaceSnapshotService.init (iOfficeAI#1906)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:done Auto-merged by bot

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants