Skip to content

feat: add Gemini CLI, Codex, and Kiro CLI agent plugins#170

Merged
subsy merged 16 commits intosubsy:mainfrom
medhatgalal:pr/multi-cli-support
Jan 22, 2026
Merged

feat: add Gemini CLI, Codex, and Kiro CLI agent plugins#170
subsy merged 16 commits intosubsy:mainfrom
medhatgalal:pr/multi-cli-support

Conversation

@medhatgalal
Copy link
Contributor

@medhatgalal medhatgalal commented Jan 20, 2026

Summary

Adds support for three additional AI coding agents:

  • Gemini CLI - Google's Gemini AI assistant
  • Codex - OpenAI's Codex CLI
  • Kiro CLI - AWS Kiro AI assistant

Changes

New Agent Plugins

  • src/plugins/agents/builtin/gemini.ts - Gemini CLI with JSONL output parsing, yoloMode support
  • src/plugins/agents/builtin/codex.ts - Codex CLI with fullAuto and sandbox modes
  • src/plugins/agents/builtin/kiro.ts - Kiro CLI with trustAllTools and agent selection

Bug Fix

  • Store resolved command path in detect() for all agents (claude, opencode, gemini, codex, kiro)
  • Ensures execute() uses the same path that was validated during detection

Documentation

  • Updated README with new agents in features list and skills paths table
  • Updated CONTRIBUTING project structure

Tests

  • Unit tests for all three new agents covering metadata, initialization, and setup questions

Testing

bun test tests/plugins/gemini-agent.test.ts tests/plugins/codex-agent.test.ts tests/plugins/kiro-agent.test.ts
# 48 pass, 0 fail

bun run typecheck  # passes
bun run build      # passes

Skills Paths

Each agent has skillsPaths configured for the multi-agent skill system:

Agent Skills Location
Gemini CLI ~/.gemini/skills/
Codex ~/.codex/skills/
Kiro CLI ~/.kiro/skills/

Summary by CodeRabbit

  • New Features

    • Added Gemini, Codex and Kiro CLI agent integrations with selection and skills support.
  • Improvements

    • Agents persist discovered CLI paths; execution lifecycle now invokes end-of-run hooks.
    • sessionId is propagated into runtime config and iteration log metadata.
    • Iteration log filenames gain a session-aware timestamped format (backwards compatible).
    • TUI parsing extended to surface Gemini/Codex events.
  • Documentation

    • README, contributing guide and site docs/navigation updated for the new agents.
  • Tests

    • Comprehensive tests added for Gemini, Codex, Kiro agents and session-aware logs.

✏️ Tip: You can customize this high-level summary in your review settings.

- Add GeminiAgentPlugin for Gemini CLI
- Add CodexAgentPlugin for OpenAI Codex CLI
- Add KiroAgentPlugin for AWS Kiro CLI
- Store resolved command path in detect() for all agents
- Add skillsPaths for multi-agent skill system support
- Include unit tests for new agents
@vercel
Copy link

vercel bot commented Jan 20, 2026

@medhatgalal is attempting to deploy a commit to the plgeek Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

Warning

Rate limit exceeded

@subsy has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minutes and 19 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

Adds three built-in agent plugins (Gemini CLI, Codex, Kiro); caches resolved executable paths in Claude and OpenCode during detect(); registers/exports new agents; updates TUI output parsing for Gemini/Codex; adds tests and docs; introduces session-aware iteration log filenames and propagates sessionId through run/resume.

Changes

Cohort / File(s) Summary
New Agent Plugins
src/plugins/agents/builtin/codex.ts, src/plugins/agents/builtin/gemini.ts, src/plugins/agents/builtin/kiro.ts
Add Codex, Gemini and Kiro agent plugins: initialization, detect (CLI discovery + version), buildArgs/getStdinInput, JSONL streaming parsing for Codex/Gemini, setup questions, validation, factories and default exports.
Agent Registry & Exports
src/plugins/agents/builtin/index.ts
Import and register new agents; expand public exports to include new factories and plugin classes.
Existing Agents: Path Caching
src/plugins/agents/builtin/claude.ts, src/plugins/agents/builtin/opencode.ts
detect() now stores resolved CLI path on the plugin instance (this.commandPath) for reuse during execute().
TUI Output Parsing
src/tui/output-parser.ts
Add Gemini and Codex JSONL detection/parsing and formatting; integrate into parseAgentOutput and StreamingOutputParser flows to surface tool calls, tool results and errors.
Commands: sessionId propagation
src/commands/run.tsx, src/commands/resume.tsx, src/config/types.ts
Add optional sessionId to config type and assign config.sessionId when creating or resuming sessions to enable session-aware log filenames.
Logging: session-aware filenames & sorting
src/logs/persistence.ts, tests/logs/persistence.test.ts
New filename format {sessionId}_{timestamp}_{taskId}.log when sessionId+startedAt present; generateLogFilename signature updated; list/has/cleanup handle legacy and new formats and sort by startedAt.
Execution lifecycle hook
src/plugins/agents/base.ts
Propagate optional options into RunningExecution and invoke execution.options?.onEnd?.(result) before resolving an execution to allow end-of-run cleanup.
Tests
tests/plugins/*.test.ts, tests/logs/persistence.test.ts
Add comprehensive tests for Codex/Gemini/Kiro plugins and expand persistence tests for session-aware filenames and chronological ordering.
Docs & Navigation
website/content/docs/plugins/agents/*.mdx, website/lib/navigation.ts, README.md, CONTRIBUTING.md
Add detailed docs for new agents, update README/CONTRIBUTING listings and add navigation entries (badged "New").
Engine logging payload
src/engine/index.ts
saveIterationLog now includes sessionId in metadata passed to persistence.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Manager as Agent Manager
    participant Plugin as Agent Plugin
    participant CLI as CLI Executable
    participant Parser as Output Parser
    participant Config as Configuration

    Manager->>Plugin: initialize(config)
    Plugin->>Config: read model/flags/timeout
    Config-->>Plugin: config values

    Manager->>Plugin: detect()
    Plugin->>CLI: findCommandPath(defaultCommand)
    CLI-->>Plugin: resolved path
    Plugin->>CLI: run --version (with timeout)
    CLI-->>Plugin: version string / error
    Plugin->>Plugin: store commandPath & version
    Plugin-->>Manager: AgentDetectResult

    Manager->>Plugin: buildArgs(prompt, files, options)
    Plugin-->>Manager: CLI argument array

    Manager->>Plugin: execute(prompt,...)
    Plugin->>CLI: run using stored commandPath + args (stdin)
    CLI-->>Plugin: stdout / stream (JSONL or text)
    Plugin->>Parser: forward raw JSONL (onJsonlMessage)
    Parser-->>Manager: parsed events / streamed display segments
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hopped through code with tiny paws,
Gemini, Codex, Kiro joined the cause,
Paths tucked in pockets, sessions stamped tight,
Logs now ordered by timestamp and light,
A rabbit hums: the tests passed through the night!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change—adding three new agent plugins (Gemini CLI, Codex, and Kiro CLI)—which aligns with the substantial additions in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/plugins/agents/builtin/kiro.ts`:
- Around line 117-123: When handling the child process close in the
proc.on('close', ...) callback, don't treat exit code 0 as a success unless the
version regex actually matched; update the logic in that handler (the proc.on
close block that currently inspects stdout/stderr and calls resolve) to check
versionMatch before resolving success: true — if versionMatch is falsy, resolve
with success: false and include a clear error message (e.g., "Unexpected version
format") plus the raw stdout/stderr for diagnostics; keep the existing behavior
for non‑zero codes but ensure the same diagnostic info is included in the error
payload.
- Around line 92-100: The runVersion function currently spawns the resolved
executable path with shell: true; change this to shell: false to avoid
unnecessary shell invocation when 'command' is a resolved path, and only enable
shell: true when necessary for platform-specific behavior (e.g., wrap with a
runtime check for Windows if you need cmd.exe semantics). Update the spawn call
in runVersion (the const proc = spawn(command, ['-V'], { ... })) to set shell:
false and adjust any Windows handling around command resolution or quoting so
the direct spawn works cross-platform.
🧹 Nitpick comments (2)
README.md (1)

173-181: Consider including Kiro in the table for improved scannability.

Whilst the current note below the table is clear, users scanning the table might initially miss that Kiro is supported. Consider adding Kiro as a table row:

| Kiro CLI | Uses [steering files](https://kiro.dev/docs/cli/steering/) (`~/.kiro/steering/`) instead of skills |

This would improve discoverability whilst maintaining the explanation about the different mechanism.

src/plugins/agents/builtin/kiro.ts (1)

155-176: Terminate options before the prompt to avoid flag parsing.

If the user prompt starts with -, it could be interpreted as an option. If kiro-cli supports it, insert -- before the prompt.

🔧 Suggested change
-    // Prompt goes last
-    args.push(prompt);
+    // Prompt goes last
+    args.push('--', prompt);

- Add Gemini CLI, Codex, Kiro CLI to supported agents list
- Add skills paths for new agents
- Update project structure comment
- Use shell: false on Unix, shell: true only on Windows for spawn
- Add validation for version parsing before resolving success
- Apply fixes to gemini, codex, and kiro agents
@subsy
Copy link
Owner

subsy commented Jan 21, 2026

@medhatgalal thanks for this - reviewing shortly - would you mind please adding docs updates to add the new agent plugins to website/docs ?

@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 55.93087% with 561 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.09%. Comparing base (5fb1ab5) to head (50adbf8).
⚠️ Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
src/plugins/agents/builtin/codex.ts 48.87% 205 Missing ⚠️
src/plugins/agents/builtin/gemini.ts 46.52% 200 Missing ⚠️
src/plugins/agents/builtin/kiro.ts 59.32% 72 Missing ⚠️
src/tui/output-parser.ts 86.12% 29 Missing ⚠️
src/logs/persistence.ts 56.66% 26 Missing ⚠️
src/plugins/agents/base.ts 31.25% 22 Missing ⚠️
src/commands/run.tsx 0.00% 3 Missing ⚠️
src/plugins/agents/builtin/claude.ts 33.33% 2 Missing ⚠️
src/plugins/agents/builtin/opencode.ts 33.33% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #170      +/-   ##
==========================================
+ Coverage   42.16%   44.09%   +1.92%     
==========================================
  Files          73       76       +3     
  Lines       20731    21991    +1260     
==========================================
+ Hits         8742     9696     +954     
- Misses      11989    12295     +306     
Files with missing lines Coverage Δ
src/config/types.ts 100.00% <ø> (ø)
src/engine/index.ts 50.09% <100.00%> (+0.03%) ⬆️
src/plugins/agents/builtin/index.ts 90.90% <100.00%> (+6.29%) ⬆️
src/plugins/agents/builtin/claude.ts 45.43% <33.33%> (-0.07%) ⬇️
src/plugins/agents/builtin/opencode.ts 45.38% <33.33%> (-0.09%) ⬇️
src/commands/run.tsx 8.46% <0.00%> (-0.02%) ⬇️
src/plugins/agents/base.ts 59.15% <31.25%> (+26.84%) ⬆️
src/logs/persistence.ts 50.59% <56.66%> (+12.17%) ⬆️
src/tui/output-parser.ts 91.69% <86.12%> (-2.63%) ⬇️
src/plugins/agents/builtin/kiro.ts 59.32% <59.32%> (ø)
... and 2 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

AI Agent and others added 2 commits January 21, 2026 17:34
Move prompt from command-line arguments to stdin for gemini, codex,
and kiro agents. This avoids shell interpretation issues with special
characters (&, |, >, etc.) on Windows where shell: true is required.

Follows the same pattern established in PR subsy#159 for OpenCode.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/plugins/agents/builtin/codex.ts`:
- Around line 182-214: Codex currently sets structuredOutputFormat: 'jsonl' and
enables --json in buildArgs but doesn't parse JSONL events; update the Codex
plugin by overriding execute() (similar to the Claude/OpenCode implementations)
to parse JSONL lines from STDOUT into structured events and forward parsed
events to onStdout/onEvent handlers when options?.subagentTracing is true, or if
you intend not to support JSONL remove the structuredOutputFormat declaration
and the --json flag logic in buildArgs; look for the Codex class, its buildArgs
method, and the base execute/onStdout behavior to implement the JSONL parsing
and event emission consistently with other agents.

In `@src/plugins/agents/builtin/gemini.ts`:
- Around line 168-195: The Gemini agent currently declares
structuredOutputFormat as 'jsonl' and pushes '--output-format', 'stream-json' in
buildArgs but does not parse JSONL; either implement JSONL parsing by adding
parseJsonlLine() and updating execute() in the Gemini agent to read stdout
line-by-line, parse each JSONL record (handling errors and end-of-stream) and
emit/subagent-trace accordingly (also mimic Claude/OpenCode by adding
'--verbose' when using 'stream-json' if required), or remove
structuredOutputFormat and stop pushing '--output-format', 'stream-json' in
buildArgs (and remove any related subagentTracing branching) so the agent
returns raw stdout consistently; refer to the buildArgs, execute,
structuredOutputFormat, and the new parseJsonlLine symbol when making the
changes.
🧹 Nitpick comments (1)
src/plugins/agents/builtin/codex.ts (1)

98-145: Clear the timeout once the version probe settles.
Avoids late kill()/double resolve after a fast exit.

♻️ Proposed refactor
-    return new Promise((resolve) => {
+    return new Promise((resolve) => {
+      let settled = false;
+      let timeout: ReturnType<typeof setTimeout>;
+      const finish = (result: { success: boolean; version?: string; error?: string }) => {
+        if (settled) return;
+        settled = true;
+        clearTimeout(timeout);
+        resolve(result);
+      };
       // Only use shell on Windows where direct spawn may not work
       const useShell = process.platform === 'win32';
       const proc = spawn(command, ['--version'], {
         stdio: ['ignore', 'pipe', 'pipe'],
         shell: useShell,
       });
@@
-      proc.on('error', (error) => {
-        resolve({ success: false, error: `Failed to execute: ${error.message}` });
-      });
+      proc.on('error', (error) => {
+        finish({ success: false, error: `Failed to execute: ${error.message}` });
+      });
@@
-          resolve({ success: true, version: versionMatch[1] });
+          finish({ success: true, version: versionMatch[1] });
         } else {
-          resolve({ success: false, error: stderr || `Exited with code ${code}` });
+          finish({ success: false, error: stderr || `Exited with code ${code}` });
         }
       });
 
-      setTimeout(() => {
+      timeout = setTimeout(() => {
         proc.kill();
-        resolve({ success: false, error: 'Timeout waiting for --version' });
+        finish({ success: false, error: 'Timeout waiting for --version' });
       }, 5000);
     });
   }

@subsy
Copy link
Owner

subsy commented Jan 21, 2026

@medhatgalal Im working on your branch to add tests, docs, a few little bug fixes, and adding JSONL parsing for output display to work properly - hopefully will be able to get this merged tonight

@vercel
Copy link

vercel bot commented Jan 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
ralph-tui Ignored Ignored Preview Jan 22, 2026 0:26am

Request Review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@src/logs/persistence.ts`:
- Around line 704-711: The filename filter's newPattern is too strict (only
lower-case hex) and will skip valid logs created by generateLogFilename();
update newPattern to accept any sessionId token (e.g., one or more
non-underscore chars) such as
/^[^_]+_\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}_.*\.log$/ and use this relaxed
pattern wherever hasIterationLogs() or similar logic checks the "new" format so
both files.filter(... legacyPattern || newPattern) and hasIterationLogs()
recognise the same new-format filenames. Ensure the updated pattern still
enforces the timestamp and taskId parts and keep legacyPattern unchanged.

In `@src/plugins/agents/builtin/codex.ts`:
- Around line 331-384: The onStdout wrapper in override execute drops JSONL
records split across chunks; fix by buffering incomplete data between calls: add
a persistent buffer (e.g., this._jsonlBuffer or a closure variable inside
execute) and on each incoming chunk prepend the buffer, split on '\n', keep the
last partial line in the buffer, and only parse/forward the complete lines to
onJsonlMessage and to parseCodexOutputToEvents; ensure when the stream ends you
flush any remaining buffered line. Update references: execute,
onStdout/onJsonlMessage, parseCodexOutputToEvents, processAgentEventsToSegments,
and processAgentEvents to use only the complete-line data, preserving the
leftover for the next chunk.

In `@src/plugins/agents/builtin/gemini.ts`:
- Around line 301-354: The onStdout wrapper in execute currently splits each
incoming chunk by '\n' and can drop partial JSONL records that span chunks;
change the wrapper to maintain a persistent buffer string (e.g., let jsonlBuffer
= '') closed over the onStdout callback, append each incoming chunk to it,
extract complete lines by finding newline boundaries, parse/forward only those
complete lines to options.onJsonlMessage (and skip or log malformed JSON), and
keep any trailing partial line in jsonlBuffer for the next chunk; after
extracting complete lines, pass the reconstructed data for display parsing via
parseGeminiOutputToEvents -> processAgentEventsToSegments/processAgentEvents as
before so that parseGeminiOutputToEvents, processAgentEventsToSegments, and
processAgentEvents operate only on full event text.

In `@website/content/docs/plugins/agents/codex.mdx`:
- Around line 133-173: Docs incorrectly state that `--json` is only passed when
tracing is enabled; update the text around the `codex exec` example and the
"Enabling Tracing" section to say that Codex is always invoked with `--json` (so
output is parseable) and that `subagentTracingDetail` (and the TUI keys t /
Shift+T) only control whether Ralph TUI parses/displays the subagent JSONL, not
whether `--json` is used; reference the `codex exec [options]` example and the
`--json` flag and replace any wording like "When tracing enabled" with a clear
statement that `--json` is always included and tracing only affects
display/verbosity.

In `@website/content/docs/plugins/agents/gemini.mdx`:
- Around line 130-169: The docs incorrectly state that `--output-format
stream-json` is optional, but the implementation always appends it; update the
argument construction in the Ralph TUI Gemini launch logic to add
`--output-format stream-json` only when tracing is enabled (e.g., when
`subagentTracingDetail` is not "off"/false or when the TUI trace toggle is
active), and update the documentation to reflect that the flag is only present
when tracing is enabled; locate the code that builds the gemini args (the block
that constructs `gemini [options]` / the array used to spawn the process) and
make the `--output-format stream-json` entry conditional based on the
`subagentTracingDetail`/trace toggle state.
🧹 Nitpick comments (1)
src/logs/persistence.ts (1)

48-63: Consider UTC for filename timestamps to avoid timezone drift.

Using local time means filenames depend on the machine’s timezone and can diverge from the ISO startedAt. UTC keeps naming deterministic across environments.

Proposed tweak (UTC formatting)
-  const year = date.getFullYear();
-  const month = String(date.getMonth() + 1).padStart(2, '0');
-  const day = String(date.getDate()).padStart(2, '0');
-  const hours = String(date.getHours()).padStart(2, '0');
-  const minutes = String(date.getMinutes()).padStart(2, '0');
-  const seconds = String(date.getSeconds()).padStart(2, '0');
+  const year = date.getUTCFullYear();
+  const month = String(date.getUTCMonth() + 1).padStart(2, '0');
+  const day = String(date.getUTCDate()).padStart(2, '0');
+  const hours = String(date.getUTCHours()).padStart(2, '0');
+  const minutes = String(date.getUTCMinutes()).padStart(2, '0');
+  const seconds = String(date.getUTCSeconds()).padStart(2, '0');

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/plugins/agents/builtin/codex.ts`:
- Around line 201-247: The runVersion function starts a timer that can fire
after proc emits 'close' or 'error', causing a late proc.kill() and duplicate
resolve; fix it by storing the timeout ID (e.g., const timer = setTimeout(...)),
clearTimeout(timer) in both proc.on('error', ...) and proc.on('close', ...)
handlers, and add a simple resolved guard (let settled = false; set settled =
true before each resolve and skip if already true) so proc.kill() and resolve
are not called twice; reference runVersion, proc, timer/timeout and the
'error'/'close' handlers when applying the change.
- Around line 338-405: BaseAgentPlugin currently never calls the
AgentExecuteOptions.onEnd lifecycle hook, so Codex's jsonlBuffer can never be
flushed when the agent exits; implement onEnd invocation in
BaseAgentPlugin.completeExecution() (call options.onEnd?.() after the execution
promise resolves/rejects) and then in src/plugins/agents/builtin/codex.ts (the
override execute method using jsonlBuffer) wrap or rely on that onEnd so the
remaining jsonlBuffer content is parsed and forwarded (flush buffer -> parse
last line(s) to JSON via JSON.parse and emit via
options.onJsonlMessage/options.onStdoutSegments/options.onStdout). Ensure you
reference BaseAgentPlugin.completeExecution, AgentExecuteOptions.onEnd, and the
execute override/jsonlBuffer in codex.ts when making the changes.
♻️ Duplicate comments (2)
src/logs/persistence.ts (1)

704-712: LGTM! Past review concern addressed.

The relaxed pattern [^_]+ now accepts any non-underscore session ID token, addressing the previous feedback about the overly strict hex-only pattern. The pattern correctly enforces the timestamp format whilst remaining flexible on the session ID format. This is also consistent with the pattern used in hasIterationLogs.

website/content/docs/plugins/agents/gemini.mdx (1)

130-169: LGTM!

The documentation now correctly states that --output-format stream-json is always enabled (lines 132, 155, 165), which accurately reflects the implementation. This addresses the previous review feedback.

@medhatgalal
Copy link
Contributor Author

Thanks @subsy for jumping in on this! Happy to help with any remaining items after your changes land.

Looking at the CodeRabbit feedback, I can tackle:

  • The shell: false + platform check for kiro.ts/codex.ts spawn calls
  • Timeout cleanup in version check functions
  • Any remaining doc updates for the website

Let me know what's left after your push and I'll pick up from there.

AI Agent added 6 commits January 21, 2026 23:23
- Add full documentation for all three new agent plugins
- Update navigation to include new agents with "New" labels
- Expand test coverage to >50% for all three plugins:
  - codex.ts: 57.46% coverage (41 tests)
  - gemini.ts: 56.74% coverage (42 tests)
  - kiro.ts: 51.25% coverage (37 tests)
- Tests cover buildArgs, getStdinInput, validateSetup, and config options
Two issues fixed:

1. **Chronological sorting bug**: When iteration numbers reset between
   sessions, the code incorrectly showed old logs because it assumed
   alphabetical filename order = chronological order. Now sorts by
   `startedAt` timestamp from log metadata.

2. **New filename format**: Changed from `iteration-{NNN}-{taskId}.log`
   to `{sessionId}_{timestamp}_{taskId}.log` for better uniqueness and
   sorting. Example: `a1b2c3d4_2024-01-15_10-30-45_BEAD-001.log`

Also includes agent plugin improvements:
- Codex/Gemini: Always use JSON output format for proper parsing
- Codex/Gemini: Filter out echoed user messages in JSONL stream
- Codex/Gemini: Add extractErrorMessage() for proper error display
- Relax sessionId regex pattern to accept any non-underscore token
  (was incorrectly restricted to hex-only [a-f0-9]{8})
- Add JSONL buffering in codex.ts and gemini.ts to handle records
  split across stream chunks
- Update docs to clarify --json and --output-format stream-json are
  always used, not conditional on tracing config
Add JSONL parser for Gemini CLI and Codex CLI output formats to
enable proper display of completed iteration logs. Without this,
viewing Gemini/Codex iterations showed "Agent output could not
be parsed - showing raw JSON".

- Add GeminiEvent interface for stream-json output format
- Add parseGeminiJsonlLine() and formatGeminiEventForDisplay()
- Integrate into parseAgentOutput() for log viewing
- Integrate into StreamingOutputParser for real-time display
- Codex reuses same parser (similar JSONL format)
- Kiro doesn't need parser (text output only)
- Add getSandboxRequirements() with mise installation paths so sandbox
  can access npm global packages installed via mise
- Add '-' argument to tell Codex CLI to read prompt from stdin
  (per https://developers.openai.com/codex/cli/reference)
Codex CLI outputs different event types than originally assumed:
- item.started/item.completed with agent_message, command_execution, file_*
- thread.started, turn.started, turn.completed (skipped)

Now correctly extracts text from agent_message items and shows
command executions and file operations as tool_use events.
@subsy subsy force-pushed the pr/multi-cli-support branch from dd3b47d to 95fd490 Compare January 21, 2026 23:24
@subsy
Copy link
Owner

subsy commented Jan 21, 2026

@medhatgalal I think we're all set now just resolving a couple more pieces of feedback but should be good to go - thanks again!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In `@src/plugins/agents/builtin/gemini.ts`:
- Around line 308-375: The onStdout wrapper in override execute buffers partial
JSONL lines in jsonlBuffer but never processes leftover content if the process
exits without a trailing newline; update override execute so after calling
super.execute(...) you capture the returned AgentExecutionHandle and attach a
finalization/close/exit callback on that handle (the AgentExecutionHandle
returned by super.execute) to flush any remaining jsonlBuffer: trim and, if
non-empty and startsWith('{'), JSON.parse and call options.onJsonlMessage, then
feed the leftover into parseGeminiOutputToEvents and forward to
options.onStdoutSegments/options.onStdout as the existing logic does, and
finally clear jsonlBuffer. Ensure this uses the same parsing/forwarding code
paths as inside the onStdout wrapper so the last JSONL/event isn't dropped.
- Around line 184-231: In runVersion, prevent the timeout from firing after the
process ends by creating a timeout handle (e.g., const timeout =
setTimeout(...)) and clear it in both proc.on('error') and proc.on('close')
handlers; also guard against double-resolve by tracking a settled flag (e.g.,
let settled = false) and only call resolve (and clearTimeout) when !settled,
then set settled = true before returning. This change should be applied around
the proc, timeout, and the resolve calls inside runVersion to ensure proc.kill()
and resolve are not called twice.

In `@src/tui/output-parser.ts`:
- Around line 94-122: The GeminiEvent interface currently lists undocumented
event kinds "tool_call" and "function_result"; remove those two from the type
union in the GeminiEvent declaration so the union matches the documented event
types ('init' | 'message' | 'tool_use' | 'tool_result' | 'result' | 'error'),
and then search for any code referencing GeminiEvent types "tool_call" or
"function_result" and either remove/replace those usages or add a clear
comment/feature-flag if you intend to keep them for backward/forward
compatibility; ensure TypeScript types and any related tests are updated
accordingly.
- Around line 384-395: The parsing block currently treats Codex and Gemini as
interchangeable by calling parseGeminiJsonlLine when useCodexParser or
useGeminiParser is true; separate the flows so Codex uses a dedicated
parser/formatter (e.g., parseCodexJsonlLine and formatCodexEventForDisplay)
while Gemini continues to use parseGeminiJsonlLine and
formatGeminiEventForDisplay. Update the conditional to check useCodexParser and
call the Codex-specific functions (setting hasJsonl, pushing formatted parts to
parsedParts, and continuing) and only invoke parseGeminiJsonlLine when
useGeminiParser is true to avoid mis-parsing incompatible event shapes.

In `@website/content/docs/plugins/agents/kiro.mdx`:
- Around line 97-102: The docs show a mismatch between the timeout default in
the options table (`timeout = 0`) and the full config example (`timeout =
300000`); update the Kiro docs by either changing the table default to reflect
the example or add a short clarifying sentence below the table referencing the
`timeout` option (mention `timeout`, `0`, and `300000`) that explains `0` means
no timeout and examples like `300000` (ms) are recommended for
production/autonomous runs; ensure the clarification appears near the table and
the full config example so readers see both values together.
- Around line 39-52: The "Select an Agent" step shows an unchanged basic run
command and doesn't demonstrate how to pass agent selection via CLI; update the
Step titled "Select an Agent" so the bash snippet demonstrates agent selection
(e.g., include the CLI flag such as `--agent kiro` or the `--agent-options`
syntax) and/or add a short follow-up run example that uses the agent flag;
ensure the existing toml config block `[agentOptions] agent = "code-reviewer"`
remains and place the corrected CLI example (e.g., `ralph-tui run --prd
./prd.json --agent kiro`) in the Step so readers see both config and CLI ways to
select an agent.
- Around line 18-20: Replace the generic placeholder pointing to
https://kiro.dev with concrete Kiro CLI installation steps: update the fenced
bash snippet in the kiro.mdx docs to reference the downloads URL
(https://kiro.dev/downloads/) and include the common install commands such as
the curl install script (curl -fsSL https://cli.kiro.dev/install | bash) and
Homebrew alternative (brew install kiro-cli), and add a short note about
verifying the install with kiro --version; ensure these commands replace the
existing placeholder block so users can copy/paste directly.
♻️ Duplicate comments (2)
src/plugins/agents/builtin/codex.ts (2)

210-256: Clear the version-check timeout on completion.

The timeout can still fire after close or error events, causing a late kill() call and potential double-resolve. This was flagged in a previous review and appears unaddressed.

🛠️ Suggested fix
   private runVersion(
     command: string
   ): Promise<{ success: boolean; version?: string; error?: string }> {
     return new Promise((resolve) => {
+      let settled = false;
+      const finish = (result: { success: boolean; version?: string; error?: string }) => {
+        if (settled) return;
+        settled = true;
+        clearTimeout(timeoutId);
+        resolve(result);
+      };
       // Only use shell on Windows where direct spawn may not work
       const useShell = process.platform === 'win32';
       const proc = spawn(command, ['--version'], {
         stdio: ['ignore', 'pipe', 'pipe'],
         shell: useShell,
       });

       let stdout = '';
       let stderr = '';

       proc.stdout?.on('data', (data: Buffer) => {
         stdout += data.toString();
       });

       proc.stderr?.on('data', (data: Buffer) => {
         stderr += data.toString();
       });

       proc.on('error', (error) => {
-        resolve({ success: false, error: `Failed to execute: ${error.message}` });
+        finish({ success: false, error: `Failed to execute: ${error.message}` });
       });

       proc.on('close', (code) => {
         if (code === 0) {
           const versionMatch = stdout.match(/(\d+\.\d+\.\d+)/);
           if (!versionMatch?.[1]) {
-            resolve({
+            finish({
               success: false,
               error: `Unable to parse codex version output: ${stdout}`,
             });
             return;
           }
-          resolve({ success: true, version: versionMatch[1] });
+          finish({ success: true, version: versionMatch[1] });
         } else {
-          resolve({ success: false, error: stderr || `Exited with code ${code}` });
+          finish({ success: false, error: stderr || `Exited with code ${code}` });
         }
       });

-      setTimeout(() => {
+      const timeoutId = setTimeout(() => {
         proc.kill();
-        resolve({ success: false, error: 'Timeout waiting for --version' });
+        finish({ success: false, error: 'Timeout waiting for --version' });
       }, 5000);
     });
   }

347-414: Buffered JSONL content is not flushed when the process exits.

The JSONL buffering implementation correctly handles incomplete lines split across chunks. However, if Codex exits without a final newline, any remaining content in jsonlBuffer is never processed. This could silently drop the last JSONL event.

Consider wrapping the returned AgentExecutionHandle.promise to flush and parse any remaining buffer content after execution completes.

🛠️ Suggested approach
     const parsedOptions: AgentExecuteOptions = {
       ...options,
+      onEnd: () => {
+        // Flush any remaining buffered content
+        if (jsonlBuffer.trim()) {
+          const trimmed = jsonlBuffer.trim();
+          if (trimmed.startsWith('{')) {
+            try {
+              const parsed = JSON.parse(trimmed);
+              options?.onJsonlMessage?.(parsed);
+            } catch {
+              // Not valid JSON, skip
+            }
+          }
+          const events = parseCodexOutputToEvents(jsonlBuffer);
+          if (events.length > 0) {
+            if (options?.onStdoutSegments) {
+              const segments = processAgentEventsToSegments(events);
+              if (segments.length > 0) {
+                options.onStdoutSegments(segments);
+              }
+            }
+            if (options?.onStdout) {
+              const parsed = processAgentEvents(events);
+              if (parsed.length > 0) {
+                options.onStdout(parsed);
+              }
+            }
+          }
+        }
+        options?.onEnd?.();
+      },
       onStdout: (options?.onStdout || options?.onStdoutSegments || options?.onJsonlMessage)

Note: This requires BaseAgentPlugin to invoke onEnd in completeExecution(). If that's not yet implemented, an alternative is to wrap the returned handle's promise.

🧹 Nitpick comments (6)
website/content/docs/plugins/agents/kiro.mdx (1)

213-217: Consider adding Codex agent to Next Steps.

The PR adds Gemini, Codex, and Kiro agents together. For completeness, consider adding a link to the Codex agent documentation alongside Claude and Gemini.

📝 Suggested addition
 ## Next Steps

 - **[Claude Agent](/docs/plugins/agents/claude)** - Anthropic's Claude Code
 - **[Gemini Agent](/docs/plugins/agents/gemini)** - Google's Gemini CLI
+- **[Codex Agent](/docs/plugins/agents/codex)** - OpenAI's Codex CLI
 - **[Configuration](/docs/configuration/options)** - Full options reference
website/content/docs/plugins/agents/codex.mdx (1)

124-131: Consider adding a comma for readability.

Static analysis suggests a comma may improve readability in the sentence about autonomous operation.

✏️ Suggested fix
-When `fullAuto` is enabled (default), Codex will auto-approve all actions without prompting. This is required for Ralph TUI's autonomous operation since it cannot relay interactive prompts.
+When `fullAuto` is enabled (default), Codex will auto-approve all actions without prompting. This is required for Ralph TUI's autonomous operation, since it cannot relay interactive prompts.
website/content/docs/plugins/agents/gemini.mdx (1)

117-128: Consider adding a comma for readability.

Static analysis suggests a comma may improve readability in the sentence about autonomous operation.

✏️ Suggested fix
-When `yoloMode` is enabled (default), Gemini will skip approval prompts and auto-approve all actions. This is required for Ralph TUI's autonomous operation since it cannot relay interactive prompts.
+When `yoloMode` is enabled (default), Gemini will skip approval prompts and auto-approve all actions. This is required for Ralph TUI's autonomous operation, since it cannot relay interactive prompts.
tests/plugins/gemini-agent.test.ts (1)

135-137: Update type signature to reflect undefined handling.

The validateModel method type signature declares validateModel(model: string): string | null, yet the implementation checks if (model === '' || model === undefined). This mismatch necessitates the type assertion undefined as unknown as string in the test. Instead, update the interface in src/plugins/agents/types.ts to validateModel(model: string | undefined): string | null to match the actual implementation, eliminating the need for the assertion.

src/logs/persistence.ts (1)

48-63: Local timezone usage in timestamp formatting is intentional but should be documented.

The formatTimestampForFilename function uses local time methods (getHours(), getMinutes(), getSeconds()) rather than UTC equivalents. This means the same ISO timestamp will produce different filenames depending on the server's timezone.

This is likely intentional for human readability (local times are more meaningful to users), but it could cause confusion when debugging logs across different machines or timezones.

Consider adding a brief comment noting this is intentional:

📝 Suggested documentation
 /**
  * Format a timestamp for use in filenames.
  * Input: ISO 8601 timestamp (e.g., '2024-01-15T10:30:45.123Z')
  * Output: filesystem-safe timestamp (e.g., '2024-01-15_10-30-45')
+ * Note: Uses local timezone for human readability.
  */
src/tui/output-parser.ts (1)

148-207: Event formatting provides good noise suppression.

The formatGeminiEventForDisplay function appropriately:

  • Skips user messages (echoed input)
  • Skips init and result/stats events
  • Shows assistant content (main output)
  • Shows tool calls with truncated input details
  • Only shows tool results on error (avoiding verbose success output)

One minor issue: the break on Line 159 is unreachable after the return undefined on Line 154 and the return event.content on Line 157. The break would only execute if neither condition matched, but in that case the function falls through to Line 206 anyway.

🧹 Optional cleanup
     case 'message':
       // Skip user messages (they echo the input prompt)
       if (event.role === 'user') {
         return undefined;
       }
       // Assistant message - the main content we want to display
       if (event.role === 'assistant' && event.content) {
         return event.content;
       }
-      break;
+      return undefined;

1. runVersion race condition fix:
   - Add settled guard to prevent duplicate resolves
   - Store timeout ID and clear it in error/close handlers
   - Use safeResolve wrapper for consistent cleanup

2. Buffer flush on exit:
   - Add options to RunningExecution interface in base.ts
   - Call onEnd lifecycle hook in completeExecution before resolving
   - Add flushBuffer helper in codex.ts execute method
   - Wrap onEnd to flush remaining jsonlBuffer content

This ensures no JSONL data is lost when the agent exits with
partial content still in the buffer.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/plugins/agents/base.ts`:
- Around line 515-517: The call to the plugin lifecycle hook
execution.options?.onEnd?.(result) in completeExecution must be wrapped in a
try/catch so that exceptions from onEnd do not prevent resolution; modify the
completeExecution implementation to invoke execution.options.onEnd(result)
inside a try {} catch (err) {} block (log or swallow the error via processLogger
or a provided logger) and always proceed to resolve/return the result even if
onEnd throws.

gemini.ts:
- Fix runVersion race condition with settled guard and clearTimeout
- Add buffer flush on exit via onEnd lifecycle hook

output-parser.ts:
- Remove undocumented GeminiEvent types (tool_call, function_result)
- Add dedicated CodexEvent interface and parsing functions
- Separate Codex and Gemini parsing flows to avoid mis-parsing

kiro.mdx:
- Add concrete installation steps (curl, Homebrew)
- Add CLI --agent-options example for agent selection
- Clarify timeout: 0 means no timeout, recommend 300000 for production
AI Agent added 2 commits January 21, 2026 23:52
Exceptions from the onEnd lifecycle hook should not prevent the
execution promise from resolving. Wrap the call in try/catch and
log errors to debug output while always proceeding to resolve.
Add comprehensive tests for:
- StreamingOutputParser with Codex and Gemini formats
- parseAgentOutput function with agent-specific parsing
- Codex plugin parsing functions (extractErrorMessage, parseCodexJsonLine, parseCodexOutputToEvents)
- Gemini plugin parsing functions
- BaseAgentPlugin onEnd lifecycle hook (including error handling)
- Plugin configuration, buildArgs, and setup questions

Export internal parsing functions for testing:
- codex.ts: extractErrorMessage, parseCodexJsonLine, parseCodexOutputToEvents
- gemini.ts: extractErrorMessage, parseGeminiJsonLine, parseGeminiOutputToEvents

Coverage improvements:
- output-parser.ts: 96% line coverage
- base.ts: 63% line coverage
- codex.ts/gemini.ts: 47%/44% line, 70%/68% function coverage
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/plugins/agents/base.test.ts`:
- Around line 305-308: The doc comment above the test plugin is inaccurate;
update it to reflect the actual commands used by the test implementation (it
runs "cmd /c echo" on Windows and "echo" on POSIX shells), e.g., change the
sentence from "Uses 'echo' or 'true' commands" to explicitly state "Uses 'cmd /c
echo' on Windows and 'echo' on other platforms" so the comment matches the
behavior around the test plugin in this file.

In `@src/plugins/agents/base.ts`:
- Around line 515-527: The onEnd hook (execution.options?.onEnd) is only called
in completeExecution, so if sandbox resolution fails and the code takes the
early .catch path callers never get onEnd; update the sandbox resolution error
handling to invoke execution.options?.onEnd(resultOrErrorContext) inside that
catch before rethrowing or resolving so plugins always get a final call, and
ensure the call is wrapped in the same try/catch/logging pattern used in
completeExecution (use debugLog when process.env.RALPH_DEBUG) to avoid letting
hook errors change control flow.

- Update doc comment for RealExecuteTestPlugin to accurately describe
  platform-specific commands ('cmd /c echo' on Windows, 'echo' on POSIX)
- Call onEnd lifecycle hook in sandbox resolution catch block, ensuring
  plugins always receive the callback even if execution fails before
  process spawns
- Use same try/catch/debugLog pattern as completeExecution for consistency
@subsy
Copy link
Owner

subsy commented Jan 22, 2026

@medhatgalal feel free to take a look.
I'll plan to merge this tomorrow morning UK.
I havent been able to test the kiro plugin so would appreciate you verifying that it's all good.

@medhatgalal
Copy link
Contributor Author

@subsy Will do. You did all the lift! I'll check everything including docs! Thanks a bunch. I love the tool and I wanted to help. Keep up the great work. Will post a comment tonight so you find it when you wake up.

Cheers.

@medhatgalal
Copy link
Contributor Author

✅ Kiro Plugin Test Results

Tested the Kiro integration tonight - everything works great!

Unit Tests

  • 38/38 Kiro agent tests pass (bun test tests/plugins/kiro-agent.test.ts)
  • Build and typecheck pass

Live Execution Test

  • Kiro CLI v1.24.0 detected at /opt/homebrew/bin/kiro-cli
  • Ran ralph-tui run --agent kiro --prd ./prd.json --iterations 1 --headless
  • Kiro launched successfully, displayed banner, read task, created output file
  • Task completed correctly

Notes

  • 5 unrelated test failures exist (skills list command, process utility, BaseAgentPlugin lifecycle) - not Kiro-specific
  • Ready for merge from my side 👍

@subsy subsy merged commit 4a57852 into subsy:main Jan 22, 2026
9 checks passed
@medhatgalal
Copy link
Contributor Author

@subsy found some failure modes related to using and addressing ANSI failures that sometimes Kiro-cli emitted in response. Fixing in a new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments