fix: detect Node.js / Python shebang scripts as their agent identity (#9870)#10022
fix: detect Node.js / Python shebang scripts as their agent identity (#9870)#10022lonexreb wants to merge 3 commits intowarpdotdev:masterfrom
Conversation
…arpdotdev#9870) On Linux, agents distributed as Node.js or Python shebang scripts surface as `node /usr/local/bin/codex` (the runtime, not the script) because `/proc/PID/comm` is initialised from the executable basename. The bare prefix match in `CLIAgent::detect` saw `node` and missed Codex entirely, so the third-party CLI agent toolbar never appeared. The reproduction in warpdotdev#9870 confirms this is the canonical case: the @openai/codex npm package installs `/usr/local/bin/codex` as a symlink to `codex.js` whose shebang is `#!/usr/bin/env node`. Detection failed for every Linux user of Codex, but worked on macOS where shebang process naming differs. Fix: - New helper `path_basename_token` peels a leading path component so `/usr/local/bin/codex` → `codex` for matching purposes. - New helper `shebang_script_basename` recognises a known script-runtime invocation (`node`, `nodejs`, `bun`, `deno`, `python`, `python3`) and pulls the script basename from `argv[1]`, stripping common script extensions (`.js`, `.mjs`, `.cjs`, `.ts`, `.py`). - `detect()` matches `command_prefix()` against the exact first word OR either of the two derived tokens. The change is purely additive — every existing detect path still matches exactly as before, including the Uber-team `aifx agent run claude` special case. The new candidates only fire when the exact-word match fails, so there's no false-positive surface for unrelated commands like `node ./my-app.js` or `python /home/user/script.py`. Tests in `cli_agent_tests.rs` cover: - `node /usr/local/bin/codex` → Codex (the canonical warpdotdev#9870 case) - `node` with flags before the script path - `node /path/to/codex.js` (extension stripped) - `nodejs`, `bun` runtimes - Path-prefixed binaries: `/usr/local/bin/codex`, `./goose` - Negative tests: `node`, `node ./my-app.js`, `python script.py`, `node --version` all return None. Followups deliberately out of scope: - A user-configurable runtime list (V0 hardcodes the 6 most common). - An `argv[0]` rewrite via `process.title` patches upstream — that's a Codex CLI repo concern, not a Warp concern.
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
Reviewed the CLI agent detection changes for path-prefixed binaries and Node/Python-style shebang script invocations, along with the added unit coverage.
Concerns
- Runtime flag parsing skips every leading flag without accounting for flags that consume the following token, which can make non-script invocations like
node -e codexorpython -c clauderesolve as known agents.
Verdict
Found: 0 critical, 1 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| let _runtime = tokens.next()?; | ||
| // Skip leading flags like `-e '...'` or `--inspect`. Stop at the first | ||
| // non-flag token: that's the script path. | ||
| let script = tokens.find(|t| !t.starts_with('-'))?; |
There was a problem hiding this comment.
node -e codex, node --require codex, or python -c claude can false-positive as agents. Account for value-taking runtime flags before selecting the script token.
Resolves the merge conflict with master after warpdotdev#9667 (Mistral Vibe agent recognition) merged. Combines both surfaces: - The Vibe arm now also resolves through path_basename_token and shebang_script_basename, so `/usr/local/bin/vibe-acp` and `node /opt/vibe-acp` both detect as Vibe. Also addresses oz-for-oss review on PR warpdotdev#10022: > This treats the first non-flag token as a script even when the > preceding runtime flag consumes an argument, so invocations like > `node -e codex`, `node --require codex`, or `python -c claude` can > false-positive as agents. Adds VALUE_TAKING_RUNTIME_FLAGS, a list of flags that consume the following token as their value: - Node: -e, --eval, -p, --print, -r, --require, -C, --conditions - Python: -c, -m, -X, -W - Deno (subset): -A, --allow-read, --allow-write, --allow-net - Bun (subset): -d, --define shebang_script_basename now consumes the value alongside the flag, so `node -e codex` no longer false-positives as Codex. The `--key=value` form remains a single whitespace token and is handled by the existing `starts_with('-')` flag check. New regression tests in cli_agent_tests.rs: - `node -e codex` → None - `node --eval codex` → None - `node --require claude /home/user/app.js` → None - `python -c claude` → None - `python -m gemini` → None - `node --require some-mod /usr/local/bin/codex` → Codex (legitimate agent script after value-taking flag is consumed) - `node --inspect=127.0.0.1:9229 /usr/local/bin/codex` → Codex (--key=value form preserves correct script detection)
|
/oz-review |
|
I'm re-reviewing this pull request in response to a review request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR adds basename-based detection for path-prefixed CLI agents and Node/Python-style shebang runtime invocations.
Concerns
- The basename fallback applies to every agent prefix, including the generic Cursor prefix
agent, which creates false positives for unrelated scripts or binaries namedagent. - The runtime parser does not stay aligned with
extract_first_commandwhen shell parsing skips leading environment assignments, so env-prefixed shebang invocations can still fail detection.
Verdict
Found: 0 critical, 2 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| || candidate_basename.as_deref() == Some(prefix) | ||
| || runtime_invoked_basename.as_deref() == Some(prefix) |
There was a problem hiding this comment.
command_prefix(), including CLIAgent::CursorCli whose prefix is the generic string agent; /tmp/agent, node /tmp/agent.js, or python /tmp/agent.py will now be detected as Cursor. Keep exact matching for generic prefixes or allowlist only unambiguous agent names for basename/runtime recovery.
| } | ||
| // Use whitespace splitting (not shell parsing) — this is best-effort | ||
| // recovery, not security-critical input handling. | ||
| let mut tokens = command.split_whitespace(); |
There was a problem hiding this comment.
extract_first_command; with escape_char set, FOO=1 node /usr/local/bin/codex produces resolved_first_word == "node", but this iterator consumes FOO=1 as the runtime and never reaches the Codex script. Strip the same leading assignments or share the parsed tokens before scanning runtime arguments.
Addresses two findings from oz-for-oss re-review on PR warpdotdev#10022: 1. `CLIAgent::CursorCli`'s `command_prefix()` is the bare string "agent", which is generic enough to false-positive when the basename / runtime- script fallbacks fire on unrelated scripts and binaries. `/tmp/agent`, `node /tmp/agent.js`, `python /tmp/agent.py` would all incorrectly detect as Cursor. New `GENERIC_AGENT_PREFIXES` const lists prefixes that are too generic for fallback recovery; `prefix_eligible_for_basename_recovery` gates the candidate_basename and runtime_invoked_basename branches in `CLIAgent::detect`. Exact-word match still works for Cursor, just not the basename fallbacks. Distinctive prefixes (codex, claude, etc.) continue to recover via basename normally. 2. `shebang_script_basename` did its own whitespace tokenization, which diverged from `extract_first_command`'s shell-aware parsing when the user's command starts with env-var assignments. `FOO=1 node /usr/local/ bin/codex` would have `resolved_first_word == "node"` (correct) but `shebang_script_basename` would read `FOO=1` as the runtime token and never reach the script. Added `is_env_var_assignment` helper (matches POSIX env-var name pattern: `[A-Za-z_][A-Za-z0-9_]*=`). `shebang_script_basename` now skips leading env-var assignments before reading the runtime token, keeping it aligned with `extract_first_command`. New regression tests: - `test_detect_generic_prefix_no_basename_false_positive`: - `agent` → CursorCli (exact match still works) - `/tmp/agent`, `node /tmp/agent.js`, `python /tmp/agent.py` → None - `/tmp/codex` → Codex (distinctive prefixes still recover) - `test_detect_node_shebang_with_env_var_prefix`: - `FOO=1 node /usr/local/bin/codex` → Codex - `FOO=1 BAR=baz node /usr/local/bin/codex` → Codex - `PATH=/tmp:/usr/bin node /usr/local/bin/codex` → Codex
|
The two important issues from the previous bot review are now addressed in commit 409b61b:
The bot's re-review at 09:30 UTC on May 4 completed without requesting further human review, but the PR still shows |
You're able to re-request review from the agent by using the /oz-review command. Once you feel this PR is ready for review by a maintainer, please feel free to @/tag me! I'd be happy to take a look |
|
I'm re-reviewing this pull request in response to a review request. You can view the conversation on Warp. I reviewed this pull request and requested human review from: Comment Powered by Oz |
|
I'm re-reviewing this pull request in response to a review request. You can view the conversation on Warp. I reviewed this pull request and requested human review from: Comment Powered by Oz |
Oz no longer requests changes for this pull request after the latest automated review.
There was a problem hiding this comment.
Overview
This PR extends CLI agent detection to recover agent identities from path-prefixed binaries and Node/Python-style shebang runtime invocations, with regression coverage for Codex and false-positive cases.
Concerns
- No blocking correctness or security concerns found.
- Non-blocking follow-up: Deno is listed as a supported script runtime, but the current token parsing does not handle common Deno command shapes.
Verdict
Found: 0 critical, 0 important, 1 suggestions
Approve with nits
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| /// runtime, not the script name — so a command surfaces as e.g. | ||
| /// `node /usr/local/bin/codex` and we have to look at the second token to | ||
| /// recover the agent identity. See #9870 for the canonical Codex case. | ||
| const SCRIPT_RUNTIMES: &[&str] = &["node", "nodejs", "bun", "deno", "python", "python3"]; |
There was a problem hiding this comment.
💡 [SUGGESTION] deno is listed as a runtime, but typical invocations put run before the script and permission flags like --allow-read can be boolean, so the helper can return run or consume the script path. Handle Deno's command shape or remove it from this list until supported.
There was a problem hiding this comment.
Overview
This PR expands CLI agent detection to recover agent names from path-prefixed commands and common runtime-invoked shebang scripts, with coverage for the Codex Linux failure, basename recovery, value-taking runtime flags, generic prefix gating, and env-var prefixes.
Concerns
- Deno is listed as a supported runtime, but common Deno shebangs use
deno run ... <script>, so the current first-positional parser still will not recover those script basenames.
Verdict
Found: 0 critical, 0 important, 1 suggestions
Approve with nits
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| /// runtime, not the script name — so a command surfaces as e.g. | ||
| /// `node /usr/local/bin/codex` and we have to look at the second token to | ||
| /// recover the agent identity. See #9870 for the canonical Codex case. | ||
| const SCRIPT_RUNTIMES: &[&str] = &["node", "nodejs", "bun", "deno", "python", "python3"]; |
There was a problem hiding this comment.
💡 [SUGGESTION] deno does not follow the helper's argv[1] is the script assumption for common shebangs (deno run [flags] <script>), so this list advertises Deno support that will still return None. Either skip the run subcommand and Deno flags before selecting the script, or remove deno until that path is supported.
harryalbert
left a comment
There was a problem hiding this comment.
@lonexreb, after looking over this PR a bit more, I think I'm a little hesitant about the solution here. While it would potentially address the issue, it's depending on a lot of special node, python, etc. state that could change with new versions or older versions (and is not generalizable to other runtimes that folks might be using). I'd prefer a less specialized solution if possible.
I chatted with the agent a bit, and I think that a more robust solution shape might look something like:
Persist the shell preexec command
◦ Add preexec_command: Option to Block.
◦ In Block::preexec at app/src/terminal/model/block.rs:3247, store data.command before/while handling PreexecValue.
◦ Add a narrow getter, e.g. preexec_command(&self) -> Option<&str>.
Use preexec command for CLI-agent detection
◦ In detect_cli_agent_from_model at app/src/terminal/view/use_agent_footer/mod.rs:352, prefer:
▪ active_block.preexec_command()
▪ fallback to active_block.command_with_secrets_obfuscated(false) at app/src/terminal/view/use_agent_footer/mod.rs:373
◦ This makes detection use what the shell was about to execute, not whatever command text/process shape Warp later reconstructed.
Simplify CLIAgent::detect
◦ Keep:
exact first-token match: codex
alias expansion
path basename match: /usr/local/bin/codex
existing special cases like aifx agent run claude / vibe-acp
◦ Remove the broad runtime parser for Node/Python/Deno/Bun flags from app/src/terminal/cli_agent.rs.
Curious on your thoughts on this shape/if you agree!
Fixes #9870.
The bug
On Linux, agents distributed as Node.js or Python shebang scripts surface as
node /usr/local/bin/<agent>because/proc/PID/commis initialised from the executable basename, not fromargv[0]or the symlink name. The bare prefix match inCLIAgent::detectsawnodeand missed the agent entirely, so the third-party CLI agent toolbar never appeared.The canonical reproduction in #9870: the
@openai/codexnpm package installs/usr/local/bin/codexas a symlink tocodex.jswhose shebang is#!/usr/bin/env node. Detection failed for every Linux user of Codex.The fix
Two new private helpers in
app/src/terminal/cli_agent.rs, plus a small extension todetect()to consider their output:path_basename_tokenpeels a leading path component so/usr/local/bin/codex→codexfor matching purposes. Covers fully-qualified invocations.shebang_script_basenamerecognises a known script-runtime invocation (node,nodejs,bun,deno,python,python3) and pulls the script basename fromargv[1], stripping common script extensions (.js,.mjs,.cjs,.ts,.py). Skips leading flags (node --inspect /usr/local/bin/codex).detect()matchescommand_prefix()against the exact first word OR either of the two derived tokens.The change is purely additive — every existing detect path still matches exactly as before, including the Uber-team
aifx agent run claudespecial case. The new candidates only fire when the exact-word match fails, so there's no false-positive surface for unrelated commands.Test coverage
cli_agent_tests.rsgets three new test functions covering:node /usr/local/bin/codex→CLIAgent::Codex(the canonical Codex agent toolbelt does not appear on Linux because Node.js script process shows as 'node' not 'codex' #9870 case)node --inspect /usr/local/bin/codex(runtime flag before script path)node /path/to/codex.js(extension stripped)nodejs,bunalternative runtimes/usr/local/bin/codex,./goose,/opt/homebrew/bin/claudenode,node ./my-app.js,python /home/user/script.py,node --versionall returnNone.Why not patch upstream
process.title?The reporter notes that adding
process.title = 'codex'to the Codex CLI's entry script would update/proc/PID/commtocodexon Linux. That's a fix in the wrong tree — every shebang-script-distributed agent would need the same upstream patch, including agents Warp doesn't ship and can't influence. Recognising the pattern in Warp's detection layer covers all of them at once.Test plan
npm install -g @openai/codex, runcodexin a Warp terminal — third-party agent toolbar appears (was: never appeared).node ./my-random-app.js— no agent toolbar appears (negative case).cli_agent_tests.rstests continue to pass.Scope
app/src/terminal/cli_agent.rs(+77/-2),app/src/terminal/cli_agent_tests.rs(+103).detect()but in non-overlapping arms (Vibe adds avibe-acpexact-word check, this adds basename / runtime fallbacks). Either can rebase cleanly onto the other.