perf: Add brokered nsjail Claude runtime#2468
Conversation
|
✅ No security or compliance issues detected. Reviewed everything up to f415523. Security OverviewDetected Code ChangesThe diff is too large to display a summary of code changes. |
There was a problem hiding this comment.
4 issues found across 22 files
Confidence score: 2/5
- There is a high-confidence race in
tracecat/agent/runtime/claude_code/broker.pywhererun_turn()may create a new session afterstop()sets_closed, which can lead to inconsistent lifecycle state and hard-to-reproduce runtime failures. docker-compose.dev.ymlintroduces broad privileges (SYS_ADMIN,seccomp:unconfined,/dev/net/tun) foragent-executorunconditionally, which is a meaningful security-risk increase even in a dev profile.- Additional medium-severity reliability issues in
tracecat/agent/sandbox/shim_entrypoint.py(stdin broken pipe/reset handling) andtracecat/agent/runtime/claude_code/transport.py(path resolution mismatch skipping jail rewrite) raise concrete regression risk in process execution flows. - Pay close attention to
tracecat/agent/runtime/claude_code/broker.py,docker-compose.dev.yml,tracecat/agent/sandbox/shim_entrypoint.py,tracecat/agent/runtime/claude_code/transport.py- fix lifecycle race, privilege escalation defaults, stdin-forwarding robustness, and executable path normalization before merge.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tracecat/agent/sandbox/shim_entrypoint.py">
<violation number="1" location="tracecat/agent/sandbox/shim_entrypoint.py:124">
P2: Handle broken pipe/reset while forwarding stdin so early child stdin closure doesn't crash the shim.</violation>
</file>
<file name="tracecat/agent/runtime/claude_code/transport.py">
<violation number="1" location="tracecat/agent/runtime/claude_code/transport.py:274">
P2: Resolve `executable` for consistent comparison with the resolved `host_site_packages_root`. Without this, symlinked venvs or CLI paths will silently skip the jail rewrite, and the subprocess will fail to find the CLI binary inside the sandbox.</violation>
</file>
<file name="docker-compose.dev.yml">
<violation number="1" location="docker-compose.dev.yml:295">
P1: `agent-executor` now runs with `SYS_ADMIN` + `seccomp:unconfined` + `/dev/net/tun` unconditionally, which grants broad host-level privileges even when nsjail remains disabled by default.</violation>
</file>
<file name="tracecat/agent/runtime/claude_code/broker.py">
<violation number="1" location="tracecat/agent/runtime/claude_code/broker.py:52">
P1: `run_turn()` can start a new session after `stop()` sets `_closed` due to a check-then-lock race. Re-check `_closed` inside the locked section before registering `_active_turns`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tracecat/agent/sandbox/shim_entrypoint.py">
<violation number="1" location="tracecat/agent/sandbox/shim_entrypoint.py:299">
P2: Use `os.environ.get(VAR) or DEFAULT` instead of `os.environ.get(VAR, DEFAULT)` so an empty-string env var falls back to the default rather than creating `Path("")` (which resolves to cwd).
(Based on your team's feedback about using the `or` fallback pattern for env vars before type conversion.) [FEEDBACK_USED]</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
0d19d76 to
a720bcd
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
7b13a2a to
42987ff
Compare
LiteLLM is managed externally, so start_configured_llm_proxy and stop_configured_llm_proxy were pure ceremony with no actual work.
The transport factory was optional to support a legacy HTTP path that is no longer used in production. Making it required simplifies the runtime by removing conditional env var logic for ANTHROPIC_BASE_URL and HOME, both of which are now handled exclusively by the transport.
2cd330b to
272765a
Compare
|
You're iterating quickly on this pull request. To help protect your rate limits, cubic has paused automatic reviews on new pushes for now—when you're ready for another review, comment |
|
@cubic-dev-ai review |
|
@codex review |
@daryllimyt I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fc96e0a206
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary
Testing
Summary by cubic
Adds a worker‑global, brokered Claude runtime via a standalone
nsjailshim as the only execution path. Startup and shutdown are faster and more reliable with a unified, size‑capped init payload, non‑blocking shim stdin handling, stable per‑session paths, correct transport mode handling, and durable workflows routed through the executor with session control across approval continuation.New Features
SandboxedCLITransportusingorjsonfor faster I/O.nsjailmodes; fixed LLM socket path.Refactors
ClaudeAgentRuntimenow requires atransport_factory.Written for commit f415523. Summary will update on new commits.