Skip to content

fix(shell): replace global static output buffer with per-instance TempDir#7632

Merged
codefromthecrypt merged 2 commits intomainfrom
fix/shell-output-buffer-race
Mar 4, 2026
Merged

fix(shell): replace global static output buffer with per-instance TempDir#7632
codefromthecrypt merged 2 commits intomainfrom
fix/shell-output-buffer-race

Conversation

@codefromthecrypt
Copy link
Copy Markdown
Collaborator

@codefromthecrypt codefromthecrypt commented Mar 4, 2026

Summary

Shell output truncation uses a global static Mutex<HashMap> to cache temp file paths. This is racy in production too, not just tests: agent.rs dispatches tool calls concurrently via stream::select_all, so two shell commands in the same turn can clobber each other's saved output. The temp files also leak because NamedTempFile::keep() persists them forever.

This moves the temp directory onto ShellTool itself and adds an AtomicUsize call index that round-robins through a fixed number of file slots (fetch_add % OUTPUT_SLOTS, same pattern as elastic/elasticsearch-rs). Concurrent calls get distinct slots, sequential calls reuse them, and everything is cleaned up when the tool drops.

Type of Change

  • Bug fix

AI Assistance

  • This PR was created or reviewed with AI assistance

Testing

two new tests for slot cycling and distinctness.

@codefromthecrypt codefromthecrypt requested a review from baxen March 4, 2026 00:29
@codefromthecrypt
Copy link
Copy Markdown
Collaborator Author

@michaelneale @baxen this is primarily a test deflake thing for me, but I went ahead and made it safer with cleanup

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1d4145e9f9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems ok, if you want to tweak go ahead. yeah panic is ugly in rust (although if can't write to tmp... not in a good state is it!)

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d815e8f78

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Ok(Self {
info,
shell_tool: Arc::new(ShellTool::new()),
shell_tool: Arc::new(ShellTool::new()?),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle ShellTool init failures without process panic

ShellTool::new() is now fallible (tempfile::tempdir()?), but this constructor path is still used from the platform extension registry via DeveloperClient::new(ctx).unwrap() in crates/goose/src/agents/platform_extensions/mod.rs, so an unwritable/invalid temp directory now crashes extension initialization instead of returning a tool error. This turns a recoverable environment issue (e.g., broken TMPDIR, disk/inode exhaustion) into a startup panic for the developer extension.

Useful? React with 👍 / 👎.

…pDir

Shell output truncation used a global static Mutex<HashMap> to cache
temp file paths. This was racy under concurrent tool calls dispatched
via stream::select_all in agent.rs, and leaked files via
NamedTempFile::keep().

ShellTool now owns a TempDir and an AtomicUsize call index. Concurrent
calls round-robin through a fixed number of file slots (fetch_add %
OUTPUT_SLOTS, same pattern as elastic/elasticsearch-rs RoundRobin).
Files are bounded and cleaned up when the tool drops.

Signed-off-by: Adrian Cole <[email protected]>
@codefromthecrypt codefromthecrypt force-pushed the fix/shell-output-buffer-race branch from 7d815e8 to 131d631 Compare March 4, 2026 23:07
@codefromthecrypt codefromthecrypt added this pull request to the merge queue Mar 4, 2026
Merged via the queue into main with commit 42fc515 Mar 4, 2026
22 checks passed
@codefromthecrypt codefromthecrypt deleted the fix/shell-output-buffer-race branch March 4, 2026 23:17
craigwalkeruk pushed a commit to craigwalkeruk/custom-goose that referenced this pull request Mar 5, 2026
tlongwell-block added a commit that referenced this pull request Mar 5, 2026
* origin/main:
  fix(shell): replace global static output buffer with per-instance TempDir (#7632)
  opt: remove timestamped config file backup (#7618)
  chore(deps): bump @hono/node-server from 1.19.9 to 1.19.10 in /ui/desktop (#7662)
  chore(deps): bump hono from 4.12.3 to 4.12.5 in /evals/open-model-gym/mcp-harness (#7661)
  chore(deps): bump hono from 4.12.2 to 4.12.5 in /ui/desktop (#7660)
  fix: resolve parameters in initial message with autosubmit (#7659)
  fix: this should not be blocked (#7656)
  Relax the assertion for the model list ACP test (#7653)
  fix: add analyzer extension in recipe to maintain backwards compatibility  (#7652)
  docs: add GOOSE_INPUT_LIMIT environment variable documentation (#7299)
wpfleger96 added a commit that referenced this pull request Mar 6, 2026
* origin/main: (29 commits)
  Update to rmcp 1.1.0 (#7619)
  Fix max turns configuration (#7612)
  feat: add base path field to custom provider configuration (#7614)
  fix: compare extension configs before skipping add_extension (#7650)
  chore(release): release version 1.27.0 (minor) (#7611)
  feat: better private channel detection, bot version debugging (#7680)
  chore(deps): bump svgo from 3.3.2 to 3.3.3 in /documentation (#7667)
  fix: only add viewable channels to bot context (#7678)
  chore: added a recipe to help identify high risk change prs for testing (#7651)
  fix: make sure platform binary exists (#7676)
  fix(shell): replace global static output buffer with per-instance TempDir (#7632)
  opt: remove timestamped config file backup (#7618)
  chore(deps): bump @hono/node-server from 1.19.9 to 1.19.10 in /ui/desktop (#7662)
  chore(deps): bump hono from 4.12.3 to 4.12.5 in /evals/open-model-gym/mcp-harness (#7661)
  chore(deps): bump hono from 4.12.2 to 4.12.5 in /ui/desktop (#7660)
  fix: resolve parameters in initial message with autosubmit (#7659)
  fix: this should not be blocked (#7656)
  Relax the assertion for the model list ACP test (#7653)
  fix: add analyzer extension in recipe to maintain backwards compatibility  (#7652)
  docs: add GOOSE_INPUT_LIMIT environment variable documentation (#7299)
  ...
michaelneale added a commit that referenced this pull request Mar 6, 2026
* origin/main: (40 commits)
  fix: make TLS configurable in goosed agent via GOOSE_TLS env var (#7686)
  Update to rmcp 1.1.0 (#7619)
  Fix max turns configuration (#7612)
  feat: add base path field to custom provider configuration (#7614)
  fix: compare extension configs before skipping add_extension (#7650)
  chore(release): release version 1.27.0 (minor) (#7611)
  feat: better private channel detection, bot version debugging (#7680)
  chore(deps): bump svgo from 3.3.2 to 3.3.3 in /documentation (#7667)
  fix: only add viewable channels to bot context (#7678)
  chore: added a recipe to help identify high risk change prs for testing (#7651)
  fix: make sure platform binary exists (#7676)
  fix(shell): replace global static output buffer with per-instance TempDir (#7632)
  opt: remove timestamped config file backup (#7618)
  chore(deps): bump @hono/node-server from 1.19.9 to 1.19.10 in /ui/desktop (#7662)
  chore(deps): bump hono from 4.12.3 to 4.12.5 in /evals/open-model-gym/mcp-harness (#7661)
  chore(deps): bump hono from 4.12.2 to 4.12.5 in /ui/desktop (#7660)
  fix: resolve parameters in initial message with autosubmit (#7659)
  fix: this should not be blocked (#7656)
  Relax the assertion for the model list ACP test (#7653)
  fix: add analyzer extension in recipe to maintain backwards compatibility  (#7652)
  ...
Abhijay007 pushed a commit to Abhijay007/goose that referenced this pull request Mar 6, 2026
wpfleger96 added a commit that referenced this pull request Mar 6, 2026
* origin/main: (59 commits)
  fix: restore smart-approve mode (#7690)
  fix: make TLS configurable in goosed agent via GOOSE_TLS env var (#7686)
  Update to rmcp 1.1.0 (#7619)
  Fix max turns configuration (#7612)
  feat: add base path field to custom provider configuration (#7614)
  fix: compare extension configs before skipping add_extension (#7650)
  chore(release): release version 1.27.0 (minor) (#7611)
  feat: better private channel detection, bot version debugging (#7680)
  chore(deps): bump svgo from 3.3.2 to 3.3.3 in /documentation (#7667)
  fix: only add viewable channels to bot context (#7678)
  chore: added a recipe to help identify high risk change prs for testing (#7651)
  fix: make sure platform binary exists (#7676)
  fix(shell): replace global static output buffer with per-instance TempDir (#7632)
  opt: remove timestamped config file backup (#7618)
  chore(deps): bump @hono/node-server from 1.19.9 to 1.19.10 in /ui/desktop (#7662)
  chore(deps): bump hono from 4.12.3 to 4.12.5 in /evals/open-model-gym/mcp-harness (#7661)
  chore(deps): bump hono from 4.12.2 to 4.12.5 in /ui/desktop (#7660)
  fix: resolve parameters in initial message with autosubmit (#7659)
  fix: this should not be blocked (#7656)
  Relax the assertion for the model list ACP test (#7653)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants