Skip to content

fix(sandbox): sanitize security errors before posting to channels#51323

Open
ernestodeoliveira wants to merge 2 commits intoopenclaw:mainfrom
ernestodeoliveira:fix/sandbox-security-error-pii-leak-v2
Open

fix(sandbox): sanitize security errors before posting to channels#51323
ernestodeoliveira wants to merge 2 commits intoopenclaw:mainfrom
ernestodeoliveira:fix/sandbox-security-error-pii-leak-v2

Conversation

@ernestodeoliveira
Copy link
Copy Markdown
Contributor

@ernestodeoliveira ernestodeoliveira commented Mar 20, 2026

Summary

When an agent fails to start due to a sandbox security violation, the full raw error message was being posted to the configured channel (Discord, Telegram, etc.) as if it were the agent's reply. This leaks sensitive internal information to public-facing channels.

Example of what was leaking:

ÔÜá´©Å Agent failed before reply: Sandbox security: bind mount "/home/user/.openclaw/sandbox-configs/config.json:/app/config.json:ro" source "/home/user/.openclaw/sandbox-configs/config.json" is outside allowed roots (/home/user/.openclaw/workspace-agent).

This exposes full filesystem paths including OS usernames, sandbox configuration details, workspace directory structure, and config file names.

Details

Added isSandboxSecurityError() to the error classification module. When a sandbox security error is detected in runAgentTurnWithFallback, the channel receives a safe generic message instead of the raw error:

ÔÜá´©Å Agent failed to start. Check logs: openclaw logs --follow

The original error is still logged internally via defaultRuntime ÔÇö it is only redacted from the outbound channel payload.

The fix follows the same pattern already used for billing errors, rate limit errors, and other sensitive failure modes in agent-runner-execution.ts.

Related Issues

Fixes #51275

How to Validate

  1. Configure a sandbox bind mount with a source path outside the allowed workspace root
  2. Trigger the agent via a channel (Discord, Telegram, etc.)
  3. Confirm the channel receives the generic message, not the raw path
  4. Confirm openclaw logs --follow still shows the full error internally

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • Windows
      • npm run

AI-Assisted Disclosure

  • This PR was built with Claude Code (Anthropic)
  • Fully tested locally — tests pass, oxfmt formatting verified
  • Author reviewed and understands all changes
  • Codex review not available locally (not installed)

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Mar 20, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 20, 2026

Greptile Summary

This PR fixes an information-disclosure bug where raw sandbox security error messages (containing full filesystem paths and OS usernames) were being forwarded to external channels (Discord, Telegram, etc.) as agent replies. The fix adds isSandboxSecurityError() to the error classification module and wires it into runAgentTurnWithFallback so that only a safe generic message is posted to the channel, while the full error is still recorded internally via defaultRuntime.

Key changes:

  • errors.ts: New isSandboxSecurityError(raw: string) that matches all error messages prefixed with "Sandbox security:" — consistent with every error produced in validate-sandbox-security.ts and zod-schema.agent-runtime.ts
  • agent-runner-execution.ts: isSandboxError check inserted after the internal log call (so sensitive details are still persisted) and before the channel payload is assembled
  • pi-embedded-helpers.ts: Barrel export added for the new helper
  • Test coverage for all known sandbox error variants — though the final test case ("does not leak filesystem paths in the safe fallback message") only asserts against a hardcoded string literal rather than exercising the integration path in agent-runner-execution.ts

Confidence Score: 4/5

  • This PR is safe to merge; the logic is correct, follows existing patterns, and the security fix is complete.
  • The implementation is straightforward and correctly mirrors the pattern used for billing/rate-limit errors. The isSandboxSecurityError prefix match covers all known error sources. The only minor gap is a test case that asserts a hardcoded literal rather than the actual integration path, which slightly reduces confidence in regression coverage but does not introduce any functional risk.
  • No files require special attention beyond the weak integration test case in src/agents/pi-embedded-helpers/sandbox-security-error.test.ts.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/sandbox-security-error.test.ts
Line: 62-80

Comment:
**Test asserts a hardcoded literal, not the actual integration**

The second half of this test (lines 73–79) only asserts that the hardcoded string `"⚠️ Agent failed to start. Check logs: openclaw logs --follow"` doesn't contain filesystem paths, which is trivially true and can never fail regardless of what the production code actually returns. It doesn't verify that `agent-runner-execution.ts` actually *uses* that safe message when `isSandboxSecurityError` returns `true`.

Consider replacing these assertions with an integration-level test (or at least a comment explaining they are intentional), so a future refactor that accidentally changes the fallback message in `agent-runner-execution.ts` would still be caught:

```ts
// Instead of asserting a literal string, derive the safe message from the
// same source the production code uses, or write an integration test that
// exercises runAgentTurnWithFallback with a sandbox error and checks the
// returned payload.text.
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix(sandbox): saniti..."

Comment on lines +62 to +80
});

it("does not leak filesystem paths in the safe fallback message", () => {
const rawError =
'Sandbox security: bind mount "/home/ernest/.openclaw/sandbox-configs/config.json:/app/config.json:ro" ' +
'source "/home/ernest/.openclaw/sandbox-configs/config.json" is outside allowed roots ' +
"(/home/ernest/.openclaw/workspace-agent).";

expect(isSandboxSecurityError(rawError)).toBe(true);

// The safe message the channel receives should NOT contain any of these:
const safeMessage = "⚠️ Agent failed to start. Check logs: openclaw logs --follow";
expect(safeMessage).not.toContain("/home/ernest");
expect(safeMessage).not.toContain(".openclaw");
expect(safeMessage).not.toContain("sandbox-configs");
expect(safeMessage).not.toContain("workspace-agent");
expect(safeMessage).not.toContain("bind mount");
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Test asserts a hardcoded literal, not the actual integration

The second half of this test (lines 73–79) only asserts that the hardcoded string "⚠️ Agent failed to start. Check logs: openclaw logs --follow" doesn't contain filesystem paths, which is trivially true and can never fail regardless of what the production code actually returns. It doesn't verify that agent-runner-execution.ts actually uses that safe message when isSandboxSecurityError returns true.

Consider replacing these assertions with an integration-level test (or at least a comment explaining they are intentional), so a future refactor that accidentally changes the fallback message in agent-runner-execution.ts would still be caught:

// Instead of asserting a literal string, derive the safe message from the
// same source the production code uses, or write an integration test that
// exercises runAgentTurnWithFallback with a sandbox error and checks the
// returned payload.text.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-helpers/sandbox-security-error.test.ts
Line: 62-80

Comment:
**Test asserts a hardcoded literal, not the actual integration**

The second half of this test (lines 73–79) only asserts that the hardcoded string `"⚠️ Agent failed to start. Check logs: openclaw logs --follow"` doesn't contain filesystem paths, which is trivially true and can never fail regardless of what the production code actually returns. It doesn't verify that `agent-runner-execution.ts` actually *uses* that safe message when `isSandboxSecurityError` returns `true`.

Consider replacing these assertions with an integration-level test (or at least a comment explaining they are intentional), so a future refactor that accidentally changes the fallback message in `agent-runner-execution.ts` would still be caught:

```ts
// Instead of asserting a literal string, derive the safe message from the
// same source the production code uses, or write an integration test that
// exercises runAgentTurnWithFallback with a sandbox error and checks the
// returned payload.text.
```

How can I resolve this? If you propose a fix, please make it concise.

@ernestodeoliveira ernestodeoliveira force-pushed the fix/sandbox-security-error-pii-leak-v2 branch from 39abd2c to dccc3ba Compare March 20, 2026 23:37
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 39abd2cb8b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

if (!raw) {
return false;
}
return raw.includes("Sandbox security:");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Sanitize SSH sandbox startup errors too

This matcher only catches messages containing Sandbox security:, but embedded runs also fail in resolveSandboxContext() on SSH-specific validations like Sandbox SSH workspaceRoot must be an absolute POSIX path: ... (src/agents/sandbox/config.ts:181-186) and Invalid sandbox SSH target: ... (src/agents/sandbox/ssh.ts:133-139). Those exceptions are thrown before any reply from runEmbeddedAttempt() (src/agents/pi-embedded-runner/run/attempt.ts:1411-1415), so they still fall through to the raw Agent failed before reply: ... channel message and leak the same hostnames/paths this patch is trying to redact.

Useful? React with 👍 / 👎.

Comment on lines +638 to +639
: isSandboxError
? "⚠️ Agent failed to start. Check logs: openclaw logs --follow"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep detailed sandbox failures on internal webchat runs

This branch redacts sandbox failures unconditionally, but runAgentTurnWithFallback() is also the reply path for Control UI / webchat sessions (src/gateway/server-methods/chat.ts:1301-1325). Unlike the CLI path, which emits a lifecycle phase: "error" event with the raw exception (src/auto-reply/reply/agent-runner-execution.ts:282-294), the embedded path has no parallel error emission, so browser-only operators now lose the only actionable explanation for a bad bind/network/AppArmor setting and just see Agent failed to start.

Useful? React with 👍 / 👎.

@ernestodeoliveira ernestodeoliveira force-pushed the fix/sandbox-security-error-pii-leak-v2 branch from bde9d66 to 6c0cee0 Compare March 21, 2026 21:30
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6c0cee0bcb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +152 to +154
const code = (err as NodeJS.ErrnoException).code ?? "UNKNOWN";
log.warn(
`Hook "${safeLogValue(entry.hook.name)}" failed to load: ${safeLogValue(entry.hook.handlerPath)} (${safeLogValue(code)}). Skipping.`,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve actual hook import errors in loader logs

When a hook module has a parse error or throws during top-level evaluation, import() usually raises a plain SyntaxError/Error with no code. This branch now casts it to ErrnoException and logs only err.code, so those failures collapse to (UNKNOWN) here (and again in the legacy-handler catch below) while the loader silently skips the hook. That removes the only actionable clue for fixing a broken hook, because operators can no longer see the real syntax/runtime error message.

Useful? React with 👍 / 👎.

Comment on lines +153 to +154
log.warn(
`Hook "${safeLogValue(entry.hook.name)}" failed to load: ${safeLogValue(entry.hook.handlerPath)} (${safeLogValue(code)}). Skipping.`,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep hook load failures at error severity

Downgrading hook load failures from error to warn makes them disappear entirely in deployments that run with logging.level=error or logging.consoleLevel=error: isFileLogLevelEnabled() and shouldLogToConsole() both suppress warn-level output in that configuration. In those environments a missing or broken hook now becomes a silent no-op during startup, whereas the previous error log was still visible.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Sandbox security errors posted as channel messages — leaks internal paths and PII

1 participant