fix: redact raw error messages from channel-facing agent failure replies by henry-the-frog · Pull Request #51308 · openclaw/openclaw

henry-the-frog · 2026-03-20T22:23:20Z

Summary

Fixes #51275 — agent preflight errors (sandbox security violations, config issues, etc.) were being forwarded verbatim to channels like Discord, leaking internal filesystem paths, OS usernames, and sandbox configuration details.

Changes

src/auto-reply/reply/agent-runner-execution.ts:

Added isSandboxOrSecurity classifier for sandbox security error messages
Sandbox/security errors now show: "Agent failed to start due to a configuration error"
All other unclassified errors now show: "Agent failed before reply" (generic, no raw message)
Removed unused safeMessage/trimmedMessage variables (error details are no longer interpolated into user-facing text)
Full error details are still logged via defaultRuntime.error() and visible in openclaw logs --follow

src/auto-reply/reply.triggers...e2e.test.ts:

Updated expected error text to match new generic format

Security Impact

Before: channel messages could contain:

Full filesystem paths including OS username
Sandbox configuration details (bind mounts, allowed roots)
Internal workspace directory structure

After: all error details stay in operator logs only.

greptile-apps · 2026-03-20T22:25:25Z

Greptile Summary

This PR fixes a security issue where raw agent error messages — containing internal filesystem paths, OS usernames, and sandbox configuration details — were being forwarded verbatim to external channels (e.g., Discord). The fix replaces all raw error interpolation in user-facing fallback replies with static, generic strings, while preserving full error details in operator logs via defaultRuntime.error().

Key changes:

Added an isSandboxOrSecurity classifier (/sandbox security|outside allowed roots/i) to show a slightly more descriptive "configuration error" message for that class of failures.
All other unclassified errors now produce a fully generic "Agent failed before reply." message — no raw error text is ever sent to channels.
Removed the now-unnecessary safeMessage/trimmedMessage variables that were previously used to sanitize and interpolate error details.
The existing test case for "sandbox is not defined." validates the generic fallback correctly, but the new isSandboxOrSecurity-specific message ("Agent failed to start due to a configuration error...") has no test coverage — a case with e.g. "sandbox security violation" or "outside allowed roots" in the error string should be added.

Confidence Score: 4/5

Safe to merge — the security fix is correct and the fallback chain is strictly safer than before; the only gap is missing test coverage for the new isSandboxOrSecurity branch.
The logic change is minimal and directionally correct: raw error text is never forwarded to channels in any code path. The isSandboxOrSecurity classifier is additive (missing matches fall to the safe generic message), so the regex scope doesn't introduce regression risk. Score is 4 rather than 5 only because the new classifier branch lacks a dedicated test case.
The test file (src/auto-reply/reply.triggers.trigger-handling.targets-active-session-native-stop.e2e.test.ts) is missing a test case that exercises the isSandboxOrSecurity branch in agent-runner-execution.ts.

Comments Outside Diff (1)

src/auto-reply/reply.triggers.trigger-handling.targets-active-session-native-stop.e2e.test.ts, line 153-163 (link)

Missing test case for isSandboxOrSecurity branch

The test updates the expected output for "sandbox is not defined.", which correctly hits the generic fallback because "sandbox is not defined." doesn't match the new /sandbox security|outside allowed roots/i regex. However, the new isSandboxOrSecurity classifier branch — which produces the distinct "⚠️ Agent failed to start due to a configuration error. Check \openclaw logs --follow` for details."` message — has no test coverage at all.

Consider adding a test case that exercises the actual patterns the regex targets, e.g.:

Without this, a future refactor of the regex could silently break the specific "configuration error" UX without any test catching it.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/auto-reply/reply.triggers.trigger-handling.targets-active-session-native-stop.e2e.test.ts
Line: 153-163

Comment:
**Missing test case for `isSandboxOrSecurity` branch**

The test updates the expected output for `"sandbox is not defined."`, which correctly hits the generic fallback because `"sandbox is not defined."` doesn't match the new `/sandbox security|outside allowed roots/i` regex. However, the new `isSandboxOrSecurity` classifier branch — which produces the distinct `"⚠️ Agent failed to start due to a configuration error. Check \`openclaw logs --follow\` for details."` message — has no test coverage at all.

Consider adding a test case that exercises the actual patterns the regex targets, e.g.:



Without this, a future refactor of the regex could silently break the specific "configuration error" UX without any test catching it.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/auto-reply/reply.triggers.trigger-handling.targets-active-session-native-stop.e2e.test.ts
Line: 153-163

Comment:
**Missing test case for `isSandboxOrSecurity` branch**

The test updates the expected output for `"sandbox is not defined."`, which correctly hits the generic fallback because `"sandbox is not defined."` doesn't match the new `/sandbox security|outside allowed roots/i` regex. However, the new `isSandboxOrSecurity` classifier branch — which produces the distinct `"⚠️ Agent failed to start due to a configuration error. Check \`openclaw logs --follow\` for details."` message — has no test coverage at all.

Consider adding a test case that exercises the actual patterns the regex targets, e.g.:

```suggestion
      const errorCases = [
        {
          error: "sandbox is not defined.",
          expected: "⚠️ Agent failed before reply. Check `openclaw logs --follow` for details.",
        },
        {
          error: "sandbox security violation: outside allowed roots /home/user/secret",
          expected:
            "⚠️ Agent failed to start due to a configuration error. Check `openclaw logs --follow` for details.",
        },
        {
          error: "Context window exceeded",
          expected:
            "⚠️ Context overflow — prompt too large for this model. Try a shorter message or a larger-context model.",
        },
      ] as const;
```

Without this, a future refactor of the regex could silently break the specific "configuration error" UX without any test catching it.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "fix: redact raw erro..."}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 98c8282e4d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-20T22:29:15Z

src/auto-reply/reply/agent-runner-execution.ts

          : isRoleOrderingError
            ? "⚠️ Message ordering conflict - please try again. If this persists, use /new to start a fresh session."
-            : `⚠️ Agent failed before reply: ${trimmedMessage}.\nLogs: openclaw logs --follow`;
+            : isSandboxOrSecurity
+              ? "⚠️ Agent failed to start due to a configuration error. Check `openclaw logs --follow` for details."
+              : "⚠️ Agent failed before reply. Check `openclaw logs --follow` for details.";


Preserve actionable failures on internal webchat runs

runAgentTurnWithFallback() already distinguishes the internal control surface (webchat) via shouldSurfaceToControlUi, and runReplyAgent() returns this fallback text directly to that chat surface. After this change, the final branch always emits the same generic copy here, so a webchat session that throws before any lifecycle error event is emitted (for example the existing sandbox is not defined path in src/auto-reply/reply.triggers.trigger-handling.targets-active-session-native-stop.e2e.test.ts) loses the only actionable error the operator sees. That makes the commit redact internal debugging output as well, even though the title says the change is for channel-facing replies.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-20T22:29:15Z

src/auto-reply/reply/agent-runner-execution.ts

          : isRoleOrderingError
            ? "⚠️ Message ordering conflict - please try again. If this persists, use /new to start a fresh session."
-            : `⚠️ Agent failed before reply: ${trimmedMessage}.\nLogs: openclaw logs --follow`;
+            : isSandboxOrSecurity
+              ? "⚠️ Agent failed to start due to a configuration error. Check `openclaw logs --follow` for details."
+              : "⚠️ Agent failed before reply. Check `openclaw logs --follow` for details.";


Keep sanitized retry guidance for exhausted transient HTTP failures

This branch still retries isTransientHttp errors once, but if the second attempt also fails it now falls straight into the generic Agent failed before reply text. That regresses the existing safe channel-facing copy from sanitizeUserFacingText() / formatRawAssistantErrorForUi() (for example the friendly 521/529 unavailable/overloaded messages covered in src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts), so common provider outages stop telling users that the failure is transient and retryable. The helper already strips raw HTML/API payloads, so this is a behavior loss, not just a secrecy hardening.

Useful? React with 👍 / 👎.

Preflight errors (sandbox security, config issues, etc.) were being forwarded verbatim to channels like Discord, leaking filesystem paths, OS usernames, and sandbox configuration details. Now: - Sandbox/security errors get a specific generic message - All other unclassified errors get a generic message - Full error details remain in openclaw logs for operator debugging Fixes openclaw#51275

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba97c76779

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-23T14:41:33Z

src/auto-reply/reply.triggers.trigger-handling.targets-active-session-native-stop.e2e.test.ts

+      const errorCases = [
+        {
+          error: "sandbox is not defined.",
+          expected: "⚠️ Agent failed before reply. Check `openclaw logs --follow` for details.",


Sync the harness mock with the redacted fallback copy

This expectation now asserts the new redacted message, but the file-level runReplyAgent mock still hard-codes the old ⚠️ Agent failed before reply: …\nLogs: … format in src/auto-reply/reply.triggers.trigger-handling.targets-active-session-native-stop.e2e.test.ts:42-47. As written, the first error case will fail even if the runtime change is correct, so this regression test no longer validates the behavior the commit is trying to lock in.

Useful? React with 👍 / 👎.

openclaw-barnacle bot added the size: XS label Mar 20, 2026

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

scoootscooob mentioned this pull request Mar 21, 2026

fix(agent): clarify embedded transport errors #51419

Merged

20 tasks

henry-the-frog force-pushed the fix/redact-preflight-errors-from-channels branch from 98c8282 to 1236d8c Compare March 21, 2026 20:46

henry-the-frog force-pushed the fix/redact-preflight-errors-from-channels branch from 1236d8c to ba97c76 Compare March 23, 2026 14:33

chatgpt-codex-connector bot reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: redact raw error messages from channel-facing agent failure replies#51308

fix: redact raw error messages from channel-facing agent failure replies#51308
henry-the-frog wants to merge 1 commit intoopenclaw:mainfrom
henry-the-frog:fix/redact-preflight-errors-from-channels

henry-the-frog commented Mar 20, 2026

Uh oh!

greptile-apps bot commented Mar 20, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

henry-the-frog commented Mar 20, 2026

Summary

Changes

Security Impact

Uh oh!

greptile-apps bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Comments Outside Diff (1)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps bot commented Mar 20, 2026 •

edited

Loading