Skip to content

fix(agents): recover sandbox edit after post-write failure#45964

Open
5Funingyuan wants to merge 3 commits intoopenclaw:mainfrom
5Funingyuan:fix-sandbox-edit-post-write-recovery
Open

fix(agents): recover sandbox edit after post-write failure#45964
5Funingyuan wants to merge 3 commits intoopenclaw:mainfrom
5Funingyuan:fix-sandbox-edit-post-write-recovery

Conversation

@5Funingyuan
Copy link
Copy Markdown

Summary

  • Problem: sandboxed edit could report a false "failed" result even when the file write had already succeeded, because sandboxed edit did not use the same post-write recovery path that host edit already had.
  • Why it matters: mutating tool failures are surfaced to users, so a false failure is misleading and can make a successful edit look unsafe or incomplete.
  • What changed: generalized wrapHostEditToolWithPostWriteRecovery(...) to accept injected path resolution and read-back behavior, then wired sandbox edit to verify through the sandbox fs bridge and return success when newText is present and oldText is no longer present after an upstream post-write throw.
  • What did NOT change (scope boundary): this does not change generic tool-runner error classification, does not change host edit behavior beyond keeping its existing default recovery path, and does not add any new permissions, config, or network behavior.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

For sandboxed agent runs, a successful edit no longer emits a false failure when the upstream edit tool throws after the write has already landed (for example, during post-write diff/result formatting).

No config or default behavior changes outside this recovery path.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22.16.0, Vitest; sandbox path covered through a bridge-backed unit test
  • Model/provider: N/A
  • Integration/channel (if any): N/A
  • Relevant config (redacted): None

Steps

  1. Create a sandboxed edit tool using a sandbox fs bridge.
  2. Simulate an upstream post-write throw while bridge read-back shows newText present and oldText absent.
  3. Execute the edit tool and observe whether it returns success or surfaces a failure.

Expected

  • If the write already landed, the edit should return success instead of surfacing a false failure.
  • Pre-write failures should still be reported as failures.

Actual

  • Before this change, sandboxed edit could surface failure because it did not use post-write recovery.
  • After this change, sandboxed edit recovers in the same class of post-write failure that host edit already handled.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)
$ volta run --node 22.16.0 pnpm test -- src/agents/pi-tools.read.host-edit-recovery.test.ts

RUN  v4.1.0 /Users/a1-6/openclaw

Test Files  1 passed (1)
Tests       4 passed (4)

@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: S labels Mar 14, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 14, 2026

Greptile Summary

This PR fixes a false-failure bug in sandboxed edit: when the upstream edit library throws after successfully writing the file (e.g. during diff/result formatting), sandboxed edits now recover in the same way the host edit tool already did. The fix generalizes wrapHostEditToolWithPostWriteRecovery to accept injected resolvePath and readFile hooks, then wires the sandbox tool to verify through the SandboxFsBridge before deciding whether to recover or rethrow.

  • pi-tools.host-edit.ts — adds PostWriteRecoveryOptions so callers can inject path resolution and read-back behavior without changing the host default path.
  • pi-tools.read.tscreateSandboxedEditTool now wraps the base tool with post-write recovery using a pass-through path resolver and the bridge's readFile.
  • Test file adds a sandbox success test with a bridge mock, but is missing parity tests for the two rethrow conditions (newText absent; oldText still present), which leaves pre-write failure detection for sandbox unverified.
  • The default host readFile lambda drops the signal parameter, so an in-flight recovery read won't be aborted when a cancellation signal fires in host mode.

Confidence Score: 4/5

  • Safe to merge — the change is well-scoped, host behavior is unchanged, and the core recovery logic is correct.
  • The logic is sound: the generalization is minimal, the sandbox wiring correctly delegates path resolution and reads to the bridge, and the recovery condition (hasNew && !stillHasOld) mirrors the already-reviewed host implementation. The score is 4 rather than 5 because the default host readFile silently drops the AbortSignal, and the test suite is missing the two sandbox rethrow cases that would confirm pre-write failure detection works end-to-end for sandbox mode.
  • src/agents/pi-tools.host-edit.ts (signal handling in default readFile) and src/agents/pi-tools.read.host-edit-recovery.test.ts (missing sandbox rethrow tests).
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 33-35

Comment:
**Default `readFile` silently drops `signal`**

The default `readFile` lambda captures only `filePath` and ignores the `signal` argument passed on line 68. If an `AbortSignal` is triggered while the recovery read is in flight (host mode, no custom `readFile` provided), the read will not be cancelled, potentially delaying or masking the abort.

```suggestion
  const readFile =
    options?.readFile ??
    ((filePath: string, signal?: AbortSignal) => fs.readFile(filePath, { encoding: "utf-8", signal }));
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-tools.read.host-edit-recovery.test.ts
Line: 91-123

Comment:
**Missing sandbox "should rethrow" test cases**

The host tool has three tests covering both recovery and the two rethrow conditions (newText absent, oldText still present). The new sandbox test only covers the success/recovery path. Without parity tests, a regression in the sandbox rethrow logic (e.g. a bad bridge mock returning the wrong content) would go undetected.

Consider adding at minimum:

```ts
it("rethrows for sandbox edit when bridge readFile shows newText is absent", async () => {
  const bridge: SandboxFsBridge = {
    ...
    readFile: vi.fn(async () => Buffer.from("unchanged content", "utf-8")),
  };
  const tool = createSandboxedEditTool({ root: "/workspace", bridge });
  await expect(
    tool.execute("call-1", { path: "/workspace/MEMORY.md", oldText: "x", newText: "never-written" }, undefined),
  ).rejects.toThrow("Simulated post-write failure");
});

it("rethrows for sandbox edit when bridge readFile shows oldText still present (pre-write failure)", async () => {
  const oldText = "replace me";
  const newText = "new content";
  const bridge: SandboxFsBridge = {
    ...
    readFile: vi.fn(async () => Buffer.from(`before ${oldText} after ${newText}`, "utf-8")),
  };
  const tool = createSandboxedEditTool({ root: "/workspace", bridge });
  await expect(
    tool.execute("call-1", { path: "/workspace/MEMORY.md", oldText, newText }, undefined),
  ).rejects.toThrow("Simulated post-write failure");
});
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 361c450

Comment on lines +33 to +35
const readFile =
options?.readFile ??
((filePath: string) => fs.readFile(filePath, "utf-8"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default readFile silently drops signal

The default readFile lambda captures only filePath and ignores the signal argument passed on line 68. If an AbortSignal is triggered while the recovery read is in flight (host mode, no custom readFile provided), the read will not be cancelled, potentially delaying or masking the abort.

Suggested change
const readFile =
options?.readFile ??
((filePath: string) => fs.readFile(filePath, "utf-8"));
const readFile =
options?.readFile ??
((filePath: string, signal?: AbortSignal) => fs.readFile(filePath, { encoding: "utf-8", signal }));
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 33-35

Comment:
**Default `readFile` silently drops `signal`**

The default `readFile` lambda captures only `filePath` and ignores the `signal` argument passed on line 68. If an `AbortSignal` is triggered while the recovery read is in flight (host mode, no custom `readFile` provided), the read will not be cancelled, potentially delaying or masking the abort.

```suggestion
  const readFile =
    options?.readFile ??
    ((filePath: string, signal?: AbortSignal) => fs.readFile(filePath, { encoding: "utf-8", signal }));
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +91 to +123
it("returns success for sandbox edit when upstream throws after writing", async () => {
const filePath = "/workspace/MEMORY.md";
const oldText = "# Memory";
const newText = "Blog Writing";
const bridge: SandboxFsBridge = {
resolvePath: () => ({
hostPath: "/tmp/MEMORY.md",
relativePath: "MEMORY.md",
containerPath: filePath,
}),
readFile: vi.fn(async () => Buffer.from(`\n\n${newText}\n`, "utf-8")),
writeFile: vi.fn(async () => {}),
mkdirp: vi.fn(async () => {}),
remove: vi.fn(async () => {}),
rename: vi.fn(async () => {}),
stat: vi.fn(async () => ({ type: "file", size: newText.length, mtimeMs: Date.now() })),
};

const tool = createSandboxedEditTool({ root: "/workspace", bridge });
const result = await tool.execute("call-1", { path: filePath, oldText, newText }, undefined);

expect(result).toBeDefined();
const content = Array.isArray((result as { content?: unknown }).content)
? (result as { content: Array<{ type?: string; text?: string }> }).content
: [];
const textBlock = content.find((b) => b?.type === "text" && typeof b.text === "string");
expect(textBlock?.text).toContain("Successfully replaced text");
expect(bridge.readFile).toHaveBeenCalledWith({
filePath,
cwd: "/workspace",
signal: undefined,
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing sandbox "should rethrow" test cases

The host tool has three tests covering both recovery and the two rethrow conditions (newText absent, oldText still present). The new sandbox test only covers the success/recovery path. Without parity tests, a regression in the sandbox rethrow logic (e.g. a bad bridge mock returning the wrong content) would go undetected.

Consider adding at minimum:

it("rethrows for sandbox edit when bridge readFile shows newText is absent", async () => {
  const bridge: SandboxFsBridge = {
    ...
    readFile: vi.fn(async () => Buffer.from("unchanged content", "utf-8")),
  };
  const tool = createSandboxedEditTool({ root: "/workspace", bridge });
  await expect(
    tool.execute("call-1", { path: "/workspace/MEMORY.md", oldText: "x", newText: "never-written" }, undefined),
  ).rejects.toThrow("Simulated post-write failure");
});

it("rethrows for sandbox edit when bridge readFile shows oldText still present (pre-write failure)", async () => {
  const oldText = "replace me";
  const newText = "new content";
  const bridge: SandboxFsBridge = {
    ...
    readFile: vi.fn(async () => Buffer.from(`before ${oldText} after ${newText}`, "utf-8")),
  };
  const tool = createSandboxedEditTool({ root: "/workspace", bridge });
  await expect(
    tool.execute("call-1", { path: "/workspace/MEMORY.md", oldText, newText }, undefined),
  ).rejects.toThrow("Simulated post-write failure");
});
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.read.host-edit-recovery.test.ts
Line: 91-123

Comment:
**Missing sandbox "should rethrow" test cases**

The host tool has three tests covering both recovery and the two rethrow conditions (newText absent, oldText still present). The new sandbox test only covers the success/recovery path. Without parity tests, a regression in the sandbox rethrow logic (e.g. a bad bridge mock returning the wrong content) would go undetected.

Consider adding at minimum:

```ts
it("rethrows for sandbox edit when bridge readFile shows newText is absent", async () => {
  const bridge: SandboxFsBridge = {
    ...
    readFile: vi.fn(async () => Buffer.from("unchanged content", "utf-8")),
  };
  const tool = createSandboxedEditTool({ root: "/workspace", bridge });
  await expect(
    tool.execute("call-1", { path: "/workspace/MEMORY.md", oldText: "x", newText: "never-written" }, undefined),
  ).rejects.toThrow("Simulated post-write failure");
});

it("rethrows for sandbox edit when bridge readFile shows oldText still present (pre-write failure)", async () => {
  const oldText = "replace me";
  const newText = "new content";
  const bridge: SandboxFsBridge = {
    ...
    readFile: vi.fn(async () => Buffer.from(`before ${oldText} after ${newText}`, "utf-8")),
  };
  const tool = createSandboxedEditTool({ root: "/workspace", bridge });
  await expect(
    tool.execute("call-1", { path: "/workspace/MEMORY.md", oldText, newText }, undefined),
  ).rejects.toThrow("Simulated post-write failure");
});
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f528009bfd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

operations: createSandboxEditOperations(params),
}) as unknown as AnyAgentTool;
return wrapToolParamNormalization(base, CLAUDE_PARAM_GROUPS.edit);
const withRecovery = wrapHostEditToolWithPostWriteRecovery(base, params.root, {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid marking sandbox edits successful on pre-write failures

This new sandbox wiring unconditionally applies wrapHostEditToolWithPostWriteRecovery, whose heuristic returns success whenever the file currently contains newText and not oldText; that also matches pre-write failures like "oldText not found" when the file already had the target text before this call. In that case the edit did not actually apply any replacement, but callers now get a success result in sandbox mode, which can hide real edit failures and lead the agent to continue with incorrect assumptions.

Useful? React with 👍 / 👎.

@5Funingyuan
Copy link
Copy Markdown
Author

I took another pass through the latest failing checks after fixing the sandbox edit recovery test typing issue.

At this point, the remaining failures appear to be in unrelated extension and Windows test shards rather than in the sandbox edit recovery path changed in this PR. The current failures are showing up in areas like extensions/telegram, extensions/discord, extensions/whatsapp, extensions/slack, extensions/signal, and src/infra/heartbeat-runner.*, not in the src/agents/pi-tools.* changes here.

The targeted sandbox recovery test for this PR passes locally, and the PR-specific typing issue in src/agents/pi-tools.read.host-edit-recovery.test.ts has already been fixed in a follow-up commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] edit tool reports 'failed' but file is actually modified

1 participant