Skip to content

fix(infra): block ambient Homebrew env vars from brew resolution#74463

Merged
pgondhi987 merged 6 commits intoopenclaw:mainfrom
pgondhi987:fix/fix-533
May 1, 2026
Merged

fix(infra): block ambient Homebrew env vars from brew resolution#74463
pgondhi987 merged 6 commits intoopenclaw:mainfrom
pgondhi987:fix/fix-533

Conversation

@pgondhi987
Copy link
Copy Markdown
Contributor

Summary

  • Problem: HOMEBREW_BREW_FILE and HOMEBREW_PREFIX were read from ambient process.env (including workspace .env) to resolve the Homebrew executable and bin directory. An attacker-controlled repository could point these vars at a malicious local binary and gain code execution during skill install or dependency bootstrap.
  • Why it matters: Any user who opens a repository and triggers a Homebrew-backed skill install (e.g. go install, brew-based dep) is vulnerable without any explicit action beyond loading the workspace.
  • What changed: resolveBrewExecutable() and resolveBrewPathDirs() in src/infra/brew.ts no longer consult HOMEBREW_BREW_FILE or HOMEBREW_PREFIX at all — resolution uses only hardcoded standard paths (/opt/homebrew, /usr/local, ~/.linuxbrew). Both keys are also added to BLOCKED_WORKSPACE_DOTENV_KEYS in src/infra/dotenv.ts as defense-in-depth. The HOMEBREW_PREFIX env-fallback in resolveBrewBinDir() (src/agents/skills-install.ts) is removed.
  • What did NOT change: Brew detection via hasBinary("brew") (PATH-based) is unaffected. Standard Homebrew install locations continue to work. No user-facing behavior change for standard setups.

Change Type (select all)

  • Bug fix
  • Security hardening

Scope (select all touched areas)

  • Skills / tool execution

Linked Issue/PR

  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: resolveBrewExecutable() trusted env vars that can be injected by a workspace .env file, allowing arbitrary executable redirection.
  • Missing detection / guardrail: HOMEBREW_BREW_FILE / HOMEBREW_PREFIX were not on the workspace dotenv blocklist and were read unconditionally from process.env.
  • Contributing context (if known): Skill install and PATH bootstrap paths fall back to resolveBrewExecutable() when brew is not on PATH, making the injection reachable without user interaction beyond opening a workspace.

Regression Test Plan (if applicable)

  • Coverage level:
    • Unit test
    • Seam / integration test
  • Target test or file: src/infra/brew.test.ts, src/infra/dotenv.test.ts, src/infra/path-env.test.ts, src/agents/skills-install-fallback.test.ts
  • Scenario the test should lock in: env-injected HOMEBREW_BREW_FILE/HOMEBREW_PREFIX do not affect brew executable resolution or PATH construction; workspace dotenv blocklist rejects both keys; GOBIN is not set from env prefix after brew --prefix failure.
  • Why this is the smallest reliable guardrail: Each guard (dotenv block, function-level env ignore, resolveBrewBinDir removal) has a dedicated negative-assertion test.
  • Existing test that already covers this: None (gap that enabled the vulnerability).

User-visible / Behavior Changes

Users with a non-standard Homebrew prefix set only via environment variable (not in PATH) may no longer have that prefix auto-detected. Workaround: ensure brew is on PATH. Standard installs (/opt/homebrew, /usr/local, ~/.linuxbrew) are unaffected.

Diagram (if applicable)

Before:
workspace .env -> process.env.HOMEBREW_BREW_FILE -> resolveBrewExecutable() -> executes attacker binary

After:
workspace .env -> blocked by dotenv blocklist (no-op)
resolveBrewExecutable() -> hardcoded standard paths only

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? Yes — Homebrew executable resolution no longer trusts env vars; attack surface reduced.
  • Data access scope changed? No
  • Explanation: The command execution surface is narrowed: resolveBrewExecutable() can no longer return an attacker-supplied path. Hardcoded standard paths are the only candidates.

Repro + Verification

Environment

  • OS: Linux / macOS
  • Runtime/container: Node 22+
  • Model/provider: N/A
  • Integration/channel: N/A
  • Relevant config: workspace .env with HOMEBREW_BREW_FILE or HOMEBREW_PREFIX

Steps

  1. Create a workspace .env containing HOMEBREW_BREW_FILE=./evil-brew/bin/brew or HOMEBREW_PREFIX=./evil-brew.
  2. Trigger a skill install that falls back to brew resolution (e.g. go-tool install with brew not on PATH).
  3. Observe that the attacker-controlled binary is not invoked; install falls through to hardcoded candidates.

Expected

  • resolveBrewExecutable() returns a path under /opt/homebrew, /usr/local, or ~/.linuxbrew only, or undefined.
  • HOMEBREW_BREW_FILE / HOMEBREW_PREFIX are not present in process.env after loadDotEnv().

Actual (after fix)

  • Both conditions satisfied.

Evidence

  • Failing test/log before + passing after — new unit/integration tests added to all three affected layers (dotenv block, brew resolution, skill-install GOBIN).

Human Verification (required)

⚠️ AI-assisted PR (generated by OpenAI Codex, reviewed by Claude). No human has run these changes locally.

  • Verified scenarios: Code review confirmed all three removal sites (brew.ts env reads, dotenv blocklist, skills-install.ts env fallback) and all new test assertions.
  • Edge cases checked: Signal extension call site (resolveBrewExecutable() with no opts), path-env.ts PATH bootstrap, plugin-sdk re-export — all confirmed safe with the new hardcoded-only behavior.
  • What you did not verify: Live smoke test on a machine with a non-standard Homebrew prefix; E2E skill install flow on macOS.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes (for standard Homebrew installs)
  • Config/env changes? Yes — HOMEBREW_BREW_FILE and HOMEBREW_PREFIX are now blocked from workspace .env and ignored in brew resolution.
  • Migration needed? No — users with non-standard prefixes should add brew to their PATH.

Risks and Mitigations

  • Risk: Users with non-standard Homebrew prefix (env-only, not in PATH) lose auto-detection.
    • Mitigation: Standard install paths cover the vast majority of users; env-var redirection was the exact attack vector being closed.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels Apr 29, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR closes the workspace .env injection path for Homebrew executable resolution by removing HOMEBREW_BREW_FILE / HOMEBREW_PREFIX reads from resolveBrewExecutable(), resolveBrewPathDirs(), and resolveBrewBinDir(), and blocking both keys in BLOCKED_WORKSPACE_DOTENV_KEYS.

  • src/plugins/provider-openai-codex-oauth-tls.ts line 67 still reads process.env.HOMEBREW_PREFIX to resolve the OpenSSL cert bundle path — an attacker-controlled value here could redirect TLS certificate trust, enabling MITM against the Codex OAuth flow. This use site was not included in the fix and is not covered by any of the new tests.

Confidence Score: 3/5

Not safe to merge as-is — the fix is incomplete, leaving a security-relevant HOMEBREW_PREFIX read in the TLS plugin.

A P1 security finding (incomplete hardening of HOMEBREW_PREFIX in provider-openai-codex-oauth-tls.ts) caps the score at 4, and the fact that it directly relates to the stated security goal of this PR pulls it below the ceiling to 3. The three primary fix sites (brew.ts, skills-install.ts, dotenv.ts) are well-executed with strong negative-assertion test coverage.

src/plugins/provider-openai-codex-oauth-tls.ts — contains a remaining process.env.HOMEBREW_PREFIX read not addressed by this PR

Security Review

  • TLS cert bundle hijack (partial fix gap): resolveHomebrewPrefixFromExecPath in src/plugins/provider-openai-codex-oauth-tls.ts still reads process.env.HOMEBREW_PREFIX to locate the OpenSSL cert bundle (${prefix}/etc/openssl@3/cert.pem). An attacker-controlled prefix could redirect TLS trust to a malicious CA, enabling MITM against the Codex OAuth flow. The new dotenv blocklist closes the workspace .env injection path, but the direct process.env read remains and is not hardened like brew.ts was.

Comments Outside Diff (1)

  1. src/plugins/provider-openai-codex-oauth-tls.ts, line 67-68 (link)

    P1 security Missed HOMEBREW_PREFIX read not covered by this fix

    resolveHomebrewPrefixFromExecPath still reads process.env.HOMEBREW_PREFIX as a fallback. An attacker-controlled value here doesn't redirect binary execution, but it does redirect TLS cert bundle resolution — resolveCertBundlePath() constructs ${prefix}/etc/openssl@3/cert.pem from this, so a workspace-injected prefix could point to a malicious CA bundle and enable MITM against the OpenAI Codex OAuth TLS flow. The dotenv blocklist added in this PR prevents workspace .env injection, but the read itself remains and would silently trust any HOMEBREW_PREFIX already present in the ambient environment.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/plugins/provider-openai-codex-oauth-tls.ts
    Line: 67-68
    
    Comment:
    **Missed `HOMEBREW_PREFIX` read not covered by this fix**
    
    `resolveHomebrewPrefixFromExecPath` still reads `process.env.HOMEBREW_PREFIX` as a fallback. An attacker-controlled value here doesn't redirect binary execution, but it does redirect TLS cert bundle resolution — `resolveCertBundlePath()` constructs `${prefix}/etc/openssl@3/cert.pem` from this, so a workspace-injected prefix could point to a malicious CA bundle and enable MITM against the OpenAI Codex OAuth TLS flow. The dotenv blocklist added in this PR prevents workspace `.env` injection, but the read itself remains and would silently trust any `HOMEBREW_PREFIX` already present in the ambient environment.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/plugins/provider-openai-codex-oauth-tls.ts
Line: 67-68

Comment:
**Missed `HOMEBREW_PREFIX` read not covered by this fix**

`resolveHomebrewPrefixFromExecPath` still reads `process.env.HOMEBREW_PREFIX` as a fallback. An attacker-controlled value here doesn't redirect binary execution, but it does redirect TLS cert bundle resolution — `resolveCertBundlePath()` constructs `${prefix}/etc/openssl@3/cert.pem` from this, so a workspace-injected prefix could point to a malicious CA bundle and enable MITM against the OpenAI Codex OAuth TLS flow. The dotenv blocklist added in this PR prevents workspace `.env` injection, but the read itself remains and would silently trust any `HOMEBREW_PREFIX` already present in the ambient environment.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/infra/brew.ts
Line: 16-17

Comment:
**Dead `env` field in `BrewResolutionOptions`**

The `env` field is accepted but never read inside either `resolveBrewExecutable` or `resolveBrewPathDirs` after this refactor. The comment "Kept for injected test environments" implies callers can still control resolution via `env`, but the implementations ignore it entirely. Consider removing the field to avoid misleading future readers, or making the comment explicit that `env` is a no-op retained only for call-site backward compatibility.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: address issue" | Re-trigger Greptile

Comment thread src/infra/brew.ts Outdated
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 29, 2026

Codex review: needs changes before merge.

What this changes:

The PR removes Homebrew env-var fallback from brew executable, PATH, and skill-install bin-dir resolution, blocks HOMEBREW_BREW_FILE and HOMEBREW_PREFIX in workspace dotenv loading, and adds regression tests for those paths.

Required change before merge:

A narrow automated repair can add the missing changelog entry only; maintainer/security approval and validation remain required afterward.

Security review:

Security review cleared: The diff narrows Homebrew env trust and adds focused blocklist/test coverage without adding dependencies, permissions, downloads, lifecycle hooks, or new secret exposure.

Review findings:

  • [P3] Add the required changelog entry — src/infra/dotenv.ts:27-28
Review details

Best possible solution:

Land the hardening after adding the required changelog entry and getting maintainer/security approval: keep Homebrew env vars ignored for executable, PATH, and skill-install bin-dir resolution; keep the SDK option as deprecated/no-op compatibility; and track any OAuth TLS Homebrew-prefix hardening separately if maintainers want to remove that ambient fallback too.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main has a source-level reproduction path: HOMEBREW_BREW_FILE or HOMEBREW_PREFIX can influence the brew helper and skill-install fallback, while the PR head adds negative tests for those paths.

Is this the best way to solve the issue?

Yes, with one process gap. The code direction is the narrow maintainable fix for the named command-resolution paths, and the remaining required repair is a changelog entry rather than a code redesign.

Full review comments:

  • [P3] Add the required changelog entry — src/infra/dotenv.ts:27-28
    This hardening changes user-visible security and compatibility behavior, but the PR diff only updates code and tests. Add a single-line CHANGELOG.md entry under Unreleased / ### Fixes crediting @pgondhi987 before merge.
    Confidence: 0.91

Overall correctness: patch is correct
Overall confidence: 0.84

Acceptance criteria:

  • git diff --check
  • pnpm test src/infra/brew.test.ts src/infra/dotenv.test.ts src/infra/path-env.test.ts src/agents/skills-install-fallback.test.ts
  • pnpm check:changed (Testbox before merge)

What I checked:

  • Current main brew resolver trusts Homebrew env: resolveBrewPathDirs reads env.HOMEBREW_PREFIX, and resolveBrewExecutable reads both env.HOMEBREW_BREW_FILE and env.HOMEBREW_PREFIX before standard candidates. (src/infra/brew.ts:30, 394bc9c46586)
  • Current main skill-install fallback trusts Homebrew env: resolveBrewBinDir falls back to process.env.HOMEBREW_PREFIX when brew --prefix does not return a prefix. (src/agents/skills-install.ts:237, 394bc9c46586)
  • Current main dotenv blocklist gap: The workspace dotenv blocklist on main does not include HOMEBREW_BREW_FILE or HOMEBREW_PREFIX. (src/infra/dotenv.ts:16, 394bc9c46586)
  • Latest PR head preserves SDK compatibility: The PR head keeps env?: NodeJS.ProcessEnv as a deprecated no-op compatibility field while ignoring Homebrew env vars during resolution. (src/infra/brew.ts:14, f48129237df2)
  • Latest PR head removes skill-install env fallback: The PR head resolves the brew bin directory from brew --prefix or standard paths only, with no HOMEBREW_PREFIX fallback. (src/agents/skills-install.ts:220, f48129237df2)
  • Latest PR head blocks Homebrew env keys in workspace dotenv: The PR head adds both Homebrew keys to BLOCKED_WORKSPACE_DOTENV_KEYS. (src/infra/dotenv.ts:27, f48129237df2)

Likely related people:

  • steipete: Current blame for the Homebrew helper, dotenv blocklist, and skill-install fallback points to recent maintainer refactors, and feature history shows Peter introduced Linuxbrew skill bootstrap, adjusted Linuxbrew path preference, and later hardened PATH handling. (role: introduced behavior / likely maintainer; confidence: high; commits: 8e5c2efb8d85, 18c43fe46229, 7214cf39ec00; files: src/infra/brew.ts, src/infra/path-env.ts, src/agents/skills-install.ts)
  • vincentkoc: Recent merged work hardened dependency install surfaces adjacent to the skill-install fallback touched by this PR. (role: recent maintainer; confidence: medium; commits: abd5ec98ab01; files: src/agents/skills-install.ts, src/agents/skills-install.test.ts)
  • eleqtrizit: Recent dotenv hardening commits expanded the workspace runtime env blocklist and reserved OpenClaw runtime env controls, which is the blocklist surface changed here. (role: adjacent owner; confidence: medium; commits: dbfcef319618, 018494fa3ebb; files: src/infra/dotenv.ts, src/infra/dotenv.test.ts)
  • Devin Robison: Feature history shows the untrusted CWD .env filtering behavior entered through a prior workspace-dotenv hardening commit, which is part of the same trust boundary. (role: introduced related workspace dotenv filtering; confidence: medium; commits: 6a793248024d; files: src/infra/dotenv.ts, src/infra/dotenv.test.ts)

Remaining risk / open question:

  • Users with a non-standard Homebrew prefix available only through HOMEBREW_PREFIX lose that fallback and must rely on brew in PATH or a standard install location; maintainers should accept that compatibility tradeoff.
  • The existing OpenAI Codex OAuth TLS helper still reads ambient HOMEBREW_PREFIX for cert-bundle fallback. The new workspace dotenv blocklist mitigates this PR's workspace .env injection path, but that adjacent TLS behavior may deserve a separate security decision.
  • This was a read-only review, so the targeted tests and changed gate were not executed here.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 394bc9c46586.

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 29, 2026

Codex review: found issues before merge.

What this changes:

The PR removes ambient Homebrew env-var fallback from brew executable/PATH and skill-install bin-dir resolution, blocks HOMEBREW_BREW_FILE and HOMEBREW_PREFIX in workspace dotenv loading, and adds regression tests for those paths.

Maintainer follow-up before merge:

No automated repair lane: this PR has the protected maintainer label and changes security-sensitive env and command-resolution behavior. The next action is maintainer/security review and a focused PR revision or approval.

Review findings:

  • [P2] Preserve the brew helper options type — src/infra/brew.ts:14-16
Review details

Best possible solution:

Revise and land this hardening after maintainer/security review: keep Homebrew env vars ignored by default for executable/PATH/skill-install resolution, block them from workspace dotenv, preserve the public SDK options shape as deprecated/no-op compatibility, and decide separately whether the OAuth TLS Homebrew prefix fallback should be hardened too.

Full review comments:

  • [P2] Preserve the brew helper options type — src/infra/brew.ts:14-16
    resolveBrewExecutable is exported through the public openclaw/plugin-sdk/setup-tools subpath. Removing env?: NodeJS.ProcessEnv from the options type can break external TypeScript plugin code that still calls resolveBrewExecutable({ env }) with an object literal, even if the value must now be ignored. Keep the property as a deprecated/no-op compatibility field while preserving the new resolution behavior.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.78

What I checked:

  • Protected label: The provided GitHub context lists the protected maintainer label, so this PR needs explicit maintainer handling rather than automated close or repair.
  • Current main still trusts Homebrew env in brew helper: resolveBrewPathDirs reads env.HOMEBREW_PREFIX, and resolveBrewExecutable reads env.HOMEBREW_BREW_FILE and env.HOMEBREW_PREFIX, matching the PR's target behavior. (src/infra/brew.ts:30, 8cf724a381a3)
  • Current main still uses Homebrew prefix in skill install fallback: resolveBrewBinDir falls back to process.env.HOMEBREW_PREFIX after brew --prefix fails, and go installs can use that result as GOBIN. (src/agents/skills-install.ts:237, 8cf724a381a3)
  • Workspace dotenv blocklist gap on main: The current workspace dotenv blocklist does not include HOMEBREW_BREW_FILE or HOMEBREW_PREFIX; the PR adds both keys. (src/infra/dotenv.ts:13, 8cf724a381a3)
  • Skill install security context: The skills docs say gateway-backed skill dependency installs run installer metadata after scanning, so hardening executable resolution on that path is security-relevant. Public docs: docs/tools/skills.md. (docs/tools/skills.md:149, 8cf724a381a3)
  • Public SDK export surface: resolveBrewExecutable is re-exported from src/plugin-sdk/setup-tools.ts, and package.json exposes ./plugin-sdk/setup-tools with generated types. Removing the optional env option is therefore a public TypeScript surface change. (package.json:144, 8cf724a381a3)

Likely related people:

  • steipete: Blame for the current Homebrew helper, dotenv blocklist, PATH bootstrap, skill-install fallback, and adjacent OpenAI Codex OAuth TLS prefix read points to the same recent current-main commit; this is routing context, not blame. (role: introduced behavior / likely maintainer; confidence: medium; commits: 8055e74485e2; files: src/infra/brew.ts, src/infra/dotenv.ts, src/infra/path-env.ts)

Remaining risk / open question:

  • The PR intentionally drops env-only nonstandard Homebrew prefix discovery; maintainers should confirm that compatibility tradeoff for macOS/Linux users and external plugin setup flows.
  • The remaining HOMEBREW_PREFIX read in src/plugins/provider-openai-codex-oauth-tls.ts is adjacent security-sensitive current-main behavior. The new dotenv blocklist mitigates workspace .env injection, but ambient env trust may need separate maintainer/security review.
  • This was a read-only review, so targeted tests and CI were not run here.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8cf724a381a3.

@pgondhi987
Copy link
Copy Markdown
Contributor Author

@greptile review

@pgondhi987 pgondhi987 merged commit f86953f into openclaw:main May 1, 2026
5 checks passed
ihnotic added a commit to ihnotic/openclaw that referenced this pull request May 1, 2026
* refactor: trim provider discovery internal exports

* refactor: trim voice runtime internal exports

* fix(whatsapp): stage qrcode runtime dependency

* refactor: trim stream helper internal exports

* refactor: trim provider constant exports

* fix: make Discord voice reconnect timing resilient

* refactor: trim onboarding internal helpers

* docs(sandboxing): clarify sandbox setup scripts require source checkout (openclaw#75594)

Add inline docker build commands for npm-installed users who don't have the
source checkout scripts. Update all docs referencing sandbox-setup.sh,
sandbox-common-setup.sh and sandbox-browser-setup.sh to note they are
source-checkout-only and link to the new inline instructions.

Fixes openclaw#75485.

* fix(plugins): skip update when bundled plugin version is newer than installed clawhub/marketplace version (openclaw#75604)

* fix(plugin-sdk): restore deprecated reply pipeline compat exports

* refactor: trim test support helper exports

* test: stabilize Parallels update smokes

* refactor: trim sessions spawn harness type exports

* fix(gateway): refresh stale channel health cache

* fix(agent): apply configured fast mode to embedded runs

* fix(agent): default missing model cost metadata

* fix(agent): honor explicit OpenAI SSE transport

* fix(extensions): guard model and Twilio fetches

* fix(cli): repair agent runtime deps during startup

* fix(whatsapp): mirror qrcode from root runtime deps

* refactor: trim sanitize history harness exports

* refactor: trim subagent test helper exports

* [codex] Defer status reaction cleanup (openclaw#75582)

Summary:
- The PR updates the shared status reaction controller to track active remove-capable reactions, defer cleanup until clear/restoreInitial, adjust controller and Slack lifecycle tests, add a changelog entry, and carries qrcode runtime-dependency mirror hunks from its older base.

ClawSweeper fixups:
- Included follow-up commit: fix: limit status reaction restore cleanup
- Included follow-up commit: chore: merge main into status reaction cleanup
- Included follow-up commit: fix: mirror qrcode runtime dependency

Validation:
- ClawSweeper review passed for head f3efcb4.
- Required merge gates passed before the squash merge.

Prepared head SHA: f3efcb4
Review: openclaw#75582 (comment)

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>

* refactor: trim runtime test helper type exports

* refactor: delete unused test helpers

* refactor: delete unused test support modules

* fix(agents): trim trailing assistant turns and rewrite blank user messages in session repair (openclaw#75606)

* fix(agents): trim trailing assistant turns and rewrite blank user messages in session repair

Session-file repair now:
- Trims trailing assistant messages so the JSONL never ends on
  role=assistant, preventing the Anthropic 400 prefill-loop that
  fires when thinking is enabled. (openclaw#75271)
- Rewrites blank-only user messages to a synthetic '(continue)'
  placeholder instead of dropping them, so strict providers
  (Qwen/mlx-vlm, Anthropic) no longer reject transcripts missing
  a user turn. (openclaw#75313)

Closes openclaw#75271, closes openclaw#75313.

* refactor: clean up comments in session-file repair

* fix(agents): preserve trailing assistant tool-call turns during session trim

Mirror the outbound guard (stripTrailingAssistantPrefillTurns):
skip assistant entries containing toolCall/toolUse/functionCall
blocks so transcript repair can synthesize missing tool results.

Addresses PR review feedback from clawsweeper on openclaw#75606.

* refactor: delete unused contract test helpers

* test: use built-in OpenAI provider in Windows smoke

* fix: recover Discord voice auto-join after resume

* refactor: trim cli test helper exports

* fix: send twilio notify twiml directly

* Prefer Codex native workspace tools (openclaw#75308)

Summary:
- The PR adds Codex dynamic-tool profile config defaulting to `native-first`, filters duplicate workspace/process/planning tools from Codex app-server thread payloads, keeps managed `web_search`, updates docs/manifest/config baselines/changelog, and adds regression tests.

ClawSweeper fixups:
- Included follow-up commit: test(codex): pin native-first tool catalog
- Included follow-up commit: chore(config): refresh generated schema baseline
- Included follow-up commit: chore: add codex native-first changelog
- Included follow-up commit: chore: move native-first changelog entry
- Included follow-up commit: chore: refresh config baseline after rebase

Validation:
- ClawSweeper review passed for head 30e5cec.
- Required merge gates passed before the squash merge.

Prepared head SHA: 30e5cec
Review: openclaw#75308 (comment)

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: pashpashpash <[email protected]>

* refactor: trim cli program test exports

* refactor: trim command test helper exports

* ci: stop parity gate on pull requests

* chore: update dependencies

* chore: sync whatsapp dependency lockfile

* fix(agents): dedupe messaging tool replies by route

* fix: handle EPIPE errors on child process stdin writes (openclaw#75602)

Fix three child-process stdin write paths that let async EPIPE errors
escape to uncaughtException and crash the gateway.

extensions/imessage/src/client.ts (the actual openclaw#75438 crash path):
- Add child.stdin.on('error') listener in start() to catch async EPIPE
  and reject all pending requests via failAll().
- Add write callback to request() stdin.write() that rejects the
  specific pending request on error, instead of leaving it hanging
  until timeout.

src/agents/mcp-stdio-transport.ts:
- Fix write callback race in send(): previously resolved the promise
  immediately when write() returned true, then the write callback with
  EPIPE would fire after the promise was already fulfilled. Now always
  settles the promise from the write callback so the outcome is known
  before resolving.

src/process/exec.ts:
- Add stdin.on('error') before writing input so EPIPE from a
  prematurely-exited child is swallowed — the process exit handler
  reports the real status.

One reporter observed a gateway crash after 10.5 hours of stable
uptime — a single EPIPE on an iMessage RPC child process stdin write
killed the gateway with code 1.

Fixes: openclaw#75438

* refactor: trim cron test helper exports

* refactor: delete unused repo scan helper

* fix: default Discord voice to explicit opt-in

* refactor: trim test utility exports

* fix: show google meet twilio call diagnostics

* build: refresh a2ui bundle hash

* refactor: trim auto reply test helper exports

* fix(app): retry device tokens on pinned gateways (openclaw#75537)

* test(e2e): add bundled plugin runtime smoke

* test(e2e): activate channel rows for runtime smoke

* test(e2e): let channel runtime smoke load channels

* test(e2e): tolerate missing pgrep in runtime smoke

* test(e2e): assert canonical TTS provider in smoke

* test(e2e): satisfy runtime smoke lint

* test(e2e): account for lazy plugin commands in smoke

* test(e2e): give plugin runtime RPCs more headroom

* test(e2e): share runtime deps across matrix probes

* test(e2e): skip lazy tool catalog probes

* test(e2e): isolate plugin matrix runtime deps

* test(e2e): configure tts provider sections in matrix

* fix(plugins): preserve requested speech fallback

* test(e2e): harden bundled plugin runtime smoke

* docs(changelog): credit plugin runtime smoke fix

* fix(plugins): scope requested speech providers

* fix(test): satisfy plugin smoke lint

* fix(test): tolerate channel readiness degradation

* refactor: trim gateway test helper exports

* fix: allow doctor repair size drops

* fix: async transcript I/O to unblock gateway event loop (openclaw#75595)

* fix: async transcript I/O to unblock gateway event loop

Two related fixes for event-loop starvation caused by synchronous file
operations on session transcript files during gateway hot paths.

## sessions.list: yield between transcript reads (openclaw#75330)

Extract filterAndSortSessionEntries() from listSessionsFromStore() and
add a new listSessionsFromStoreAsync() that yields to the event loop
via setImmediate every 10 session rows. The sessions.list RPC handler
now uses the async version.

The synchronous version is kept for callers that need it (sessions-
resolve visibility checks, embedded backends, subagent tools).

The dominant blocker is readSessionTitleFieldsFromTranscript(), which
performs fs.statSync + fs.openSync + fs.readSync (head) + fs.readSync
(tail) for every session row that requests derived titles or last-
message previews. With 100+ sessions, this blocks the event loop for
32-64 seconds, starving WebSocket heartbeats, channel I/O, and
concurrent RPC.

## session compaction: async file copy (openclaw#75414)

Add captureCompactionCheckpointSnapshotAsync() using fs.promises for
stat, copyFile, and unlink instead of fsSync equivalents. Switch both
compact.ts and compact.queued.ts to the async version.

The synchronous copyFileSync of large transcript files (20MB+ observed
in production) was blocking the event loop for the entire copy duration
— one reporter measured a 43-minute event loop block from a single
compaction checkpoint capture.

Refs: openclaw#75330, openclaw#75414

* test: cover async transcript I/O responsiveness

* fix: avoid sync checkpoint metadata reads

* refactor: trim agent test helper exports

* fix(gateway): preflight strict agent delivery

* fix(onboard): run noninteractive migration imports

* fix(discord): send component-only native replies

* fix(slack): send block-only slash replies

* fix(telegram): send interactive-only button replies

* fix(line): send quick-reply-only payloads

* fix(auto-reply): keep docking in direct chats

* fix(discord): pin text dm main route owner

* fix(channels): pin dm main route owners

* fix(update): verify daemon restart port

* fix(update): use service env for doctor

* refactor: trim status scan test exports

* slack: persist thread participation best-effort (openclaw#75583)

* refactor: delete unused test helper code

* fix(gateway): defer session store read maintenance

* discord: persist component registries best-effort (openclaw#75584)

* test: retry Windows Parallels agent turn

* fix: type non-interactive onboard prompter

* refactor: trim gateway test helper barrel

* fix: declare deepinfra manifest model discovery

* refactor: derive deepinfra catalog from manifest

* docs: note deepinfra model catalog migration

* refactor: trim gateway helper state exports

* fix: allow onboarding config size drops

* fix: declare groq setup auth metadata

* fix: declare groq manifest model catalog

* docs: note groq manifest catalog migration

* fix: carry matrix dm allowlist state

* refactor: trim shared test helper exports

* fix: preserve OpenAI Codex xhigh thinking policy

* fix(infra): block ambient Homebrew env vars from brew resolution (openclaw#74463)

* fix: address issue

* fix: address issue

* fix: address PR review feedback

* fix: address PR review feedback

* fix: address codex review feedback

* docs: add changelog entry for PR merge

* fix: declare venice manifest catalog metadata

* refactor: derive venice fallback catalog from manifest

* docs: note venice manifest catalog migration

* refactor: trim extension internal type exports

* fix(infra): block Windows system path env vars from workspace .env injection (openclaw#74456)

* fix: address issue

* fix: address PR review feedback

* fix: address codex review feedback

* fix: address codex review feedback

* fix: address codex review feedback

* docs: add changelog entry for PR merge

* Update CHANGELOG.md

* fix: block workspace CLOUDSDK_PYTHON override and always set trusted interpreter for gcloud (openclaw#74492)

* fix: address issue

* docs: add changelog entry for PR merge

* refactor: trim extension runtime barrels

* fix(plugins): scope install slot selection

* fix(plugins): satisfy slot registry type

* fix: declare zai manifest model catalog

* docs: note zai manifest catalog migration

* refactor: trim private extension exports

* test(docker): install procps for plugin watchdogs

* refactor: delete unused extension shared shims

* fix: declare stepfun setup auth metadata

* test(pairing): clear allowlist cache before read spy (openclaw#74147)

* fix: declare qianfan setup auth metadata

* fix(doctor): keep plugin runtime deps repair explicit (openclaw#75603)

* fix(doctor): keep plugin runtime deps repair explicit

* fix(doctor): keep plugin runtime deps repair explicit

* fix(doctor): keep plugin runtime deps repair explicit

---------

Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>

* feat(slack): publish App Home tab views

* fix(onboarding): scope post-config runtime deps (openclaw#75653)

* fix(whatsapp): drop stale qrcode runtime dependency

* fix(zai): satisfy catalog lint

* refactor: trim internal extension seams

* test: require parallels agent responses

* refactor: trim extension runtime reexports

* fix(feishu): recover WebSocket after SDK retry exhaustion (openclaw#73739)

* fix(feishu): recover WebSocket after SDK retry exhaustion

* fix(feishu): recover WebSocket after SDK retry exhaustion

---------

Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>

* feat(google-meet): add transcribe caption health

* refactor: trim extension test hooks

* test: configure parallels smoke provider timeout

* fix(doctor): keep noninteractive service repair explicit

* fix(plugins): use built code for tool discovery

* test(pairing): pass read spy path after cache reset

* refactor: trim extension helper shims

* fix(slack): declare Slack type dependency

* test(plugins): mock install slot registry

* docs(changelog): backfill 84e9463 qianfan and a4fd45c stepfun setup auth metadata

* test: quote parallels provider config json

* refactor: trim messaging runtime barrels

* test(plugin-sdk): align facade loader windows fast path

* refactor: trim extension barrel leftovers

* test(ci): stabilize pricing and codex web config checks

* refactor: trim channel dead exports

* fix(config): accept optional Codex search location

* test(config): isolate codex web schema acceptance

* refactor: trim extension shim reexports

* test(config): type fresh codex schema import

* refactor: trim browser action barrel

* docs(changelog): finalize 2026.4.30 notes

* refactor: trim provider model constants

* fix(channels): honor module loader native opt-out

* refactor: trim core command dead exports

* refactor: trim provider internal exports

* refactor: trim extension helper exports

* test(parallels): write Windows provider config via batch file

* refactor: trim browser route exports

* refactor: trim qqbot helper exports

* refactor: trim qqbot bridge exports

* test(parallels): expose portable Git to Windows agent turns

* refactor: trim qqbot utility exports

* fix(context-engine): honor assembled prompt authority in precheck (openclaw#74255)

Merged via squash.

Prepared head SHA: 650b023
Co-authored-by: 100yenadmin <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* refactor: trim qqbot internal types

* refactor: trim mattermost helper exports

* refactor: trim matrix helper exports

* refactor: trim tlon helper exports

* test(parallels): retry Windows agent idle exits

* refactor: trim browser helper types

* test(parallels): budget Windows agent retry

* fix(ci): keep tool display serialization local

* refactor: trim voice-call helper exports

* refactor: trim bluebubbles helper exports

* refactor: trim bluebubbles config helper exports

* fix(webchat): create dashboard sessions from New Chat (openclaw#73725)

Summary:
- The PR rewires Control UI/WebChat New Chat to create and switch to a dashboard session through `sessions.create`, adds guarded UI/session helper logic and regression tests, and updates the changelog.

ClawSweeper fixups:
- Included follow-up commit: fix(webchat): create dashboard sessions from New Chat

Validation:
- ClawSweeper review passed for head 983c634.
- Required merge gates passed before the squash merge.

Prepared head SHA: 983c634
Review: openclaw#73725 (comment)

Co-authored-by: vincentkoc <[email protected]>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>

* refactor: trim brave and diffs helper exports

* fix(discord): avoid resolving token during action discovery (openclaw#75424)

Summary:
- The PR changes Discord message-action discovery to inspect configured accounts without resolving bot tokens, resolves scoped channel SecretRefs during message-tool execution even with an injected config snapshot, adds regression tests and a changelog entry, and restores a tool-display serializer export.

ClawSweeper fixups:
- Included follow-up commit: fix(discord): avoid resolving token during action discovery
- Included follow-up commit: fix(tools): restore tool display serializer export

Validation:
- ClawSweeper review passed for head a2cd832.
- Required merge gates passed before the squash merge.

Prepared head SHA: a2cd832
Review: openclaw#75424 (comment)

Co-authored-by: Clawdbot <[email protected]>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>

* Mattermost: refresh slash callback command validation (openclaw#72923)

* fix(mattermost): refresh slash callback tokens

* fix(mattermost): reconcile slash callback method

* fix(mattermost): bound slash command lookups

* fix(mattermost): cache slash validation lookups

* fix(mattermost): refresh slash routing

* fix(mattermost): require slash callback secret

* fix(mattermost): rate limit slash validation

* fix(mattermost): throttle slash validation

* fix(mattermost): bound slash token cache

* fix(mattermost): sanitize slash callback logs

* fix(mattermost): avoid stale slash token cache

* fix(mattermost): scope slash token gate to command

* fix(mattermost): rate-limit slash validation

* fix(mattermost): redact slash validation errors

* fix(mattermost): satisfy slash sanitizer lint

* Move Mattermost slash refresh changelog entry to Unreleased Fixes

* Apply oxfmt accordion blank-line on Mattermost slash docs

---------

Co-authored-by: Devin Robison <[email protected]>

* refactor: trim discord helper exports

* refactor: trim discord internal helper exports

* refactor: trim discord monitor helper exports

* refactor: trim discord test helper exports

* refactor: trim feishu helper exports

* docs(doctor): clarify service repair prompts

Clarify when doctor reports service repair state versus when gateway install performs launcher writes.\n\nThanks @vincentkoc

* fix(security): keep plain audit off plugin runtimes

Keep routine security audit on config/filesystem checks by default, reserving plugin runtime collectors for deep audit paths.\n\nThanks @vincentkoc

* test(e2e): bound telegram rtt warm samples

* test(rtt): expose warm sample metrics

* test(e2e): target successful rtt samples

* test(e2e): measure telegram normal reply rtt

* test(e2e): allow rtt retries to reach sample target

* refactor: trim provider helper exports

* fix(ci): satisfy rtt lint rules

* refactor: trim google meet helper exports

* refactor: trim google meet transport exports

* refactor: trim google chat helper exports

* test(rtt): support main package measurements

* refactor: trim irc helper exports

* refactor: trim signal helper exports

* refactor: trim line helper exports

* test(plugins): materialize runtime deps fixtures

* test(parallels): force Windows OpenAI SSE smoke

* refactor: trim mattermost helper exports

* refactor: trim nextcloud talk helper exports

* refactor: trim provider helper exports

* fix(gateway): bound session transcript hot paths

Bound recent transcript reads and oversized injected-message writes across gateway session paths.\n\nThanks @vincentkoc

* fix(plugins): reuse cold inspect registry snapshots (openclaw#75620)

Summary:
- The PR reuses a request-scoped cold manifest registry/runtime context across plugin status and inspect report paths, threads that context through provider/setup/metadata helpers, adds targeted coverage, and adds a changelog entry.

ClawSweeper fixups:
- Included follow-up commit: fix(plugins): preserve setup auto-enable lookup

Validation:
- ClawSweeper review passed for head 4d8e8e2.
- Required merge gates passed before the squash merge.

Prepared head SHA: 4d8e8e2
Review: openclaw#75620 (comment)

Co-authored-by: Vincent Koc <[email protected]>

* fix(plugins): avoid source rebuilds for policy toggles

Reuse current installed-plugin registry records for policy-only enable and disable refreshes.\n\nThanks @vincentkoc

* refactor: trim msteams helper exports

* test(parallels): force POSIX OpenAI SSE smoke

* refactor: trim telegram helper exports

* refactor: trim whatsapp helper exports

* test(parallels): batch POSIX provider config

* refactor: trim slack helper exports

* refactor: trim matrix helper exports

* test(release): repair release validation checks

* refactor: trim voice call helper exports

* refactor: trim memory wiki helper exports

* refactor: trim nostr helper exports

* test(release): forward validation fixes

* test(release): run doctor fix in setup-entry e2e

* refactor: trim qa matrix helper exports

* refactor: trim memory core helper exports

* test(release): fix setup fallback loader validation

* refactor: trim file transfer helper exports

* refactor: trim tlon helper exports

* refactor: trim browser helper exports

* test(release): harden channel add setup fallback

* refactor: trim imessage helper exports

* test(ci): update imessage runtime api guard

* test(release): include mirrored root runtime deps

* refactor: trim qa lab helper exports

* fix(bluebubbles): UTI-aware audio attachment detection (openclaw#75488)

Co-authored-by: Omar Shahine <[email protected]>

* test(release): remove stale runtime deps local

* refactor: trim qqbot helper exports

* refactor: trim codex internal exports

* refactor: trim whatsapp test helper exports

* refactor: trim telegram test harness exports

* refactor: remove stale openrouter runtime barrel

* refactor: trim zalo helper exports

* test(release): harden docker release validation

* fix(rtt): parse telegram scenario list

* refactor: trim feishu lifecycle helper exports

* test(plugins): use valid plugin origin in loader test

* fix(rtt): wait between telegram samples

* fix(config): surface backup restore copy failures in audit and logs (openclaw#70515)

Merged via squash.

Prepared head SHA: 7c77974
Co-authored-by: davidangularme <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

* refactor: trim zalouser helper exports

* fix: stop stale OpenAI Responses replay in Telegram

Drop incomplete OpenAI/Codex tool-call replay turns and avoid previous_response_id deltas when Responses store is disabled. Verified on Zeus with focused tests, build, UI build, channel probe, and live Telegram-session canary.

* fix: stop stale Telegram final-answer replay

Suppress stale replayed final answers after newer Telegram/tool turns, supersede overlapping runs, and preserve current anchored output when replay frames contain old finals. Verified locally and on live Zeus with Telegram multi-skill expected-answer checks and ZeusChat model-switch browser checks.

* Stop direct NO_REPLY fallback chatter (#11)

Co-authored-by: Zeus3000 <[email protected]>

* Repair mismatched reset transcript files (#13)

Co-authored-by: Zeus3000 <[email protected]>

* Promote substantive stale-replay commentary (#15)

Co-authored-by: Zeus3000 <[email protected]>

* fix: preserve Telegram transcript user turns

---------

Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Alex Knight <[email protected]>
Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: pashpashpash <[email protected]>
Co-authored-by: Shakker <[email protected]>
Co-authored-by: Pavan Kumar Gondhi <[email protected]>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
Co-authored-by: Andrew <[email protected]>
Co-authored-by: 100yenadmin <[email protected]>
Co-authored-by: jalehman <[email protected]>
Co-authored-by: vincentkoc <[email protected]>
Co-authored-by: Conan-Scott <[email protected]>
Co-authored-by: Clawdbot <[email protected]>
Co-authored-by: Agustin Rivera <[email protected]>
Co-authored-by: Devin Robison <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: Omar Shahine <[email protected]>
Co-authored-by: Omar Shahine <[email protected]>
Co-authored-by: Fred David blum <[email protected]>
Co-authored-by: davidangularme <[email protected]>
Co-authored-by: Zeus3000 <[email protected]>
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
…nclaw#74463)

* fix: address issue

* fix: address issue

* fix: address PR review feedback

* fix: address PR review feedback

* fix: address codex review feedback

* docs: add changelog entry for PR merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant