Skip to content

fix(gateway): preserve raw media invoke for HTTP tool clients#34365

Merged
obviyus merged 1 commit intomainfrom
fix/http-tools-media-invoke
Mar 4, 2026
Merged

fix(gateway): preserve raw media invoke for HTTP tool clients#34365
obviyus merged 1 commit intomainfrom
fix/http-tools-media-invoke

Conversation

@obviyus
Copy link
Copy Markdown
Contributor

@obviyus obviyus commented Mar 4, 2026

Summary

  • keep media nodes invoke blocked by default for agent-context safety
  • allow raw media invoke for HTTP /tools/invoke clients only
  • add regression tests for both paths

Test

  • pnpm test src/agents/openclaw-tools.camera.test.ts src/gateway/tools-invoke-http.test.ts
  • pnpm test src/agents/tools/nodes-tool.test.ts

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 4, 2026

Greptile Summary

This PR adds an allowMediaInvokeCommands flag to guard media-returning invoke commands (photos.latest, camera.snap, camera.clip, screen.record) to prevent base64 context bloat in agent sessions. The flag is optional and defaults to false (blocked) for all existing callers, preserving backward compatibility.

Key changes:

  • nodes-tool.ts: Guard condition updated from if (dedicatedAction)if (dedicatedAction && !options?.allowMediaInvokeCommands). Default undefined behavior blocks media invocations to prevent context bloat.
  • openclaw-tools.ts: Clean passthrough of the new flag from createOpenClawToolscreateNodesTool.
  • tools-invoke-http.ts: HTTP /tools/invoke clients unconditionally set allowMediaInvokeCommands: true, allowing them to receive raw media payloads directly. This is documented in a clear inline comment.
  • Tests: Both code paths are covered — the existing test verifies the blocked default path, and new tests verify the opt-in path for HTTP callers.

Confidence Score: 4/5

  • Safe to merge. The change is minimal, maintains backward compatibility, and both code paths are well-tested.
  • The PR introduces a straightforward guard condition that blocks media-returning invoke commands by default while allowing HTTP clients to opt in. The default behavior is preserved for all existing callers, breaking no existing functionality. Both the blocked and allowed paths are covered by tests. The logic is correct and the implementation is clean.
  • No files require special attention.

Last reviewed commit: fffaa70

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels Mar 4, 2026
@obviyus obviyus force-pushed the fix/http-tools-media-invoke branch from fffaa70 to 4784160 Compare March 4, 2026 11:47
@obviyus obviyus merged commit 7b5e64e into main Mar 4, 2026
1 check passed
@obviyus obviyus deleted the fix/http-tools-media-invoke branch March 4, 2026 11:47
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 4, 2026

🔒 Aisle Security Analysis

We found 3 potential security issue(s) in this PR:

# Severity Title
1 🟠 High Media/base64 payload exposure & DoS via nodes action="invoke" when allowMediaInvokeCommands enabled
2 🟠 High POST /tools/invoke enables raw camera/screen media exfiltration via allowMediaInvokeCommands=true
3 🟡 Medium Unbounded media/base64 payloads enabled for HTTP /tools/invoke via allowMediaInvokeCommands

1. 🟠 Media/base64 payload exposure & DoS via nodes action="invoke" when allowMediaInvokeCommands enabled

Property Value
Severity High
CWE CWE-201
Location src/agents/tools/nodes-tool.ts:751-785

Description

The generic nodes tool path action="invoke" returns raw node.invoke results (including base64 media) directly via jsonResult, bypassing the dedicated media actions’ guardrails.

Dedicated media actions (e.g., action="photos_latest") add multiple safety measures that are not applied to action="invoke":

  • Request guardrails: photos_latest clamps limit (1..20) and sets default maxWidth/quality before calling photos.latest.
  • Output normalization: dedicated actions write media to disk and return MEDIA:<path>/FILE:<path> rather than returning base64 in details.
  • Image sanitization: dedicated actions call sanitizeToolResultImages(...) to downscale/recompress image blocks to safe limits; jsonResult(...) does no such sanitization and includes base64 inside the JSON string.

With allowMediaInvokeCommands enabled, callers can invoke media-returning commands like photos.latest through action="invoke" and receive unbounded base64 in the tool output. This creates:

  • Sensitive data exposure risk: raw photo/camera/screen-record base64 returned directly.
  • Denial-of-service risk: unbounded response size and memory usage during JSON serialization (and potentially downstream LLM context bloat if used in agent flows).

Vulnerable code (generic invoke path now permits media commands when the flag is set):

const dedicatedAction = MEDIA_INVOKE_ACTIONS[invokeCommandNormalized as keyof typeof MEDIA_INVOKE_ACTIONS];
if (dedicatedAction && !options?.allowMediaInvokeCommands) {
  throw new Error(...);
}
...
const raw = await callGatewayTool("node.invoke", gatewayOpts, { nodeId, command: invokeCommand, params: invokeParams, ... });
return jsonResult(raw ?? {});

Recommendation

Do not allow raw media-returning node commands through the generic invoke path unless the same guardrails as the dedicated actions are enforced.

Recommended fixes (choose one):

  1. Route media commands to dedicated handlers even when called via invoke:
if (dedicatedAction) {// internally dispatch to the same implementation as action="photos_latest"/"camera_snap"/...
  return await handleDedicatedMediaAction(dedicatedAction, params);
}
  1. If invoke must be supported:
  • Enforce per-command parameter clamps (e.g., limit, maxWidth, quality) identical to dedicated actions.
  • Apply response size limits and strip/replace large base64 fields (e.g., return MEDIA:<path> only).
  • Apply sanitizeToolResultImages (or an equivalent) to any returned base64 images.

Also ensure allowMediaInvokeCommands is false by default for all externally reachable entrypoints.


2. 🟠 POST /tools/invoke enables raw camera/screen media exfiltration via allowMediaInvokeCommands=true

Property Value
Severity High
CWE CWE-200
Location src/gateway/tools-invoke-http.ts:248-268

Description

POST /tools/invoke now unconditionally enables allowMediaInvokeCommands: true when creating the tool execution context.

This disables the safety check in the nodes tool that previously blocked action="invoke" for known media-returning node commands (camera/photos/screen) to avoid returning base64 blobs.

Impact:

  • Any HTTP client that can access /tools/invoke (i.e., with the gateway shared secret token/password, or anonymously if gateway.auth.mode="none") can invoke media-returning node commands via tool=nodes + action=invoke and receive the raw base64 payload directly in the HTTP response.
  • This bypasses the safer dedicated actions (e.g. photos_latest, camera_snap, screen_record) which write media to a temp file and only return MEDIA:/path references.

Vulnerable flow (new behavior):

  • Input: HTTP request body to /tools/invoke selecting tool nodes and arguments including action: "invoke", invokeCommand: "photos.latest" (or camera.snap, screen.record, etc.)
  • Policy gap: tool policies/allowlists operate at the tool name level (nodes), not per-action/command, so enabling raw invoke media broadens what nodes can leak over HTTP.
  • Sink: gateway returns tool output verbatim via sendJson(res, 200, { ok: true, result }).

Relevant code:

// src/gateway/tools-invoke-http.ts
const allTools = createOpenClawTools({
  ...
  ​// HTTP callers consume tool output directly; preserve raw media invoke payloads.
  allowMediaInvokeCommands: true,
  ...
});
// src/agents/tools/nodes-tool.ts
const dedicatedAction = MEDIA_INVOKE_ACTIONS[invokeCommandNormalized as ...];
if (dedicatedAction && !options?.allowMediaInvokeCommands) {
  throw new Error("... blocked ...");
}
...
return jsonResult(raw ?? {}); // includes payload base64

Recommendation

Do not enable raw media-returning nodes invoke commands for the generic HTTP surface by default.

Suggested mitigation options:

  1. Make it opt-in via config (default false):
// src/gateway/tools-invoke-http.ts
const allowMediaInvokeCommands = cfg.gateway?.tools?.allowMediaInvokeCommands === true;

const allTools = createOpenClawTools({
  ...,
  allowMediaInvokeCommands,
});
  1. Prefer safer dedicated actions: keep allowMediaInvokeCommands false and require callers to use nodes actions like photos_latest / camera_snap / screen_record that return file references instead of base64.

  2. If raw payloads are required, add explicit authorization and response shaping:

  • require a separate privileged token/scope for media invoke
  • strip or truncate large base64 fields from result before returning
  • return a short-lived, authenticated download URL instead of embedding base64 in JSON.

3. 🟡 Unbounded media/base64 payloads enabled for HTTP /tools/invoke via allowMediaInvokeCommands

Property Value
Severity Medium
CWE CWE-400
Location src/gateway/tools-invoke-http.ts:248-268

Description

POST /tools/invoke now constructs the toolset with allowMediaInvokeCommands: true, which disables the safety guard in the nodes tool that previously blocked media-returning invokeCommands (e.g., photos.latest, camera.snap, camera.clip, screen.record).

Impact:

  • Denial of service risk: media commands can return very large base64 payloads. The nodes tool’s invoke action will return them via jsonResult(...), which performs JSON.stringify on the full payload, potentially allocating very large strings/buffers and returning huge HTTP responses.
  • Bypasses safer dedicated actions: the dedicated nodes actions (e.g., photos_latest) clamp parameters (like limit) and convert media to temp files, avoiding returning base64 directly. Enabling raw invoke for media commands bypasses these safety bounds.

Vulnerable change (enables the bypass for all HTTP tool invocations):

const allTools = createOpenClawTools({// ...
  allowMediaInvokeCommands: true,// ...
});

This is particularly risky because POST /tools/invoke is a generic, scriptable surface; any authenticated caller can trigger these potentially huge responses by selecting the nodes tool with action="invoke" and a media command.

Recommendation

Do not enable raw media invoke payloads by default on the generic HTTP tool surface, or enforce strict size/parameter limits when it is enabled.

Option A (safer default): remove the unconditional enablement so media-returning commands remain blocked via action="invoke" and callers must use dedicated actions.

const allTools = createOpenClawTools({// ...// allowMediaInvokeCommands: true, // remove
  config: cfg,
});

Option B (if raw payloads are required): gate it behind an explicit, privileged request option (and/or stronger auth scope), and add a hard cap on response sizes for nodes invoke results. For example:

  • Only set allowMediaInvokeCommands when a specific header/flag is present and the caller is authorized for it.
  • In nodes tool action="invoke", when invokeCommand is a known media command, clamp parameters (limit/maxWidth/durationMs) and/or reject base64 payloads over a maximum size before calling jsonResult.

This preserves functionality while preventing oversized responses from exhausting memory/bandwidth.


Analyzed PR: #34365 at commit 4784160

Last updated on: 2026-03-04T12:18:05Z

@obviyus
Copy link
Copy Markdown
Contributor Author

obviyus commented Mar 4, 2026

Landed via temp rebase onto main.

  • Gate: pnpm test src/agents/openclaw-tools.camera.test.ts src/gateway/tools-invoke-http.test.ts src/agents/tools/nodes-tool.test.ts
  • Land commit: 4784160
  • Merge commit: 7b5e64e

mrosmarin added a commit to mrosmarin/openclaw that referenced this pull request Mar 4, 2026
* main: (92 commits)
  fix: preserve raw media invoke for HTTP tool clients (openclaw#34365)
  fix: prevent nodes media base64 context bloat (openclaw#34332)
  docs(changelog): credit @Brotherinlaw-13 for openclaw#34318
  fix(telegram): materialize dm draft final to avoid duplicates
  fix: relay ACP sessions_spawn parent streaming (openclaw#34310) (thanks @vincentkoc) (openclaw#34310)
  fix: kill stuck ACP child processes on startup and harden sessions in discord threads (openclaw#33699)
  feat(ios): add Live Activity connection status + stale cleanup (openclaw#33591)
  chore(docs): add plugins refactor changelog entry
  Chore: remove accidental .DS_Store artifact
  Plugins/zalouser: migrate to scoped plugin-sdk imports
  Plugins/zalo: migrate to scoped plugin-sdk imports
  Plugins/whatsapp: migrate to scoped plugin-sdk imports
  Plugins/voice-call: migrate to scoped plugin-sdk imports
  Plugins/twitch: migrate to scoped plugin-sdk imports
  Plugins/tlon: migrate to scoped plugin-sdk imports
  Plugins/thread-ownership: migrate to scoped plugin-sdk imports
  Plugins/test-utils: migrate to scoped plugin-sdk imports
  Plugins/talk-voice: migrate to scoped plugin-sdk imports
  Plugins/synology-chat: migrate to scoped plugin-sdk imports
  Plugins/qwen-portal-auth: migrate to scoped plugin-sdk imports
  ...