Gateway: follow up HEIC input image handling by vincentkoc · Pull Request #38146 · openclaw/openclaw

vincentkoc · 2026-03-06T16:34:16Z

Summary

Problem: the merged HEIC input_image PR introduced two follow-up issues: non-HEIC images started going through MIME sniffing unexpectedly, and OpenAI chat-completions maxTotalImageBytes still counted pre-normalized HEIC base64 bytes instead of the larger post-conversion JPEG payload.
Why it matters: this changed non-HEIC behavior outside the intended scope and could let normalized HEIC payloads exceed the effective total image budget.
What changed: HEIC sniffing is now scoped back to declared HEIC/HEIF inputs only, HEIC tests are hermetic via detectMime mocking, and chat-completions total image budgeting now counts the normalized image payload.
What did NOT change (scope boundary): no local-path, SSRF, or general MIME allowlist policy was broadened beyond the already-merged HEIC feature.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related Gateway: normalize HEIC input_image sources #38122

User-visible / Behavior Changes

Non-HEIC input_image inputs keep their previous declared-MIME behavior.
OpenAI chat-completions maxTotalImageBytes now correctly applies after HEIC/HEIF normalization to JPEG.

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS
Runtime/container: Node 22 / pnpm
Model/provider: OpenAI-compatible chat completions path
Integration/channel (if any): Gateway HTTP input_image
Relevant config (redacted): gateway.http.endpoints.chatCompletions.maxTotalImageBytes

Steps

Send chat-completions requests with HEIC image_url data URIs and a tight maxTotalImageBytes limit.
Normalize the HEIC input to a larger JPEG payload.
Verify the request is rejected against the normalized payload size and that non-HEIC inputs still skip MIME sniffing.

Expected

HEIC follow-up keeps non-HEIC behavior unchanged and enforces total image budgets on the actual normalized payload.

Actual

The merged PR sniffed all image MIME types and chat-completions only pre-counted the raw HEIC base64 size.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: HEIC tests now mock MIME detection explicitly; non-HEIC base64 inputs do not trigger sniffing; normalized HEIC payload size is counted against chat-completions maxTotalImageBytes.
Edge cases checked: unchanged base64 image inputs are not double-counted in chat-completions accounting.
What you did not verify: live provider requests through a real HEIC image conversion path.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps:

Failure Recovery (if this breaks)

How to disable/revert this change quickly: revert this PR.
Files/config to restore: src/media/input-files.ts, src/gateway/openai-http.ts
Known bad symptoms reviewers should watch for: non-HEIC images unexpectedly rejected by MIME mismatch, or oversized normalized HEIC payloads still slipping past total image budget checks.

Risks and Mitigations

Risk: HEIC requests near the configured total image limit may now fail where they previously slipped through.
- Mitigation: that failure now reflects the actual bytes delivered to the provider, and the tests lock the intended accounting behavior.

aisle-research-bot · 2026-03-06T16:50:22Z

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

#	Severity	Title
1	🟡 Medium	MIME type spoofing allows non-image content to bypass image MIME allowlist in normalizeInputImage

1. 🟡 MIME type spoofing allows non-image content to bypass image MIME allowlist in normalizeInputImage

Property	Value
Severity	Medium
CWE	CWE-20
Location	`src/media/input-files.ts:237-245`

Description

normalizeInputImage now skips MIME sniffing for all non-HEIC declared MIME types and trusts the caller-controlled params.mimeType.

This enables content-type confusion / validation bypass:

Input: params.mimeType comes from attacker-controlled sources:
- base64 images: source.mediaType (data URI / request JSON fields)
- URL images: HTTP Content-Type from an attacker-controlled origin
Change/regression: for any declared MIME not in HEIC_INPUT_IMAGE_MIMES, the code no longer calls detectMime() on the actual bytes.
Impact: an attacker can declare an allowed image MIME (e.g., image/png) while sending bytes that are actually a different (potentially disallowed) file type (e.g., PDF/ZIP/HTML). The allowlist check is then performed against the declared MIME, not the actual content.
- Previously, detectMime({buffer}) (via file-type) would often identify many non-image payloads (e.g., application/pdf) causing rejection.

Vulnerable behavior introduced in this change:

sourceMime becomes declaredMime for non-HEIC inputs without verifying the buffer.

Recommendation

Do not rely on a declared MIME type for security decisions.

Mitigations (choose one depending on desired strictness/perf):

Restore sniffing for all images and validate that the detected type is an allowed image MIME:

const declaredMime = normalizeMimeType(params.mimeType) ?? "application/octet-stream";
const detected = normalizeMimeType(await detectMime({ buffer: params.buffer, headerMime: params.mimeType }));

// If we can detect a concrete type, prefer it.
const sourceMime = detected ?? declaredMime;

// Optional hardening: if declared claims image/* but detected is a non-image, reject.
if (declaredMime.startsWith("image/") && detected && !detected.startsWith("image/")) {
  throw new Error(`Invalid image content: detected ${detected} for declared ${declaredMime}`);
}

if (!params.limits.allowedMimes.has(sourceMime)) {
  throw new Error(`Unsupported image MIME type: ${sourceMime}`);
}

At minimum, if declaredMime is an allowed image type, ensure the buffer looks like an image (magic bytes) and reject if detectMime() identifies a non-image (e.g., application/pdf, application/zip).

Also consider logging/auditing mismatches (declaredMime vs detected) to catch abuse.

Analyzed PR: #38146 at commit 3552736

_{Last updated on: 2026-03-06T17:02:22Z}

greptile-apps · 2026-03-06T16:54:52Z

Greptile Summary

This PR is a targeted follow-up to the HEIC input_image feature (#38122), fixing two regressions it introduced:

MIME sniffing scope: All images were being passed through detectMime(), unintentionally changing behavior for non-HEIC inputs. The fix gates detectMime behind a HEIC_INPUT_IMAGE_MIMES check in normalizeInputImage() (lines 238-242 of src/media/input-files.ts), ensuring only declared HEIC/HEIF inputs trigger MIME sniffing. Non-HEIC images now use their declared MIME directly, preserving backward compatibility.
Budget accounting: maxTotalImageBytes was counting pre-conversion HEIC base64 size. The fix in resolveImagesForRequest() (lines 300-312 of src/gateway/openai-http.ts) now:
- Checks pre-extraction bytes early to avoid expensive conversion when already over budget
- Always counts post-extraction bytes (post-JPEG normalization) against the total
- This ensures HEIC→JPEG-inflated payloads are correctly charged against the configured limit

Test coverage is comprehensive and hermetic:

input-files.fetch-guard.test.ts (line 102-125): Verifies non-HEIC images skip detectMime via mock assertions
openai-http.image-budget.test.ts (line 21-40): Confirms normalized payloads are counted post-extraction
openai-http.image-budget.test.ts (line 42-67): Verifies unchanged payloads aren't double-counted

CHANGELOG entry (line 141) accurately documents the follow-up changes.

Confidence Score: 5/5

This PR is safe to merge—it tightens two well-scoped regressions with clear tests and no behavioral changes outside the stated scope.
All three code changes (MIME scoping, budget accounting, test hermeticity) are logically correct and verified by comprehensive tests. The budget accounting correctly counts post-extraction bytes for all source types, and the early-rejection fast-path for base64 is safe because HEIC→JPEG conversion typically produces larger (not smaller) payloads. The MIME sniffing is properly scoped to declared HEIC/HEIF inputs only, verified by explicit test assertions. No security surface changes, no API contract changes, and non-HEIC behavior is explicitly locked by a new test.
No files require special attention.

_{Last reviewed commit: 3552736}

* Media: scope HEIC MIME sniffing * Media: hermeticize HEIC input tests * Gateway: fix HEIC image budget accounting * Gateway: add HEIC image budget regression test * Changelog: note HEIC follow-up fix

* Media: scope HEIC MIME sniffing * Media: hermeticize HEIC input tests * Gateway: fix HEIC image budget accounting * Gateway: add HEIC image budget regression test * Changelog: note HEIC follow-up fix (cherry picked from commit 9521e61)

vincentkoc added 5 commits March 6, 2026 11:33

Media: scope HEIC MIME sniffing

96277a3

Media: hermeticize HEIC input tests

b54edae

Gateway: fix HEIC image budget accounting

86c100b

Gateway: add HEIC image budget regression test

01c8ccc

Changelog: note HEIC follow-up fix

c6be9d4

vincentkoc self-assigned this Mar 6, 2026

openclaw-barnacle bot added gateway Gateway runtime size: S maintainer Maintainer-authored PR labels Mar 6, 2026

vincentkoc marked this pull request as ready for review March 6, 2026 16:50

Merge branch 'main' into vincentkoc-code/telegram-media-max-config

3552736

vincentkoc merged commit 9521e61 into main Mar 6, 2026
8 checks passed

vincentkoc deleted the vincentkoc-code/telegram-media-max-config branch March 6, 2026 16:54

github-actions bot mentioned this pull request Mar 6, 2026

📡 Upstream Digest — 2026-03-06 18:34 UTC curtismercier/openclaw-mods#194

Open

vincentkoc mentioned this pull request Mar 6, 2026

Media: reject spoofed input_image MIME payloads #38289

Merged

18 tasks

alexyyyander mentioned this pull request Mar 7, 2026

fix/gateway token mismatch 38617 #38676

Closed

alexey-pelykh mentioned this pull request Mar 10, 2026

Cherry-pick: gateway HEIC images, schema lookup, SecretRef degradation remoteclaw/remoteclaw#835

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gateway: follow up HEIC input image handling#38146

Gateway: follow up HEIC input image handling#38146
vincentkoc merged 6 commits intomainfrom
vincentkoc-code/telegram-media-max-config

vincentkoc commented Mar 6, 2026

Uh oh!

aisle-research-bot bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vincentkoc commented Mar 6, 2026

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Uh oh!

aisle-research-bot bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

1. 🟡 MIME type spoofing allows non-image content to bypass image MIME allowlist in normalizeInputImage

Description

Recommendation

Uh oh!

Uh oh!

greptile-apps bot commented Mar 6, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aisle-research-bot bot commented Mar 6, 2026 •

edited

Loading