Skip to content

Gateway: follow up HEIC input image handling#38146

Merged
vincentkoc merged 6 commits intomainfrom
vincentkoc-code/telegram-media-max-config
Mar 6, 2026
Merged

Gateway: follow up HEIC input image handling#38146
vincentkoc merged 6 commits intomainfrom
vincentkoc-code/telegram-media-max-config

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

Summary

  • Problem: the merged HEIC input_image PR introduced two follow-up issues: non-HEIC images started going through MIME sniffing unexpectedly, and OpenAI chat-completions maxTotalImageBytes still counted pre-normalized HEIC base64 bytes instead of the larger post-conversion JPEG payload.
  • Why it matters: this changed non-HEIC behavior outside the intended scope and could let normalized HEIC payloads exceed the effective total image budget.
  • What changed: HEIC sniffing is now scoped back to declared HEIC/HEIF inputs only, HEIC tests are hermetic via detectMime mocking, and chat-completions total image budgeting now counts the normalized image payload.
  • What did NOT change (scope boundary): no local-path, SSRF, or general MIME allowlist policy was broadened beyond the already-merged HEIC feature.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Non-HEIC input_image inputs keep their previous declared-MIME behavior.
  • OpenAI chat-completions maxTotalImageBytes now correctly applies after HEIC/HEIF normalization to JPEG.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22 / pnpm
  • Model/provider: OpenAI-compatible chat completions path
  • Integration/channel (if any): Gateway HTTP input_image
  • Relevant config (redacted): gateway.http.endpoints.chatCompletions.maxTotalImageBytes

Steps

  1. Send chat-completions requests with HEIC image_url data URIs and a tight maxTotalImageBytes limit.
  2. Normalize the HEIC input to a larger JPEG payload.
  3. Verify the request is rejected against the normalized payload size and that non-HEIC inputs still skip MIME sniffing.

Expected

  • HEIC follow-up keeps non-HEIC behavior unchanged and enforces total image budgets on the actual normalized payload.

Actual

  • The merged PR sniffed all image MIME types and chat-completions only pre-counted the raw HEIC base64 size.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: HEIC tests now mock MIME detection explicitly; non-HEIC base64 inputs do not trigger sniffing; normalized HEIC payload size is counted against chat-completions maxTotalImageBytes.
  • Edge cases checked: unchanged base64 image inputs are not double-counted in chat-completions accounting.
  • What you did not verify: live provider requests through a real HEIC image conversion path.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert this PR.
  • Files/config to restore: src/media/input-files.ts, src/gateway/openai-http.ts
  • Known bad symptoms reviewers should watch for: non-HEIC images unexpectedly rejected by MIME mismatch, or oversized normalized HEIC payloads still slipping past total image budget checks.

Risks and Mitigations

  • Risk: HEIC requests near the configured total image limit may now fail where they previously slipped through.
    • Mitigation: that failure now reflects the actual bytes delivered to the provider, and the tests lock the intended accounting behavior.

@vincentkoc vincentkoc self-assigned this Mar 6, 2026
@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime size: S maintainer Maintainer-authored PR labels Mar 6, 2026
@vincentkoc vincentkoc marked this pull request as ready for review March 6, 2026 16:50
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 6, 2026

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium MIME type spoofing allows non-image content to bypass image MIME allowlist in normalizeInputImage

1. 🟡 MIME type spoofing allows non-image content to bypass image MIME allowlist in normalizeInputImage

Property Value
Severity Medium
CWE CWE-20
Location src/media/input-files.ts:237-245

Description

normalizeInputImage now skips MIME sniffing for all non-HEIC declared MIME types and trusts the caller-controlled params.mimeType.

This enables content-type confusion / validation bypass:

  • Input: params.mimeType comes from attacker-controlled sources:
    • base64 images: source.mediaType (data URI / request JSON fields)
    • URL images: HTTP Content-Type from an attacker-controlled origin
  • Change/regression: for any declared MIME not in HEIC_INPUT_IMAGE_MIMES, the code no longer calls detectMime() on the actual bytes.
  • Impact: an attacker can declare an allowed image MIME (e.g., image/png) while sending bytes that are actually a different (potentially disallowed) file type (e.g., PDF/ZIP/HTML). The allowlist check is then performed against the declared MIME, not the actual content.
    • Previously, detectMime({buffer}) (via file-type) would often identify many non-image payloads (e.g., application/pdf) causing rejection.

Vulnerable behavior introduced in this change:

  • sourceMime becomes declaredMime for non-HEIC inputs without verifying the buffer.

Recommendation

Do not rely on a declared MIME type for security decisions.

Mitigations (choose one depending on desired strictness/perf):

  1. Restore sniffing for all images and validate that the detected type is an allowed image MIME:
const declaredMime = normalizeMimeType(params.mimeType) ?? "application/octet-stream";
const detected = normalizeMimeType(await detectMime({ buffer: params.buffer, headerMime: params.mimeType }));// If we can detect a concrete type, prefer it.
const sourceMime = detected ?? declaredMime;// Optional hardening: if declared claims image/* but detected is a non-image, reject.
if (declaredMime.startsWith("image/") && detected && !detected.startsWith("image/")) {
  throw new Error(`Invalid image content: detected ${detected} for declared ${declaredMime}`);
}

if (!params.limits.allowedMimes.has(sourceMime)) {
  throw new Error(`Unsupported image MIME type: ${sourceMime}`);
}
  1. At minimum, if declaredMime is an allowed image type, ensure the buffer looks like an image (magic bytes) and reject if detectMime() identifies a non-image (e.g., application/pdf, application/zip).

Also consider logging/auditing mismatches (declaredMime vs detected) to catch abuse.


Analyzed PR: #38146 at commit 3552736

Last updated on: 2026-03-06T17:02:22Z

@vincentkoc vincentkoc merged commit 9521e61 into main Mar 6, 2026
8 checks passed
@vincentkoc vincentkoc deleted the vincentkoc-code/telegram-media-max-config branch March 6, 2026 16:54
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR is a targeted follow-up to the HEIC input_image feature (#38122), fixing two regressions it introduced:

  1. MIME sniffing scope: All images were being passed through detectMime(), unintentionally changing behavior for non-HEIC inputs. The fix gates detectMime behind a HEIC_INPUT_IMAGE_MIMES check in normalizeInputImage() (lines 238-242 of src/media/input-files.ts), ensuring only declared HEIC/HEIF inputs trigger MIME sniffing. Non-HEIC images now use their declared MIME directly, preserving backward compatibility.

  2. Budget accounting: maxTotalImageBytes was counting pre-conversion HEIC base64 size. The fix in resolveImagesForRequest() (lines 300-312 of src/gateway/openai-http.ts) now:

    • Checks pre-extraction bytes early to avoid expensive conversion when already over budget
    • Always counts post-extraction bytes (post-JPEG normalization) against the total
    • This ensures HEIC→JPEG-inflated payloads are correctly charged against the configured limit

Test coverage is comprehensive and hermetic:

  • input-files.fetch-guard.test.ts (line 102-125): Verifies non-HEIC images skip detectMime via mock assertions
  • openai-http.image-budget.test.ts (line 21-40): Confirms normalized payloads are counted post-extraction
  • openai-http.image-budget.test.ts (line 42-67): Verifies unchanged payloads aren't double-counted

CHANGELOG entry (line 141) accurately documents the follow-up changes.

Confidence Score: 5/5

  • This PR is safe to merge—it tightens two well-scoped regressions with clear tests and no behavioral changes outside the stated scope.
  • All three code changes (MIME scoping, budget accounting, test hermeticity) are logically correct and verified by comprehensive tests. The budget accounting correctly counts post-extraction bytes for all source types, and the early-rejection fast-path for base64 is safe because HEIC→JPEG conversion typically produces larger (not smaller) payloads. The MIME sniffing is properly scoped to declared HEIC/HEIF inputs only, verified by explicit test assertions. No security surface changes, no API contract changes, and non-HEIC behavior is explicitly locked by a new test.
  • No files require special attention.

Last reviewed commit: 3552736

ant1eicher pushed a commit to ant1eicher/openclaw that referenced this pull request Mar 6, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix
Saitop pushed a commit to NomiciAI/openclaw that referenced this pull request Mar 8, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix
jenawant pushed a commit to jenawant/openclaw that referenced this pull request Mar 10, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix
dhoman pushed a commit to dhoman/chrono-claw that referenced this pull request Mar 11, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix
senw-developers pushed a commit to senw-developers/va-openclaw that referenced this pull request Mar 17, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix
V-Gutierrez pushed a commit to V-Gutierrez/openclaw-vendor that referenced this pull request Mar 17, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 20, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix

(cherry picked from commit 9521e61)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 20, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix

(cherry picked from commit 9521e61)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 20, 2026
* Media: scope HEIC MIME sniffing

* Media: hermeticize HEIC input tests

* Gateway: fix HEIC image budget accounting

* Gateway: add HEIC image budget regression test

* Changelog: note HEIC follow-up fix

(cherry picked from commit 9521e61)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime maintainer Maintainer-authored PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant