Skip to content

[codex] fix weixin pdf downloads#302

Merged
everettjf merged 2 commits intomainfrom
codex/weixin-pdf-downloads
Mar 26, 2026
Merged

[codex] fix weixin pdf downloads#302
everettjf merged 2 commits intomainfrom
codex/weixin-pdf-downloads

Conversation

@everettjf
Copy link
Copy Markdown
Contributor

This fixes Weixin inbound PDF handling so uploaded documents are either stored as real files or explicitly rejected instead of being surfaced as unusable garbage data.

Before this change, the Weixin runtime only summarized inbound file_item payloads as text and did not persist them at all. After adding persistence, it became clear that the CDN download path could still save encrypted or malformed payloads as .pdf files because the upstream metadata is inconsistent: encrypt_type may be 0 even when an aes_key is present, and the key itself may be encoded in more than one way. In practice that meant users could send a PDF, see a saved path, and still end up with a file whose contents were not actually a PDF.

The fix adds an inbound file pipeline for Weixin that downloads file_item.media, attempts decryption across the key encodings and AES key sizes observed in the payloads, validates the resulting bytes against the expected file signature, and only then writes into the working directory uploads tree. For PDFs specifically, the runtime now requires a valid %PDF- header and tolerates a short prefix before the header if Weixin prepends extra bytes. When validation fails, the runtime records a clear download_failed=... note instead of writing a bad file.

The change also improves observability around this path. We now log whether media metadata exists, whether an AES key was present, the raw key length, candidate decoded key lengths, and the truncated media JSON on failure. That makes further protocol mismatches debuggable without having to inspect the database or saved files manually.

Validation used cargo test weixin -- --nocapture, including new tests that cover successful inbound download-and-save behavior and rejection of invalid PDF payloads.

@everettjf everettjf marked this pull request as ready for review March 25, 2026 17:36
@everettjf everettjf self-assigned this Mar 25, 2026
@everettjf everettjf added bug Something isn't working enhancement New feature or request labels Mar 25, 2026
@everettjf everettjf merged commit daba3e9 into main Mar 26, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant