Validate actual base64 data size in FileUpload, not client-reported size#3816
Validate actual base64 data size in FileUpload, not client-reported size#3816
Conversation
249ade9 to
40dff01
Compare
40dff01 to
0730b1a
Compare
The store_files tool checked the client-provided `size` field to enforce max_file_size, but this field is untrusted input. A client could set size=1 while sending a multi-megabyte payload, bypassing the limit. Now computes actual size from the base64 data length instead. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
0730b1a to
aec827c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aec827c528
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| padding = b64.count("=", max(0, n - 2)) | ||
| return n * 3 // 4 - padding |
There was a problem hiding this comment.
Enforce limits from strict decoded byte length
_b64_decoded_size uses n * 3 // 4 - trailing '=', which is only exact for canonical base64 without extra formatting, but this provider later decodes with permissive base64.b64decode(...) semantics. As a result, folded payloads (for example from base64.encodebytes, which inserts newlines) are overcounted and can be rejected even when their decoded size is within max_file_size (e.g. 100 decoded bytes can be computed as 102), while some non-canonical trailing = forms are undercounted by 1 byte. Because store_files now depends on this helper for enforcement, the gate is not checking the true decoded size for decodable inputs.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
The data field arrives as a JSON string value (it is a property inside a tool call arguments dict), so it cannot contain raw newlines or other formatting — JSON string escaping guarantees canonical base64. base64.encodebytes() line-wrapping is not a concern here since the MCP transport serializes everything through JSON.
For non-canonical trailing = padding: standard base64 always produces 0, 1, or 2 = characters, and the count("=", max(0, n-2)) window handles all three cases exactly. A payload would need to be malformed (e.g. ====) to produce an off-by-one, and base64.b64decode would reject that anyway downstream in on_store.
jlowin
left a comment
There was a problem hiding this comment.
Correct fix — computing actual base64 payload size instead of trusting client-reported values closes the bypass cleanly. Tests are thorough.
Summary
The
store_filestool inFileUploadenforcedmax_file_sizeby checking the client-providedsizefield, which is untrusted input. A client could setsize: 1while sending a multi-megabyte base64 payload, completely bypassing the limit.This PR computes the actual data size from the base64 payload length (
len(data) * 3 // 4) and uses that for enforcement instead.Note: since MCP delivers all tool arguments as a single JSON payload, the data is already in server memory by the time the tool runs. This fix prevents oversized files from being stored, but transport-level request size limits (e.g. uvicorn's
--limit-request-line) are needed to prevent receipt entirely.Closes #3815
🤖 Generated with Claude Code