Skip to content

fix: materialize generators before result conversion, handle bytes gracefully#3830

Merged
jlowin merged 1 commit intomainfrom
fix/tool-return-type-handling
Apr 11, 2026
Merged

fix: materialize generators before result conversion, handle bytes gracefully#3830
jlowin merged 1 commit intomainfrom
fix/tool-return-type-handling

Conversation

@strawgate
Copy link
Copy Markdown
Collaborator

Tool return values had three classes of bugs in the result conversion pipeline: async generators were never consumed (tool output was the generator's repr string), sync generators were consumed during text serialization but exhausted before structured content was built (inconsistent text vs structured), and non-UTF8 bytes crashed with PydanticSerializationError.

@mcp.tool
async def stream(count: int) -> AsyncIterator[str]:
    for i in range(count):
        yield f"item {i}"

# Before: data=None, content="<async_generator object stream at 0x...>"
# After:  data=None, content='["item 0","item 1","item 2"]'
@mcp.tool()
def gen_tool() -> list:
    return (x for x in [1, 2, 3])

# Before: text=[1,2,3], structuredContent={"result": []}  (inconsistent!)
# After:  text=[1,2,3], structuredContent={"result": [1,2,3]}
@mcp.tool()
def binary() -> bytes:
    return bytes(range(256))

# Before: PydanticSerializationError: invalid utf-8 sequence
# After:  base64-encoded text content, no crash

Fixes #3829

🤖 Generated with Claude Code

@marvin-context-protocol marvin-context-protocol Bot added bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. server Related to FastMCP server implementation or server-side functionality. labels Apr 10, 2026
@marvin-context-protocol
Copy link
Copy Markdown
Contributor

marvin-context-protocol Bot commented Apr 10, 2026

Test Failure Analysis

Note

Updated to reflect latest analysis.

Summary: The static_analysis job failed because codespell found misspelled words in .codex/refresh/refresh-20260409-193437.md, a local tooling artifact that was accidentally committed in this PR.

Root Cause: The .codex/refresh/refresh-20260409-193437.md file should not have been included in this PR. It's a Codex-generated local workspace snapshot containing absolute paths to the author's machine (/Users/bill.easton/repos/fastmcp/...) and GitHub issue titles (which happen to contain misspellings). The codespell hook flagged:

Suggested Solution: Remove .codex/refresh/refresh-20260409-193437.md (and the .codex/ directory entirely) from this PR. Since the misspellings are in copied GitHub issue titles, "fixing" the spelling in the file would be inaccurate — and the file itself is a local machine artifact that shouldn't be committed.

git rm -r .codex/
git commit --amend --no-edit   # or a new commit
Detailed Analysis

Failing hook: codespell (exit code 65)

Log excerpt:

codespell................................................................[Failed]
- hook id: codespell
- exit code: 65

  .codex/refresh/refresh-20260409-193437.md:290: publically ==> publicly
  .codex/refresh/refresh-20260409-193437.md:298: inheritence ==> inheritance

Why the file shouldn't be committed: The file contains the author's local absolute path:

- Repo root: /Users/bill.easton/repos/fastmcp/.claude/worktrees/immutable-herding-crystal

This is a local workspace snapshot, not a project artifact. All other checks (ruff, prettier, ty, loq) passed — this is the only failure.

Related Files
  • .codex/refresh/refresh-20260409-193437.md — the accidentally committed Codex refresh snapshot that triggered the failure (should be removed)
  • pyproject.toml — contains [tool.codespell] config with ignore-words-list = "asend,shttp,te" (not relevant here; the real fix is removing the file)
  • .pre-commit-config.yaml — defines the codespell hook that caught the issue

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c6ea1d2c8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/fastmcp/tools/function_tool.py Outdated
Comment thread src/fastmcp/tools/base.py
Comment on lines +293 to +295
# Bytes can't be represented as structured JSON content
if isinstance(raw_value, bytes):
return ToolResult(content=content)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve structured content when output schema exists

This unconditional bytes fast-path drops structured_content even when an output schema is declared, creating schema/result mismatches for cases like bytes | None (which still auto-generates an output schema) or explicit output_schema usage. In those scenarios, clients receive data=None despite the server advertising structured output. The bytes shortcut should be conditional on self.output_schema is None or bytes should be normalized into a schema-compatible structured value.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — low risk in practice because bytes return types no longer generate an output_schema (suppressed in function_parsing.py), so the early return only fires when there's no schema to violate. The bytes | None edge case is real but unlikely. Leaving as-is for now.

@strawgate strawgate force-pushed the fix/tool-return-type-handling branch from c0e45f5 to 14194a8 Compare April 10, 2026 22:07
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c0e45f5479

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/fastmcp/tools/function_parsing.py Outdated
Comment on lines +209 to +210
if output_type is bytes:
output_type = _UnserializableType
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Suppress schemas for wrapped bytes return annotations

The new bytes guard only handles the exact type check output_type is bytes, so wrapped forms like Annotated[bytes, ...] still auto-generate an output schema. At runtime, convert_result now unconditionally returns content-only for any bytes value, so these tools advertise structured output but return structured_content=None (I reproduced this with an Annotated[bytes, ...] tool). This schema/result mismatch can break clients that validate tool responses against the declared output schema.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid edge case. Annotated[bytes, ...] is uncommon for tool return types — the primary use case is plain bytes. Leaving as-is; a follow-up could unwrap Annotated to check the inner type.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added _contains_bytes_type() that recurses through Union and Annotated, matching the same pattern used for prefab types. Annotated[bytes, ...] now correctly suppresses output_schema.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 14194a8297

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/fastmcp/tools/base.py
Comment on lines +499 to +505
if isinstance(item, bytes):
try:
return TextContent(type="text", text=item.decode("utf-8"))
except UnicodeDecodeError:
import base64

return TextContent(type="text", text=base64.b64encode(item).decode("ascii"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Route bytes through custom serializer when provided

The new bytes fast path returns decoded/base64 text directly, which skips _serialize_with_fallback and therefore never calls a tool’s configured serializer. This regresses deprecated-but-supported serializer behavior for byte-returning tools (for example Tool.from_function(..., serializer=...) now ignores the serializer and emits built-in decoding instead), so clients relying on a custom wire format for bytes will receive different content after this change.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. The serializer parameter is already deprecated. The bytes fast path is intentional — custom serializers can't meaningfully handle non-UTF8 binary data anyway.

@strawgate strawgate force-pushed the fix/tool-return-type-handling branch from 14194a8 to fdfeda8 Compare April 11, 2026 00:35
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fdfeda8612

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +297 to +298
if inspect.isgenerator(result):
return list(result)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Materialize sync generators in a cancellable context

Using list(result) here runs the sync generator body on the event-loop thread, so anyio.fail_after(...) cannot preempt blocking work inside the generator (no cancellation checkpoints are hit during iteration). Fresh evidence: with timeout=0.1 and a tool returning a sync generator that does time.sleep(0.2) before yielding, call_tool completed after ~0.21s without raising TimeoutError. This leaves timeout guarantees broken for sync generators and can stall concurrent requests.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same fundamental limitation as sync tool timeouts (bug 024 / existing skipped test). anyio.fail_after can't interrupt OS threads. The generator materialization inherits this pre-existing behavior — list(result) on the event loop thread is no different from the sync tool execution itself. Not introduced by this PR.

…acefully

- Detect async/sync generators after tool execution and materialize
  into lists before the result conversion pipeline processes them
- Generator materialization runs inside timeout scope so slow generators
  respect the configured timeout
- Handle bytes return types: UTF-8 bytes as text, non-UTF-8 as base64
- Suppress output_schema for bytes return types (can't be structured JSON)
- Catch UnicodeDecodeError alongside PydanticSerializationError in
  convert_result for robustness

Fixes #3829

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@strawgate strawgate force-pushed the fix/tool-return-type-handling branch from fdfeda8 to cf15349 Compare April 11, 2026 04:01
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cf153491b2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +47 to +48
if origin is Union or origin is types.UnionType or origin is Annotated:
return any(_contains_bytes_type(a) for a in get_args(tp))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Detect literal bytes before generating output schemas

_contains_bytes_type only recurses through Union and Annotated, so return annotations like Literal[b"abc"] are missed and still get an inferred output_schema. After this commit, Tool.convert_result unconditionally returns structured_content=None for any runtime bytes value (src/fastmcp/tools/base.py), so these tools now advertise structured output but never provide it, which can break clients that validate responses against the declared schema. Fresh evidence: a tool annotated -> Literal[b"abc"] currently exposes a wrapped schema but returns only text content.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Literal[b"abc"] as a tool return type is not a realistic pattern. Not addressing.

Copy link
Copy Markdown
Member

@jlowin jlowin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound approach — materializing generators before conversion addresses the root cause, and the bytes fallback path is sensible.

@jlowin jlowin merged commit eec52b1 into main Apr 11, 2026
12 of 15 checks passed
@jlowin jlowin deleted the fix/tool-return-type-handling branch April 11, 2026 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. server Related to FastMCP server implementation or server-side functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tool return type handling: generators, None, bytes, non-dict output_schema

2 participants