Add search_result_serializer hook and serialize_tools_for_output_markdown by MagnusS0 · Pull Request #3337 · PrefectHQ/fastmcp

MagnusS0 · 2026-02-28T20:06:06Z

Description

While migrating the OpenBB MCP server to FastMCP v3 I was playing around with CodeMode and noticed that search results were eating a lot of the context savings it's supposed to provide. The LLM gets back full JSON tool definitions (same as list_tools ) which is fine for simple tools but balloons fast with anything schema-heavy.

My solution is a search_result_serializer hook on BaseSearchTransform (and CodeMode), so you can swap in whatever serialization makes sense for your use case. The built-in serialize_tools_for_output_markdown strips the JSON boilerplate and renders just what the LLM actually needs to pick and call a tool. In my simple benchmark across an 11-tool catalog with some complex OpenBB-style schemas, it cut search result tokens by ~65-70% (benchmark script). Default behavior is unchanged, fully opt-in.

from fastmcp.experimental.transforms import CodeMode
from fastmcp.server.transforms.search import serialize_tools_for_output_markdown

mcp = FastMCP("Server", transforms=[
    CodeMode(search_result_serializer=serialize_tools_for_output_markdown)
])

The markdown output for a tool looks like this:

### create_document

Create a new document in the workspace with the given metadata.

**Parameters**
- `title` (string, required)
- `content` (string, required)
- `tags` (string[], required)
- `author` (string, required)
- `published` (boolean)
- `parent_id` (string?)

**Returns**
- `value` (object)

Both built-in serializers (serialize_tools_for_output_json and serialize_tools_for_output_markdown) are exported from fastmcp.server.transforms.search and work on standalone search transforms too, not just CodeMode.

Generated with Claude Code

Contributors Checklist

My change closes CodeMode search results inflate context for tools with complex schemas #3336
I have followed the repository's development workflow
I have tested my changes manually and by adding relevant tests
I have performed all required documentation updates

Review Checklist

I have self-reviewed my changes
My Pull Request is ready for review

…down Adds a `search_result_serializer` hook to `BaseSearchTransform` (and by extension `BM25SearchTransform` and `RegexSearchTransform`) so callers can control how search results are serialized before being returned to the LLM. The same hook is available on `CodeMode`. Adds a new built-in `serialize_tools_for_output_markdown` serializer that renders tool definitions as compact markdown (~65-70% fewer tokens than the default JSON format). The existing JSON serializer is now also public as `serialize_tools_for_output_json`. Both are exported from `fastmcp.server.transforms.search`. 🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d70767e79

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

CodeMode now falls back to the search transform's own serializer when no explicit search_result_serializer is set, so a pre-configured transform (e.g. BM25SearchTransform(search_result_serializer=...)) is no longer silently ignored. _schema_section now distinguishes between a missing properties key (single unnamed value) and an empty properties dict (zero-argument tool), rendering the latter as "*(no parameters)*" instead of a fake value argument. 🤖 Generated with Claude Code

marvin-context-protocol · 2026-02-28T20:34:22Z

Test Failure Analysis

Summary: The CI failure is a pre-existing flaky timeout test unrelated to this PR's changes. The same test (TestTimeout::test_timeout_client_timeout_does_not_override_tool_call_timeout_if_lower) has failed across multiple unrelated branches on the same day.

Root Cause: tests/client/test_sse.py::TestTimeout::test_timeout_client_timeout_does_not_override_tool_call_timeout_if_lower uses very tight timing thresholds (client timeout=0.1s, tool sleep of 0.03s, per-call timeout of 2s). Under CI load, the connection initialization itself — the __aenter__ / initialize() handshake — consumes the 0.1s client timeout before the actual call_tool is ever invoked. The error is thrown during session setup (mcp.shared.session.py:294), not during the tool call.

This PR (feat/code-mode-markdown-serializer) did not touch tests/client/test_sse.py or any timeout-related client code. An earlier run of this exact branch (22528048506) passed all tests cleanly.

Suggested Solution: Re-run the failed job — this is a CI infrastructure flake. The underlying test itself should be fixed separately by either relaxing its timing margins or marking it with @pytest.mark.flaky to tolerate intermittent failures.

Detailed Analysis

Failing test (tests/client/test_sse.py:188-200):

async def test_timeout_client_timeout_does_not_override_tool_call_timeout_if_lower(
    self, sse_server: str
):
    async with Client(
        transport=SSETransport(sse_server),
        timeout=0.1,   # <-- only 100ms for entire connection
    ) as client:
        await client.call_tool("sleep", {"seconds": 0.03}, timeout=2)

Error from logs:

E  mcp.shared.exceptions.McpError: Timed out while waiting for response to ClientRequest. Waited 0.1 seconds.
   .venv/lib/python3.10/site-packages/mcp/shared/session.py:294: McpError

The timeout fires at client.py:476 during self._session_state.initialize_result = await self.session.initialize(), which is the MCP session handshake that happens inside async with Client(...) as client:. The 0.1s budget is fully consumed by the SSE handshake under CI load, never reaching call_tool.

Same test failed on unrelated branches today:

Run 22525457770 (branch claude/review-issue-3035-E6W3s) — same error, same test
Run 22528355900 (this PR) — same error, same test

This PR's changes are confined to:

src/fastmcp/experimental/transforms/code_mode.py
src/fastmcp/server/transforms/search/
docs/servers/transforms/code-mode.mdx

None of these files are involved in the failing test.

Related Files

/home/runner/work/fastmcp/fastmcp/tests/client/test_sse.py — contains the flaky test (lines 188-200)
/home/runner/work/fastmcp/fastmcp/src/fastmcp/client/client.py — where timeout fires during initialize() (line 476)
/home/runner/work/fastmcp/fastmcp/src/fastmcp/experimental/transforms/code_mode.py — PR changes (unrelated)
/home/runner/work/fastmcp/fastmcp/src/fastmcp/server/transforms/search/base.py — PR changes (unrelated)

🤖 Generated with Claude Code

MagnusS0 · 2026-02-28T21:59:34Z

More general comment when it comes to token usage and search tools, maybe a follow-up PR.
But from my experience, from a context perspective, I have found the approach taken by mcp-cli where you have a two step search:

candidate selection (name + short description)
full schema for only 1-2 finalists

A lot more token efficient and scales better when you have a lot of tools.

Could be something like:

# default `summary`: name, short description, tags/category
# optional `detail="full"` for current behavior

search(query, detail="summary")

# return full input/output schema for selected tool(s)
get_schema(name | [name1, name2, ...] )

jlowin · 2026-03-01T18:54:53Z

Thanks @MagnusS0 this is really solid work! The serializer hook and the markdown rendering are both well thought out, and I totally agree with the direction you're going here. Your follow-up comment about two-stage search actually resolves some tension I've been feeling about how search results scale with complex tool catalogs, so thanks for surfacing that.

Here's where I think this should land architecturally: we'll split the current search tool into a progressive disclosure flow —

search returns just names and short descriptions, super lightweight. Option to get more detail.
get_schema takes tool name(s) and returns abbreviated markdown (your serializer) by default, with an option to get the full JSON schema when the LLM needs it
execute stays as-is

The LLM regulates its own token budget — scan candidates cheaply, pull schema detail only for the tools it actually cares about. Some default to let users err for more detail on simple servers (where two calls is worse than one due to latency) or less detail on complex servers (where two search calls is probably more efficient overall).

Since you already suggested this as a follow-up, I'd love to merge this as-is and then quickly follow up with a PR that implements the staged approach on top. You've already done the heavy lifting here with the serializer — the rest is reshaping the tool surface around it. This will all land together before it ships. Ideally targeting later this week or next.

MagnusS0 · 2026-03-01T20:46:48Z

Thanks @jlowin, really glad it landed well and that you like the direction!

One more thought for the staged approach, since you're designing the tool surface now anyway, for MCPs or converted APIs with say 10–20 domains, even a lightweight search over all candidates can get noisy. A list_categories() call as a cheap first step would let the agent scope before it does anything else:

list_categories()   # ["equity", "sec", "etf", ...]
search("price", category="equity", detail="summary")   # scoped candidates
get_schema("price_history")   # full detail

This could map to MCP tags if tools already have them, so no extra annotation burden. For servers where tags aren't set maybe don't expose the tool at all.

One pattern I've hit personally: sub-categories (e.g. equity → {price, fundamentals}). Probably niche, but if it's easy to expose as an opt-in, say a custom categories mapping on the transform, it would be very nice.

Just flagging before the surface is locked in, easier to bake in now than retrofit later.

jlowin · 2026-03-01T23:07:34Z

So there is this interesting fallback (we actually started with it but it's too complicated for simple cases) where you can use the call tool to actually learn what the tools are. like we could actually expose the schema as a data structure and allow an LLM to process it arbitrarily. I think its a interesting extreme case. Maybe we find a way to support a spectrum of tools with some signature. I'll think on it!

marvin-context-protocol Bot added enhancement Improvement to existing functionality. For issues and smaller PR improvements. server Related to FastMCP server implementation or server-side functionality. labels Feb 28, 2026

chatgpt-codex-connector Bot reviewed Feb 28, 2026

View reviewed changes

Comment thread src/fastmcp/experimental/transforms/code_mode.py Outdated

Comment thread src/fastmcp/server/transforms/search/base.py Outdated

MagnusS0 mentioned this pull request Mar 1, 2026

refactor(mcp_server): migrate to FastMCP v3 and add OpenBB Code Mode OpenBB-finance/OpenBB#7385

Draft

4 tasks

jlowin approved these changes Mar 1, 2026

View reviewed changes

Merge branch 'main' into feat/code-mode-markdown-serializer

eff197c

jlowin added this to the 3.1 milestone Mar 1, 2026

jlowin merged commit 1e72f24 into PrefectHQ:main Mar 1, 2026
6 checks passed

jlowin mentioned this pull request Mar 2, 2026

Decompose CodeMode into composable discovery tools #3354

Merged

MagnusS0 mentioned this pull request Mar 8, 2026

feat(mcp): add FastMCP search and CodeMode on top of skills OpenBB-finance/OpenBB#7401

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add search_result_serializer hook and serialize_tools_for_output_markdown#3337

Add search_result_serializer hook and serialize_tools_for_output_markdown#3337
jlowin merged 3 commits intoPrefectHQ:mainfrom
MagnusS0:feat/code-mode-markdown-serializer

MagnusS0 commented Feb 28, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

marvin-context-protocol Bot commented Feb 28, 2026

Uh oh!

MagnusS0 commented Feb 28, 2026

Uh oh!

jlowin commented Mar 1, 2026

Uh oh!

Uh oh!

MagnusS0 commented Mar 1, 2026

Uh oh!

jlowin commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MagnusS0 commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

marvin-context-protocol Bot commented Feb 28, 2026

Test Failure Analysis

Uh oh!

MagnusS0 commented Feb 28, 2026

Uh oh!

jlowin commented Mar 1, 2026

Uh oh!

Uh oh!

MagnusS0 commented Mar 1, 2026

Uh oh!

jlowin commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MagnusS0 commented Feb 28, 2026 •

edited

Loading