Fix session persistence across tool calls in multi-server MCPConfigTransport#3330
Fix session persistence across tool calls in multi-server MCPConfigTransport#3330jlowin merged 10 commits intoPrefectHQ:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ae71307f01
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Test Failure AnalysisSummary: One timing-sensitive test failed in the "Tests with lowest-direct dependencies" CI job — Root Cause: Suggested Solution: Make the test less sensitive to wall-clock timing by either:
The companion test Detailed AnalysisFailing test: Log excerpt:
async def _ping_loop(self, session: Any, session_id: int) -> None:
try:
while True:
await anyio.sleep(self.interval_ms / 1000) # sleep FIRST, then ping
await session.send_ping()
finally:
self._active_sessions.discard(session_id)Note the sleep-first pattern: the loop sleeps 50ms, sends a ping, sleeps again, etc. With a 350ms window the math says ~7 pings, but the "lowest-direct dependencies" matrix job (which pins minimum compatible versions) ran on a busy runner and the cumulative sleep drift consumed almost the entire window after just one iteration. Only the "lowest-direct dependencies" job failed (1 fail / 4619 pass), all other matrix jobs passed. Related Files
|

Root cause
MCPConfigTransport.connect_session()createdProxyClientinstances but never connected them before passing tocreate_proxy(). This meant_create_client_factory()saw them as disconnected and used theclient.new()path, creating a fresh connection (and new session ID) for every tool call.Fix
Session persistence (
config.py):_create_proxy()now connects each client viaAsyncExitStackbefore mounting it on the composite server.StatefulProxyClientis used instead ofProxyClient, because its context-restoring handler wrappers prevent stale ContextVars in the reused session's receive-loop task, which fixes elicitation, logging, and progress forwarding that silently failed withProxyClientin the testing phase.Note that a new INFO message is printed to the user when reconnecting to a session; we might consider downgrading this to DEBUG.
Since
StatefulProxyClient.__aexit__is intentionally a no-op (designed for thenew_stateful()use case),enter_async_contextalone cannot clean up the connection. Instead, the client is connected manually viaawait client.__aenter__(), and explicit cleanup callbacks are pushed onto theAsyncExitStackin LIFO order:transport.close()is pushed first (runs last), thenclient._disconnect(force=True)is pushed second (runs first). This ensures the client is force-disconnected before the underlying transport is closed when the stack unwinds.Context restoration (
proxy.py):_restore_request_context()now also restores_current_contextsoget_context()works in the receive-loop task. Following existing patterns,(RequestContext, weakref[FastMCP])is stashed instead of theContextinstance (whose properties are ContextVar-dependent and would resolve stale values), and a freshContextis constructed at restore time afterrequest_ctxhas been fixed.Test fixes (
test_mcp_config.py):test_multi_server_timeout_propagationreplaced itswraps=spy with a fullAsyncMockto match the newasync _create_proxysignature and avoid real stdio connections that caused hangs.Contributors Checklist
Review Checklist