Skip to content

fix(transport): single Server instance causes response routing to wrong clients #138

@polaz

Description

@polaz

Problem

The MCP server uses a single global Server instance shared across all Streamable HTTP transport sessions. This causes responses to be routed to the wrong client when multiple agents connect concurrently.

Root Cause

In src/server.ts:49:

export const server = new Server(...)

Each new session calls server.connect(transport) which replaces this._transport in the SDK's Protocol class:

// @modelcontextprotocol/sdk/shared/protocol.js:215
async connect(transport) { this._transport = transport; }

When Client1 and Client2 are connected:

  1. Client1 connects → this._transport = transport1
  2. Client2 connects → this._transport = transport2 (transport1 orphaned!)
  3. Client1 sends request → _onrequest captures this._transport = transport2 (WRONG!)
  4. Response sent to transport2 → fails with "No connection established for request ID"
  5. Client1 never gets response → hangs indefinitely

Evidence from Envoy Logs

SSE GET requests show DR (Downstream Reset) flag after exactly ~125s with 0 response bytes — the stream opens but never receives any data, confirming the transport routing failure.

Related Sub-Problems

These issues are tightly coupled to the same transport lifecycle code:

Race Condition in Transport Creation

Concurrent requests between transport creation and onsessioninitialized callback can create duplicate transports for the same session.

Error Handling When Headers Already Sent

The catch block in SSE mode calls res.status(500).json(...) even after SSE streaming has started (headers already sent), causing ERR_HTTP_HEADERS_SENT.

Memory Leak from Orphaned Sessions

When clients disconnect without sending DELETE, sessions/transports are never cleaned up. Only explicit DELETE requests trigger server.close().

Proposed Solution

Create a per-session Server instance or implement a session→transport routing map:

Option A: Per-session Server (cleanest)

const sessions = new Map<string, Server>();
// Each new session gets its own Server instance
const sessionServer = new Server(serverInfo, options);
await sessionServer.connect(transport);
sessions.set(sessionId, sessionServer);

Option B: Transport Router (if per-instance is too expensive)

// Override Protocol._onrequest to route by session
class RoutingServer extends Server {
  private transportMap = new Map<string, Transport>();
  
  async connect(transport: Transport, sessionId: string) {
    this.transportMap.set(sessionId, transport);
  }
}

Fix Checklist

  • Implement per-session Server instances OR session→transport routing
  • Add session cleanup on disconnect/timeout (not just DELETE)
  • Fix error handling for SSE streams (check headersSent before sending error)
  • Add session timeout/cleanup mechanism (configurable TTL)
  • Add integration tests for concurrent multi-client scenarios
  • Add metrics: active sessions count, orphaned session detection

Impact

Critical — Every multi-client scenario (which is the production case with Claude Code) is affected. Agents randomly hang or get connection reset.

Files to Modify

  • src/server.ts — Main transport architecture refactor
  • Potentially new src/session-manager.ts for session lifecycle

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions