Multi-Tenant Client Support (Server-to-Server) #193

mbleigh · 2025-03-07T19:05:59Z

mbleigh
Mar 7, 2025

Pre-submission Checklist

I have verified this would not be more appropriate as a feature request in a specific repository
I have searched existing discussions to avoid duplicates

Your Idea

MCP is currently designed to favor single-tenant clients. That is, it works well if you have a dedicated device or VM for the client that can install and start up servers on demand. Servers are generally built with the assumption that a single client will be communicating with them.

This becomes a problem when we want to make MCP servers available to server-side agents that serve many users. An MCP server configuration (e.g. via command-line startup args) is a singleton, but it is both inefficient and impractical for server-side agents to spin up new MCP server processes for each user.

What is needed is the ability for MCP to provide server configuration such that each request can "reconfigure" the server for its needs.

Note: While this issue is related to the overall "statefulness" of MCP, it is not strictly an issue of statefulness/statelessness. Servers could remain stateful while supporting multi-tenant clients.

The Problem

Let's imagine an MCP server with specialized tools for managing GitHub issues. The tools are designed to always operate in the context of a single repository, e.g. searchIssues(query: string) only searches issues in the configured repository. When I start it up I do something like:

export GITHUB_ACCESS_TOKEN="...";
npx my-github-mcp-server myusername/myrepo

Fundamentally, this server acts as a simple proxy to the GitHub API by providing a tool. All it needs to properly function is an access token and a repo name. I'd argue that the majority of MCP servers in the wild today operate this way today -- most of their logic is stateless, the only "state" comes from the configuration at startup time.

If I'm building a desktop app that orchestrates MCP servers client-side, this is an acceptable state of affairs. But as soon as I'm building a server-side agent that serves many clients, this becomes an impossible bottleneck. I need to be able to spin up and execute potentially thousands of concurrent instances of the same MCP server, one per user, just to perform a simple API proxy operation with two input variables.

There needs to be a way to operate a single instance of an MCP server that can serve many clients. This is not simply a transport issue because even with HTTP transport, MCP servers are currently built to serve individual clients.

Proposed Solution

The base Request and Response interfaces could be enhanced to support multi-tenant clients universally in a way that is compatible with the current "stateful" connection between client and server:

export interface Request {
  method: string;
  params?: {
    _meta?: {
      clientId?: string; // if provided, uniquely identifies a specific user of a multi-tenant client
      clientConfig?: Record<string,unknown>; // configuration specific to this user
    };
    [key: string]: unknown;
  };
}

export interface Result {
  /**
   * This result property is reserved by the protocol to allow clients and servers to attach additional metadata to their responses.
   */
  _meta?: {
    clientId?: string; // the clientId from the request this result is fulfilling
  };
  [key: string]: unknown;
}

export interface ServerCapabilities {
  clientConfig?: {
    schema?: object; // (optional) valid JSON schema describing the accepted clientConfig
  }
}

This relatively simple change opens up multi-tenant clients in a backwards compatible manner. For a server implementor, they just need to take whatever configuration is passed at startup time and make it optionally provided through clientConfig instead.

If my GitHub MCP server from above implemented clientConfig, it would start up like so:

npx my-github-mcp-server # no config arguments

And when queried about its capabilities, it might respond with:

{
  clientConfig: {
    schema: {
      type: "object",
      properties: {
        githubAccessToken: {type: 'string', describe: 'required for private repo'},
        repo: {type: 'string'},
      },
      required: ["repo"],
    }
  }
}

Now when my multi-tenant client calls this MCP server, it can include user metadata:

{
  _meta: {
    clientId: currentUser.id,
    clientConfig: { 
      repo: currentSession.repo,
    }
  }
}

My server implementation will look at the client config for a particular request and adapt its behavior accordingly.

Secrets and Stored Values

The largest potential issue with the above proposal is the potential for proliferating sending sensitive credentials over the wire between MCP multi-tenant client and MCP server. Safety would be significantly enhanced if secrets could be passed either once per client or never at all.

I won't get into specific solutions for this to keep the discussion focused on multi-tenant client support at the more basic level, but there are a few ways this could be done:

Allow servers to indicate to the client that certain config values have been "stored".
Add explicit support for "secrets" that allow clients and servers to negotiate an out-of-band transfer mechanism.
Use public-key cryptography so that the client must encrypt secret values before sending them over the wire. The public key for signing secrets could be discoverable in the server capabilities.

Each of these solutions is more involved than the simple metadata additions proposed above, but for proper multi-client support one of these (or a better idea) would likely need to be pursued.

Conclusion

For MCP to become the universal protocol for adding capabilities to agents, it must evolve to support a multi-tenant client approach. Solving this well would open up a wide world for server-to-server MCP, both multi-tenant clients and multi-tenant servers (an MCP server that can server multi-tenant clients could also reasonably server many single-tenant clients from a hosted URL).

Looking forward to some discussion on the matter!

Scope

GuillaumeDesforges · 2025-03-08T23:34:42Z

GuillaumeDesforges
Mar 8, 2025

In my humble opinion, this is one of the biggest limitations of the MCP.

I would need to be able to provide additional data to tools called that is not decided by the LLM, such as the identity of the user (email, account id, ...). Note that user authentication has already happened at this point, but tools might use this identity to perform some authorization.

1 reply

jspahrsummers Mar 13, 2025
Maintainer

Authorization is part of the MCP draft specification. It sounds like you're just looking for a way to get authorization information during tool execution, which is supported by the Python and TypeScript SDKs, and could/should be added to the others.

jspahrsummers · 2025-03-13T12:36:30Z

jspahrsummers
Mar 13, 2025
Maintainer

It's indeed the case that stdio MCP servers are geared toward single tenancy (as there is a host application involved in launching them, and we generally assume that there is a single "local user").

I'm not following the way in which remote MCP servers, available over HTTP, cannot be multi-tenant though. We have multiple working examples of MCP servers hosted on HTTP being able to serve multiple clients and users simultaneously. Is this perhaps an issue with the documentation or the SDK?

7 replies

allenporter Mar 16, 2025

What is being missed in this topic this is that the spec is fairly new, and didn't initially support OAuth or have standards around how remote connections are supported -- but now it totally does. The solution is not to complicate the spec and build complex stdio server orchestration systems. Totally agree with jspahrsummers.

the-vampiire Mar 17, 2025

So while it's true that I could build my own version of every MCP server out there that is multi-tenant friendly and hosted an an HTTP endpoint, I'd be losing out on the entire value of MCP because I couldn't use anyone else's server off-the-shelf.

how does your proposal solve this concern? if the stdio servers are designed to assume single tenancy (by nature of running on the user host machine) then i think it will require more retrofitting than just config metadata. the simplicity of development that stdio and single tenancy assumption bring are the cause for their ubiquity.

mbleigh Mar 17, 2025
Author

What is being missed in this topic this is that the spec is fairly new, and didn't initially support OAuth or have standards around how remote connections are supported -- but now it totally does.

Authentication doesn't really solve this on its own. You can have server-to-server authentication that still doesn't support multi-tenant clients.

how does your proposal solve this concern? if the stdio servers are designed to assume single tenancy (by nature of running on the user host machine) then i think it will require more retrofitting than just config metadata. the simplicity of development that stdio and single tenancy assumption bring are the cause for their ubiquity.

Most stdio MCP servers that I have seen (e.g. the reference servers built by Anthropic) do not really have any significant code that depends on being single-client. It's the matter of setting a variable from the request instead of at startup.

mbleigh Mar 17, 2025
Author

One of the things this discussion has made me wonder is whether tooling, rather than protocol changes, could be an important part of the solution. If the default SDKs for building MCP and the default tutorials for MCP were encouraging people to build servers that could be run either in stdio or using the new stateless HTTP protocol, and those tools explicitly solved for the issue of single-tenant vs. multi-tenant clients, then a protocol-level solution might not be necessary.

MCP will not be a ubiquitous solution if it doesn't solve for multi-tenant clients which I actually expect may become the majority of all model-interacting clients if they aren't already. There are more hosted web apps that do AI stuff than there are desktop apps that do AI stuff.

geelen Mar 24, 2025

The benefit of MCP is in being able to easily install prepackaged servers built by other people. If you look at the ecosystem, it is heavily (90%+?) biased toward stdio servers right now.

I think in a matter of months this will switch to 90% the other way, as you point out:

There are more hosted web apps that do AI stuff than there are desktop apps that do AI stuff.

The reason authorization "solves" this is because without it remote MCP servers aren't really securable in the same way as local ones, hence the skew towards local/stdio only. But as soon as you give people the tools to run hosted MCP servers securely (and enough clients support it, in the interim there's https://github.com/geelen/mcp-remote), I can't see any reason to persist with local ones. After all, given the choice between deploying a new version to all my users at once versus pushing a new release to NPM and hoping everyone updates, I know what I'd be choosing.

the-vampiire · 2025-03-17T13:06:57Z

the-vampiire
Mar 17, 2025

I need to be able to spin up and execute potentially thousands of concurrent instances of the same MCP server, one per user, just to perform a simple API proxy operation with two input variables.

i dont see the issue with this and it appears central to the proposal. the "overhead" (protocol wrapper logic / processing) of a stdio server seems minimal relative to the actual processing of the tools. do you think there would be substantive difference between orchestrating 1 container per user vs one container processing configs?

5 replies

mbleigh Mar 17, 2025
Author

It becomes a problem when you combine it with the statefulness of the protocol. If each stdio server needs to be spun up per user, and the protocol is stateful and requires an ongoing connection, then MCP is introducing scaling limitations to my multi-tenant client because of limitations in the protocol. Spinning up thousands of long-running processes in a server environment is going to impact the reliability and performance of the overall system for very little benefit.

If I'm given the choice between "adopt MCP servers as implemented in the ecosystem" or "reimplement the capabilities of MCP servers in my client natively" for a multi-tenant client, right now I'd choose the latter. I feel like that's a problem if MCP really wants to be universally adopted.

the-vampiire Mar 17, 2025

i want to understand your perspective, can you clarify if i am understanding this correctly:

some stdio servers have tools that use arguments derived from their host context (env or startup args) and become implicit arguments to the tool calls
this poses a challenge to your view of multi-tenancy because without this standard you will need to provide a new host context for each client session to set these implicit args
you want to standardize implicit args through _meta.config so these servers do not rely on host context
with this in place you will be able to fully configure each tool call at runtime from a single server process

mbleigh Mar 17, 2025
Author

You've got it more or less correct. Requiring a new host context for each client session creates an unnecessary burden on multi-tenant clients and will lead to them not adopting MCP. It's easier for me to reimplement a GitHub MCP server as just native tools for my client than it is for me to build infrastructure to spin up a new long-running process for every user that connects to my client.

So this proposal attempts to avoid a fragmented ecosystem where desktop apps leverage MCP but multi-tenant cloud-hosted apps don't.

the-vampiire Mar 17, 2025

ya it sounds like you want tool calls to be "pure functions" (in the sense they dont depend on that host context state). i think thats a good idea in general tbh although env vars are pretty convenient for secrets and you would be repeatedly sending args per call.

maybe this base context could be held in a session JWT that gets passed on the requests.

i still feel this could be solved at the architectural level rather than in the protocol though. whether you have a single server or per-client you will need an HTTP proxy to expose them.

an alternative that manages this at the architectural level is to use the proxy as an adapter for multi-tenancy:

client request
create container (or direct on host if you live dangerously) with user / session context
call tools with args
destroy when client closes the connection

stdio servers in a lean container can spin up near instantly so you could reduce server resources by just killing them when they are stale. although i imagine they have pretty low idling usage you could set a global or per server TTL and i think you will be pretty close to achieving your desired overhead without affecting the protocol.

mbleigh Mar 17, 2025
Author

They don't need to be pure functions per se, but they do need to account for the difference between "client" and "user of client" which is the missing piece currently. The problem with the architectural solution is that it's rather onerous to build, requires substantial infra investment (an arbitrary sandboxed but quick to kill container environment), and likely more difficult than just re-implementing the MCP servers I want to utilize in my client.

If I'm building a multi-tenant client on serverless infra, way easier to just roll my own tools.

irvinebroque · 2025-03-24T19:55:18Z

irvinebroque
Mar 24, 2025

It is both inefficient and impractical for server-side agents to spin up new MCP server processes for each user.

Hmmm — we've found it to be quite practical and efficient at Cloudflare.

With Durable Objects as the underlying compute primitive for building MCP servers on Cloudflare, each MCP client session has its own isolated instance of an MCP Server.

We obviously love Durable Objects, but there are other ways to solve this with a process or instance per server, and don't quite see how spec needs to change or get more complex here. Pretty simple and clear as-is to simply say "create unique instances of MCP servers" — clear cut rather than having client IDs bleed into the spec. Don't quite see why client metadata should need to bleed into client space like this?

0 replies

PipelineBaron · 2025-03-25T14:03:02Z

PipelineBaron
Mar 25, 2025

Both means of transport, the soon-to-be-deprecated existing SSE transport and the soon-to-be-replacement streamable HTTP, reasonably support multi-tenancy. The issue described by the OP is an implementation problem, not a protocol problem. The expectation is that you should be able to run servers explicitly implemented as single-tenant STDIO servers as remote multi-tenant SSE/Streamable servers with minimal backend code changes.

Right now, most servers are still relatively nascent because the protocol itself is. I suspect that in the future, as both the protocol and implementations mature, we can expect to see more server implementations capable of operating against most/all supported transports, but they are just not there yet.

Also, the protocol is designed to support gateway/proxy processes, and intermediary servers at this layer can solve many authentication/authorization issues.

Example SSE multi-tenancy

Unique SSE endpoints + Auth bearer tokens

Each tenant gets its unique SSE endpoint and bearer token for authorization. Connecting to this endpoint with the correct bearer token will grant you a session-unique POST endpoint.

Global SSE endpoint + Identifying and auth via bearer token

This is the same as the previous technique but uses a single SSE endpoint and disambiguates/authenticates users via bearer tokens. In this method, the SSE endpoint would still return and point you to a session unique POST endpoint.

Example Streamable HTTP multi-tenancy

Not much to say here, session id's are natively supported by the protocol. See: https://github.com/modelcontextprotocol/specification/blob/3a57a033865162a09443d4f50992b6e5382d32e6/docs/specification/draft/basic/transports.md#session-management

1 reply

mbleigh Mar 27, 2025
Author

I think it's a protocol problem when there is an enormous gap in capability between the different transports of the protocol, especially since the less-capable protocol is the one that everyone is implementing.

I agree that using mcp-session-id with the new HTTP transport you can solve some (but not all, honestly) of the issues raised in this discussion, but that doesn't really matter because the only thing the ecosystem is building is stdio. Just because people can build multi-tenant capable MCP servers doesn't mean that they are doing so today or that the protocol helps them in doing so by providing a good set of primitives and guidance to make that the default.

My argument is that multi-tenant clients are likely to be more plentiful than desktop-style clients, and MCP as a protocol should make sure the needs of those clients are addressed if it wants to be the common standard for all clients.

nbarbettini · 2025-03-27T12:05:27Z

nbarbettini
Mar 27, 2025

@mbleigh, I'm curious for your thoughts on #234. It isn't multi-tenancy exactly, but I think it addresses the core problem of handling multiple users in 1 server.

2 replies

mbleigh Mar 27, 2025
Author

I think that's a helpful proposal and that there should generally be a way for tools to declare a schema for "side-channel" data which may include authorization information. But it shouldn't just be authorization -- in general the tool inputSchema is going to be exposed directly to LLMs via the clients, and the LLM shouldn't always know about all of the arguments that are actually necessary to complete a tool (e.g. requesting a specific user id's information where the client has a session for a specific user id).

I think there should be something like a metaSchema for tools as well that allow for passing of such side-channel data.

nbarbettini Mar 27, 2025

I like the idea that there are parameters (metadata) that are important to a tool call, but should never be shown to an LLM. Authorization (e.g. OAuth tokens), secrets, API keys, and other predefined values (many MCP servers require a slack team ID, workspace ID, etc) could fit into that idea.

Kevandrew · 2025-04-03T21:47:44Z

Kevandrew
Apr 3, 2025

Yeah, this multi-tenant client issue is a big one. Trying to run most current stdio servers for multiple users just doesn't scale for hosted agents.

I built the Ithena Governance SDK (@ithena-one/mcp-governance) to wrap the base MCP server with AuthN/RBAC/etc., and figuring out how to handle per-request context securely was a core part of that.

My take on this proposal:

The Problem's Real: Totally agree. Spinning up separate processes per user is a non-starter for many server-side agent use cases.
Per-Request Context (_meta.clientConfig): Makes sense. It gives existing servers a path to handle multi-tenant requests without a full rewrite, just by reading config from the request instead of only at startup. Standardizes passing that kind of per-request config.
Secrets & Connection to Multi-user Authorization #234: This is the tricky bit. Shoving raw secrets like githubAccessToken into _meta.clientConfig feels wrong, like you mentioned.
- In the other thread (Multi-user Authorization #234), I supported the client passing the end-user's external OAuth token in _meta.authorization. That seems like the spot for the token needed for the final hop (e.g., calling the actual GitHub API).
- This _meta.clientConfig feels better suited for non-secret config needed per-request, like the target repo name in your example.
- If a tool does need a secret that's not the end-user's external token (maybe an API key the agent itself uses), sending it raw seems bad. My SDK uses a server-side CredentialResolver. Maybe the client could pass just an ID in clientConfig (like { credentialRef: 'agent-key' }), and the governed server looks up the real secret safely on the backend based on who the agent is? Keeps the actual secret off the wire for that hop.
Server-Side Governance is Still Needed: Even with this _meta stuff, the MCP server itself still needs to:
- Know who the calling agent is (AuthN).
- Check if that agent (and maybe the user in _meta.clientId) is allowed to use this tool, potentially checking against the config passed (like "Can agent X touch repo Y?") (RBAC). That's the kind of stuff the Ithena SDK handles server-side.

So, adding _meta.clientConfig for configuration seems useful for multi-tenancy. It should work alongside _meta.authorization (from #234, for external auth) and needs solid server-side governance (agent AuthN/AuthZ, secure secret fetching if needed) to be safe.

Happy to chat more about how governance interfaces could use context from these _meta fields.

0 replies

jackedelic · 2025-05-26T12:21:58Z

jackedelic
May 26, 2025

I would like to add that certain MCP remote hosting platforms, in particular smithery, provide sdk that allows mcp server developers to create mcp server (that actuall talks to the backend API) only upon session creation, that is, POST /mcp. (I am not affiliated to smithery at all, I've just been starting to look into how smithery deploys our mcp servers)
see: https://github.com/smithery-ai/sdk/blob/210c5e2fa0561154a12cdfe4f182a9f4f7f1f069/typescript/sdk/src/server/stateless.ts#L56

It looks like currently there are only two such mcp servers written using smithery sdk (github and slack) from their mcp-servers github repos https://github.com/smithery-ai/mcp-servers.

Roughly how it works is -

A stateless server is created upon initialization.
Whenever there is a POST /mcp request, a new mcp server and a transport object is created with the access token. This requires the POST request to contain the access token in its query params (base64 decoded). The mcp server will then handle the request.
The mcp server and transport object created in step 2 is destroyed when the handler returns.

0 replies

Multi-Tenant Client Support (Server-to-Server) #193

Uh oh!

Uh oh!

Pre-submission Checklist

Your Idea

The Problem

Proposed Solution

Secrets and Stored Values

Conclusion

Scope

Replies: 8 comments · 16 replies

Uh oh!

Uh oh!

Uh oh!

jspahrsummers Mar 13, 2025 Maintainer

Uh oh!

jspahrsummers Mar 13, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mbleigh Mar 17, 2025 Author

Uh oh!

mbleigh Mar 17, 2025 Author

Uh oh!

Uh oh!

Uh oh!

mbleigh Mar 17, 2025 Author

Uh oh!

Uh oh!

mbleigh Mar 17, 2025 Author

Uh oh!

Uh oh!

mbleigh Mar 17, 2025 Author

Uh oh!

Uh oh!

Example SSE multi-tenancy

Unique SSE endpoints + Auth bearer tokens

Global SSE endpoint + Identifying and auth via bearer token

Example Streamable HTTP multi-tenancy

Uh oh!

mbleigh Mar 27, 2025 Author

Uh oh!

Uh oh!

mbleigh Mar 27, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Replies: 8 comments 16 replies

jspahrsummers Mar 13, 2025
Maintainer

jspahrsummers
Mar 13, 2025
Maintainer

mbleigh Mar 17, 2025
Author

mbleigh Mar 17, 2025
Author

mbleigh Mar 17, 2025
Author

mbleigh Mar 17, 2025
Author

mbleigh Mar 17, 2025
Author

mbleigh Mar 27, 2025
Author

mbleigh Mar 27, 2025
Author