-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
SEP: Context Middleware for Application-Controlled Context Transformation
Preamble
Title: Context Middleware for Application-Controlled Context Transformation
Author: Peder HP (@PederHP)
Status: Proposal
Type: Standards Track
Created: 2025-10-04
Abstract
This SEP introduces a Context Middleware capability to MCP, enabling application-controlled transformation and enrichment of context. While MCP currently supports user-controlled context (Prompts), model-controlled actions (Tools), and static/semi-static data (Resources), there is no standardized mechanism for applications to dynamically transform context based on their own logic and timing.
Context Middleware provides a protocol-level primitive for (context + parameters) → context functions, executed at the application's discretion. This enables critical use cases including moderation systems, PII redaction/restoration, logging middleware, RAG implementations, tool filtering, and hallucination detection—all as application-controlled operations rather than model-driven tool calls.
Important: Context Middleware is designed for backend infrastructure and enterprise application development, not as a user-facing, plug-and-play feature. Users should never be prompted to connect arbitrary third-party middleware servers to their AI assistants. Instead, this capability enables AI assistant builders and enterprise organizations to construct sophisticated context processing pipelines using MCP as the standardized protocol, where middleware servers are managed by the application infrastructure—transparent to end users.
Intended Deployment Model
Backend Infrastructure, Not User-Facing Plugin
Context Middleware is fundamentally different from other MCP capabilities in its intended deployment model. While Tools, Resources, and Prompts are designed to be provided by servers that users can discover, install, and connect to their AI assistants, Context Middleware is designed for application infrastructure and backend pipelines.
Why This Matters
Security & Privacy Risk:
Allowing users to connect arbitrary third-party middleware servers to their AI assistant would create an unacceptable security risk. Every conversation, every message, every piece of context would flow through these servers. This is fundamentally different from:
- Tools: Model decides if and when to call, appends context rather than modifies
- Resources: User or application explicitly requests specific data
- Prompts: Templates that don't process user conversations
Context Middleware processes the entire conversation stream, making it unsuitable for untrusted third parties. Connecting to an MCP server is an act of high trust, as it allows for injection of server instructions into context and appended tool context and inserted tool descriptions are a possible vector for malicious servers. However, sending the entire conversation allows server providers (if remote) to listen on user conversations in there entirety, which is a unique and different kind of trust than a mere security based one. Hence, it is not advisable to expose Middleware from consumer-facing MCP servers and clients should not support this.
Intended Use Cases
✅ Enterprise Backend Pipelines
- Internal PII redaction service before sending to LLM provider
- Corporate compliance logging middleware
- Internal moderation systems
- Company-wide RAG infrastructure
✅ AI Assistant Application Infrastructure
- Chat platform content safety systems (pre-inference violation detection)
- Coding assistants and agents running secret detection and redaction pre-inference
- Agent frameworks with pluggable context processors
✅ 1st-Party Infrastructure
- AI assistant builder's own middleware servers
- Co-hosted middleware in same trust boundary as application
- Internal development tools for testing and debugging
❌ NOT for User-Facing Plugin Marketplaces
- Users should not be able to "add" middleware servers like they add tool servers
- Middleware should not appear in user-facing server directories
- Platform applications (chat, IDEs, etc.) should not expose middleware server configuration to users
Example Architecture
Here's how Context Middleware fits into a real-world AI assistant (please note RAG is just one use case and this is not primarily a RAG-oriented feature):
┌─────────────────────────────────────────────────────────────┐
│ User's View │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ AI Assistant (e.g., ChatApp) │ │
│ │ │ │
│ │ User can configure: │ │
│ │ • Tool Servers (GitHub, Notion, Calculator) │ │
│ │ • Resource Servers (not much yet) │ │
│ │ • Prompt Servers (Actions, Prompt Libraries) │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Application Backend │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Context Processing Pipeline │ │
│ │ (Invisible to User) │ │
│ │ │ │
│ │ 1. User Message │ │
│ │ ↓ │ │
│ │ 2. Middleware: PII Redaction (Internal) │ │
│ │ ↓ │ │
│ │ 3. Middleware: Content Moderation (Internal) │ │
│ │ ↓ │ │
│ │ 4. Middleware: RAG Injection (Internal) │ │
│ │ ↓ │ │
│ │ 5. Send to LLM │ │
│ │ ↓ │ │
│ │ 6. Model Response │ │
│ │ ↓ │ │
│ │ 7. Middleware: Hallucination Detection (Internal) │ │
│ │ ↓ │ │
│ │ 8. Middleware: PII Restoration (Internal) │ │
│ │ ↓ │ │
│ │ 9. Return to User │ │
│ │ │ │
│ │ All middleware servers are: │ │
│ │ • Managed by application infrastructure │ │
│ │ • In the same trust boundary │ │
│ │ • Never exposed to users │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
In practice, a backend server might have tools, resource, prompts and/or context middleware. There will be cases where it makes sense to bundle these in a single internal server. Many assistants make use of internal / server-side tools and resources, and there is nothing fundamentally wrong about having these on the same server as context middleware.
Current Reality
Organizations are already building this. Both:
- AI assistant platforms These have proprietary backend context pipelines for safety, compliance, and quality
- Enterprises building custom assistants create REST APIs for context processing, often co-hosted with MCP servers, or perform these operations in service code which creates the very MxN problem that MCP was in part created to solve
Context Middleware standardizes this pattern rather than forcing every organization to build proprietary solutions.
When Platforms Might Support It
There are specific cases where consumer-facing platforms might allow middleware configuration:
Enterprise deployments:
- IT admin configures organization-wide middleware (not end users)
- Corporate compliance middleware is mandated, not optional
- Examples: required PII redaction, audit logging, content policies
Advanced/developer modes:
- Power users explicitly opt into experimental features
- Clear warnings about security implications
- Similar to allowing custom model endpoints or API configurations
Self-hosted deployments:
- User controls entire infrastructure
- User accepts responsibility for security
- Examples: open-source AI assistants, personal agent frameworks
Guidance for Platform Implementers
If you're building a consumer-facing AI assistant:
- Do NOT expose middleware server configuration to regular users
- Do NOT include middleware servers in user-facing marketplaces
- DO use middleware internally for your own safety/compliance systems
- DO consider middleware for enterprise/admin configurations only
If you're building an enterprise AI platform:
- DO allow IT admins to configure required middleware
- DO make middleware configuration separate from user-facing servers
- DO provide audit trails for middleware operations
- DO ensure middleware runs in trusted infrastructure
If you're building agent frameworks or development tools:
- DO expose middleware as a developer-level feature
- DO document security implications clearly
- DO provide examples of safe middleware implementations
- DO consider middleware as part of your internal architecture
Motivation
The Gap in MCP's Control Model
MCP's current capabilities operate at two control levels:
- User-controlled: Prompts
- Model-controlled: Tools
- Application-controlled: Resources (but limited to static/templated data retrieval)
However, many real-world AI assistant requirements need application-controlled context transformation—where the application, not the model, decides when and how to modify context. Current workarounds include:
- Abusing Resources: Resources signal static data retrieval, not dynamic context mutation and borderline impossible with full contexts
- Forcing Tool Use: Adding latency and token costs when the application already knows what to inject
- Custom Implementations: Every application builds proprietary context transformation layers
- Application-Level Tool Use: This can be done as tools with the right input and output schema, and never exposing them to the model, but this requires manually filtering out those tools which is brittle
Real-World Use Cases
Moderation Systems
- Pass conversation history to moderation server
- Server analyzes content, injects warnings for model, flags policy violations
- Enables content filtering without model involvement
- Application can reject messages before sending to LLM
PII Redaction & Restoration
- Outbound: Remove PII before sending context to model (legal/compliance requirement)
- Inbound: Restore PII in model responses using handles
- Transparent to user, protects sensitive data
Logging & Audit
- Legal requirement in enterprise: log all context sent to models
- Application-controlled timing ensures complete audit trail
- Can include metadata about why context was modified
RAG (Non-Agentic)
- Classic RAG pattern: query → retrieve relevant documents → inject as context
- No model decision needed—application knows retrieval is required
- More efficient than tool-based RAG (no extra round-trip, fewer tokens)
Tool Filtering
- Analyze conversation context to determine relevant tools
- Filter tool set before model sees them
- Reduces token usage, improves model performance
Hallucination Detection & Fixing
- Validate model responses for correctness (URLs, code, facts)
- Flag hallucinations, provide corrected context
- Fix broken URLs automatically
- Application can choose to re-run the inference (possible with injected instructions related to observed hallucinations - but not always, experience shows sometimes a simple re-running is good enough)
- Applications can notify the user about a detected hallucination and not correct the model (depending on the use case)
Why Not Existing Primitives?
Why Not Resources?
- Resources represent static/templated data retrieval patterns
- Context Middleware is a dynamic transformation process
- Sending large amounts of context is very awkward with Resource Templates
- Example: RAG with message history cannot be expressed as a resource template—the query is the full conversation context
Why Not Tools?
- Tools require model decision and invocation (round-trip latency)
- Tools add token costs (schema + context in every tool call)
- Context Middleware is for when the application already knows what to transform
- Some transformations are inherently not intended for the model (guardrails, moderation, PII redaction)
- Example: Always injecting current timestamp should not require a tool call
Specification
Capability Declaration
Servers advertise Context Middleware support during initialization:
{
"capabilities": {
"contextMiddleware": {}
}
}Listing Middleware Functions
Clients discover available middleware via middleware/list:
Request:
{
"jsonrpc": "2.0",
"id": 1,
"method": "middleware/list"
}Response:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"middleware": [
{
"name": "pii_redaction",
"description": "Remove PII from context before sending to LLM",
"inputSchema": {
"type": "object",
"properties": {
"aggressiveness": {
"type": "string",
"enum": ["standard", "strict"],
"default": "standard"
}
}
}
},
{
"name": "content_moderation",
"description": "Analyze context for policy violations",
"inputSchema": {
"type": "object",
"properties": {
"policies": {
"type": "array",
"items": {"type": "string"}
}
}
}
}
]
}
}Invoking Middleware
Clients invoke middleware via middleware/invoke:
Request:
{
"jsonrpc": "2.0",
"id": 2,
"method": "middleware/invoke",
"params": {
"name": "pii_redaction",
"arguments": {
"aggressiveness": "strict"
},
"context": [
{
"type": "text",
"text": "My name is John Doe and my SSN is 123-45-6789"
}
]
}
}Response:
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"content": [
{
"type": "text",
"text": "My name is [PERSON_1] and my SSN is [SSN_1]"
}
],
"metadata": {
"redactions": [
{"handle": "PERSON_1", "type": "person"},
{"handle": "SSN_1", "type": "ssn"}
]
}
}
}Schema
NB! This should also include Tool call and result payloads. Which means the invoke request should be able to pass more than Content.
MiddlewareDefinition:
interface MiddlewareDefinition {
name: string; // Unique identifier
description?: string; // Human-readable description
inputSchema?: { // JSON Schema for arguments
type: "object";
properties?: Record<string, unknown>;
};
}MiddlewareInvokeRequest:
interface MiddlewareInvokeRequest {
method: "middleware/invoke";
params: {
name: string; // Middleware to invoke
arguments?: Record<string, unknown>; // Validated against inputSchema
context: Content[]; // Context to transform
};
}MiddlewareInvokeResult:
interface MiddlewareInvokeResult {
content: Content[]; // Transformed context
metadata?: Record<string, unknown>; // Optional metadata
}Content Types:
Context uses MCP's standard Content types: TextContent, ImageContent, AudioContent, EmbeddedResource, etc.
Design Decisions
Generic vs. Specialized
Decision: Keep the capability entirely generic—a simple (context + params) → context function.
Rationale:
- Avoids protocol bloat from feature-specific capabilities
- Maximizes flexibility for diverse use cases
- Follows the successful pattern established by Tools
- RAG is just one specialization of context transformation
Alternative Considered: RAG-specific capability with properties like maxResults, minRelevance
- Rejected: Too narrow, doesn't address the broader need for application-controlled transformations
Metadata Structure
Decision: Return metadata as an optional freeform object.
Rationale:
- Different transformations need different metadata (redaction handles, relevance scores, moderation flags)
- Application layer knows how to interpret its own middleware's metadata
- Extensible without protocol changes
Application vs. Server Control
Decision: Application invokes middleware, decides timing.
Rationale:
- Middleware is application-controlled by definition
- Applications know their requirements (PII laws, moderation policies, logging needs)
- Timing matters: outbound vs. inbound transformations have different semantics
Examples
Example 1: PII Redaction (Outbound)
Use Case: Application must remove PII before sending to model (compliance requirement)
Request:
{
"method": "middleware/invoke",
"params": {
"name": "pii_redaction",
"arguments": {},
"context": [
{
"type": "text",
"text": "Please review this contract for Jane Smith ([email protected], SSN: 987-65-4321)"
}
]
}
}Response:
{
"content": [
{
"type": "text",
"text": "Please review this contract for [PERSON_1] ([EMAIL_1], SSN: [SSN_1])"
}
],
"metadata": {
"redactions": {
"PERSON_1": "Jane Smith",
"EMAIL_1": "[email protected]",
"SSN_1": "987-65-4321"
}
}
}Example 2: PII Restoration (Inbound)
Use Case: Restore PII in model response so user sees actual names/details
Request:
{
"method": "middleware/invoke",
"params": {
"name": "pii_restoration",
"arguments": {
"redactions": {
"PERSON_1": "Jane Smith",
"EMAIL_1": "[email protected]"
}
},
"context": [
{
"type": "text",
"text": "I've reviewed the contract for [PERSON_1]. Please send it to [EMAIL_1]."
}
]
}
}Response:
{
"content": [
{
"type": "text",
"text": "I've reviewed the contract for Jane Smith. Please send it to [email protected]."
}
]
}Example 3: Content Moderation
Use Case: Check conversation for policy violations before sending to model
Request:
{
"method": "middleware/invoke",
"params": {
"name": "content_moderation",
"arguments": {
"policies": ["violence", "harassment"]
},
"context": [
{
"type": "text",
"text": "How can I hurt someone's feelings?"
}
]
}
}Response:
{
"content": [
{
"type": "text",
"text": "[MODERATION WARNING: This query may seek harmful advice]\n\nHow can I hurt someone's feelings?"
}
],
"metadata": {
"flags": ["potential_harm_seeking"],
"severity": "medium",
"allow": true
}
}Example 4: RAG Context Injection
Use Case: Retrieve relevant documents based on conversation and inject as context
Request:
{
"method": "middleware/invoke",
"params": {
"name": "document_retrieval",
"arguments": {
"maxResults": 3,
"sources": ["internal_docs", "email"]
},
"context": [
{
"type": "text",
"text": "What was discussed in the Q4 planning meeting?"
}
]
}
}Response:
{
"content": [
{
"type": "text",
"text": "[Retrieved Documents]\n\n# Q4 Planning Notes (Oct 1, 2025)\n- Focus on EMEA expansion\n- Budget: $2M allocated\n\n# Email from Sarah (Oct 3)\nRe: Q4 priorities - confirming team assignments\n\n---\n\nWhat was discussed in the Q4 planning meeting?"
}
],
"metadata": {
"sources": [
{"title": "Q4 Planning Notes", "relevance": 0.95},
{"title": "Email from Sarah", "relevance": 0.87}
]
}
}Example 5: Timestamp Injection
Use Case: Always include current time in context (more efficient than a tool)
Request:
{
"method": "middleware/invoke",
"params": {
"name": "timestamp_injector",
"arguments": {},
"context": [
{
"type": "text",
"text": "What's on my calendar today?"
}
]
}
}Response:
{
"content": [
{
"type": "text",
"text": "[Current time: Saturday, October 4, 2025, 3:42 PM UTC]\n\nWhat's on my calendar today?"
}
]
}Example 6: Tool Filtering
Use Case: Reduce tool set based on conversation relevance
Request:
{
"method": "middleware/invoke",
"params": {
"name": "tool_filter",
"arguments": {
"availableTools": [
"search_email",
"search_documents",
"search_calendar",
"calculate",
"send_email"
]
},
"context": [
{
"type": "text",
"text": "Find my meeting notes from last week about the marketing campaign"
}
]
}
}Response:
{
"content": [
{
"type": "text",
"text": "Find my meeting notes from last week about the marketing campaign"
}
],
"metadata": {
"relevantTools": [
{"name": "search_documents", "score": 0.92},
{"name": "search_email", "score": 0.78},
{"name": "search_calendar", "score": 0.65}
],
"suggestedToolSet": ["search_documents", "search_email", "search_calendar"]
}
}Backwards Compatibility
This proposal is fully backwards compatible:
- New capability, no changes to existing primitives
- Clients that don't support
contextMiddlewaresimply won't use it - Servers that don't implement it won't advertise the capability
- No breaking changes to protocol
Security Considerations
Critical: Trust Boundary Requirements
Context Middleware servers MUST operate within the application's trust boundary. Unlike Tools or Resources, middleware processes the entire conversation stream, making trust paramount.
Deployment Requirements:
- Middleware servers should be managed by application infrastructure
- Should run in the same security context as the application
- Should NOT be user-configurable in end user-facing applications
- Should be treated as internal application components
Privacy & Data Protection
PII and Sensitive Data:
- Middleware is specifically designed to handle PII (redaction/restoration)
- Enables compliance with GDPR, CCPA, HIPAA, and similar regulations
- Applications must ensure middleware servers have appropriate data handling certifications
- Audit logs should track all middleware operations on sensitive data
Data Minimization:
- Applications should only send necessary context to middleware
- Consider whether full conversation history is required or if recent messages suffice
- Metadata should not leak sensitive information
Content Safety
Moderation & Policy Enforcement:
- Middleware enables application-level content filtering before model inference
- Can implement organization-specific safety policies
- Should support different policy levels (enterprise, consumer, regulated industries)
- Moderation decisions should be auditable
Authentication & Authorization
Middleware Access Control:
- Applications should authenticate middleware servers
- Alternatives to OAuth are valid inside a trust boundary
- If using external middleware servers, consider ways to ensure identity of middleware server
Attack Vectors
Potential Risks:
- Malicious Middleware: If user-configurable, could exfiltrate conversations
- Mitigation: Never allow user configuration in consumer apps
- Middleware Compromise: Attacker gains access to conversation stream
- Mitigation: Strong authentication, network isolation, monitoring
- Data Leakage: Metadata inadvertently exposes sensitive information
- Mitigation: Do not forward metadata to models, validate metadata schemas, audit metadata content
- Injection Attacks: Malicious content in middleware responses
- Mitigation: Sanitize middleware outputs, validate against schemas
Comparison with Other Capabilities
Why Middleware Requires Stricter Security:
| Capability | Trust Model | User Config | Data Exposure |
|---|---|---|---|
| Tools | Model-controlled calls | ✅ Yes | Limited (specific tool inputs) |
| Resources | Explicit requests | ✅ Yes | Limited (specific resources) |
| Prompts | Templates | ✅ Yes | Minimal (template params) |
| Middleware | Full conversation stream | ❌ No | Complete context |
Recommendations for Implementers
Application Developers:
- Never expose middleware configuration to untrusted users
- Run middleware servers in isolated environments
- Implement comprehensive logging and monitoring
- Regular security audits of middleware implementations
- Have incident response plans for middleware breaches
Middleware Server Developers:
- Clearly document data handling practices
- Provide security certifications for enterprise use
- Implement rate limiting and abuse prevention
- Support audit logging capabilities
- Follow principle of least privilege for data access
Enterprise Deployments:
- Require security review before deploying middleware
- Implement network segmentation for middleware servers
- Regular penetration testing of middleware infrastructure
- Data classification and handling policies
- Compliance verification for regulated industries
Reference Implementation
To be provided
A reference implementation will include:
- SDK implementation: At least one SDK branch with this fully implemented
- Server example: PII redaction middleware with redaction/restoration
- Client example: Host application using middleware in request/response pipeline
- Sample use case: Complete moderation or RAG implementation using Context Middleware
Alternatives Considered
1. Extend Resources with Context Parameters
Idea: Allow resources to accept context as input
Rejected:
- Conflicts with Resources' semantic meaning (static/templated data)
- Would require something beyond Resource Templates
- Having a third basic Resource type adds confusion, especially if it's function-like
2. Add Tool Annotations for Application Control
Idea: Mark specific tools as "application-invoked"
Rejected:
- Tools are semantically model-controlled
- Mixing control models creates confusion
- Brittle, clients might not respect this
- Bloating the tools contract adds complexity for everyone
3. Multiple Narrow Capabilities (RAG, Moderation, etc.)
Idea: Create separate capabilities for each use case
Rejected:
- Protocol bloat—every new use case needs a capability
- Underlying pattern is the same: context transformation
- Generic primitive is more future-proof
Open Questions
-
Pagination: Should middleware responses support pagination for large context transformations?
- Initial Answer: No—like Sampling, transformations should be batched. Context is sent to LLMs in single requests anyway.
-
Streaming: Should middleware support streaming transformations?
- Initial Answer: Not in v1—adds complexity, unclear benefit for initial use cases.
-
Chaining: Should the protocol support middleware chains?
- Initial Answer: No—application layer can chain multiple invocations if needed.
Summary
Context Middleware fills a critical gap in MCP's capability model by providing a standardized, application-controlled primitive for context transformation. It's a flexible, generic mechanism that enables enterprise-critical features (PII redaction, moderation, audit logging) alongside performance optimizations (RAG, tool filtering, timestamp injection) and quality improvements (hallucination detection).
By following the successful pattern of Tools—simple, generic, extensible—Context Middleware becomes a fundamental building block for sophisticated AI assistant architectures without bloating the protocol with feature-specific capabilities.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status