-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
Problem Statement
Currently, all enabled MCP servers load at session start, consuming significant context budget before any actual work begins.
Measured impact (from community reports):
| MCP Server | Approx. Token Cost | Source |
|---|---|---|
| GitHub (27 tools) | ~18,000 | Issue #11364 |
| AWS MCP servers | ~18,300 | Issue #7172 |
| Cloudflare | ~15,000+ | Community reports |
| Sentry | ~14,000 | Community reports |
| Playwright (21 tools) | ~13,647 | Scott Spence |
| Supabase | ~12,000+ | Community reports |
| Average per tool | ~550-850 | Issue #11364 |
| 7 servers total | 67,300 (33.7%) | Issue #11364 |
Many MCPs may never be used in a given session, yet they permanently occupy context space.
Real-World Pain Point: The Modern Work Hub Dilemma
Modern knowledge workers manage numerous platforms simultaneously:
| Category | Platforms |
|---|---|
| Code Hosting | GitHub, GitLab, Bitbucket |
| Project Management | Jira, Linear, Asana, Notion |
| Communication | Slack, Discord, Teams |
| CI/CD | Vercel, Netlify, AWS |
| Monitoring | Sentry, Datadog |
The Dilemma:
- Option A (Install All): 50,000+ tokens consumed at session start = 50% context gone
- Option B (Separate by Project): Defeats Claude Code's value as a unified command center
Neither option is acceptable. We need on-demand loading to unlock Claude Code's potential as a universal work orchestrator.
Key Distinction: Context Isolation, Not Lazy Loading
This proposal is fundamentally different from traditional lazy loading approaches.
| Approach | Main Context | Load Time | Complexity |
|---|---|---|---|
| Traditional Lazy Loading | Gets populated when MCP needed | Runtime dynamic | High (state management) |
| Our Proposal: Context Isolation | Always stays clean | At fork creation | Low (reuses context: fork) |
Traditional Lazy Loading:
Main Context ──[need MCP]──> Load MCP ──> Main Context (now occupied)
Our Proposal (Context Isolation):
Main Context (stays clean)
└── Fork Agent Context ──> Load MCPs ──> Isolated Context
└── Released when done
This approach:
- Keeps main context permanently clean (not temporarily)
- Reuses existing
context: forkinfrastructure (lower implementation cost) - No runtime dynamic loading complexity (load once at fork creation)
Observation
Claude Code 2.1.x introduced context: fork for skills, enabling isolated context for specialized operations. This architecture already supports:
- Spawning isolated sub-contexts
- Independent tool permissions per fork
- Clean context separation
Proposal: On-demand MCP Loading
Extend the fork architecture to support MCP assignment at the agent/skill level:
# Example: agents/database-specialist.md
---
name: database-specialist
description: Database operations expert
tools: [Read, Bash, Grep]
mcp: [postgres, redis] # Only loads when this agent runs
context: fork
---# Example: skills/deploy/SKILL.md
---
description: Deploy to production
mcp: [vercel, github] # Only loads during /deploy
context: fork
---Proposed Architecture
Main Session (Lean)
│
├── Base MCPs only: filesystem, memory
│ (minimal context footprint)
│
├── Task: database-specialist (forked)
│ └── Loads: postgres, redis (isolated)
│
└── Skill: /deploy (forked)
└── Loads: vercel, github (isolated)
Benefits
- Context Efficiency: Main context stays lean, only loading MCPs when needed
- Granular Permissions: Each agent/skill has its own MCP scope
- Progressive Security: Layered access control instead of all-or-nothing
- Scalability: As MCP ecosystem grows, selective loading becomes essential
Proposed Implementation: Two-Sided Configuration
The ideal solution combines both MCP-side and Agent/Skill-side configuration for maximum flexibility and backward compatibility:
MCP-Side: Lazy Loading Flag in settings.json
{
"mcpServers": {
"memory": { "command": "...", "lazy": false },
"github": { "command": "...", "lazy": true },
"postgres": { "command": "...", "lazy": true }
}
}Agent/Skill-Side: Frontmatter Declaration
# agents/database-specialist.md
---
name: database-specialist
mcp:
required: [postgres]
optional: [redis]
context: fork
---Loading Logic:
MCP lazy Setting |
Agent/Skill Declaration | Result |
|---|---|---|
false (or omitted) |
- | ✅ Load at session start (current behavior) |
true |
Not declared | ❌ Don't load |
true |
mcp: [xxx] |
✅ Load when agent/skill runs |
Why this approach?
- Backward compatible: Omitting
lazymaintains current behavior - Gradual migration: Move heavy MCPs to
lazy: trueone at a time - Fine-grained control: Both infrastructure and application level settings
Challenges to Consider
| Challenge | Possible Solution |
|---|---|
| MCP startup latency | Warm pool or pre-connect |
| State after fork ends | Stateless design or session cache |
| Tool discovery | Lazy manifest (know tools exist, load on use) |
| Credential scoping | Env var inheritance with scope limits |
About Us: Claude World
This proposal comes from Claude World - a Claude Code developer community based in Taiwan.
- 200+ developers joined on Day 1 of our community launch
- We focus on Claude Code best practices, advanced patterns, and architectural improvements
- MCP efficiency and context management is one of our most discussed topics
- We're actively documenting MCP token costs, designing workarounds, and sharing learnings with the global community
We'd love to hear the team's thoughts on this direction!
Related
context: forkimplementation in 2.1.0- Context awareness features in 2.0.65+
- MCP ecosystem growth
- Issue Feature Request: Lazy Loading for MCP Servers and Tools (95% context reduction possible) #7336 (Lazy loading feature request)
- Issue Lazy-load MCP tool definitions to reduce context usage #11364 (Lazy-load MCP tool definitions)
Full Write-up
For detailed analysis with architecture diagrams and implementation considerations: