-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[Feature Request] MCP Tool Search - Lazy Loading for 85% Token Reduction #9350
Description
Feature Request: MCP Tool Search - Lazy Loading for Token Optimization
Summary
Implement MCP Tool Search functionality to enable lazy loading of MCP server tools, dramatically reducing token consumption and improving context efficiency for multi-MCP server setups.
Problem Statement
Currently, OpenCode loads all tool definitions from all connected MCP servers at session startup. This causes severe context window bloat:
- 7+ MCP servers: Consume 67,000+ tokens before any user interaction
- Single Docker MCP server (135 tools): Consumes 125,000 tokens
- Typical 4-server setup: Burns 51,000 tokens (46.9% of context window)
- Result: Users lose 50-70% of their 200K context limit before writing a single prompt
This creates a brutal tradeoff: either limit MCP servers to 2-3 core tools, or accept that half your context budget disappears before work begins.
Proposed Solution: MCP Tool Search (Lazy Loading)
Implement a lazy loading system that:
- Loads only a lightweight search index (~5K tokens) at startup instead of all tool definitions
- Dynamically fetches tool definitions only when needed for a specific task
- Automatically activates when tool descriptions would exceed 10% of available context
- Maintains backward compatibility with all existing MCP servers
Expected Benefits
| Metric | Before | After | Improvement |
|---|---|---|---|
| MCP Tools Token Consumption | 39.8K tokens (19.9%) | ~5K tokens (2.5%) | 85% reduction |
| Available Context | 92K tokens (46%) | 195K tokens (97.5%) | 112% increase |
| 4-Server Setup | 51K tokens | 8.5K tokens | 46.9% reduction |
| Tool Selection Accuracy (Opus 4.5) | 79.5% | 88.1% | 10.8% increase |
| Tool Selection Accuracy (Opus 4) | 49% | 74% | 51% increase |
Implementation Approach
1. Core Architecture
- Create a lightweight tool registry/index at session startup
- Implement on-demand tool definition loading via similarity search (Regex or BM25)
- Use
serverInstructionsfield metadata for intelligent tool discovery
2. Configuration Options
// Global settings
{
"enable_tool_search": true // Auto-enabled when beneficial
}
// Per-MCP server configuration
{
"mcpServers": {
"my-server": {
"command": "node",
"args": ["/path/to/server.js"],
"serverInstructions": "Database operations for PostgreSQL including queries, schema management, and data migrations. Use for any database-related tasks."
}
}
}3. Activation Logic
- Automatic: Activates when tool descriptions > 10% of context window
- Manual Override: Allow users to enable/disable via settings
- Per-Server Control: Option to disable for specific high-frequency tools
4. User Experience
- Transparent operation - no workflow changes required
- Monitor usage via
/contextand/mcpcommands - Clear indication of which tools are loaded on-demand
Technical Implementation Details
Search Index Creation
- Index tool names, descriptions, and
serverInstructions - Use BM25 or Regex-based similarity search for matching user queries to tools
- Keep index under 5K tokens total
On-Demand Loading
- Intercept tool invocation requests
- Query index for relevant tools
- Load only matched tool definitions into context
- Cache loaded tools for session duration
Server Instructions Enhancement
{
"serverInstructions": "Database operations for PostgreSQL including queries, schema management, and data migrations. Use for any database-related tasks. Supports SQL queries, table creation, schema migration, and backup operations."
}References
- Anthropic's MCP Tool Search: Released January 2026
- Original Feature Request: GitHub Feature Request: Lazy Loading for MCP Servers and Tools (95% context reduction possible) anthropics/claude-code#7336
- Documentation: https://claudefa.st/blog/tools/mcp-extensions/mcp-tool-search
- Analysis: https://venturebeat.com/orchestration/claude-code-just-got-updated-with-one-of-the-most-requested-user-features
- Performance Data: https://medium.com/@joe.njenga/claude-code-just-cut-mcp-context-bloat-by-46-9-51k-tokens-down-to-8-5k-with-new-tool-search-ddf9e905f734
Use Cases
- Multi-MCP Users: Run 7+ servers without context exhaustion
- Complex Workflows: Maintain conversation history across extended sessions
- Tool-Rich Environments: Access 100+ tools without performance penalty
- Enterprise Deployments: Scale tool integrations without token constraints
Priority
High - This is a critical feature for scalability and user experience. It removes the primary constraint limiting MCP adoption and enables more sophisticated agent workflows.
Additional Considerations
- Backward Compatibility: Must work with all existing MCP servers
- Performance: Search operations should be fast (<100ms)
- Monitoring: Provide tools to track token savings
- Migration Path: Smooth transition from current behavior
Related Features
- Programmatic Tool Use
- Advanced Tool Discovery
- Context Optimization
- MCP Server Management
This feature would significantly enhance OpenCode's ability to handle complex, multi-tool workflows while maintaining optimal context efficiency. The implementation approach follows established patterns from Anthropic's Claude Code and represents a significant leap forward in AI coding agent architecture.