Skip to content

Feature Request: Improve Claude Code Token Management with MCP Servers #7172

@nCubed

Description

@nCubed

Feature Request: Improve Claude Code Token Management with MCP Servers

Problem Statement

Claude Code's current MCP server architecture creates significant workflow friction and inefficient resource utilization. All configured MCP servers load their complete tool schemas into the context at session initialization, consuming tokens regardless of actual usage.

Specific Issues:

  • Static token overhead: 18.3k tokens (9.2% of context) consumed by unused AWS MCP servers
  • Configuration-time resource decisions in discovery-driven workflows
  • Session restart required to modify MCP server availability
  • Premature optimization pressure: choose between token efficiency or tool availability

Impact on Developer Workflow

Real development scenarios require dynamic tool access patterns that the current architecture cannot support:

  1. Mid-conversation discovery: Developer realizes they need AWS documentation while debugging, but MCP servers weren't loaded
  2. Context-dependent tooling: Different projects require different AWS services (Lambda vs CDK vs pricing analysis)
  3. Token budget management: 18k static overhead reduces effective context window by ~4-5k lines of code
  4. Workflow interruption: Restarting sessions to change MCP configuration breaks conversation continuity

Technical Root Cause

The system treats MCP servers as session-scoped heavyweight resources rather than on-demand lightweight services. Tool schema definitions are eagerly loaded rather than lazily initialized, violating efficient resource allocation principles.

Proposed Solutions

Primary: Runtime MCP Server Management

  • Enable/disable servers within active sessions without configuration changes
  • UI controls in /mcp interface for real-time server toggling
  • Tool schema loading/unloading on demand
  • Preserve conversation context during server state changes

Secondary: Intelligent Tool Loading

  • Lazy schema initialization: Load tool definitions only when first referenced
  • Contextual server suggestions: Claude identifies and requests needed servers mid-conversation
  • Automatic schema eviction: Unload unused tool definitions to reclaim tokens
  • Token-aware prioritization: Prefer lightweight servers when context pressure exists

Tertiary: Enhanced Configuration Scoping

  • Session profiles: Quick-switch between predefined MCP server combinations
  • Project-based auto-configuration: Automatically load relevant servers based on project type detection
  • Usage analytics: Track MCP server utilization to inform configuration optimization

Success Criteria

  1. Zero-restart server management: Developers can enable AWS documentation MCP server mid-conversation without session interruption
  2. Token efficiency: Unused servers consume zero context tokens
  3. Workflow preservation: MCP server changes maintain conversation history and context
  4. Predictable performance: Server loading/unloading operations complete within 2-3 seconds

Business Justification

This directly impacts developer productivity in Claude Code adoption:

  • Reduced cognitive overhead: No need to predict entire toolchain requirements at session start
  • Improved context utilization: Recover 9%+ of context window for actual code and conversation
  • Enhanced user experience: Eliminate artificial workflow constraints that force suboptimal behavior

Current Environment

  • Claude Code with global MCP server configuration
  • AWS MCP servers: aws-core, aws-documentation, aws-cdk, aws-pricing
  • Context usage: 89k/200k tokens with 18.3k MCP overhead
  • Development focus: Serverless/Lambda with Terraform (not CDK)

Priority Classification

High Priority - This addresses a fundamental architectural constraint that forces users into inefficient resource allocation patterns, directly impacting the core value proposition of Claude Code as a development productivity tool.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions