Skip to content

research(mcp): Semantic Tool Discovery — embedding-based MCP tool retrieval reduces context overhead without accuracy loss (arXiv:2603.20313) #2321

@bug-ops

Description

@bug-ops

Paper

Semantic Tool Discovery for Large Language Models: A Vector-Based Approach to MCP Tool Selection

Summary

Addresses the MCP scalability problem where exposing 50–100+ tools in context causes token overhead, cost, and accuracy degradation. The paper proposes embedding all available MCP tools into a dense vector index and dynamically selecting only the 3–5 most semantically relevant tools per query at inference time, reducing context overhead without degrading task success rate.

Relevance to Zeph

zeph-mcp currently loads and passes all registered MCP tools to the context. PR #2293 (merged) added prune_tools() via LLM-based selection, but this paper provides a faster, cheaper, peer-reviewed alternative: embedding-based retrieval instead of LLM prompting for tool selection.

Key implementation considerations:

  • Tools would be embedded once at connect time into a per-server vector index
  • Per-query retrieval replaces the LLM call in prune_tools() — lower latency, no token cost
  • Complements tool_schema_filter (which already uses embedding-based selection for Zeph's built-in tools) — MCP tools could use the same pipeline
  • The existing EmbeddingAnomalyGuard embedding infrastructure in zeph-mcp could be reused

Distinct from the MCP/ACP/A2A interoperability survey (#2307) and tool invocation reliability taxonomy (#2234).

Priority

P2 — directly applicable to zeph-mcp::pruning, provides a faster/cheaper alternative to current LLM-based approach.

Metadata

Metadata

Assignees

Labels

P2High value, medium complexityresearchResearch-driven improvement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions