-
Notifications
You must be signed in to change notification settings - Fork 2
research(mcp): Semantic Tool Discovery — embedding-based MCP tool retrieval reduces context overhead without accuracy loss (arXiv:2603.20313) #2321
Description
Paper
Semantic Tool Discovery for Large Language Models: A Vector-Based Approach to MCP Tool Selection
- arXiv: https://arxiv.org/abs/2603.20313 (submitted 19 March 2026)
Summary
Addresses the MCP scalability problem where exposing 50–100+ tools in context causes token overhead, cost, and accuracy degradation. The paper proposes embedding all available MCP tools into a dense vector index and dynamically selecting only the 3–5 most semantically relevant tools per query at inference time, reducing context overhead without degrading task success rate.
Relevance to Zeph
zeph-mcp currently loads and passes all registered MCP tools to the context. PR #2293 (merged) added prune_tools() via LLM-based selection, but this paper provides a faster, cheaper, peer-reviewed alternative: embedding-based retrieval instead of LLM prompting for tool selection.
Key implementation considerations:
- Tools would be embedded once at connect time into a per-server vector index
- Per-query retrieval replaces the LLM call in
prune_tools()— lower latency, no token cost - Complements
tool_schema_filter(which already uses embedding-based selection for Zeph's built-in tools) — MCP tools could use the same pipeline - The existing
EmbeddingAnomalyGuardembedding infrastructure inzeph-mcpcould be reused
Distinct from the MCP/ACP/A2A interoperability survey (#2307) and tool invocation reliability taxonomy (#2234).
Priority
P2 — directly applicable to zeph-mcp::pruning, provides a faster/cheaper alternative to current LLM-based approach.