feat(mcp): embedding-based semantic tool discovery (#2321) by bug-ops · Pull Request #2342 · bug-ops/zeph

bug-ops · 2026-03-28T10:44:55Z

Summary

Add SemanticToolIndex to zeph-mcp: embeds MCP tools once at connect time, retrieves top-K by cosine similarity per query — replaces the per-turn LLM call in prune_tools() when strategy = "embedding"
ToolDiscoveryConfig in zeph-config with strategy, top_k, min_similarity, strict, embedding_provider fields; default strategy is "none" (backward compatible, opt-in)
Description sanitization before embedding (strip control chars, 200 char cap) to prevent embedding poisoning
Explicit warn! on all fallback paths; strict = true skips tool sync on failure instead of silently passing all tools
Fixes pre-existing gap: apply_mcp_pruning() was wired but never called in runner.rs
Fixes pre-existing bug: Llm strategy + pruning_enabled=false skipped sync_mcp_executor_tools()
--init wizard updated with step_mcp_discovery(); --migrate-config handled automatically via default.toml

Test plan

cargo nextest run -p zeph-mcp --features full --lib — 315 tests pass (17 in semantic_index)
cargo nextest run --workspace --features full --exclude exarch-python --exclude exarch-node --lib --bins — 6963 passed
Set [mcp.tool_discovery] strategy = "embedding" in config and verify top-K tool selection in a session with 10+ MCP tools
Set strict = true, kill embedding provider, verify warn log and no tool sync on that turn
Run --init wizard and verify [mcp.tool_discovery] section is prompted and written to output config

Add SemanticToolIndex to zeph-mcp that embeds MCP tools once at connect time and retrieves top-K by cosine similarity per query, replacing the LLM call in prune_tools() when strategy = "embedding". - SemanticToolIndex: build() embeds tools concurrently (buffer_unordered 8), select() returns top-K filtered by min_similarity threshold - always_include resolved from original tool list (not only indexed entries) so failed-to-embed tools can still be pinned - ToolDiscoveryStrategy defined in zeph-mcp (no circular dep); config type ToolDiscoveryConfig in zeph-config with serde conversion in agent_setup - Description sanitization before embedding (strips control chars, 200 char cap) to prevent embedding poisoning via keyword-stuffed tool descriptions - Explicit warn! on all fallback paths; strict = true skips tool sync instead of silently passing all tools on embedding failure - Default strategy is "none" (opt-in); backward compatible - Fixes pre-existing gap: apply_mcp_pruning() was parsed but never wired - Fixes pre-existing bug: Llm + pruning_enabled=false skipped sync_mcp_executor_tools(); added explicit else clause - --init wizard: step_mcp_discovery() prompts for strategy/top_k/provider - --migrate-config: handled automatically via default.toml additions Closes #2321

github-actions bot added enhancement New feature or request size/XL Extra large PR (500+ lines) documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate dependencies Dependency updates config Configuration file changes labels Mar 28, 2026

bug-ops force-pushed the semantic-tool-discovery branch from 8f9a16b to 8f87220 Compare March 28, 2026 10:45

bug-ops enabled auto-merge (squash) March 28, 2026 10:47

bug-ops added 2 commits March 28, 2026 11:57

fix(mcp): resolve clippy warnings in init.rs wizard step

718d989

bug-ops force-pushed the semantic-tool-discovery branch from f04adb8 to 718d989 Compare March 28, 2026 10:57

bug-ops merged commit 20f4571 into main Mar 28, 2026
25 checks passed

bug-ops deleted the semantic-tool-discovery branch March 28, 2026 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): embedding-based semantic tool discovery (#2321)#2342

feat(mcp): embedding-based semantic tool discovery (#2321)#2342
bug-ops merged 2 commits intomainfrom
semantic-tool-discovery

bug-ops commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 28, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant