Skip to content

feat(mcp): add per-message caching for prune_tools to avoid redundant LLM calls #2298

@bug-ops

Description

@bug-ops

Problem

prune_tools fires an LLM call on every agent loop iteration where tool count exceeds min_tools_to_prune. A multi-turn conversation with 5 tool-use steps would make 5 extra LLM calls per user message, adding ~500ms latency at 100ms/call.

Solution

Cache the pruned tool set per user message, keyed on (message_content_hash, tool_list_hash). Reset the cache when a new user message arrives or when the MCP tool list changes.

Implementation note

Cache should live in the agent loop state (not in prune_tools itself, which is a stateless free function).

Priority

Must be addressed before the wiring PR for acceptable UX.

Component

zeph-core (agent loop), zeph-mcp (pruning.rs)

Metadata

Metadata

Assignees

Labels

P2High value, medium complexityenhancementNew feature or requestllmzeph-llm crate (Ollama, Claude)toolsTool execution and MCP integration

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions