-
Notifications
You must be signed in to change notification settings - Fork 2
research(tools): parallel tool dispatch within a single LLM response (LLMCompiler) #1646
Description
Source
LLMCompiler: An LLM Compiler for Parallel Function Calling (ICML 2024, widely deployed in production 2025–2026)
Finding
LLMCompiler identifies tool calls in a single LLM response that have no data dependencies between them (argument of tool B does not reference output of tool A) and executes them in parallel. Demonstrated 3.7x latency improvement, 6x cost savings vs. sequential ReAct.
Applicability
Zeph's tool executor in zeph-tools executes tool calls sequentially within a single agent turn even when calls are independent. The TaskGraph / DagScheduler infrastructure already exists for inter-turn parallelism; the gap is intra-turn parallel dispatch.
Implementation sketch:
- Parse tool calls from LLM response and build a dependency graph (calls referencing another call's
tool_use_idare dependent; otherwise independent) - Execute independent groups in parallel via
tokio::try_join_all - Return all results before the next LLM turn
Related: #1628 (DagScheduler async dispatch — inter-task, not intra-turn)
Priority
High — clear latency and cost benefit with existing infrastructure; no new abstractions needed.