Skip to content

research(tools): parallel tool dispatch within a single LLM response (LLMCompiler) #1646

@bug-ops

Description

@bug-ops

Source

LLMCompiler: An LLM Compiler for Parallel Function Calling (ICML 2024, widely deployed in production 2025–2026)

Finding

LLMCompiler identifies tool calls in a single LLM response that have no data dependencies between them (argument of tool B does not reference output of tool A) and executes them in parallel. Demonstrated 3.7x latency improvement, 6x cost savings vs. sequential ReAct.

Applicability

Zeph's tool executor in zeph-tools executes tool calls sequentially within a single agent turn even when calls are independent. The TaskGraph / DagScheduler infrastructure already exists for inter-turn parallelism; the gap is intra-turn parallel dispatch.

Implementation sketch:

  1. Parse tool calls from LLM response and build a dependency graph (calls referencing another call's tool_use_id are dependent; otherwise independent)
  2. Execute independent groups in parallel via tokio::try_join_all
  3. Return all results before the next LLM turn

Related: #1628 (DagScheduler async dispatch — inter-task, not intra-turn)

Priority

High — clear latency and cost benefit with existing infrastructure; no new abstractions needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    llmzeph-llm crate (Ollama, Claude)performancePerformance improvementsresearchResearch-driven improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions