-
Notifications
You must be signed in to change notification settings - Fork 2
ADR-0003: Async coloring costs in production — evidence from Codex, Symphony, opencode #3
Description
Summary
This ADR documents the real-world cost of function coloring in production systems, using OpenAI Codex (Rust/Tokio), OpenAI Symphony (Elixir/BEAM), and opencode (TypeScript) as evidence. The findings directly validate ZEP-0002's decision to reject both async fn and Io parameter infection in favor of fiber-based structured concurrency.
This is a language design ADR — it's about what happens when a language forces concurrency into function signatures, not about building any specific application.
Evidence: What coloring costs in practice
Rust async/await (OpenAI Codex)
Codex is a 63k-star Rust CLI. Its core module (codex.rs) has 150+ import lines — half are async runtime plumbing (tokio::sync::*, futures::prelude::*, async_channel::*, tokio_util::sync::CancellationToken, #[async_trait]).
Concrete costs observed:
-
Trait object workarounds. Codex defines a
SessionTasktrait for polymorphic task execution. Because Rust'sasync fnin traits isn't object-safe, every implementation needs#[async_trait]— which boxes the future, adding allocation and indirection. With uncolored functions, this is just a function pointer. -
Cancellation token threading. Every async function that needs to be cancellable must accept a
CancellationTokenparameter and check it at yield points. This is parameter infection by another name — notIo, butCancellationToken. The token must be threaded throughrun_turn → try_run_turn → stream → tool_exec → permission_check. In a fiber model, cancellation state is per-fiber and checked viacheckCancel()with zero parameters. -
Shared state ceremony. Mutable state that crosses async boundaries requires
Arc<Mutex<T>>orArc<RwLock<T>>. Codex wraps session state, agent status, and configuration in these. With fibers on a work-stealing pool, fiber-aware locks achieve the same safety withoutArcwrapping. -
Reference cycle avoidance. Codex uses
Weak<ThreadManagerState>inAgentControlto avoid reference cycles between sessions and their parent manager. This is a symptom ofArc-based shared ownership forced by async — fiber-scoped lifetimes eliminate this category of bug. -
Startup prewarm complexity.
RegularTaskhas aMutex<Option<ModelClientSession>>to hold a prewarmed WebSocket session that gets taken once. This pattern exists because async task construction and async task execution are separate steps. With fibers, construction and execution happen in the same function scope.
TypeScript async/await (opencode)
opencode demonstrates coloring in a dynamically typed language:
-
Full-chain infection.
SessionPrompt.loop()→resolveTools()→execute()→ask()— every function isasyncbecause the leaf (ask()) might prompt the user. Adding one I/O call to a utility function forcesasyncup the entire call chain. -
Event bus as coloring workaround. opencode's typed pub/sub bus exists partly to decouple async producers from async consumers. The bus is itself an async boundary. With blocking channels, this decoupling is free.
Elixir/BEAM (OpenAI Symphony) — the uncolored baseline
Symphony is OpenAI's agent orchestrator in Elixir. Zero function coloring. A function that reads a file, calls an API, or sleeps looks identical whether called from a GenServer, a Task, or inline. The language doesn't distinguish sync from async.
Result: Symphony's codebase has dramatically less ceremony than Codex despite solving a similar problem (agent orchestration with concurrency, cancellation, and isolation). Functions are universally composable.
What this means for ZEP-0002
These three systems demonstrate the same taxonomy of coloring costs:
| Cost category | Rust (Codex) | TypeScript (opencode) | Elixir (Symphony) | ZEP-0002 (zag) |
|---|---|---|---|---|
| Function signature infection | async fn everywhere |
async everywhere |
None | None |
| Cancellation threading | CancellationToken param |
AbortSignal param |
Process signals | checkCancel() (no param) |
| Trait/interface workarounds | #[async_trait] boxing |
N/A | N/A | Plain function pointers |
| Shared state wrapping | Arc<Mutex<T>> |
N/A (single-threaded) | N/A (message passing) | Fiber-aware locks (no Arc) |
| Reference cycle risk | Weak<T> patterns |
N/A | N/A | Scope-based lifetimes |
ZEP-0002's fiber model eliminates all five cost categories while preserving the good patterns these systems use (protocol boundaries, bounded concurrency, cooperative cancellation, OS-level sandboxing).
Validation of specific ZEP-0002 decisions
- "Concurrency at the call site, not in function types" — Codex proves the alternative (coloring) scales poorly. 150+ imports of async plumbing in one file.
zag.checkCancel()with no parameters — Codex'sCancellationTokenthreading is the exact problem this solves.- Fiber-aware
Mutex/RwLock— Codex'sArc<Mutex<T>>wrapping disappears when locks are fiber-aware. zag.Scopefor structured lifetimes — Codex'sWeak<T>reference cycle avoidance is unnecessary when fiber scopes define lifetimes.zag.blockingCallfor FFI — Codex uses dedicated threads for subprocess execution (sandbox-exec, codex-linux-sandbox). Same escape hatch.
Proposed ADR-0003 scope
Record as docs/adr/ADR-0003-async-coloring-costs.md:
- Document the five categories of coloring cost with real-world evidence
- Reference Codex (Rust), opencode (TypeScript), Symphony (Elixir) as case studies
- Validate ZEP-0002's specific design decisions against this evidence
- Note that higher-level patterns (supervision, actor mailboxes, task DAGs) are separate language design concerns for future ZEPs — this ADR only addresses the foundational async model