-
Notifications
You must be signed in to change notification settings - Fork 0
Implement AgentRuntime interface and type contracts #3
Description
Context
RemoteClaw is middleware that connects CLI-based AI agents (Claude, Gemini, Codex, OpenCode) to messaging channels (WhatsApp, Telegram, Slack, Discord, etc.). It replaces the upstream OpenClaw execution engine (an in-process Pi-based orchestrator) with a subprocess model where each agent CLI runs as a child process.
This issue defines the foundational type contracts that every other middleware component depends on.
Problem
The middleware layer (src/middleware/) does not yet exist. All downstream work (CLI runtime implementations, event extractors, error classifier, session map, channel bridge, MCP server, delivery adapter) depends on the type contracts defined here.
The key architectural insight is that in the subprocess model, the delivery pipeline has two sources of truth:
- CLI subprocess output — text, session info, usage, timing, abort status (captured as
AgentRunResult) - Gateway-side MCP server — what messages the agent sent, how many cron jobs were added (captured as
McpSideEffects)
These combine into AgentDeliveryResult, which is what the delivery pipeline consumers receive.
Architecture
CLI subprocess (claude/gemini/codex/opencode)
|
| NDJSON stream → AgentEvent[] → AgentRunResult
|
v
ChannelBridge.handle()
| \
| AgentRunResult \ McpSideEffects (from MCP server)
| \
v v
AgentDeliveryResult = AgentRunResult + McpSideEffects + derived payloads
|
v
Delivery pipeline
Tasks
Create src/middleware/types.ts with the following type contracts:
1. AgentRuntime interface
The core interface that all CLI runtime implementations (Claude, Gemini, Codex, OpenCode) will implement:
export interface AgentRuntime {
execute(params: AgentExecuteParams): AsyncIterable<AgentEvent>;
}2. AgentExecuteParams
Input to execute() — the prompt, session context, MCP config, abort signal, working directory, environment variables.
3. AgentEvent discriminated union
Events emitted during CLI subprocess execution:
| Event Type | Purpose |
|---|---|
text |
Text content from the agent |
tool_use |
Agent is invoking a tool |
tool_result |
Tool execution result |
error |
Error during execution |
done |
Execution complete, carries AgentRunResult |
4. AgentRunResult
Final CLI output summary:
export type AgentRunResult = {
text: string;
sessionId: string | undefined;
durationMs: number;
usage: AgentUsage | undefined;
aborted: boolean;
totalCostUsd?: number | undefined;
apiDurationMs?: number | undefined;
numTurns?: number | undefined;
stopReason?: string | undefined;
errorSubtype?: string | undefined;
permissionDenials?: PermissionDenial[] | undefined;
};5. McpSideEffects and McpMessageTarget
Gateway-side MCP server tracking:
export type McpMessageTarget = {
tool: string; // e.g., "message", "sessions_send", "telegram_send"
provider: string; // e.g., "telegram", "discord", "slack"
accountId?: string;
to?: string; // target identifier (chat ID, channel ID, etc.)
};
export type McpSideEffects = {
sentTexts: string[];
sentMediaUrls: string[];
sentTargets: McpMessageTarget[];
cronAdds: number;
};6. AgentDeliveryResult
The three-type delivery contract (AgentRunResult + McpSideEffects = AgentDeliveryResult):
export type AgentDeliveryResult = {
payloads: ReplyPayload[];
run: AgentRunResult;
mcp: McpSideEffects;
error?: string | undefined;
};Why composite (.run + .mcp) instead of flat?
- Clean separation: CLI output vs gateway-side tracking
- Self-documenting:
result.mcp.sentTextsis clear about origin - No field collisions
- ChannelBridge constructs it naturally:
{ run: agentResult, mcp: mcpServer.drain() }
7. Supporting types
AgentUsage— token counts (input, output, cache read/write)PermissionDenial— permission denial trackingChannelMessage— incoming message from a channelBridgeCallbacks— streaming callbacks (onPartialReply,onBlockReply,onToolResult)
Acceptance Criteria
-
src/middleware/types.tsexists and exports all types listed above -
AgentRuntimeinterface withexecute(params) -> AsyncIterable<AgentEvent> -
AgentEventis a discriminated union coveringtext,tool_use,tool_result,error,done - Three-type delivery contract:
AgentRunResult+McpSideEffects=AgentDeliveryResult -
pnpm buildpasses with the new types - No runtime code — this is types only (implementations come in subsequent issues)
Design Decisions
- Composite over flat:
AgentDeliveryResultwrapsrunandmcpas nested objects rather than flattening all fields. This preserves origin clarity and avoids naming collisions. payloadsat top level: The delivery pipeline universally operates onReplyPayload[]. For CLI backends,payloadsistext ? [{ text }] : []. The conversion happens once in ChannelBridge.McpSideEffectshas zero-value defaults: All fields default to[]or0— no optionality needed.didSendViaMcpToolis derived:sentTargets.length > 0(no separate boolean field).