Implement ClaudeCliRuntime concrete class

## Context

RemoteClaw's middleware architecture uses CLI subprocesses to interact with AI agents. The abstract base class `CLIRuntimeBase` (in `src/middleware/cli-runtime-base.ts`) handles subprocess spawning, NDJSON parsing, watchdog timers, abort signal propagation, and stdin prompt delivery. Concrete runtimes extend it and implement three abstract methods.

This issue implements the **Claude CLI runtime** — the first and primary concrete runtime.

### Architecture

```
AgentRuntime (interface, src/middleware/types.ts)
  └── CLIRuntimeBase (abstract, src/middleware/cli-runtime-base.ts)
        └── ClaudeCliRuntime ← THIS ISSUE
```

`CLIRuntimeBase` requires subclasses to implement:

```typescript
/** Construct CLI-specific command-line arguments. */
protected abstract buildArgs(params: AgentExecuteParams): string[];

/** Parse a single NDJSON line into an AgentEvent (or null to skip). */
protected abstract extractEvent(line: string): AgentEvent | null;

/** Construct provider-specific environment variables. */
protected abstract buildEnv(params: AgentExecuteParams): Record<string, string>;
```

### Dependencies

- `src/middleware/types.ts` — `AgentRuntime`, `AgentExecuteParams`, `AgentEvent`, `AgentRunResult`, etc.
- `src/middleware/cli-runtime-base.ts` — `CLIRuntimeBase` abstract class

Both exist on `main`.

## Specification

### File: `src/middleware/runtimes/claude.ts`

Create `ClaudeCliRuntime` extending `CLIRuntimeBase`.

#### Constructor

```typescript
constructor() {
  super("claude"); // CLI binary name
}
```

#### `buildArgs(params: AgentExecuteParams): string[]`

Build the Claude CLI argument list:

| Flag | Value | When |
|------|-------|------|
| `-p` | (none) | Always — enables pipe/print mode |
| `--output-format` | `stream-json` | Always — NDJSON streaming output |
| `--verbose` | (none) | Always — enables usage/cost reporting in output |
| `--resume` | `params.sessionId` | When `params.sessionId` is provided |
| `--mcp-config` | `<JSON string>` | When `params.mcpServers` has entries |
| *(positional)* | `params.prompt` | When prompt length ≤ 10,000 chars (stdin threshold is handled by `CLIRuntimeBase`) |

**MCP config format** (Claude JSON format, passed as inline string):

```json
{
  "mcpServers": {
    "<server-name>": {
      "command": "<command>",
      "args": ["<arg1>", "<arg2>"],
      "env": { "<KEY>": "<VALUE>" }
    }
  }
}
```

The `--mcp-config` flag accepts JSON strings directly (no temp file needed). Pass the serialized JSON as a CLI argument: `--mcp-config '{"mcpServers": {...}}'`. This eliminates temp file lifecycle management entirely.

**Important**: The prompt is passed as a positional argument (last arg) only when it fits within CLI argument limits. `CLIRuntimeBase` handles the >10KB stdin delivery case — `buildArgs()` should always include the prompt as the final positional argument; the base class writes to stdin *in addition* when the threshold is exceeded.

#### `extractEvent(line: string): AgentEvent | null`

Parse a single NDJSON line from Claude's `stream-json` output into an `AgentEvent`.

**Claude `stream-json` format** ([headless docs](https://code.claude.com/docs/en/headless), [SDK streaming docs](https://platform.claude.com/docs/en/agent-sdk/streaming-output)):

Each NDJSON line is one of:
1. A **`stream_event` envelope** wrapping a standard Claude API streaming event
2. A **final line** (`result`) emitted after all streaming events

**`stream_event` envelope structure:**

```json
{
  "type": "stream_event",
  "uuid": "<UUID>",
  "session_id": "<session-id>",
  "parent_tool_use_id": "<tool-use-id> | null",
  "event": { /* Standard Claude API RawMessageStreamEvent */ }
}
```

**Event mapping (`stream_event` lines — where `line.type === "stream_event"`):**

| Inner `event.type` | Condition | Maps To | Notes |
|---------------------|-----------|---------|-------|
| `message_start` | — | *Skip* | Extract `session_id` from envelope |
| `content_block_start` | `content_block.type === "text"` | *Skip* | Text content arrives via deltas |
| `content_block_start` | `content_block.type === "tool_use"` | Buffer | Store `name`, `id` from `content_block`; init input accumulator |
| `content_block_delta` | `delta.type === "text_delta"` | `AgentTextEvent` | `{ type: "text", text: delta.text }` |
| `content_block_delta` | `delta.type === "input_json_delta"` | Accumulate | Append `delta.partial_json` to buffered tool input |
| `content_block_delta` | `delta.type === "thinking_delta"` | *Skip* | Extended thinking content, not user-facing |
| `content_block_stop` | Tool buffered | `AgentToolUseEvent` | Emit with `JSON.parse(accumulated_input)`; clear buffer |
| `content_block_stop` | No tool buffered | *Skip* | End of text block |
| `message_delta` | — | *Skip* | Extract `delta.stop_reason` and `usage` for result metadata |
| `message_stop` | — | *Skip* | — |
| `ping` | — | *Skip* | Keepalive |

**Event mapping (final `result` line — where `line.type === "result"`):**

The `result` line is emitted after all `stream_event` lines. It contains cost, usage, session ID, and run metadata. Map to `AgentDoneEvent` with populated `AgentRunResult`.

> **Note on tool results**: The Claude CLI handles tool execution internally. `AgentToolResultEvent` mapping requires empirical verification — the CLI may or may not emit observable events for tool results. If it does, they likely appear as part of the interleaved conversation flow (the next response's `stream_event` lines will have `parent_tool_use_id` set). Initial implementation may omit `AgentToolResultEvent` and add it when the exact format is confirmed.

**Stateful fields** (instance-level, reset per `execute()` call):
- `currentSessionId: string | undefined` — from first `stream_event.session_id` envelope
- `accumulatedText: string` — concatenated text deltas for `AgentRunResult.text`
- `toolBuffer: { name: string; id: string; input: string } | null` — in-progress `tool_use` block
- `lastUsage: AgentUsage | undefined` — from `message_delta` event's `usage` field
- `lastStopReason: string | undefined` — from `message_delta` event's `delta.stop_reason`

**Session ID tracking**: Every `stream_event` envelope contains a `session_id` field. Capture it from the first event. Also available on the final `result` line. Store as instance state for inclusion in `AgentRunResult`.

**Usage extraction** (from `message_delta` inner event and/or `result` line):

```typescript
// message_delta event contains usage in its top-level usage field:
// { output_tokens: number }
// The result line contains cumulative usage:
{
  input_tokens: number;
  output_tokens: number;
  cache_read_input_tokens?: number;
  cache_creation_input_tokens?: number;
}
```

Map to `AgentUsage`:
- `inputTokens` ← `input_tokens`
- `outputTokens` ← `output_tokens`
- `cacheReadTokens` ← `cache_read_input_tokens`
- `cacheWriteTokens` ← `cache_creation_input_tokens`

**Result metadata mapping** (from `result` line → `AgentRunResult`):
- `text` ← accumulated from all `text_delta` events
- `sessionId` ← from `stream_event.session_id` envelope or `result.session_id`
- `usage` ← from `result` line usage fields
- `totalCostUsd` ← `result.cost_usd`
- `apiDurationMs` ← `result.duration_api_ms`
- `numTurns` ← `result.num_turns`
- `stopReason` ← from `message_delta` `delta.stop_reason` or `result.subtype`

> **Note on field names**: The exact field names on the `result` line (e.g., `cost_usd` vs `costUsd`, `duration_api_ms` vs `apiDurationMs`) require empirical verification via `claude -p --output-format stream-json` output. The names above are best-guess based on SDK type definitions; implementation should adapt to actual output.

**Note**: `CLIRuntimeBase.execute()` currently synthesizes a minimal `AgentDoneEvent` with empty fields after the subprocess exits. `extractEvent()` should emit its own `AgentDoneEvent` from the `result` line *before* the stream closes, which will be yielded to consumers. The base class's synthetic done event will follow — consumers should handle receiving the richer one first. Alternatively, refactor `CLIRuntimeBase` to skip its synthetic done event if a subclass already emitted one (preferred if straightforward).

#### `buildEnv(params: AgentExecuteParams): Record<string, string>`

Return environment variable overrides for the Claude subprocess. Currently returns an empty record `{}`.

Auth credentials (`ANTHROPIC_API_KEY`, `CLAUDE_CODE_OAUTH_TOKEN`) are passed through `params.env` by the caller, not hardcoded in the runtime. The runtime should not assume any particular auth mechanism.

### File: `src/middleware/runtimes/claude.test.ts`

Unit tests verifying:

1. **Argument construction** (6+ test cases):
   - Basic invocation: `-p --output-format stream-json --verbose <prompt>`
   - Session resume: adds `--resume <session-id>`
   - MCP config: serializes JSON, adds `--mcp-config <json-string>`
   - Short prompt: included as positional arg
   - All flags present: session + MCP + prompt combined
   - No session, no MCP: minimal args

2. **Event extraction** (10+ test cases):
   - `stream_event` with `message_start` → skip (but session ID captured from envelope)
   - `stream_event` with `content_block_delta` / `text_delta` → `AgentTextEvent`
   - `stream_event` with `content_block_start` / `tool_use` → buffers tool name+id
   - `stream_event` with `content_block_delta` / `input_json_delta` → accumulates tool input
   - `stream_event` with `content_block_stop` (after tool_use) → `AgentToolUseEvent` with parsed input
   - `stream_event` with `content_block_stop` (after text) → skip
   - `stream_event` with `message_delta` → extracts stop_reason and usage
   - `stream_event` with `thinking_delta` → skip (returns `null`)
   - `result` line → `AgentDoneEvent` with full `AgentRunResult` (usage, cost, session, etc.)
   - `ping` → skip (returns `null`)
   - Unknown event type → skip (returns `null`)
   - Malformed JSON → handled by `CLIRuntimeBase` (skip at base level)

3. **Environment construction** (2+ test cases):
   - Returns empty record (no hardcoded env vars)
   - Does not inject auth vars (caller responsibility)

4. **MCP config serialization** (2+ test cases):
   - JSON string is correctly serialized from `McpServerConfig` entries
   - `--mcp-config` arg is omitted when no MCP servers configured

## Acceptance Criteria

- [ ] `src/middleware/runtimes/claude.ts` exists and exports `ClaudeCliRuntime`
- [ ] Class extends `CLIRuntimeBase` and implements all three abstract methods
- [ ] `buildArgs()` produces correct Claude CLI flags for all parameter combinations
- [ ] `extractEvent()` correctly maps Claude `stream_event` envelopes and `result` line to `AgentEvent` types
- [ ] `extractEvent()` is stateful: buffers tool_use blocks, accumulates text, tracks session ID
- [ ] Session ID is extracted from `stream_event` envelope and included in the done result
- [ ] Usage/cost metadata is populated from the `result` event
- [ ] MCP server config is serialized to JSON string and passed via `--mcp-config`
- [ ] Unit tests cover argument construction, event extraction, and environment setup
- [ ] `pnpm build` passes
- [ ] `pnpm test` passes

## Reference

- The existing upstream CLI runner at `src/agents/cli-runner.ts` and `src/agents/cli-backends.ts` shows how Claude is invoked today (using `--output-format json`, collecting full output). Our implementation uses `stream-json` for streaming NDJSON events instead, which is the key architectural difference.
- [Claude Code headless mode docs](https://code.claude.com/docs/en/headless) — defines `stream-json` output format and `stream_event` envelope
- [Agent SDK streaming output docs](https://platform.claude.com/docs/en/agent-sdk/streaming-output) — defines `SDKPartialAssistantMessage` type (`stream_event` with `RawMessageStreamEvent` inner events)
- **Known limitation**: When `maxThinkingTokens` is explicitly set (extended thinking mode), `StreamEvent` messages are not emitted — the SDK returns only the final `AssistantMessage` and `ResultMessage`.
- **Empirical verification needed**: Field-level names on the `result` line and exact tool result event format require capture of actual `claude -p --output-format stream-json` output during implementation.




Flag	Value	When
`-p`	(none)	Always — enables pipe/print mode
`--output-format`	`stream-json`	Always — NDJSON streaming output
`--verbose`	(none)	Always — enables usage/cost reporting in output
`--resume`	`params.sessionId`	When `params.sessionId` is provided
`--mcp-config`	`<JSON string>`	When `params.mcpServers` has entries
(positional)	`params.prompt`	When prompt length ≤ 10,000 chars (stdin threshold is handled by `CLIRuntimeBase`)

Inner `event.type`	Condition	Maps To	Notes
`message_start`	—	Skip	Extract `session_id` from envelope
`content_block_start`	`content_block.type === "text"`	Skip	Text content arrives via deltas
`content_block_start`	`content_block.type === "tool_use"`	Buffer	Store `name`, `id` from `content_block`; init input accumulator
`content_block_delta`	`delta.type === "text_delta"`	`AgentTextEvent`	`{ type: "text", text: delta.text }`
`content_block_delta`	`delta.type === "input_json_delta"`	Accumulate	Append `delta.partial_json` to buffered tool input
`content_block_delta`	`delta.type === "thinking_delta"`	Skip	Extended thinking content, not user-facing
`content_block_stop`	Tool buffered	`AgentToolUseEvent`	Emit with `JSON.parse(accumulated_input)`; clear buffer
`content_block_stop`	No tool buffered	Skip	End of text block
`message_delta`	—	Skip	Extract `delta.stop_reason` and `usage` for result metadata
`message_stop`	—	Skip	—
`ping`	—	Skip	Keepalive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ClaudeCliRuntime concrete class #8

Context

Architecture

Dependencies

Specification

File: `src/middleware/runtimes/claude.ts`

Constructor

`buildArgs(params: AgentExecuteParams): string[]`

`extractEvent(line: string): AgentEvent | null`

`buildEnv(params: AgentExecuteParams): Record<string, string>`

File: `src/middleware/runtimes/claude.test.ts`

Acceptance Criteria

Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement ClaudeCliRuntime concrete class #8

Description

Context

Architecture

Dependencies

Specification

File: src/middleware/runtimes/claude.ts

Constructor

buildArgs(params: AgentExecuteParams): string[]

extractEvent(line: string): AgentEvent | null

buildEnv(params: AgentExecuteParams): Record<string, string>

File: src/middleware/runtimes/claude.test.ts

Acceptance Criteria

Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

File: `src/middleware/runtimes/claude.ts`

`buildArgs(params: AgentExecuteParams): string[]`

`extractEvent(line: string): AgentEvent | null`

`buildEnv(params: AgentExecuteParams): Record<string, string>`

File: `src/middleware/runtimes/claude.test.ts`