[Bug]: Azure OpenAI models report 0 context tokens — supportsUsageInStreaming forcefully disabled

### Bug type

Regression (worked before, now fails)

### Summary

`normalizeModelCompat()` forces `supportsUsageInStreaming: false` for all non-`api.openai.com` endpoints, including Azure OpenAI (`*.openai.azure.com`). This prevents pi-ai from sending `stream_options: { include_usage: true }`, so Azure never returns usage data in streaming chunks. Result: `/status` shows `Context: 0/128k (0%)`, `totalTokens` is always `undefined`, and `memoryFlush` can never trigger.

### Steps to reproduce

1. Configure an Azure OpenAI provider with `api: "openai-completions"` and a `baseUrl` pointing to `*.openai.azure.com`
2. Send a message to the agent
3. Run `/status`

### Expected behavior

`/status` should show non-zero `Context` reflecting actual prompt token usage (e.g., `Context: 17k/128k (13%)`), and `memoryFlush` should be able to trigger when the threshold is reached.

### Actual behavior

`/status` shows `Context: 0/128k (0%)`. `totalTokens` in `sessions.json` is `null`/`undefined` for all Azure model sessions. `memoryFlush` never triggers because the threshold check depends on `totalTokens > 0`.

### OpenClaw version

2026.3.3

### Operating system

Ubuntu 24.04 (Docker)

### Install method

docker (local build from source)

### Logs, screenshots, and evidence

```
🦞 OpenClaw 2026.3.3
🧠 Model: azure-gpt5mini/gpt-5-mini · 🔑 api-key (models.json)
🗄️ Cache: 100% hit · 16k cached, 0 new
📚 Context: 0/128k (0%) · 🧹 Compactions: 0
```

Root cause trace:

1. `src/agents/model-compat.ts:14-21` — `isOpenAINativeEndpoint()` only matches `host === "api.openai.com"`. Azure's `*.openai.azure.com` returns `false`.
2. Line 79 — `needsForce = true` for all non-native endpoints (including Azure).
3. Line 91 — Forces `{ supportsDeveloperRole: false, supportsUsageInStreaming: false }`, overwriting any user config (even explicit `supportsUsageInStreaming: true` in `models.json`).
4. `node_modules/@mariozechner/pi-ai/dist/providers/openai-completions.js:319` — Only adds `stream_options: { include_usage: true }` when `compat.supportsUsageInStreaming !== false`. Since it IS `false`, `stream_options` is never sent.
5. Without `stream_options`, Azure doesn't return usage in streaming chunks → usage stays `{ input: 0, output: 0 }`.
6. `src/agents/usage.ts:149` — `derivePromptTokens({ input: 0, cacheRead: 0, cacheWrite: 0 })` returns `undefined` (sum = 0).
7. `src/auto-reply/reply/session-usage.ts:104` — `totalTokens = undefined`.

After fix — `/status` correctly shows:
```
🧮 Tokens: 9.3k in / 401 out · 💵 Cost: $0.0031
🗄️ Cache: 44% hit · 7.3k cached, 0 new
📚 Context: 17k/128k (13%) · 🧹 Compactions: 0
```

### Impact and severity

Affected: All users using Azure OpenAI with `api: "openai-completions"` (Chat Completions API)
Severity: High — silently breaks context tracking and memoryFlush
Frequency: 100% repro on any Azure OpenAI endpoint
Consequence: `/status` always shows 0 context; `memoryFlush` never triggers (threshold depends on `totalTokens`); users cannot monitor context window utilization

### Additional information

Introduced by commit 2671f04 (`fix(agents): disable usage streaming chunks on non-native openai-completions`, 2026-03-05), which was fixing #8714 — usage-only stream chunks breaking `choices[0]` parser on some non-native backends. The fix was correct for generic backends but did not account for Azure OpenAI, which fully supports `stream_options: { include_usage: true }`.

**Fix**: Add `isAzureOpenAIEndpoint()` check matching `host.endsWith(".openai.azure.com")`. For Azure endpoints, only force `supportsDeveloperRole: false` (Azure rejects `developer` role) but preserve `supportsUsageInStreaming` (default `true`).

Config-only workaround (setting `compat.supportsUsageInStreaming: true` in `models.json`) does NOT work because `normalizeModelCompat` explicitly overrides user-set values for non-native endpoints.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Azure OpenAI models report 0 context tokens — supportsUsageInStreaming forcefully disabled #38784

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Azure OpenAI models report 0 context tokens — supportsUsageInStreaming forcefully disabled #38784

Description

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions