Skip to content

[Bug]: Azure OpenAI models report 0 context tokens — supportsUsageInStreaming forcefully disabled #38784

@Hua688

Description

@Hua688

Bug type

Regression (worked before, now fails)

Summary

normalizeModelCompat() forces supportsUsageInStreaming: false for all non-api.openai.com endpoints, including Azure OpenAI (*.openai.azure.com). This prevents pi-ai from sending stream_options: { include_usage: true }, so Azure never returns usage data in streaming chunks. Result: /status shows Context: 0/128k (0%), totalTokens is always undefined, and memoryFlush can never trigger.

Steps to reproduce

  1. Configure an Azure OpenAI provider with api: "openai-completions" and a baseUrl pointing to *.openai.azure.com
  2. Send a message to the agent
  3. Run /status

Expected behavior

/status should show non-zero Context reflecting actual prompt token usage (e.g., Context: 17k/128k (13%)), and memoryFlush should be able to trigger when the threshold is reached.

Actual behavior

/status shows Context: 0/128k (0%). totalTokens in sessions.json is null/undefined for all Azure model sessions. memoryFlush never triggers because the threshold check depends on totalTokens > 0.

OpenClaw version

2026.3.3

Operating system

Ubuntu 24.04 (Docker)

Install method

docker (local build from source)

Logs, screenshots, and evidence

🦞 OpenClaw 2026.3.3
🧠 Model: azure-gpt5mini/gpt-5-mini · 🔑 api-key (models.json)
🗄️ Cache: 100% hit · 16k cached, 0 new
📚 Context: 0/128k (0%) · 🧹 Compactions: 0

Root cause trace:

  1. src/agents/model-compat.ts:14-21isOpenAINativeEndpoint() only matches host === "api.openai.com". Azure's *.openai.azure.com returns false.
  2. Line 79 — needsForce = true for all non-native endpoints (including Azure).
  3. Line 91 — Forces { supportsDeveloperRole: false, supportsUsageInStreaming: false }, overwriting any user config (even explicit supportsUsageInStreaming: true in models.json).
  4. node_modules/@mariozechner/pi-ai/dist/providers/openai-completions.js:319 — Only adds stream_options: { include_usage: true } when compat.supportsUsageInStreaming !== false. Since it IS false, stream_options is never sent.
  5. Without stream_options, Azure doesn't return usage in streaming chunks → usage stays { input: 0, output: 0 }.
  6. src/agents/usage.ts:149derivePromptTokens({ input: 0, cacheRead: 0, cacheWrite: 0 }) returns undefined (sum = 0).
  7. src/auto-reply/reply/session-usage.ts:104totalTokens = undefined.

After fix — /status correctly shows:

🧮 Tokens: 9.3k in / 401 out · 💵 Cost: $0.0031
🗄️ Cache: 44% hit · 7.3k cached, 0 new
📚 Context: 17k/128k (13%) · 🧹 Compactions: 0

Impact and severity

Affected: All users using Azure OpenAI with api: "openai-completions" (Chat Completions API)
Severity: High — silently breaks context tracking and memoryFlush
Frequency: 100% repro on any Azure OpenAI endpoint
Consequence: /status always shows 0 context; memoryFlush never triggers (threshold depends on totalTokens); users cannot monitor context window utilization

Additional information

Introduced by commit 2671f04 (fix(agents): disable usage streaming chunks on non-native openai-completions, 2026-03-05), which was fixing #8714 — usage-only stream chunks breaking choices[0] parser on some non-native backends. The fix was correct for generic backends but did not account for Azure OpenAI, which fully supports stream_options: { include_usage: true }.

Fix: Add isAzureOpenAIEndpoint() check matching host.endsWith(".openai.azure.com"). For Azure endpoints, only force supportsDeveloperRole: false (Azure rejects developer role) but preserve supportsUsageInStreaming (default true).

Config-only workaround (setting compat.supportsUsageInStreaming: true in models.json) does NOT work because normalizeModelCompat explicitly overrides user-set values for non-native endpoints.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions