feat(tts): add Azure Speech TTS provider by leonchui · Pull Request #51321 · openclaw/openclaw

leonchui · 2026-03-20T23:26:13Z

Summary

Add Azure Speech TTS provider to OpenClaw with SSML synthesis support.

Problem

OpenClaw currently supports Edge TTS, ElevenLabs, and OpenAI TTS
Azure Speech has 400+ neural voices including Cantonese (zh-HK) which users have requested
Many users already have Azure accounts and API keys

What Changed

Added src/tts/providers/azure.ts - Azure TTS provider implementation
Updated src/tts/provider-registry.ts to register Azure provider
Updated src/config/types.tts.ts with Azure config types
Updated src/config/zod-schema.core.ts with Azure validation schema

Features

SSML-based synthesis for natural speech
400+ neural voices including Cantonese (zh-HK-HiuMaanNeural)
Config options: apiKey, region, voice, lang, outputFormat
Environment variables: AZURE_SPEECH_API_KEY, AZURE_SPEECH_REGION
Provider ID: 'azure' with alias 'azure-tts'

Config Example

{
  "messages": {
    "tts": {
      "provider": "azure",
      "azure": {
        "apiKey": "your-key",
        "region": "eastus",
        "voice": "zh-HK-HiuMaanNeural",
        "lang": "zh-HK"
      }
    }
  }
}

Human Verification

Tested Azure TTS synthesis with Cantonese voice (zh-HK-HiuMaanNeural)
Verified API authentication works with East US region
Voice output plays correctly in Telegram

greptile-apps · 2026-03-20T23:30:00Z

Greptile Summary

This PR adds an Azure Speech TTS provider with SSML synthesis support and 400+ neural voice options. The config/schema wiring (types.tts.ts and zod-schema.core.ts) and the registry registration (provider-registry.ts) are correct, but the core provider implementation (src/tts/providers/azure.ts) contains several critical logic bugs that collectively prevent the provider from working with config-file settings and cause incorrect API routing.

Key issues found:

baseUrl silently ignored in listAzureVoices — the base variable is computed from normalizeAzureBaseUrl but never used; voice listing always hits https://{region}.tts.speech.microsoft.com regardless of a custom baseUrl.
region silently ignored in synthesize — the region variable is computed but never used in building the endpoint; synthesis always routes to the hard-coded eastus default when no explicit baseUrl is set.
Deprecated-voice filter is broken — .filter((v) => v.Status !== "Deprecated") runs after .map(), which drops the Status field; deprecated voices are never excluded.
config.azure is not part of ResolvedTtsConfig — the synthesize/isConfigured callbacks receive a ResolvedTtsConfig that has no azure sub-object, so all config-file values (API key, region, voice, etc.) are always undefined; only env vars are reachable. This means voice has no fallback and synthesis unconditionally throws for users relying on the config file.
overrides.azure is not part of TtsDirectiveOverrides — per-call directive overrides for voice, language, and output format can never be applied to the Azure provider.

Confidence Score: 1/5

Not safe to merge — the provider will silently ignore config-file settings and always route synthesis to the wrong region for most users.
Multiple critical logic bugs in the core provider file mean the feature does not work as described. Config-file API keys and voice settings are unreachable at runtime (missing azure field on ResolvedTtsConfig), region routing is broken (computed but unused variable), the deprecated-voice filter never fires, and directive overrides are also unreachable. These are not edge cases — they affect every user who follows the documented config example.
src/tts/providers/azure.ts requires significant fixes. src/tts/tts.ts also needs to be updated to add azure to ResolvedTtsConfig, resolveTtsConfig(), and TtsDirectiveOverrides.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/tts/providers/azure.ts
Line: 30-32

Comment:
**`baseUrl` silently ignored in `listAzureVoices`**

`base` is computed on line 30 (`normalizeAzureBaseUrl(params.baseUrl)`) but is never used anywhere. The voice-list URL is then built unconditionally as `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`, completely discarding the caller-supplied `baseUrl`. Anyone who configures a custom `baseUrl` (e.g., for private endpoints or sovereign clouds) will find it silently ignored for voice listing, while it _is_ respected in `synthesize`. The fix is to derive the hostname from `baseUrl` when it is provided, falling back to the region-based URL:

```suggestion
  const region = params.region || "eastus";
  const url = params.baseUrl
    ? `${normalizeAzureBaseUrl(params.baseUrl)}/cognitiveservices/voices/list`
    : `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`;
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/tts/providers/azure.ts
Line: 100-101

Comment:
**`region` ignored in `synthesize` — endpoint always defaults to `eastus`**

`region` is resolved on line 100 but is never referenced again. The endpoint is built entirely from `baseUrl` (line 113: `${baseUrl}/cognitiveservices/v1`), and `normalizeAzureBaseUrl` hard-codes `"https://eastus.tts.speech.microsoft.com"` as its default. This means that if a user sets `region: "westeurope"` (or any other region) but does not also set a matching `baseUrl`, all synthesis requests silently go to `eastus`, causing authentication errors or incorrect routing.

The region should be used to construct the base URL when no explicit `baseUrl` is provided:

```suggestion
      const region = req.config?.azure?.region || process.env.AZURE_SPEECH_REGION || "eastus";
      const baseUrl = req.config?.azure?.baseUrl
        ? normalizeAzureBaseUrl(req.config.azure.baseUrl)
        : `https://${region}.tts.speech.microsoft.com`;
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/tts/providers/azure.ts
Line: 45-55

Comment:
**Deprecated-voice filter never fires — `Status` is dropped by `.map()`**

The `.map()` on lines 47–53 projects each `AzureVoiceListEntry` into a new object that does not include the `Status` field. The subsequent `.filter()` on line 54 then checks `voice.Status !== "Deprecated"`, but `voice.Status` is always `undefined` on the mapped object, so the condition is always `true` and deprecated voices are never removed. TypeScript will also flag `voice.Status` as an unknown property on the mapped type.

The filter needs to run on the original entries before (or during) the map:

```suggestion
  return Array.isArray(voices)
    ? voices
        .filter((voice) => voice.Status !== "Deprecated")
        .map((voice) => ({
          id: voice.ShortName?.trim() ?? "",
          name: voice.DisplayName?.trim() || voice.ShortName?.trim() || undefined,
          category: voice.VoiceType?.trim() || undefined,
          locale: voice.Locale?.trim() || undefined,
          gender: voice.Gender?.trim() || undefined,
        }))
        .filter((voice) => voice.id.length > 0)
    : [];
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/tts/providers/azure.ts
Line: 88-111

Comment:
**`config.azure` does not exist on `ResolvedTtsConfig` — all config values are always `undefined`**

`SpeechSynthesisRequest.config` and `SpeechProviderConfiguredContext.config` are both typed as `ResolvedTtsConfig` (see `src/tts/provider-types.ts` and `src/tts/tts.ts`). `ResolvedTtsConfig` in `src/tts/tts.ts` defines named sub-configs for `elevenlabs`, `openai`, and `edge`, but has **no `azure` field**. As a result, every access to `req.config?.azure`, `req.config.azure`, or `config.azure` in `isConfigured`, `listVoices`, and `synthesize` will always resolve to `undefined` at runtime, and TypeScript should report a compile-time error on those accesses.

The practical effect is that API key, region, voice, language, and output-format settings written in the JSON config file are completely ignored; only the `AZURE_SPEECH_API_KEY` / `AZURE_SPEECH_REGION` environment variables are reachable. The `voice` field has no env-var fallback at all, so synthesis will unconditionally throw `"Azure voice not configured"` for any user who relies on config-file settings.

To fix this properly:
1. Add an `azure` sub-object to `ResolvedTtsConfig` in `src/tts/tts.ts` (mirroring the pattern for `elevenlabs` / `openai` / `edge`).
2. Populate it in `resolveTtsConfig()` by reading `raw.azure.*` and resolving secrets with `normalizeResolvedSecretInputString`.
3. Update `SpeechSynthesisRequest` or the shared config type so the provider can access resolved values.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/tts/providers/azure.ts
Line: 102-107

Comment:
**`overrides.azure` does not exist on `TtsDirectiveOverrides` — override fields are always `undefined`**

`SpeechSynthesisRequest.overrides` is typed as `TtsDirectiveOverrides` (defined in `src/tts/tts.ts`). That type includes `openai`, `elevenlabs`, and `microsoft` override bags, but has **no `azure` field**. Every access to `req.overrides?.azure?.voice`, `req.overrides?.azure?.lang`, and `req.overrides?.azure?.outputFormat` will always evaluate to `undefined` at runtime, and TypeScript should flag these as errors.

In practice this means that per-call directive overrides for voice, language, and output format can never be applied to the Azure provider. An `azure` bag should be added to `TtsDirectiveOverrides` and wired through `parseTtsDirectives` in the same way the existing providers' overrides are handled.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "feat(tts): add Azure..."}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 33b95fed9a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-20T23:30:06Z

src/tts/providers/azure.ts

+      ),
+    synthesize: async (req) => {
+      const apiKey =
+        req.config.azure?.apiKey || process.env.AZURE_SPEECH_API_KEY;


Carry Azure settings through resolved TTS config

When users configure Azure through messages.tts.azure as introduced in this commit, this provider reads req.config.azure, but resolveTtsConfig in src/tts/tts.ts:274-343 never populates an azure section on ResolvedTtsConfig. In practice that means config-only Azure setups cannot work: isConfigured stays false, apiKey/voice/region are dropped, and synthesis only succeeds if the same values are also present in environment variables.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-20T23:30:06Z

src/tts/providers/azure.ts

+      const region = req.config?.azure?.region || process.env.AZURE_SPEECH_REGION || "eastus";
+      const baseUrl = normalizeAzureBaseUrl(req.config?.azure?.baseUrl);


Build the synth endpoint from the selected Azure region

If a deployment sets AZURE_SPEECH_REGION (or later wires messages.tts.azure.region) to anything other than eastus and leaves baseUrl unset, synthesis still posts to East US. region is computed here, but normalizeAzureBaseUrl(undefined) on the next line hard-codes https://eastus.tts.speech.microsoft.com, so voice listing can hit one region while synthesis hits another and fails with region/resource mismatches.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-20T23:30:06Z

src/tts/providers/azure.ts

+      const response = await fetch(endpoint, {
+        method: "POST",
+        headers: {


Honor configured timeouts on Azure TTS requests

This fetch never uses an AbortController, so neither the global messages.tts.timeoutMs nor the new messages.tts.azure.timeoutMs can stop a slow Azure call. In textToSpeech the provider loop awaits each synthesize sequentially (src/tts/tts.ts:701-729), so a hung Azure request can stall the whole reply path indefinitely instead of timing out and falling back to the next provider.

Useful? React with 👍 / 👎.

greptile-apps · 2026-03-20T23:30:03Z

src/tts/providers/azure.ts

+  const base = normalizeAzureBaseUrl(params.baseUrl);
+  const region = params.region || "eastus";
+  const url = `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`;


baseUrl silently ignored in listAzureVoices

base is computed on line 30 (normalizeAzureBaseUrl(params.baseUrl)) but is never used anywhere. The voice-list URL is then built unconditionally as https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list, completely discarding the caller-supplied baseUrl. Anyone who configures a custom baseUrl (e.g., for private endpoints or sovereign clouds) will find it silently ignored for voice listing, while it is respected in synthesize. The fix is to derive the hostname from baseUrl when it is provided, falling back to the region-based URL:

Suggested change

const base = normalizeAzureBaseUrl(params.baseUrl);

const region = params.region || "eastus";

const url = `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`;

const region = params.region || "eastus";

const url = params.baseUrl

? `${normalizeAzureBaseUrl(params.baseUrl)}/cognitiveservices/voices/list`

: `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`;

Prompt To Fix With AI

This is a comment left during a code review. Path: src/tts/providers/azure.ts Line: 30-32 Comment: **`baseUrl` silently ignored in `listAzureVoices`** `base` is computed on line 30 (`normalizeAzureBaseUrl(params.baseUrl)`) but is never used anywhere. The voice-list URL is then built unconditionally as `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`, completely discarding the caller-supplied `baseUrl`. Anyone who configures a custom `baseUrl` (e.g., for private endpoints or sovereign clouds) will find it silently ignored for voice listing, while it _is_ respected in `synthesize`. The fix is to derive the hostname from `baseUrl` when it is provided, falling back to the region-based URL: ```suggestion const region = params.region || "eastus"; const url = params.baseUrl ? `${normalizeAzureBaseUrl(params.baseUrl)}/cognitiveservices/voices/list` : `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`; ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-20T23:30:05Z

src/tts/providers/azure.ts

+      const region = req.config?.azure?.region || process.env.AZURE_SPEECH_REGION || "eastus";
+      const baseUrl = normalizeAzureBaseUrl(req.config?.azure?.baseUrl);


region ignored in synthesize — endpoint always defaults to eastus

region is resolved on line 100 but is never referenced again. The endpoint is built entirely from baseUrl (line 113: ${baseUrl}/cognitiveservices/v1), and normalizeAzureBaseUrl hard-codes "https://eastus.tts.speech.microsoft.com" as its default. This means that if a user sets region: "westeurope" (or any other region) but does not also set a matching baseUrl, all synthesis requests silently go to eastus, causing authentication errors or incorrect routing.

The region should be used to construct the base URL when no explicit baseUrl is provided:

Suggested change

const region = req.config?.azure?.region || process.env.AZURE_SPEECH_REGION || "eastus";

const baseUrl = normalizeAzureBaseUrl(req.config?.azure?.baseUrl);

const region = req.config?.azure?.region || process.env.AZURE_SPEECH_REGION || "eastus";

const baseUrl = req.config?.azure?.baseUrl

? normalizeAzureBaseUrl(req.config.azure.baseUrl)

: `https://${region}.tts.speech.microsoft.com`;

Prompt To Fix With AI

This is a comment left during a code review. Path: src/tts/providers/azure.ts Line: 100-101 Comment: **`region` ignored in `synthesize` — endpoint always defaults to `eastus`** `region` is resolved on line 100 but is never referenced again. The endpoint is built entirely from `baseUrl` (line 113: `${baseUrl}/cognitiveservices/v1`), and `normalizeAzureBaseUrl` hard-codes `"https://eastus.tts.speech.microsoft.com"` as its default. This means that if a user sets `region: "westeurope"` (or any other region) but does not also set a matching `baseUrl`, all synthesis requests silently go to `eastus`, causing authentication errors or incorrect routing. The region should be used to construct the base URL when no explicit `baseUrl` is provided: ```suggestion const region = req.config?.azure?.region || process.env.AZURE_SPEECH_REGION || "eastus"; const baseUrl = req.config?.azure?.baseUrl ? normalizeAzureBaseUrl(req.config.azure.baseUrl) : `https://${region}.tts.speech.microsoft.com`; ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-20T23:30:06Z

src/tts/providers/azure.ts

+  return Array.isArray(voices)
+    ? voices
+        .map((voice) => ({
+          id: voice.ShortName?.trim() ?? "",
+          name: voice.DisplayName?.trim() || voice.ShortName?.trim() || undefined,
+          category: voice.VoiceType?.trim() || undefined,
+          locale: voice.Locale?.trim() || undefined,
+          gender: voice.Gender?.trim() || undefined,
+        }))
+        .filter((voice) => voice.id.length > 0 && voice.Status !== "Deprecated")
+    : [];


Deprecated-voice filter never fires — Status is dropped by .map()

The .map() on lines 47–53 projects each AzureVoiceListEntry into a new object that does not include the Status field. The subsequent .filter() on line 54 then checks voice.Status !== "Deprecated", but voice.Status is always undefined on the mapped object, so the condition is always true and deprecated voices are never removed. TypeScript will also flag voice.Status as an unknown property on the mapped type.

The filter needs to run on the original entries before (or during) the map:

Suggested change

return Array.isArray(voices)

? voices

.map((voice) => ({

id: voice.ShortName?.trim() ?? "",

name: voice.DisplayName?.trim() || voice.ShortName?.trim() || undefined,

category: voice.VoiceType?.trim() || undefined,

locale: voice.Locale?.trim() || undefined,

gender: voice.Gender?.trim() || undefined,

}))

.filter((voice) => voice.id.length > 0 && voice.Status !== "Deprecated")

: [];

return Array.isArray(voices)

? voices

.filter((voice) => voice.Status !== "Deprecated")

.map((voice) => ({

id: voice.ShortName?.trim() ?? "",

name: voice.DisplayName?.trim() || voice.ShortName?.trim() || undefined,

category: voice.VoiceType?.trim() || undefined,

locale: voice.Locale?.trim() || undefined,

gender: voice.Gender?.trim() || undefined,

}))

.filter((voice) => voice.id.length > 0)

: [];

Prompt To Fix With AI

This is a comment left during a code review. Path: src/tts/providers/azure.ts Line: 45-55 Comment: **Deprecated-voice filter never fires — `Status` is dropped by `.map()`** The `.map()` on lines 47–53 projects each `AzureVoiceListEntry` into a new object that does not include the `Status` field. The subsequent `.filter()` on line 54 then checks `voice.Status !== "Deprecated"`, but `voice.Status` is always `undefined` on the mapped object, so the condition is always `true` and deprecated voices are never removed. TypeScript will also flag `voice.Status` as an unknown property on the mapped type. The filter needs to run on the original entries before (or during) the map: ```suggestion return Array.isArray(voices) ? voices .filter((voice) => voice.Status !== "Deprecated") .map((voice) => ({ id: voice.ShortName?.trim() ?? "", name: voice.DisplayName?.trim() || voice.ShortName?.trim() || undefined, category: voice.VoiceType?.trim() || undefined, locale: voice.Locale?.trim() || undefined, gender: voice.Gender?.trim() || undefined, })) .filter((voice) => voice.id.length > 0) : []; ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-20T23:30:07Z

src/tts/providers/azure.ts

+    isConfigured: ({ config }) =>
+      Boolean(
+        config.azure?.apiKey ||
+          process.env.AZURE_SPEECH_API_KEY,
+      ),
+    synthesize: async (req) => {
+      const apiKey =
+        req.config.azure?.apiKey || process.env.AZURE_SPEECH_API_KEY;
+      if (!apiKey) {
+        throw new Error("Azure Speech API key missing");
+      }
+
+      const region = req.config?.azure?.region || process.env.AZURE_SPEECH_REGION || "eastus";
+      const baseUrl = normalizeAzureBaseUrl(req.config?.azure?.baseUrl);
+      const voice = req.overrides?.azure?.voice ?? req.config?.azure?.voice;
+      const lang = req.overrides?.azure?.lang ?? req.config?.azure?.lang;
+      const outputFormat =
+        req.overrides?.azure?.outputFormat ??
+        req.config?.azure?.outputFormat ??
+        DEFAULT_AZURE_OUTPUT_FORMAT;
+
+      if (!voice) {
+        throw new Error("Azure voice not configured");
+      }


config.azure does not exist on ResolvedTtsConfig — all config values are always undefined

SpeechSynthesisRequest.config and SpeechProviderConfiguredContext.config are both typed as ResolvedTtsConfig (see src/tts/provider-types.ts and src/tts/tts.ts). ResolvedTtsConfig in src/tts/tts.ts defines named sub-configs for elevenlabs, openai, and edge, but has no azure field. As a result, every access to req.config?.azure, req.config.azure, or config.azure in isConfigured, listVoices, and synthesize will always resolve to undefined at runtime, and TypeScript should report a compile-time error on those accesses.

The practical effect is that API key, region, voice, language, and output-format settings written in the JSON config file are completely ignored; only the AZURE_SPEECH_API_KEY / AZURE_SPEECH_REGION environment variables are reachable. The voice field has no env-var fallback at all, so synthesis will unconditionally throw "Azure voice not configured" for any user who relies on config-file settings.

To fix this properly:

Add an azure sub-object to ResolvedTtsConfig in src/tts/tts.ts (mirroring the pattern for elevenlabs / openai / edge).

Populate it in resolveTtsConfig() by reading raw.azure.* and resolving secrets with normalizeResolvedSecretInputString.

Update SpeechSynthesisRequest or the shared config type so the provider can access resolved values.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/tts/providers/azure.ts Line: 88-111 Comment: **`config.azure` does not exist on `ResolvedTtsConfig` — all config values are always `undefined`** `SpeechSynthesisRequest.config` and `SpeechProviderConfiguredContext.config` are both typed as `ResolvedTtsConfig` (see `src/tts/provider-types.ts` and `src/tts/tts.ts`). `ResolvedTtsConfig` in `src/tts/tts.ts` defines named sub-configs for `elevenlabs`, `openai`, and `edge`, but has **no `azure` field**. As a result, every access to `req.config?.azure`, `req.config.azure`, or `config.azure` in `isConfigured`, `listVoices`, and `synthesize` will always resolve to `undefined` at runtime, and TypeScript should report a compile-time error on those accesses. The practical effect is that API key, region, voice, language, and output-format settings written in the JSON config file are completely ignored; only the `AZURE_SPEECH_API_KEY` / `AZURE_SPEECH_REGION` environment variables are reachable. The `voice` field has no env-var fallback at all, so synthesis will unconditionally throw `"Azure voice not configured"` for any user who relies on config-file settings. To fix this properly: 1. Add an `azure` sub-object to `ResolvedTtsConfig` in `src/tts/tts.ts` (mirroring the pattern for `elevenlabs` / `openai` / `edge`). 2. Populate it in `resolveTtsConfig()` by reading `raw.azure.*` and resolving secrets with `normalizeResolvedSecretInputString`. 3. Update `SpeechSynthesisRequest` or the shared config type so the provider can access resolved values. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-20T23:30:08Z

src/tts/providers/azure.ts

+      const voice = req.overrides?.azure?.voice ?? req.config?.azure?.voice;
+      const lang = req.overrides?.azure?.lang ?? req.config?.azure?.lang;
+      const outputFormat =
+        req.overrides?.azure?.outputFormat ??
+        req.config?.azure?.outputFormat ??
+        DEFAULT_AZURE_OUTPUT_FORMAT;


overrides.azure does not exist on TtsDirectiveOverrides — override fields are always undefined

SpeechSynthesisRequest.overrides is typed as TtsDirectiveOverrides (defined in src/tts/tts.ts). That type includes openai, elevenlabs, and microsoft override bags, but has no azure field. Every access to req.overrides?.azure?.voice, req.overrides?.azure?.lang, and req.overrides?.azure?.outputFormat will always evaluate to undefined at runtime, and TypeScript should flag these as errors.

In practice this means that per-call directive overrides for voice, language, and output format can never be applied to the Azure provider. An azure bag should be added to TtsDirectiveOverrides and wired through parseTtsDirectives in the same way the existing providers' overrides are handled.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/tts/providers/azure.ts Line: 102-107 Comment: **`overrides.azure` does not exist on `TtsDirectiveOverrides` — override fields are always `undefined`** `SpeechSynthesisRequest.overrides` is typed as `TtsDirectiveOverrides` (defined in `src/tts/tts.ts`). That type includes `openai`, `elevenlabs`, and `microsoft` override bags, but has **no `azure` field**. Every access to `req.overrides?.azure?.voice`, `req.overrides?.azure?.lang`, and `req.overrides?.azure?.outputFormat` will always evaluate to `undefined` at runtime, and TypeScript should flag these as errors. In practice this means that per-call directive overrides for voice, language, and output format can never be applied to the Azure provider. An `azure` bag should be added to `TtsDirectiveOverrides` and wired through `parseTtsDirectives` in the same way the existing providers' overrides are handled. How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8e07d5c326

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-20T23:43:21Z

src/tts/providers/azure.ts

+  const base = normalizeAzureBaseUrl(params.baseUrl);
+  const region = params.region || "eastus";
+  const url = `https://${region}.tts.speech.microsoft.com/cognitiveservices/voices/list`;


Honor azure.baseUrl when listing voices

If a deployment sets messages.tts.azure.baseUrl (or passes listSpeechVoices(..., baseUrl)) for a custom/private Azure Speech endpoint, this still calls https://${region}.tts.speech.microsoft.com/... because the computed base value is never used. That makes Azure voice discovery fail anywhere the new config must target a non-default host, even though this commit adds baseUrl support and src/tts/providers/azure.test.ts:110-123 expects it to work.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-20T23:43:21Z

src/tts/providers/azure.ts

+      const voice = req.overrides?.azure?.voice ?? req.config?.azure?.voice;
+      const lang = req.overrides?.azure?.lang ?? req.config?.azure?.lang;
+      const outputFormat =
+        req.overrides?.azure?.outputFormat ??
+        req.config?.azure?.outputFormat ??


Wire Azure directive overrides into the parser

These Azure override reads are currently unreachable. parseTtsDirectives only populates openai, elevenlabs, and microsoft fields (src/tts/tts-core.ts:154-319), and TtsDirectiveOverrides has no azure section (src/tts/tts.ts:159-180). As a result, prompts like [[tts:provider=azure voice=...]] cannot change the Azure voice/lang/output format, so the new provider silently ignores the model-override path it appears to support here.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-20T23:43:21Z

src/tts/providers/azure.ts

+        outputFormat,
+        fileExtension: outputFormat.includes("mp3") ? ".mp3" : ".wav",


Infer Azure file extensions from the selected output format

Any non-MP3 Azure output format is surfaced as .wav here. textToSpeech writes the returned buffer using this extension (src/tts/tts.ts:662-663), so Azure Opus/WebM/PCM responses get saved with the wrong filename and downstream MIME/voice-note detection becomes incorrect. The bug shows up as soon as someone uses the newly added messages.tts.azure.outputFormat setting with anything other than an MP3 format.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ea9ffd2659

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-21T02:32:24Z

src/tts/tts.ts

+      apiKey: normalizeResolvedSecretInputString({
+        value: raw.azure?.apiKey,
+        path: "messages.tts.azure.apiKey",


Auto-select Azure when it is the only configured API provider

Adding Azure config here does not make the new provider reachable from the existing default-provider path. When messages.tts.provider is unset and prefs are empty, getTtsProvider() still only auto-detects OpenAI and ElevenLabs (src/tts/tts.ts:490-506) because resolveTtsApiKey() returns nothing for Azure (src/tts/tts.ts:563-574). In the common setup where a user adds messages.tts.azure.* but does not also set messages.tts.provider, OpenClaw will silently keep using Microsoft instead of the newly configured Azure provider.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-21T02:32:24Z

src/tts/providers/azure.ts

+    isConfigured: ({ config }) =>
+      Boolean(
+        (config as any)?.azure?.apiKey ||
+          process.env.AZURE_SPEECH_API_KEY,
+      ),


Treat Azure as configured only after a voice is set

This readiness check is too loose for Azure, because unlike the other providers there is no default voice to fall back to. resolveReadySpeechProvider() trusts isConfigured() before synthesizing (src/tts/tts.ts:620-640), but synthesize() then hard-fails with Azure voice not configured at src/tts/providers/azure.ts:117-119. As a result, a deployment with only an API key configured will show Azure as ready and may select it as the primary/fallback provider even though every TTS request will fail at runtime until messages.tts.azure.voice is added.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-21T02:32:25Z

src/tts/providers/azure.ts

+          "X-Microsoft-OutputFormat": outputFormat,
+        },
+        body: ssml,
+        signal: AbortSignal.timeout((req.config as any)?.azure?.timeoutMs ?? 30000),


Fall back to the global TTS timeout for Azure requests

Fresh evidence relative to the earlier timeout report: the new AbortSignal.timeout(...) now honors messages.tts.azure.timeoutMs, but it still bypasses the existing global messages.tts.timeoutMs. Every other hosted provider inherits req.config.timeoutMs, while this path hard-codes azure.timeoutMs ?? 30000, so deployments that only set the shared timeout will wait the full 30 seconds here. Because synthesizeSpeech() tries providers sequentially (src/tts/tts.ts:728-758), one slow Azure request can delay fallback much longer than the rest of the TTS stack.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 67a5b9b336

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-21T16:41:46Z

src/tts/providers/azure.ts

+      const outputFormat =
+        azureOverride?.outputFormat ??
+        (req.config as any)?.azure?.outputFormat ??
+        DEFAULT_AZURE_OUTPUT_FORMAT;


Honor voice-note targets when choosing Azure output

Checked src/tts/tts.ts:729-751: OpenClaw passes target="voice-note" for Telegram/Feishu/WhatsApp, and the other hosted providers switch to Opus on that target. Azure ignores req.target here and always uses the configured/default MP3 format, so Feishu auto-TTS will be uploaded as a generic file instead of audio because extensions/feishu/src/media.ts:490-497 only treats .opus/.ogg as msgType: "audio". This means the new provider cannot deliver the expected voice-bubble/audio behavior on at least Feishu even when TTS is otherwise configured correctly.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-21T16:41:46Z

src/tts/providers/azure.ts

+    id: "azure",
+    label: "Azure Speech",
+    aliases: ["azure-tts"],


Canonicalize the new azure-tts alias

The new alias is only registered in getSpeechProvider(). normalizeSpeechProviderId() still canonicalizes only edge, so a config or directive that sets provider: azure-tts leaves getTtsProvider() returning the alias string. From there resolveTtsProviderOrder() will try both azure-tts and azure, so one failing Azure request is retried a second time before any real fallback, and helpers like resolveTtsApiKey() do not recognize the alias at all. Since this commit advertises azure-tts as a provider alias, it should normalize to the canonical azure id like edge does for Microsoft.

Useful? React with 👍 / 👎.

- Add Azure TTS provider with SSML synthesis - Support for 400+ neural voices including Cantonese (zh-HK) - Config options: apiKey, region, voice, lang, outputFormat - Environment variables: AZURE_SPEECH_API_KEY, AZURE_SPEECH_REGION - Provider ID: 'azure' with alias 'azure-tts' - Built-in voices: zh-HK-HiuMaanNeural, zh-HK-HiuGaaiNeural

- Test voice list mapping from Azure API response - Test filtering of deprecated voices - Test error handling for API failures - Test custom baseUrl support

Fixed critical bugs identified by bot reviews: 1. baseUrl now used in listAzureVoices (was computed but unused) 2. region now used in synthesize endpoint construction 3. Deprecated-voice filter runs BEFORE map (Status field available) 4. Added azure to ResolvedTtsConfig type 5. Added azure to TtsDirectiveOverrides for directive support 6. Added DEFAULT_AZURE_OUTPUT_FORMAT constant 7. Added AbortController timeout for synthesize requests 8. Used type assertion for config.azure access (req.config as any) All changes follow the suggested fixes from greptile-apps and chatgpt-codex-connector reviews.

Fixed all P2 issues identified by code review: 1. Added azure_voice directive support in parseTtsDirectives - Added 'azure_voice' and 'azurevoice' directive cases - Azure voice validation: accepts non-empty ShortName format 2. Fixed fileExtension mapping for non-MP3 Azure formats - Added getFileExtension() helper function - Maps: mp3->.mp3, wav->.wav, ogg->.ogg, webm->.webm - Default to .mp3 for unknown formats 3. Added azure to TTS_PROVIDERS array - Now included in auto-selection priority list 4. Fixed Azure readiness check (isConfigured) - Now properly checks for both apiKey AND voice/lang - Uses ResolvedTtsConfig properly 5. Fixed timeout handling - Added DEFAULT_TIMEOUT_MS constant (30000) - Uses timeout from config, directive, or default - Properly passed to AbortSignal.timeout() 6. Added azure to resolveTtsApiKey function - Returns config.azure.apiKey || AZURE_SPEECH_API_KEY 7. Added azure to TtsDirectiveOverrides type - Supports voice, lang, outputFormat overrides

The azure type block was placed outside the interface, causing 'export modifier cannot appear on a type member' error. Now azure is properly inside TtsDirectiveOverrides interface.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 455981919b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-21T17:02:35Z

src/tts/providers/azure.ts

+function getFileExtension(outputFormat: string): string {
+  if (outputFormat.includes("mp3")) return ".mp3";
+  if (outputFormat.includes("wav")) return ".wav";
+  if (outputFormat.includes("ogg")) return ".ogg";
+  if (outputFormat.includes("webm")) return ".webm";
+  return ".mp3"; // default to mp3


Handle Azure Opus/PCM formats before defaulting to .mp3

This helper still mislabels several supported Azure output formats. Azure's REST API documents audio-...-opus, raw-*, and riff-* values for X-Microsoft-OutputFormat (for example: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-text-to-speech), but anything that doesn't contain mp3, wav, ogg, or webm falls through to .mp3 here. textToSpeech() then writes the returned bytes under that extension (src/tts/tts.ts:698), so choosing one of those formats causes downstream media/channel code to classify the file incorrectly.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-21T17:02:35Z

src/tts/providers/azure.ts

+      if (!voice) {
+        throw new Error(
+          "Azure voice not configured. Set voice in config or use [[tts:voice=zh-HK-HiuMaanNeural]] directive",


Point missing-voice errors at a directive Azure actually parses

When messages.tts.azure.voice is unset, this error tells users to retry with [[tts:voice=...]], but parseTtsDirectives() still routes the generic voice= key into overrides.openai (src/tts/tts-core.ts:167-177). Azure only reads req.overrides.azure.voice, so following the suggested fix leaves the provider in the same failing state until the voice is set in config or the caller discovers the separate azure_voice directive.

Useful? React with 👍 / 👎.

openclaw-barnacle bot added the size: S label Mar 20, 2026

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

greptile-apps bot reviewed Mar 20, 2026

View reviewed changes

openclaw-barnacle bot added size: M and removed size: S labels Mar 20, 2026

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 21, 2026

View reviewed changes

Yobo added 4 commits March 21, 2026 09:45

test(tts): add Azure TTS provider tests

cc8ba44

- Test voice list mapping from Azure API response - Test filtering of deprecated voices - Test error handling for API failures - Test custom baseUrl support

leonchui force-pushed the feature/azure-tts branch from 67a5b9b to 5d199eb Compare March 21, 2026 16:45

Yobo and others added 2 commits March 21, 2026 09:49

fix(tts): fix azure block position in TtsDirectiveOverrides type

3fad8b1

The azure type block was placed outside the interface, causing 'export modifier cannot appear on a type member' error. Now azure is properly inside TtsDirectiveOverrides interface.

style: run oxfmt on azure TTS files

4559819

chatgpt-codex-connector bot reviewed Mar 21, 2026

View reviewed changes

BradGroux mentioned this pull request Mar 26, 2026

WORKING: All Microsoft Issues and PRs #49126

Draft

Copilot AI mentioned this pull request Mar 27, 2026

List Azure and Windows open PRs available for review johnsonshi/openclaw#1

Closed

17 tasks

		const region = req.config?.azure?.region \|\| process.env.AZURE_SPEECH_REGION \|\| "eastus";
		const baseUrl = normalizeAzureBaseUrl(req.config?.azure?.baseUrl);

		outputFormat,
		fileExtension: outputFormat.includes("mp3") ? ".mp3" : ".wav",

Uh oh!

Conversation

leonchui commented Mar 20, 2026

Summary

Problem

What Changed

Features

Config Example

Human Verification

Uh oh!

greptile-apps bot commented Mar 20, 2026

Greptile Summary

Confidence Score: 1/5

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment