fix: validate Edge TTS output file is non-empty before reporting success#43273
fix: validate Edge TTS output file is non-empty before reporting success#43273howardpen9 wants to merge 1 commit intoopenclaw:mainfrom
Conversation
The `edgeTTS()` function calls `tts.ttsPromise()` from node-edge-tts, which resolves successfully even when the Bing TTS service sends `turn.end` without any audio frames. This results in a 0-byte MP3 file being returned as a successful TTS result. Add a `statSync` check after `ttsPromise` resolves: if the output file is 0 bytes, throw so the provider-fallback loop in `textToSpeech()` can try the next provider (OpenAI / ElevenLabs) instead of delivering an empty audio file. Closes openclaw#43229
Greptile SummaryThis PR fixes a real bug where
Confidence Score: 4/5
Last reviewed commit: 29cac6a |
| const { size } = statSync(outputPath); | ||
| if (size === 0) { | ||
| throw new Error("Edge TTS produced empty audio file"); | ||
| } |
There was a problem hiding this comment.
Use async stat and guard against missing file
statSync blocks the event loop. While this is a minor concern after an async network call, it is still worth using the async variant to stay consistent with the surrounding async function.
More importantly, if ttsPromise somehow resolves without ever creating the file (e.g. a future node-edge-tts version that skips creating the WriteStream on certain error paths), statSync will throw an ENOENT error whose message will surface in the provider-fallback error log instead of the clearer "Edge TTS produced empty audio file" message.
| const { size } = statSync(outputPath); | |
| if (size === 0) { | |
| throw new Error("Edge TTS produced empty audio file"); | |
| } | |
| let size = 0; | |
| try { | |
| ({ size } = statSync(outputPath)); | |
| } catch { | |
| // File was never created — treat the same as an empty file. | |
| } | |
| if (size === 0) { | |
| throw new Error("Edge TTS produced empty audio file"); | |
| } |
Alternatively, switch to import { stat } from "node:fs/promises" and await stat(outputPath) with the same try/catch pattern.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/tts/tts-core.ts
Line: 677-680
Comment:
**Use async `stat` and guard against missing file**
`statSync` blocks the event loop. While this is a minor concern after an async network call, it is still worth using the async variant to stay consistent with the surrounding `async` function.
More importantly, if `ttsPromise` somehow resolves without ever creating the file (e.g. a future `node-edge-tts` version that skips creating the `WriteStream` on certain error paths), `statSync` will throw an `ENOENT` error whose message will surface in the provider-fallback error log instead of the clearer `"Edge TTS produced empty audio file"` message.
```suggestion
let size = 0;
try {
({ size } = statSync(outputPath));
} catch {
// File was never created — treat the same as an empty file.
}
if (size === 0) {
throw new Error("Edge TTS produced empty audio file");
}
```
Alternatively, switch to `import { stat } from "node:fs/promises"` and `await stat(outputPath)` with the same try/catch pattern.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 29cac6a5e2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const { size } = statSync(outputPath); | ||
| if (size === 0) { | ||
| throw new Error("Edge TTS produced empty audio file"); |
There was a problem hiding this comment.
Wait for file flush before rejecting zero-byte Edge output
This synchronous statSync check can misclassify successful Edge TTS calls as failures when ttsPromise() resolves before the underlying write stream has fully flushed to disk (a realistic timing for async createWriteStream writes). In that case size is still 0 (or transiently not finalized), so we throw and trigger provider fallback or a hard failure in Edge-only setups, even though valid audio is written moments later; the check needs a short readiness wait/retry (or a completion signal tied to file flush) instead of a single immediate stat.
Useful? React with 👍 / 👎.
Summary
tts.ttsPromise()resolves inedgeTTS()— if the output is 0 bytes, throw an error so the provider-fallback loop intextToSpeech()tries the next provider instead of delivering an empty audio fileRoot Cause
node-edge-tts'sttsPromise()creates aWriteStream, connects to Bing's TTS WebSocket, and resolves the promise when it receivesturn.end. However, if the service sendsturn.endwithout any preceding audio frames (e.g. due to rate-limiting, unsupported voice/format, or transient errors), the promise still resolves successfully — leaving a 0-byte file on disk.textToSpeech()then treats this as a successful result and delivers the empty file to the channel (e.g. Telegram voice message with no audio).Fix
A single
statSynccheck afterttsPromiseresolves:This allows the existing provider-fallback mechanism to kick in and try OpenAI/ElevenLabs TTS instead.
Test plan
edgeTTSthrows when output file is 0 bytesedgeTTSsucceeds when output file has contentCloses #43229