fix(discord): resample audio to 48kHz for voice messages#32298
fix(discord): resample audio to 48kHz for voice messages#32298steipete merged 1 commit intoopenclaw:mainfrom
Conversation
Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate.
🔒 Aisle Security AnalysisWe found 1 potential security issue(s) in this PR:
1. 🔵 Unbounded ffmpeg processing in voice message conversion can cause resource-exhaustion (DoS)
Description
Impact:
Vulnerable code: await execFileAsync("ffmpeg", [
"-y",
"-i",
filePath,
"-ar",
"48000",
"-c:a",
"libopus",
"-b:a",
"64k",
outputPath,
]);RecommendationAdd hard limits around transcoding:
const duration = await getAudioDuration(filePath);
if (duration > 120) throw new Error("Voice messages must be <= 120s");
import { runCommandWithTimeout } from "../process/exec.js";
await runCommandWithTimeout([
"ffmpeg",
"-y",
"-i", filePath,
"-t", "120", // cap work
"-ar", "48000",
"-c:a", "libopus",
"-b:a", "64k",
outputPath,
], { timeoutMs: 30_000 });Optionally also reduce worst-case probing costs:
Analyzed PR: #32298 at commit Last updated on: 2026-03-02T23:47:23Z |
Greptile SummaryThis PR fixes half-speed Discord voice message playback (issue #32293) by adding Key concern:
Confidence Score: 3/5
Last reviewed commit: 09ec684 |
Additional Comments (1)
The fix correctly adds To be fully consistent with the fix's intent, the OGG/Opus fast-path should also verify the stream's sample rate with if (ext === ".ogg") {
try {
const { stdout: codecOut } = await execFileAsync("ffprobe", [
"-v", "error",
"-select_streams", "a:0",
"-show_entries", "stream=codec_name,sample_rate",
"-of", "csv=p=0",
filePath,
]);
const [codec, sampleRate] = codecOut.trim().toLowerCase().split(",");
if (codec === "opus" && sampleRate === "48000") {
return { path: filePath, cleanup: false };
}
// else: fall through to conversion (wrong codec or wrong sample rate)
} catch {
// If probe fails, convert anyway
}
}Without this change the bug described in the linked issue can still surface whenever the input is already in OGG/Opus container but encoded at a sample rate other than 48kHz. Prompt To Fix With AIThis is a comment left during a code review.
Path: src/discord/voice-message.ts
Line: 162-181
Comment:
**Early-return bypasses 48kHz resampling for existing OGG/Opus files**
The fix correctly adds `-ar 48000` to the conversion path, but the early-return for files that are already OGG/Opus at line 176-178 still returns the file unchanged — without verifying its sample rate. If a TTS provider outputs OGG/Opus natively at 24kHz, this path will skip resampling and Discord will still play the audio at half speed, because the OGG header declares 24kHz and Discord honours that declared rate.
To be fully consistent with the fix's intent, the OGG/Opus fast-path should also verify the stream's sample rate with `ffprobe` and fall through to conversion when it isn't 48kHz:
```typescript
if (ext === ".ogg") {
try {
const { stdout: codecOut } = await execFileAsync("ffprobe", [
"-v", "error",
"-select_streams", "a:0",
"-show_entries", "stream=codec_name,sample_rate",
"-of", "csv=p=0",
filePath,
]);
const [codec, sampleRate] = codecOut.trim().toLowerCase().split(",");
if (codec === "opus" && sampleRate === "48000") {
return { path: filePath, cleanup: false };
}
// else: fall through to conversion (wrong codec or wrong sample rate)
} catch {
// If probe fails, convert anyway
}
}
```
Without this change the bug described in the linked issue can still surface whenever the input is already in OGG/Opus container but encoded at a sample rate other than 48kHz.
How can I resolve this? If you propose a fix, please make it concise. |
|
Landed.
Thanks @kevinWangSheng! |
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]> (cherry picked from commit 924d9e3)
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]> (cherry picked from commit 924d9e3)
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>
) Fixes openclaw#32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source When TTS providers (like mlx-audio Qwen3-TTS) output audioHz, Discord voice at 24k messages play at half speed because Discord expects 48kHz. This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate. Co-authored-by: Kevin Shenghui <[email protected]>
Fixes #32293: Discord voice message plays at ~0.5x speed with 24kHz TTS source
When TTS providers (like mlx-audio Qwen3-TTS) output audio at 24kHz, Discord voice messages play at half speed because Discord expects 48kHz.
This fix adds explicit sample rate conversion to 48kHz in the ensureOggOpus function, ensuring voice messages always play at correct speed regardless of the input audio's sample rate.