-
-
Notifications
You must be signed in to change notification settings - Fork 69.1k
Discord voice message plays at ~0.5x speed with 24kHz TTS source (mlx-audio Qwen3-TTS) #32293
Copy link
Copy link
Closed
Description
Description
Discord voice messages generated from a 24kHz TTS source play back at roughly 0.5x speed (noticeably slow). The same MP3 file sent as a regular attachment plays at normal speed.
Environment
- OpenClaw latest (npm)
- TTS provider: OpenAI-compatible (mlx-audio server)
- TTS model:
mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit - Source audio: MP3, 24000 Hz, mono, 128 kbps
- Channel: Discord
Steps to Reproduce
- Configure TTS with an OpenAI-compatible server that outputs 24kHz MP3 (e.g. mlx-audio)
- Use the
ttstool to generate a voice message - Listen to the Discord voice message — it plays at ~0.5x speed
Expected
Voice message plays at normal speed (1x).
Actual
Voice message plays at ~0.5x speed. The audio sounds slowed down.
Analysis
The TTS server outputs MP3 at 24000 Hz sample rate. When OpenClaw converts this to a Discord voice message (opus encoding), it appears to assume 48kHz input, causing the 0.5x playback speed. Sending the same MP3 as a regular file attachment plays correctly.
ffprobe output of TTS source:
Stream #0:0: Audio: mp3 (mp3float), 24000 Hz, mono, fltp, 128 kb/s
Workaround
Resampling the audio to 48kHz before passing to OpenClaw would likely fix it, but this should be handled internally.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Fields
Give feedbackNo fields configured for issues without a type.