feishu: pass audio duration to file upload to fix TTS truncation (#33043)#33060
Conversation
Greptile SummaryThis PR fixes Feishu TTS audio truncation by probing the audio duration via ffprobe and passing it to the Feishu file upload API. Without the Key changes:
The implementation is straightforward and well-scoped — one minor comment inaccuracy has been flagged for clarification. Confidence Score: 5/5
Last reviewed commit: 79598f3 |
| // For local paths, we can probe duration before loading into memory. | ||
| // Feishu's audio API requires duration in ms; without it the client | ||
| // defaults to 0ms and truncates playback. | ||
| let localSourcePath: string | undefined; |
There was a problem hiding this comment.
The comment says "we can probe duration before loading into memory", but the actual getAudioDurationMs call happens on line 476, after loadWebMedia (line 451) has already loaded the file into memory. The variable localSourcePath is captured early, but the probing itself is deferred. The comment describes the intent but misrepresents the runtime timing.
| // For local paths, we can probe duration before loading into memory. | |
| // Feishu's audio API requires duration in ms; without it the client | |
| // defaults to 0ms and truncates playback. | |
| let localSourcePath: string | undefined; | |
| // Capture local path before loading so we can probe its duration later. | |
| // Feishu's audio API requires duration in ms; without it the client | |
| // defaults to 0ms and truncates playback. | |
| let localSourcePath: string | undefined; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/media.ts
Line: 438-441
Comment:
The comment says "we can probe duration **before** loading into memory", but the actual `getAudioDurationMs` call happens on line 476, **after** `loadWebMedia` (line 451) has already loaded the file into memory. The variable `localSourcePath` is captured early, but the probing itself is deferred. The comment describes the intent but misrepresents the runtime timing.
```suggestion
// Capture local path before loading so we can probe its duration later.
// Feishu's audio API requires duration in ms; without it the client
// defaults to 0ms and truncates playback.
let localSourcePath: string | undefined;
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 79598f3ced
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| name = fileName ?? "file"; | ||
| } else if (mediaUrl) { | ||
| // Detect local file paths (absolute paths not starting with a URL scheme) | ||
| if (/^\//.test(mediaUrl) || /^[A-Za-z]:[/\\]/.test(mediaUrl)) { |
There was a problem hiding this comment.
Handle file:// media URLs when probing upload duration
sendMediaFeishu only marks a source as local when mediaUrl starts with / or a drive letter, but Feishu outbound passes mediaUrl through unchanged and loadWebMedia explicitly supports file://... inputs by converting them to local paths. For file://...voice.opus/file://...mp4 uploads, this branch skips getAudioDurationMs, so uploadFileFeishu omits duration and the original Feishu truncation behavior (0:00 playback) still occurs for that valid local-media form.
Useful? React with 👍 / 👎.
… playable audio messages The Feishu outbound media router only recognized .opus and .ogg as audio, sending common formats like mp3/wav/m4a/flac/aac as file attachments instead of inline playable audio bubbles. Changes: - Use existing isAudioFileName() + mimeKind === 'audio' for audio detection, consistent with how image routing already uses mimeKind === 'image' - Expand detectFileType() to map common audio extensions to 'opus' - Add audio duration parsing via music-metadata (already in monorepo for extensions/matrix) with graceful degradation - Update existing test and add coverage for mp3/wav/m4a/flac/aac routing Verified: Feishu im/v1/files API accepts mp3/wav/m4a/flac/aac with file_type=opus and plays them correctly. This is consistent with the community Feishu plugin @m1heng-clawd/feishu (2.3k+ stars) which maps all common audio extensions to file_type=opus in production. Related: openclaw#33060, openclaw#43388, openclaw#33736, openclaw#28269
… playable audio messages The Feishu outbound media router only recognized .opus and .ogg as audio, sending common formats like mp3/wav/m4a/flac/aac as file attachments instead of inline playable audio bubbles. Changes: - Use existing isAudioFileName() + mimeKind === 'audio' for audio detection, consistent with how image routing already uses mimeKind === 'image' - Expand detectFileType() to map common audio extensions to 'opus' - Add audio duration parsing via music-metadata (already in monorepo for extensions/matrix) with graceful degradation - Update existing test and add coverage for mp3/wav/m4a/flac/aac routing Verified: Feishu im/v1/files API accepts mp3/wav/m4a/flac/aac with file_type=opus and plays them correctly. This is consistent with the community Feishu plugin @m1heng-clawd/feishu (2.3k+ stars) which maps all common audio extensions to file_type=opus in production. Related: openclaw#33060, openclaw#43388, openclaw#33736, openclaw#28269
Summary
sendMediaFeishucallsuploadFileFeishufor opus audio without providing thedurationfield (milliseconds). Feishu'sim.file.createAPI treats a missingdurationas 0 ms; the client renders a 0:00 progress bar and halts playback after the initial buffer drains.sendMediaFeishu, so the Feishu upload code path is never reached.parseFfprobeDurationSecs(pure parser, easy to unit-test) andgetAudioDurationMs(ffprobe-backed, returnsnumber | undefined) tosrc/media/ffmpeg-exec.ts.getAudioDurationMsthroughPluginRuntimeCore.mediaso extensions can use it without bypassing the plugin boundary.sendMediaFeishu, when the source is a local file path and the file type isopusormp4, the duration is probed before upload and forwarded touploadFileFeishu. If ffprobe is unavailable the field is simply omitted (graceful degradation).Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
Feishu TTS audio now plays the full reply instead of being truncated. No behavior changes for any other channel.
Security Impact (required)
durationfield)Yes, explain risk + mitigation: N/ARepro + Verification
Environment
messages.tts.auto: "always"Steps
messages.tts.auto: "always"in config.Expected
Actual (before fix)
Actual (after fix)
Evidence
parseFfprobeDurationSecsunit tests — all green (pnpm test -- extensions/feishu/src/media.test.ts src/media/ffmpeg-exec.test.ts)pnpm tsgo— no new type errors (pre-existingtlonmodule errors unrelated)Human Verification (required)
getAudioDurationMsreturnsundefinedgracefully when ffprobe is unavailable;mediaBuffer-only calls do not trigger ffprobe.Compatibility / Migration
Failure Recovery (if this breaks)
git revert 79598f3ceextensions/feishu/src/media.ts,src/media/ffmpeg-exec.ts,src/plugins/runtime/runtime-media.ts,src/plugins/runtime/types-core.tsdurationfield (extremely unlikely — field is documented and optional-but-recommended by Feishu)Risks and Mitigations
getAudioDurationMscatches all errors and returnsundefined;durationis then omitted from the upload request, preserving the pre-fix behavior rather than hard-failing.