Skip to content

fix(feishu): harden voice reply delivery#43388

Open
WaynePika wants to merge 2 commits intoopenclaw:mainfrom
WaynePika:codex/feishu-voice-hardening
Open

fix(feishu): harden voice reply delivery#43388
WaynePika wants to merge 2 commits intoopenclaw:mainfrom
WaynePika:codex/feishu-voice-hardening

Conversation

@WaynePika
Copy link
Copy Markdown
Contributor

Summary

  • transcode audio-like outbound media to ogg/opus before Feishu audio send
  • include duration metadata for Feishu audio/media uploads
  • strip internal [[tts]] tags from Feishu reply dispatch and outbound payload normalization
  • add regression coverage for media send and reply cleanup

Why

Feishu voice replies were not reliable in practice:

  • Edge TTS produced mp3, but Feishu audio messages need a compatible audio upload path
  • missing duration metadata could show 0:00 or incorrect duration in the client
  • internal [[tts]] tags could leak into user-visible messages

Testing

  • local runtime validation on mini against a real Feishu group
  • verified Feishu audio messages were sent successfully after transcode
  • verified client duration display after adding duration metadata
  • verified latest messages no longer exposed raw [[tts]] tags
  • added test coverage in the Feishu extension and outbound normalization layers

@openclaw-barnacle openclaw-barnacle bot added channel: feishu Channel integration: feishu size: M labels Mar 11, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR hardens Feishu voice reply delivery through three coordinated changes: transcoding incoming audio (mp3, m4a, wav, etc.) to ogg/opus via ffmpeg before upload, attaching ffprobe-derived duration metadata to audio/video uploads, and stripping internal [[tts:...]] tags from both the Feishu reply dispatcher and the shared outbound payload normalizer. All three changes come with test coverage.

Key findings:

  • probeBufferDurationMs is missing a try/catch around its writeFile step — unlike probeMediaDurationMs which returns undefined on any error, a temp-write failure here will propagate and abort the entire media send. See inline comment on media.ts:459.
  • transcodeAudioToFeishuOpus silently returns null on any failure with no log output, so transcode degradation (e.g. ffmpeg not installed or malformed input) is invisible in production diagnostics.
  • stripInternalTtsTags is copy-pasted verbatim into both reply-dispatcher.ts and payloads.ts; a shared utility would be easier to keep in sync.

Confidence Score: 3/5

  • Generally safe to merge; the one logic gap in probeBufferDurationMs should be fixed before shipping to avoid rare-but-complete send failures.
  • The TTS-tag stripping and msg_type routing changes are clean and well-tested. The transcode pipeline is sound in the happy path, but probeBufferDurationMs missing a try/catch means a filesystem hiccup during duration probing can silently kill an audio message send — a regression relative to the graceful error handling in every other helper in this file. The silent transcode failure also reduces observability. These lower confidence below merge-ready.
  • extensions/feishu/src/media.ts — specifically the probeBufferDurationMs function (missing error catch) and the transcodeAudioToFeishuOpus catch block (no logging).

Last reviewed commit: dac035f

Comment on lines +459 to +473
async function probeBufferDurationMs(params: {
buffer: Buffer;
fileName: string;
}): Promise<number | undefined> {
const ext = path.extname(params.fileName).toLowerCase() || ".bin";
const tmpDir = await fs.promises.mkdtemp(path.join(os.tmpdir(), "openclaw-feishu-probe-"));
const filePath = path.join(tmpDir, `probe${ext}`);

try {
await fs.promises.writeFile(filePath, params.buffer);
return await probeMediaDurationMs(filePath);
} finally {
await fs.promises.rm(tmpDir, { recursive: true, force: true }).catch(() => undefined);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probeBufferDurationMs can throw, aborting the entire media send

Unlike probeMediaDurationMs which wraps all operations in try/catch and returns undefined on failure, probeBufferDurationMs has no error handling around writeFile. If the temp file write fails (e.g. disk full, permissions issue), the exception propagates all the way up through sendMediaFeishu, and no message is delivered at all — even though duration metadata is intended to be best-effort.

This is inconsistent with how transcodeAudioToFeishuOpus (returns null on failure) and probeMediaDurationMs (returns undefined on failure) behave. The call site in sendMediaFeishu doesn't add a .catch() either.

Suggested change
async function probeBufferDurationMs(params: {
buffer: Buffer;
fileName: string;
}): Promise<number | undefined> {
const ext = path.extname(params.fileName).toLowerCase() || ".bin";
const tmpDir = await fs.promises.mkdtemp(path.join(os.tmpdir(), "openclaw-feishu-probe-"));
const filePath = path.join(tmpDir, `probe${ext}`);
try {
await fs.promises.writeFile(filePath, params.buffer);
return await probeMediaDurationMs(filePath);
} finally {
await fs.promises.rm(tmpDir, { recursive: true, force: true }).catch(() => undefined);
}
}
async function probeBufferDurationMs(params: {
buffer: Buffer;
fileName: string;
}): Promise<number | undefined> {
const ext = path.extname(params.fileName).toLowerCase() || ".bin";
const tmpDir = await fs.promises.mkdtemp(path.join(os.tmpdir(), "openclaw-feishu-probe-"));
const filePath = path.join(tmpDir, `probe${ext}`);
try {
await fs.promises.writeFile(filePath, params.buffer);
return await probeMediaDurationMs(filePath);
} catch {
return undefined;
} finally {
await fs.promises.rm(tmpDir, { recursive: true, force: true }).catch(() => undefined);
}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/media.ts
Line: 459-473

Comment:
**`probeBufferDurationMs` can throw, aborting the entire media send**

Unlike `probeMediaDurationMs` which wraps all operations in `try/catch` and returns `undefined` on failure, `probeBufferDurationMs` has no error handling around `writeFile`. If the temp file write fails (e.g. disk full, permissions issue), the exception propagates all the way up through `sendMediaFeishu`, and no message is delivered at all — even though duration metadata is intended to be best-effort.

This is inconsistent with how `transcodeAudioToFeishuOpus` (returns `null` on failure) and `probeMediaDurationMs` (returns `undefined` on failure) behave. The call site in `sendMediaFeishu` doesn't add a `.catch()` either.

```suggestion
async function probeBufferDurationMs(params: {
  buffer: Buffer;
  fileName: string;
}): Promise<number | undefined> {
  const ext = path.extname(params.fileName).toLowerCase() || ".bin";
  const tmpDir = await fs.promises.mkdtemp(path.join(os.tmpdir(), "openclaw-feishu-probe-"));
  const filePath = path.join(tmpDir, `probe${ext}`);

  try {
    await fs.promises.writeFile(filePath, params.buffer);
    return await probeMediaDurationMs(filePath);
  } catch {
    return undefined;
  } finally {
    await fs.promises.rm(tmpDir, { recursive: true, force: true }).catch(() => undefined);
  }
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +431 to +435
} catch {
return null;
} finally {
await fs.promises.rm(tmpDir, { recursive: true, force: true }).catch(() => undefined);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transcode failures are silently swallowed with no log

When ffmpeg fails (binary not installed, malformed input, out of disk space, etc.) the catch block returns null and the code falls back to uploading the original file as type "stream" with msg_type: "file". From the user's perspective the voice message silently becomes a file download, and from an operator's perspective there is no log entry to diagnose why audio delivery degraded.

Consider at minimum emitting a warning through the runtime logger so that transcode failures are observable in production:

  } catch (err) {
    // Returning null here causes sendMediaFeishu to fall back to the original file as "stream"
    // Log the failure so it is visible in production diagnostics
    params.runtime?.error?.(`feishu: audio transcode failed for "${params.fileName}": ${String(err)}`);
    return null;
  }

(A runtime reference would need to be threaded through, or at least a console.warn could serve as a minimal indicator.)

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/feishu/src/media.ts
Line: 431-435

Comment:
**Transcode failures are silently swallowed with no log**

When `ffmpeg` fails (binary not installed, malformed input, out of disk space, etc.) the `catch` block returns `null` and the code falls back to uploading the original file as type `"stream"` with `msg_type: "file"`. From the user's perspective the voice message silently becomes a file download, and from an operator's perspective there is no log entry to diagnose why audio delivery degraded.

Consider at minimum emitting a warning through the runtime logger so that transcode failures are observable in production:

```typescript
  } catch (err) {
    // Returning null here causes sendMediaFeishu to fall back to the original file as "stream"
    // Log the failure so it is visible in production diagnostics
    params.runtime?.error?.(`feishu: audio transcode failed for "${params.fileName}": ${String(err)}`);
    return null;
  }
```

(A runtime reference would need to be threaded through, or at least a `console.warn` could serve as a minimal indicator.)

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +40 to +46
function stripInternalTtsTags(text: string): string {
return text
.replace(/\[\[tts:text\]\][\s\S]*?\[\[\/tts:text\]\]/gi, "")
.replace(/\[\[tts:[^\]]+\]\]/gi, "")
.replace(/\n{3,}/g, "\n\n")
.trim();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate stripInternalTtsTags implementation

The exact same function body is copy-pasted into extensions/feishu/src/reply-dispatcher.ts (lines 24–30). If the tag format ever changes, both sites need to be updated in sync.

Consider extracting this to a shared utility in openclaw/plugin-sdk or a common infra helper so there is a single source of truth.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/infra/outbound/payloads.ts
Line: 40-46

Comment:
**Duplicate `stripInternalTtsTags` implementation**

The exact same function body is copy-pasted into `extensions/feishu/src/reply-dispatcher.ts` (lines 24–30). If the tag format ever changes, both sites need to be updated in sync.

Consider extracting this to a shared utility in `openclaw/plugin-sdk` or a common infra helper so there is a single source of truth.

How can I resolve this? If you propose a fix, please make it concise.

oolong-tea-2026 added a commit to oolong-tea-2026/openclaw that referenced this pull request Mar 24, 2026
… playable audio messages

The Feishu outbound media router only recognized .opus and .ogg as audio,
sending common formats like mp3/wav/m4a/flac/aac as file attachments
instead of inline playable audio bubbles.

Changes:
- Use existing isAudioFileName() + mimeKind === 'audio' for audio detection,
  consistent with how image routing already uses mimeKind === 'image'
- Expand detectFileType() to map common audio extensions to 'opus'
- Add audio duration parsing via music-metadata (already in monorepo for
  extensions/matrix) with graceful degradation
- Update existing test and add coverage for mp3/wav/m4a/flac/aac routing

Verified: Feishu im/v1/files API accepts mp3/wav/m4a/flac/aac with
file_type=opus and plays them correctly. This is consistent with the
community Feishu plugin @m1heng-clawd/feishu (2.3k+ stars) which maps
all common audio extensions to file_type=opus in production.

Related: openclaw#33060, openclaw#43388, openclaw#33736, openclaw#28269
oolong-tea-2026 added a commit to oolong-tea-2026/openclaw that referenced this pull request Mar 27, 2026
… playable audio messages

The Feishu outbound media router only recognized .opus and .ogg as audio,
sending common formats like mp3/wav/m4a/flac/aac as file attachments
instead of inline playable audio bubbles.

Changes:
- Use existing isAudioFileName() + mimeKind === 'audio' for audio detection,
  consistent with how image routing already uses mimeKind === 'image'
- Expand detectFileType() to map common audio extensions to 'opus'
- Add audio duration parsing via music-metadata (already in monorepo for
  extensions/matrix) with graceful degradation
- Update existing test and add coverage for mp3/wav/m4a/flac/aac routing

Verified: Feishu im/v1/files API accepts mp3/wav/m4a/flac/aac with
file_type=opus and plays them correctly. This is consistent with the
community Feishu plugin @m1heng-clawd/feishu (2.3k+ stars) which maps
all common audio extensions to file_type=opus in production.

Related: openclaw#33060, openclaw#43388, openclaw#33736, openclaw#28269
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: feishu Channel integration: feishu size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant