Skip to content

fix(stt): normalize language code to ISO 639-1 for OpenAI Whisper#1939

Merged
piorpua merged 1 commit intomainfrom
fix/sentry-ELECTRON-G3
Mar 30, 2026
Merged

fix(stt): normalize language code to ISO 639-1 for OpenAI Whisper#1939
piorpua merged 1 commit intomainfrom
fix/sentry-ELECTRON-G3

Conversation

@kaizhou-lab
Copy link
Copy Markdown
Collaborator

Summary

  • Sentry Issue: ELECTRON-G3 (3 events, 3 users across Chile/Colombia/Hong Kong)
  • OpenAI Whisper API requires ISO 639-1 language codes (e.g. en) but the languageHint and config values may arrive in BCP 47 format (e.g. en-us), causing 422 Unprocessable Entity responses
  • Fix: extract the primary language subtag (language.split('-')[0].toLowerCase()) before appending to the form data

Changes

  • src/process/bridge/services/SpeechToTextService.ts: normalize language code before formData.append('language', ...)
  • tests/unit/SpeechToTextService.test.ts: new test file covering BCP 47 → ISO 639-1 conversion, plain ISO 639-1 passthrough, no-language omission, and error response handling

Verification

  • Unit tests: 5/5 pass
  • Type check: clean
  • Lint: 0 errors
  • Process: main (unit tests sufficient)

Test plan

  • Unit tests cover BCP 47 language hint conversion (e.g. en-usen)
  • Unit tests cover config language conversion (e.g. zh-CNzh)
  • Unit tests verify plain ISO 639-1 codes pass through unchanged
  • Unit tests verify language field omitted when no hint/config
  • Unit tests verify error handling on non-ok response

…request (ELECTRON-G3)

OpenAI Whisper API requires ISO 639-1 language codes (e.g. "en") but
the languageHint and config values may arrive in BCP 47 format
(e.g. "en-us"), causing 422 Unprocessable Entity responses. Extract
the primary subtag before appending to the form data.
@kaizhou-lab kaizhou-lab marked this pull request as ready for review March 30, 2026 09:58
@piorpua piorpua added the bot:reviewing Review in progress (mutex) label Mar 30, 2026
@piorpua
Copy link
Copy Markdown
Contributor

piorpua commented Mar 30, 2026

Code Review:fix(stt): normalize language code to ISO 639-1 for OpenAI Whisper (#1939)

变更概述

本 PR 修复 Sentry 问题 ELECTRON-G3:SpeechToTextService 向 OpenAI Whisper API 发送语言参数时,直接传递了 BCP 47 格式(如 en-us),而 Whisper 仅接受 ISO 639-1 格式(如 en),导致 3 名用户遭遇 422 错误。改动仅涉及 SpeechToTextService.ts 一行逻辑及新增的 5 个单元测试。


方案评估

结论:✅ 方案合理

language.split('-')[0].toLowerCase() 提取主语言标签,精准修复 BCP 47 → ISO 639-1 的转换问题,改动最小且与 Whisper API 文档要求一致。现有 if (language) 守卫覆盖了空值情况,无需额外处理。测试覆盖了 BCP 47 转换、原生 ISO 639-1 直通、无语言字段省略和错误响应四个核心场景。


问题清单

✅ 未发现明显问题,代码质量良好,建议批准合并。

bunx oxlint 报告的 no-extraneous-class 警告(SpeechToTextService 类,第 143 行)为已存在的历史问题,与本次改动无关,不作为本 PR 阻塞点。


汇总

无问题。

结论

批准合并 — 修复精准,测试完整,无阻塞性问题。


本报告由本地 pr-review skill 生成,包含完整项目上下文,无截断限制。

CONCLUSION: APPROVED
IS_CRITICAL_PATH: false
PR_NUMBER: 1939

@piorpua
Copy link
Copy Markdown
Contributor

piorpua commented Mar 30, 2026

✅ 已自动 review,无阻塞性问题,正在触发自动合并。

@piorpua piorpua merged commit 868bdaf into main Mar 30, 2026
17 checks passed
@piorpua piorpua deleted the fix/sentry-ELECTRON-G3 branch March 30, 2026 11:37
@piorpua piorpua added bot:done Auto-merged by bot and removed bot:reviewing Review in progress (mutex) labels Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:done Auto-merged by bot

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants