fix(mistral): preserve diarization segments in transcription response#23925
Merged
Chesars merged 1 commit intoBerriAI:litellm_oss_staging_03_17_2026from Mar 18, 2026
Conversation
Fixes BerriAI#23890 — Mistral's Voxtral transcription with `diarize=true` returns `segments` (with speaker_id, timestamps) and `language`, but these fields were dropped when mapping the response to TranscriptionResponse.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Greptile SummaryThis PR fixes a data-loss bug where Mistral Voxtral's diarization fields ( Changes:
The implementation correctly follows the pattern already used by Deepgram ( Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| litellm/llms/mistral/audio_transcription/transformation.py | Adds preservation of segments and language fields from Mistral's diarized transcription API response, following the same pattern used by Deepgram and ElevenLabs. |
| tests/test_litellm/llms/mistral/audio_transcription/test_mistral_audio_transcription_transformation.py | Adds a mock-only unit test test_mistral_audio_transcription_response_transform_diarized that verifies segments and language are preserved in the transformed response when diarization is active. |
Sequence Diagram
sequenceDiagram
participant Client
participant LiteLLM
participant MistralAPI
Client->>LiteLLM: transcription(model="mistral/voxtral-mini-latest", diarize=True)
LiteLLM->>MistralAPI: POST /v1/audio/transcriptions (form: diarize=true)
MistralAPI-->>LiteLLM: { text, language, segments: [{speaker_id, start, end, ...}], usage }
Note over LiteLLM: transform_audio_transcription_response()<br/>Extract text → TranscriptionResponse<br/>Preserve segments → response["segments"]<br/>Preserve language → response["language"]<br/>Store full JSON → _hidden_params
LiteLLM-->>Client: TranscriptionResponse(text, segments, language)
Last reviewed commit: "fix(mistral): preser..."
f059ba5
into
BerriAI:litellm_oss_staging_03_17_2026
5 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Relevant issues
Fixes #23890
Pre-Submission checklist
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
🐛 Bug Fix
Changes
When using Mistral's Voxtral model with
diarize=true, the API returnssegments(withspeaker_id, timestamps) andlanguagefields. These were being dropped intransform_audio_transcription_responsewhich only extractedtext.Now
segmentsandlanguageare preserved on theTranscriptionResponseobject, matching the pattern used by other providers like Deepgram.