fix(mistral): preserve diarization segments in transcription response by Chesars · Pull Request #23925 · BerriAI/litellm

Chesars · 2026-03-18T02:05:17Z

Relevant issues

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

When using Mistral's Voxtral model with diarize=true, the API returns segments (with speaker_id, timestamps) and language fields. These were being dropped in transform_audio_transcription_response which only extracted text.

Now segments and language are preserved on the TranscriptionResponse object, matching the pattern used by other providers like Deepgram.

Fixes BerriAI#23890 — Mistral's Voxtral transcription with `diarize=true` returns `segments` (with speaker_id, timestamps) and `language`, but these fields were dropped when mapping the response to TranscriptionResponse.

vercel · 2026-03-18T02:05:23Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 18, 2026 2:07am

greptile-apps · 2026-03-18T02:07:24Z

Greptile Summary

This PR fixes a data-loss bug where Mistral Voxtral's diarization fields (segments and language) were silently dropped during transform_audio_transcription_response, leaving users unable to access speaker-attributed transcript segments.

Changes:

litellm/llms/mistral/audio_transcription/transformation.py: After constructing the base TranscriptionResponse, the fix conditionally copies segments and language from the raw API response JSON to the top-level response object using dictionary-style attribute assignment (response["key"] = value).
tests/test_litellm/llms/mistral/audio_transcription/test_mistral_audio_transcription_transformation.py: Adds test_mistral_audio_transcription_response_transform_diarized, a fully mocked unit test verifying that both segments (with speaker_id, start, end) and language (including None) survive the transformation.

The implementation correctly follows the pattern already used by Deepgram (response["language"], response["words"]) and ElevenLabs (response["language"]), and no real network calls are made in the new test.

Confidence Score: 5/5

This PR is safe to merge — it is a small, additive, non-breaking change with appropriate mock test coverage.
The change is minimal (7 lines), follows an already-established pattern used by Deepgram and ElevenLabs, introduces no breaking changes (segments/language are only surfaced when present in the response), and is fully covered by a mock unit test. No hardcoded model flags, no FastAPI imports, no DB calls, and no backwards-incompatible behaviour changes.
No files require special attention.

Important Files Changed

Filename	Overview
litellm/llms/mistral/audio_transcription/transformation.py	Adds preservation of `segments` and `language` fields from Mistral's diarized transcription API response, following the same pattern used by Deepgram and ElevenLabs.
tests/test_litellm/llms/mistral/audio_transcription/test_mistral_audio_transcription_transformation.py	Adds a mock-only unit test `test_mistral_audio_transcription_response_transform_diarized` that verifies `segments` and `language` are preserved in the transformed response when diarization is active.

Sequence Diagram

sequenceDiagram
    participant Client
    participant LiteLLM
    participant MistralAPI

    Client->>LiteLLM: transcription(model="mistral/voxtral-mini-latest", diarize=True)
    LiteLLM->>MistralAPI: POST /v1/audio/transcriptions (form: diarize=true)
    MistralAPI-->>LiteLLM: { text, language, segments: [{speaker_id, start, end, ...}], usage }
    Note over LiteLLM: transform_audio_transcription_response()<br/>Extract text → TranscriptionResponse<br/>Preserve segments → response["segments"]<br/>Preserve language → response["language"]<br/>Store full JSON → _hidden_params
    LiteLLM-->>Client: TranscriptionResponse(text, segments, language)

_{Last reviewed commit: "fix(mistral): preser..."}

vercel bot deployed to Preview March 18, 2026 02:07 View deployment

Chesars merged commit f059ba5 into BerriAI:litellm_oss_staging_03_17_2026 Mar 18, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(mistral): preserve diarization segments in transcription response#23925

fix(mistral): preserve diarization segments in transcription response#23925
Chesars merged 1 commit intoBerriAI:litellm_oss_staging_03_17_2026from
Chesars:fix/mistral-diarize-segments-response

Chesars commented Mar 18, 2026

Uh oh!

vercel bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 18, 2026

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Chesars commented Mar 18, 2026

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 18, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 18, 2026 •

edited

Loading