Skip to content

feat(fine-tuning): fix Azure OpenAI fine-tuning job creation#24687

Merged
yuneng-berri merged 6 commits intoBerriAI:mainfrom
Sameerlite:litellm_litellm_azure-finetuning-fixes
Mar 27, 2026
Merged

feat(fine-tuning): fix Azure OpenAI fine-tuning job creation#24687
yuneng-berri merged 6 commits intoBerriAI:mainfrom
Sameerlite:litellm_litellm_azure-finetuning-fixes

Conversation

@Sameerlite
Copy link
Copy Markdown
Collaborator

Summary

Fixes Azure OpenAI fine-tuning job creation by addressing two issues:

  1. Default trainingType=1 for Azure – Azure requires this parameter; omitting it yields a misleading "The specified base model does not support fine-tuning" error
  2. Normalize Azure FineTuningJob responses – Azure returns status: "pending", organization_id: null, result_files: null, which don't match OpenAI's schema and caused Pydantic validation errors

Changes

  • litellm/fine_tuning/main.py – Auto-inject trainingType=1 for Azure when not provided
  • litellm/llms/openai/fine_tuning/handler.py – Normalize Azure responses (pending→queued, null→defaults) before building LiteLLMFineTuningJob
  • litellm/types/llms/openai.py – Add "pending" to OpenAIFileObject.status allowed values
  • Tests for trainingType default and response normalization

Test plan

  • test_azure_trainingtype_defaults_to_one – Verifies trainingType=1 is injected
  • test_normalize_fine_tuning_job_dict_maps_azure_pending – Verifies response normalization
  • test_openai_file_object_accepts_pending_status – Verifies pending file status accepted
  • Manual E2E test with real Azure endpoint (file upload + job creation succeeded)

- Default trainingType=1 for Azure when omitted to avoid misleading "base model does not support fine-tuning" error
- Normalize Azure FineTuningJob responses (pending→queued, null fields→defaults) to match OpenAI schema
- Add pending status support to OpenAIFileObject for Azure file uploads
- Add test coverage for trainingType default and response normalization

Made-with: Cursor
- Move trainingType injection to AzureOpenAIFineTuningAPI handler
- Guard normalization with is_azure flag to only apply to Azure responses
- Override acreate_fine_tuning_job in Azure handler to use is_azure=True
- Update test to directly test _ensure_training_type method
- Add test for OpenAI unchanged behavior

Made-with: Cursor
- Call _ensure_training_type in acreate_fine_tuning_job async override

Made-with: Cursor
- Remove redundant _ensure_training_type call from acreate_fine_tuning_job
- Use explicit _AZURE_STATUS_MAP for status normalization

Made-with: Cursor
…on 4)

- Add cancel/retrieve overrides in AzureOpenAIFineTuningAPI to normalize responses
- Expand _AZURE_STATUS_MAP to handle all known Azure statuses
- Add "pending" to OpenAIFileObject.status allowed values
- Fix async test mock to return awaitable LiteLLMFineTuningJob
- Add test_openai_file_object_accepts_pending_status

Made-with: Cursor
…on 5)

- Remove unused FineTuningJob import from test
- Document "canceling" → "cancelled" mapping in _AZURE_STATUS_MAP

Made-with: Cursor
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 27, 2026 2:37pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 27, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing Sameerlite:litellm_litellm_azure-finetuning-fixes (e635cee) with main (88ed4f9)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 27, 2026

Greptile Summary

This PR fixes Azure OpenAI fine-tuning by addressing two runtime failures: automatically injecting trainingType=1 into extra_body when it is absent (Azure requires this parameter), and normalising Azure-specific response fields (pending status, null organization_id / result_files) to conform to the OpenAI schema before constructing a LiteLLMFineTuningJob.\n\nKey changes:\n- AzureOpenAIFineTuningAPI now overrides create_fine_tuning_job, cancel_fine_tuning_job, and retrieve_fine_tuning_job to call _litellm_fine_tuning_job_from_response(response, is_azure=True), applying the status map and null-field defaults.\n- _normalize_fine_tuning_job_dict and _litellm_fine_tuning_job_from_response are extracted as module-level helpers in the OpenAI handler and reused across both providers.\n- \"pending\" is added to the OpenAIFileObject.status Literal to match what Azure returns immediately after file upload.\n- The existing test_mock_azure_create_fine_tune_job_with_azure_specific_params test is updated to return a LiteLLMFineTuningJob via a coroutine; assertions are unchanged and coverage is not weakened.\n- list_fine_tuning_jobs / alist_fine_tuning_jobs are not overridden in the Azure handler, so Azure list-endpoint responses still skip normalisation — a pre-existing gap worth addressing in a follow-up.

Confidence Score: 4/5

Safe to merge; fixes a real Azure runtime failure with no behavioural regression for OpenAI users, but two P2 gaps remain worth addressing.

All three findings are P2 (style / robustness / completeness). The unmapped-status concern is the most consequential — an unexpected Azure status would surface as a Pydantic ValidationError at runtime — but it only affects status values not yet returned by Azure and is easy to harden. The list-endpoint gap and single-use coroutine are minor. No P0/P1 issues found.

litellm/llms/openai/fine_tuning/handler.py (unmapped Azure status handling) and litellm/llms/azure/fine_tuning/handler.py (list endpoint normalisation gap)

Important Files Changed

Filename Overview
litellm/llms/azure/fine_tuning/handler.py New Azure handler overriding create/cancel/retrieve to inject trainingType and normalize responses via is_azure=True; logic is correct but list_fine_tuning_jobs is not overridden so list responses skip normalization
litellm/llms/openai/fine_tuning/handler.py Introduces _normalize_fine_tuning_job_dict and _litellm_fine_tuning_job_from_response helper functions; all existing call sites updated cleanly; OpenAI path unaffected (is_azure defaults to False)
litellm/types/llms/openai.py Adds "pending" to OpenAIFileObject.status Literal; backwards-compatible addition with updated docstring
tests/batches_tests/test_fine_tuning_api.py Adds test_azure_trainingtype_defaults_to_one (new mock test); modifies test_mock_azure_create_fine_tune_job_with_azure_specific_params to use LiteLLMFineTuningJob and a coroutine return value — assertions are preserved but single-use coroutine pattern is fragile
tests/test_litellm/types/llms/test_types_llms_openai.py Adds three new unit tests for normalization and OpenAIFileObject; reformats some assertion lines for readability; no coverage weakening detected

Sequence Diagram

sequenceDiagram
    participant Caller
    participant main as fine_tuning/main.py
    participant AzureHandler as AzureOpenAIFineTuningAPI
    participant SDK as AzureOpenAI SDK
    participant Normalize as _normalize_fine_tuning_job_dict

    Caller->>main: acreate_fine_tuning_job(model, training_file, ...)
    main->>main: _prepare_azure_extra_body() [adds trainingType if in kwargs]
    main->>AzureHandler: create_fine_tuning_job(_is_async=True, ...)
    AzureHandler->>AzureHandler: _ensure_training_type() [defaults trainingType=1 if absent]
    AzureHandler->>AzureHandler: get_openai_client() → AsyncAzureOpenAI
    AzureHandler->>SDK: fine_tuning.jobs.create(**data)
    SDK-->>AzureHandler: FineTuningJob (status="pending", org_id=null, result_files=null)
    AzureHandler->>Normalize: _litellm_fine_tuning_job_from_response(response, is_azure=True)
    Normalize->>Normalize: map status pending→queued, null→defaults
    Normalize-->>AzureHandler: LiteLLMFineTuningJob (status="queued", org_id="", result_files=[])
    AzureHandler-->>main: LiteLLMFineTuningJob
    main-->>Caller: LiteLLMFineTuningJob
Loading

Reviews (1): Last reviewed commit: "feat(fine-tuning): address greptile revi..." | Re-trigger Greptile

Comment on lines +44 to +46
status = normalized.get("status")
if status in _AZURE_STATUS_MAP:
normalized["status"] = _AZURE_STATUS_MAP[status]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unmapped Azure statuses pass through unchanged

If Azure ever returns a status value not present in _AZURE_STATUS_MAP (e.g. a new Azure-specific state), _normalize_fine_tuning_job_dict leaves it unchanged. This value then gets passed directly into LiteLLMFineTuningJob(**...), which inherits OpenAI SDK's FineTuningJob with a strict Literal status field. Pydantic will raise a ValidationError for any unrecognised value.

Consider adding an explicit fallback so that an unknown Azure status is either preserved with a warning or mapped to the closest known value, rather than silently failing at object construction:

status = normalized.get("status")
if status in _AZURE_STATUS_MAP:
    normalized["status"] = _AZURE_STATUS_MAP[status]
elif status is not None:
    verbose_logger.warning(
        "Azure fine-tuning: unknown status %r – passing through unchanged", status
    )

Comment on lines +35 to +63
async def acreate_fine_tuning_job(
self,
create_fine_tuning_job_data: dict,
openai_client: Union[AsyncOpenAI, AsyncAzureOpenAI],
) -> LiteLLMFineTuningJob:
response = await openai_client.fine_tuning.jobs.create(
**create_fine_tuning_job_data
)
return _litellm_fine_tuning_job_from_response(response, is_azure=True)

async def acancel_fine_tuning_job(
self,
fine_tuning_job_id: str,
openai_client: Union[AsyncOpenAI, AsyncAzureOpenAI],
) -> LiteLLMFineTuningJob:
response = await openai_client.fine_tuning.jobs.cancel(
fine_tuning_job_id=fine_tuning_job_id
)
return _litellm_fine_tuning_job_from_response(response, is_azure=True)

async def aretrieve_fine_tuning_job(
self,
fine_tuning_job_id: str,
openai_client: Union[AsyncOpenAI, AsyncAzureOpenAI],
) -> LiteLLMFineTuningJob:
response = await openai_client.fine_tuning.jobs.retrieve(
fine_tuning_job_id=fine_tuning_job_id
)
return _litellm_fine_tuning_job_from_response(response, is_azure=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 list_fine_tuning_jobs not overridden — Azure list responses skip normalisation

create_fine_tuning_job, cancel_fine_tuning_job, and retrieve_fine_tuning_job are all overridden here to call _litellm_fine_tuning_job_from_response(response, is_azure=True), so their responses are correctly normalised. However list_fine_tuning_jobs / alist_fine_tuning_jobs are not overridden and fall through to the parent OpenAIFineTuningAPI implementation, which returns the raw SDK response without applying the Azure status map.

Azure pagination responses contain FineTuningJob objects with the same "pending" / "notRunning" statuses, so callers consuming a list of Azure jobs will still receive un-normalised status values. This is a gap even if it is technically pre-existing behaviour.

Comment on lines +636 to +640
async def mock_async_create(*args, **kwargs):
return mock_response

with patch("litellm.llms.azure.fine_tuning.handler.AzureOpenAIFineTuningAPI.create_fine_tuning_job") as mock_create:
mock_create.return_value = mock_response
mock_create.return_value = mock_async_create()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Single-use coroutine created at mock setup time

mock_async_create() creates a coroutine object once when mock_create.return_value is assigned. A coroutine can only be awaited a single time. If the mock were ever called more than once (e.g. due to retry logic or test ordering), the second await would silently return None or raise a RuntimeError: cannot reuse already awaited coroutine.

Prefer AsyncMock (or a side_effect factory) so a fresh coroutine is created on every call:

from unittest.mock import AsyncMock

mock_create = AsyncMock(return_value=mock_response)
with patch(
    "litellm.llms.azure.fine_tuning.handler.AzureOpenAIFineTuningAPI.create_fine_tuning_job",
    mock_create,
):
    ...

Rule Used: What: Flag any modifications to existing tests and... (source)

@yuneng-berri yuneng-berri self-requested a review March 27, 2026 16:56
@yuneng-berri yuneng-berri enabled auto-merge March 27, 2026 16:57
@yuneng-berri yuneng-berri merged commit f3fe6d1 into BerriAI:main Mar 27, 2026
40 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants