feat(fine-tuning): fix Azure OpenAI fine-tuning job creation#24687
Conversation
- Default trainingType=1 for Azure when omitted to avoid misleading "base model does not support fine-tuning" error - Normalize Azure FineTuningJob responses (pending→queued, null fields→defaults) to match OpenAI schema - Add pending status support to OpenAIFileObject for Azure file uploads - Add test coverage for trainingType default and response normalization Made-with: Cursor
- Move trainingType injection to AzureOpenAIFineTuningAPI handler - Guard normalization with is_azure flag to only apply to Azure responses - Override acreate_fine_tuning_job in Azure handler to use is_azure=True - Update test to directly test _ensure_training_type method - Add test for OpenAI unchanged behavior Made-with: Cursor
- Call _ensure_training_type in acreate_fine_tuning_job async override Made-with: Cursor
- Remove redundant _ensure_training_type call from acreate_fine_tuning_job - Use explicit _AZURE_STATUS_MAP for status normalization Made-with: Cursor
…on 4) - Add cancel/retrieve overrides in AzureOpenAIFineTuningAPI to normalize responses - Expand _AZURE_STATUS_MAP to handle all known Azure statuses - Add "pending" to OpenAIFileObject.status allowed values - Fix async test mock to return awaitable LiteLLMFineTuningJob - Add test_openai_file_object_accepts_pending_status Made-with: Cursor
…on 5) - Remove unused FineTuningJob import from test - Document "canceling" → "cancelled" mapping in _AZURE_STATUS_MAP Made-with: Cursor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes Azure OpenAI fine-tuning by addressing two runtime failures: automatically injecting Confidence Score: 4/5Safe to merge; fixes a real Azure runtime failure with no behavioural regression for OpenAI users, but two P2 gaps remain worth addressing. All three findings are P2 (style / robustness / completeness). The unmapped-status concern is the most consequential — an unexpected Azure status would surface as a Pydantic ValidationError at runtime — but it only affects status values not yet returned by Azure and is easy to harden. The list-endpoint gap and single-use coroutine are minor. No P0/P1 issues found. litellm/llms/openai/fine_tuning/handler.py (unmapped Azure status handling) and litellm/llms/azure/fine_tuning/handler.py (list endpoint normalisation gap)
|
| Filename | Overview |
|---|---|
| litellm/llms/azure/fine_tuning/handler.py | New Azure handler overriding create/cancel/retrieve to inject trainingType and normalize responses via is_azure=True; logic is correct but list_fine_tuning_jobs is not overridden so list responses skip normalization |
| litellm/llms/openai/fine_tuning/handler.py | Introduces _normalize_fine_tuning_job_dict and _litellm_fine_tuning_job_from_response helper functions; all existing call sites updated cleanly; OpenAI path unaffected (is_azure defaults to False) |
| litellm/types/llms/openai.py | Adds "pending" to OpenAIFileObject.status Literal; backwards-compatible addition with updated docstring |
| tests/batches_tests/test_fine_tuning_api.py | Adds test_azure_trainingtype_defaults_to_one (new mock test); modifies test_mock_azure_create_fine_tune_job_with_azure_specific_params to use LiteLLMFineTuningJob and a coroutine return value — assertions are preserved but single-use coroutine pattern is fragile |
| tests/test_litellm/types/llms/test_types_llms_openai.py | Adds three new unit tests for normalization and OpenAIFileObject; reformats some assertion lines for readability; no coverage weakening detected |
Sequence Diagram
sequenceDiagram
participant Caller
participant main as fine_tuning/main.py
participant AzureHandler as AzureOpenAIFineTuningAPI
participant SDK as AzureOpenAI SDK
participant Normalize as _normalize_fine_tuning_job_dict
Caller->>main: acreate_fine_tuning_job(model, training_file, ...)
main->>main: _prepare_azure_extra_body() [adds trainingType if in kwargs]
main->>AzureHandler: create_fine_tuning_job(_is_async=True, ...)
AzureHandler->>AzureHandler: _ensure_training_type() [defaults trainingType=1 if absent]
AzureHandler->>AzureHandler: get_openai_client() → AsyncAzureOpenAI
AzureHandler->>SDK: fine_tuning.jobs.create(**data)
SDK-->>AzureHandler: FineTuningJob (status="pending", org_id=null, result_files=null)
AzureHandler->>Normalize: _litellm_fine_tuning_job_from_response(response, is_azure=True)
Normalize->>Normalize: map status pending→queued, null→defaults
Normalize-->>AzureHandler: LiteLLMFineTuningJob (status="queued", org_id="", result_files=[])
AzureHandler-->>main: LiteLLMFineTuningJob
main-->>Caller: LiteLLMFineTuningJob
Reviews (1): Last reviewed commit: "feat(fine-tuning): address greptile revi..." | Re-trigger Greptile
| status = normalized.get("status") | ||
| if status in _AZURE_STATUS_MAP: | ||
| normalized["status"] = _AZURE_STATUS_MAP[status] |
There was a problem hiding this comment.
Unmapped Azure statuses pass through unchanged
If Azure ever returns a status value not present in _AZURE_STATUS_MAP (e.g. a new Azure-specific state), _normalize_fine_tuning_job_dict leaves it unchanged. This value then gets passed directly into LiteLLMFineTuningJob(**...), which inherits OpenAI SDK's FineTuningJob with a strict Literal status field. Pydantic will raise a ValidationError for any unrecognised value.
Consider adding an explicit fallback so that an unknown Azure status is either preserved with a warning or mapped to the closest known value, rather than silently failing at object construction:
status = normalized.get("status")
if status in _AZURE_STATUS_MAP:
normalized["status"] = _AZURE_STATUS_MAP[status]
elif status is not None:
verbose_logger.warning(
"Azure fine-tuning: unknown status %r – passing through unchanged", status
)| async def acreate_fine_tuning_job( | ||
| self, | ||
| create_fine_tuning_job_data: dict, | ||
| openai_client: Union[AsyncOpenAI, AsyncAzureOpenAI], | ||
| ) -> LiteLLMFineTuningJob: | ||
| response = await openai_client.fine_tuning.jobs.create( | ||
| **create_fine_tuning_job_data | ||
| ) | ||
| return _litellm_fine_tuning_job_from_response(response, is_azure=True) | ||
|
|
||
| async def acancel_fine_tuning_job( | ||
| self, | ||
| fine_tuning_job_id: str, | ||
| openai_client: Union[AsyncOpenAI, AsyncAzureOpenAI], | ||
| ) -> LiteLLMFineTuningJob: | ||
| response = await openai_client.fine_tuning.jobs.cancel( | ||
| fine_tuning_job_id=fine_tuning_job_id | ||
| ) | ||
| return _litellm_fine_tuning_job_from_response(response, is_azure=True) | ||
|
|
||
| async def aretrieve_fine_tuning_job( | ||
| self, | ||
| fine_tuning_job_id: str, | ||
| openai_client: Union[AsyncOpenAI, AsyncAzureOpenAI], | ||
| ) -> LiteLLMFineTuningJob: | ||
| response = await openai_client.fine_tuning.jobs.retrieve( | ||
| fine_tuning_job_id=fine_tuning_job_id | ||
| ) | ||
| return _litellm_fine_tuning_job_from_response(response, is_azure=True) |
There was a problem hiding this comment.
list_fine_tuning_jobs not overridden — Azure list responses skip normalisation
create_fine_tuning_job, cancel_fine_tuning_job, and retrieve_fine_tuning_job are all overridden here to call _litellm_fine_tuning_job_from_response(response, is_azure=True), so their responses are correctly normalised. However list_fine_tuning_jobs / alist_fine_tuning_jobs are not overridden and fall through to the parent OpenAIFineTuningAPI implementation, which returns the raw SDK response without applying the Azure status map.
Azure pagination responses contain FineTuningJob objects with the same "pending" / "notRunning" statuses, so callers consuming a list of Azure jobs will still receive un-normalised status values. This is a gap even if it is technically pre-existing behaviour.
| async def mock_async_create(*args, **kwargs): | ||
| return mock_response | ||
|
|
||
| with patch("litellm.llms.azure.fine_tuning.handler.AzureOpenAIFineTuningAPI.create_fine_tuning_job") as mock_create: | ||
| mock_create.return_value = mock_response | ||
| mock_create.return_value = mock_async_create() |
There was a problem hiding this comment.
Single-use coroutine created at mock setup time
mock_async_create() creates a coroutine object once when mock_create.return_value is assigned. A coroutine can only be awaited a single time. If the mock were ever called more than once (e.g. due to retry logic or test ordering), the second await would silently return None or raise a RuntimeError: cannot reuse already awaited coroutine.
Prefer AsyncMock (or a side_effect factory) so a fresh coroutine is created on every call:
from unittest.mock import AsyncMock
mock_create = AsyncMock(return_value=mock_response)
with patch(
"litellm.llms.azure.fine_tuning.handler.AzureOpenAIFineTuningAPI.create_fine_tuning_job",
mock_create,
):
...Rule Used: What: Flag any modifications to existing tests and... (source)
Summary
Fixes Azure OpenAI fine-tuning job creation by addressing two issues:
trainingType=1for Azure – Azure requires this parameter; omitting it yields a misleading "The specified base model does not support fine-tuning" errorstatus: "pending",organization_id: null,result_files: null, which don't match OpenAI's schema and caused Pydantic validation errorsChanges
litellm/fine_tuning/main.py– Auto-injecttrainingType=1for Azure when not providedlitellm/llms/openai/fine_tuning/handler.py– Normalize Azure responses (pending→queued, null→defaults) before buildingLiteLLMFineTuningJoblitellm/types/llms/openai.py– Add"pending"toOpenAIFileObject.statusallowed valuesTest plan
test_azure_trainingtype_defaults_to_one– Verifies trainingType=1 is injectedtest_normalize_fine_tuning_job_dict_maps_azure_pending– Verifies response normalizationtest_openai_file_object_accepts_pending_status– Verifies pending file status accepted