File tree Expand file tree Collapse file tree 5 files changed +20
-4
lines changed
deployment/kustomizations/base Expand file tree Collapse file tree 5 files changed +20
-4
lines changed Original file line number Diff line number Diff line change @@ -81,18 +81,21 @@ data:
8181 max_completion_tokens: 4096
8282 n: 1
8383 seed: 0
84+ reasoning_effort: minimal
8485 - model: gpt-5-mini-2025-08-07
8586 context_window_size: 380000
8687 kwargs:
8788 max_completion_tokens: 4096
8889 n: 1
8990 seed: 0
91+ reasoning_effort: minimal
9092 - model: gpt-5-2025-08-07
9193 context_window_size: 380000
9294 kwargs:
9395 max_completion_tokens: 4096
9496 n: 1
9597 seed: 0
98+ reasoning_effort: minimal
9699 ---
97100 type: embedder
98101 provider: litellm_embedder
Original file line number Diff line number Diff line change @@ -30,19 +30,21 @@ models:
3030 max_completion_tokens : 4096
3131 n : 1
3232 seed : 0
33+ reasoning_effort : minimal
3334 - model : gpt-5-mini-2025-08-07
3435 context_window_size : 380000
3536 kwargs :
3637 max_completion_tokens : 4096
3738 n : 1
3839 seed : 0
40+ reasoning_effort : minimal
3941 - model : gpt-5-2025-08-07
4042 context_window_size : 380000
4143 kwargs :
4244 max_completion_tokens : 4096
4345 n : 1
4446 seed : 0
45-
47+ reasoning_effort : minimal
4648---
4749type : embedder
4850provider : litellm_embedder
Original file line number Diff line number Diff line change 77 _convert_message_to_openai_format ,
88)
99from haystack .dataclasses import ChatMessage , StreamingChunk
10- from litellm import Router , acompletion
10+ from litellm import acompletion
11+ from litellm .router import Router
1112
1213from src .core .provider import LLMProvider
1314from src .providers .llm import (
@@ -99,11 +100,16 @@ async def _run(
99100 ** (generation_kwargs or {}),
100101 }
101102
103+ allowed_params = (
104+ ["reasoning_effort" ] if self ._model .startswith ("gpt-5" ) else None
105+ )
106+
102107 if self ._has_fallbacks :
103108 completion = await self ._router .acompletion (
104109 model = self ._model ,
105110 messages = openai_formatted_messages ,
106111 stream = streaming_callback is not None ,
112+ allowed_openai_params = allowed_params ,
107113 mock_testing_fallbacks = self ._enable_fallback_testing ,
108114 ** generation_kwargs ,
109115 )
@@ -116,6 +122,7 @@ async def _run(
116122 timeout = self ._timeout ,
117123 messages = openai_formatted_messages ,
118124 stream = streaming_callback is not None ,
125+ allowed_openai_params = allowed_params ,
119126 ** generation_kwargs ,
120127 )
121128
Original file line number Diff line number Diff line change @@ -30,19 +30,21 @@ models:
3030 max_completion_tokens : 4096
3131 n : 1
3232 seed : 0
33+ reasoning_effort : minimal
3334 - model : gpt-5-mini-2025-08-07
3435 context_window_size : 380000
3536 kwargs :
3637 max_completion_tokens : 4096
3738 n : 1
3839 seed : 0
40+ reasoning_effort : minimal
3941 - model : gpt-5-2025-08-07
4042 context_window_size : 380000
4143 kwargs :
4244 max_completion_tokens : 4096
4345 n : 1
4446 seed : 0
45-
47+ reasoning_effort : minimal
4648---
4749type : embedder
4850provider : litellm_embedder
Original file line number Diff line number Diff line change @@ -30,19 +30,21 @@ models:
3030 max_completion_tokens : 4096
3131 n : 1
3232 seed : 0
33+ reasoning_effort : minimal
3334 - model : gpt-5-mini-2025-08-07
3435 context_window_size : 380000
3536 kwargs :
3637 max_completion_tokens : 4096
3738 n : 1
3839 seed : 0
40+ reasoning_effort : minimal
3941 - model : gpt-5-2025-08-07
4042 context_window_size : 380000
4143 kwargs :
4244 max_completion_tokens : 4096
4345 n : 1
4446 seed : 0
45-
47+ reasoning_effort : minimal
4648---
4749type : embedder
4850provider : litellm_embedder
You can’t perform that action at this time.
0 commit comments