Skip to content

Commit e0ce916

Browse files
authored
[Docs] Update serving sphinx (#9169)
1 parent 8d295f2 commit e0ce916

File tree

2 files changed

+46
-25
lines changed

2 files changed

+46
-25
lines changed

docs/api/mlrun.serving/index.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,30 @@ mlrun.serving
77
:members: to, error_handler, set_flow, cycle_to
88
:private-members:
99

10+
.. autoclass:: mlrun.serving.states.RootFlowStep
11+
:members: add_step, add_shared_model, configure_shared_pool_resource
12+
:private-members:
13+
14+
.. autoclass:: mlrun.serving.states.ModelRunnerStep
15+
:members: add_model, add_shared_model_proxy, configure_pool_resource
16+
:private-members:
17+
18+
.. autoclass:: mlrun.serving.states.Model
19+
:members: predict, predict_async, load
20+
:private-members:
21+
22+
.. autoclass:: mlrun.serving.states.LLModel
23+
:no-members:
24+
25+
.. autoclass:: mlrun.serving.states.ModelRunnerSelector
26+
:members: select_models, select_outlets
27+
:private-members:
28+
1029
.. automodule:: mlrun.serving
1130
:members:
1231
:show-inheritance:
1332
:undoc-members:
33+
:exclude-members: LLModel, Model, ModelRunnerStep, ModelRunner, ModelSelector, MonitoredStep, ModelRunnerSelector
1434

1535
.. automodule:: mlrun.serving.remote
1636
:members:

mlrun/serving/states.py

Lines changed: 26 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2655,10 +2655,10 @@ def add_step(
26552655
:param model_endpoint_creation_strategy: Strategy for creating or updating the model endpoint:
26562656
26572657
* **overwrite**: If model endpoints with the same name exist, delete the `latest` one;
2658-
create a new model endpoint entry and set it as `latest`.
2658+
create a new model endpoint entry and set it as `latest`.
26592659
26602660
* **inplace** (default): If model endpoints with the same name exist, update the `latest`
2661-
entry; otherwise, create a new entry.
2661+
entry; otherwise, create a new entry.
26622662
26632663
* **archive**: If model endpoints with the same name exist, preserve them;
26642664
create a new model endpoint with the same name and set it to `latest`.
@@ -3204,26 +3204,26 @@ def add_shared_model(
32043204
Add a shared model to the graph, this model will be available to all the ModelRunners in the graph
32053205
:param name: Name of the shared model (should be unique in the graph)
32063206
:param model_class: Model class name. If LLModel is chosen
3207-
(either by name `LLModel` or by its full path, e.g. mlrun.serving.states.LLModel),
3208-
outputs will be overridden with UsageResponseKeys fields.
3207+
(either by name `LLModel` or by its full path, e.g. mlrun.serving.states.LLModel),
3208+
outputs will be overridden with UsageResponseKeys fields.
32093209
:param execution_mechanism: Parallel execution mechanism to be used to execute this model. Must be one of:
32103210
32113211
* **process_pool**: To run in a separate process from a process pool. This is appropriate for CPU or GPU
32123212
intensive tasks as they would otherwise block the main process by holding Python's Global Interpreter
32133213
Lock (GIL).
32143214
32153215
* **dedicated_process**: To run in a separate dedicated process. This is appropriate for CPU or GPU
3216-
intensive tasks that also require significant Runnable-specific initialization (e.g. a large model).
3216+
intensive tasks that also require significant Runnable-specific initialization (e.g. a large model).
32173217
32183218
* **thread_pool**: To run in a separate thread. This is appropriate for blocking I/O tasks, as they would
32193219
otherwise block the main event loop thread.
32203220
32213221
* **asyncio**: To run in an asyncio task. This is appropriate for I/O tasks that use asyncio, allowing the
32223222
event loop to continue running while waiting for a response.
32233223
3224-
* **shared_executor": Reuses an external executor (typically managed by the flow or context) to execute the
3225-
runnable. Should be used only if you have multiply `ParallelExecution` in the same flow and especially
3226-
useful when:
3224+
* **shared_executor**: Reuses an external executor (typically managed by the flow or context) to execute
3225+
the runnable. Should be used only if you have multiple `ParallelExecution` in the same flow and
3226+
especially useful when:
32273227
32283228
- You want to share a heavy resource like a large model loaded onto a GPU.
32293229
@@ -3237,22 +3237,22 @@ def add_shared_model(
32373237
* **naive**: To run in the main event loop. This is appropriate only for trivial computation and/or file
32383238
I/O. It means that the runnable will not actually be run in parallel to anything else.
32393239
3240-
:param model_artifact: model artifact or mlrun model artifact uri
3241-
:param inputs: list of the model inputs (e.g. features) ,if provided will override the inputs
3242-
that been configured in the model artifact, please note that those inputs need
3243-
to be equal in length and order to the inputs that model_class
3244-
predict method expects
3245-
:param outputs: list of the model outputs (e.g. labels) ,if provided will override the outputs
3246-
that been configured in the model artifact, please note that those outputs need
3247-
to be equal to the model_class
3248-
predict method outputs (length, and order)
3249-
:param input_path: input path inside the user event, expect scopes to be defined by dot notation
3250-
(e.g "inputs.my_model_inputs"). expects list or dictionary type object in path.
3251-
:param result_path: result path inside the user output event, expect scopes to be defined by dot
3252-
notation (e.g "outputs.my_model_outputs") expects list or dictionary type object
3253-
in path.
3254-
:param override: bool allow override existing model on the current ModelRunnerStep.
3255-
:param model_parameters: Parameters for model instantiation
3240+
:param model_artifact: model artifact or mlrun model artifact uri
3241+
:param inputs: list of the model inputs (e.g. features) ,if provided will override the inputs
3242+
that been configured in the model artifact, please note that those inputs need
3243+
to be equal in length and order to the inputs that model_class
3244+
predict method expects
3245+
:param outputs: list of the model outputs (e.g. labels) ,if provided will override the outputs
3246+
that been configured in the model artifact, please note that those outputs need
3247+
to be equal to the model_class
3248+
predict method outputs (length, and order)
3249+
:param input_path: input path inside the user event, expect scopes to be defined by dot notation
3250+
(e.g "inputs.my_model_inputs"). expects list or dictionary type object in path.
3251+
:param result_path: result path inside the user output event, expect scopes to be defined by dot
3252+
notation (e.g "outputs.my_model_outputs") expects list or dictionary type object
3253+
in path.
3254+
:param override: bool allow override existing model on the current ModelRunnerStep.
3255+
:param model_parameters: Parameters for model instantiation
32563256
"""
32573257
if isinstance(model_class, Model) and model_parameters:
32583258
raise mlrun.errors.MLRunInvalidArgumentError(
@@ -3337,8 +3337,9 @@ def configure_shared_pool_resource(
33373337
) -> None:
33383338
"""
33393339
Configure the resource limits for the shared models in the graph.
3340+
33403341
:param max_processes: Maximum number of processes to spawn (excluding dedicated processes).
3341-
Defaults to the number of CPUs or 16 if undetectable.
3342+
Defaults to the number of CPUs or 16 if undetectable.
33423343
:param max_threads: Maximum number of threads to spawn. Defaults to 32.
33433344
:param pool_factor: Multiplier to scale the number of process/thread workers per runnable. Defaults to 1.
33443345
"""

0 commit comments

Comments
 (0)