Support MLFlow Handler for single process/multi task enviornment#5728
Support MLFlow Handler for single process/multi task enviornment#5728wyli merged 3 commits intoProject-MONAI:devfrom
Conversation
Signed-off-by: Sachidanand Alle <[email protected]>
|
Hi @SachidanandAlle , May I know how you tested the MLFlow experiment management for bundles?
And I agree with your proposal, I think we should try to avoid setting the global param of MLFlow. Thanks. |
Point 1.. yes.. |
|
Hi @SachidanandAlle , Thanks in advance. |
|
Run the handler with two different URI (folder) something like this.. from concurrent.futures import ThreadPoolExecutor
def run_task(uri):
print(f"Running handler update for uri: {uri}")
r = train(req) # Run Transform instead of train with mocked engine for train/eval
return r
train_tasks = ["uri1", "uri2"]
with ThreadPoolExecutor(2, "Training") as executor:
for t in train_tasks:
futures[t["_id"]] = t, executor.submit(run_task, t)
for tid, (t, future) in futures.items():
res = future.result()Actually this keeps the active run open and both handlers will try to race for the current uri/experiment/active run. https://github.com/mlflow/mlflow/blob/master/mlflow/tracking/fluent.py#L1529-L1532 Another easiest way to run is via unit tests and invoke with multi threads... |
|
Hi @Nic-Ma , Thanks, |
|
/build |
Signed-off-by: Wenqi Li <[email protected]>
|
/build |
2 similar comments
|
/build |
|
/build |
Thanks for your testing, the PR looks good to me. Thanks in advance. |
Signed-off-by: Sachidanand Alle [email protected]
Current MLFlow Handler fails when you invoke 2 train requests back to back with different URI. Or multiple train requests within the same process. This is mainly for using global array where it saves active experiment, active run and others share the same. This will cause conflicts between 2 invokes with 2 different URI.
Fixes
The above two conditions will help create similar behavior compared to using
mlflow.active_run()Verified
Error Description
Error stack when you run two train workflows with in the same process (simply one after another).
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests --disttests.make htmlcommand in thedocs/folder.