Skip to content

Fix name conflicts and add test to cover search#224

Merged
wyli merged 7 commits intomainfrom
tests
Apr 18, 2023
Merged

Fix name conflicts and add test to cover search#224
wyli merged 7 commits intomainfrom
tests

Conversation

@mingxin-zheng
Copy link
Copy Markdown
Contributor

This is to fix: https://github.com/Project-MONAI/MONAI/actions/runs/4729226287/jobs/8391521501

The issue is caused by using the same keys in (hyper_parameters.yaml, hyper_parameters_search.yaml) of dints.

When the order is scrambled by os.listdir in here, it can trigger using the transform#resample_to_spacing in searching

image

cache_rate has the same problem, and so I include it in this PR too.

Since search is not used by default, so it is not covered by any test in MONAI. I added a test case here (and I found issues in multi-GPU when search is enabled).

mingxin-zheng and others added 2 commits April 18, 2023 10:56
Signed-off-by: Wenqi Li <[email protected]>
@wyli
Copy link
Copy Markdown
Contributor

wyli commented Apr 18, 2023

2023-04-18 08:00:46.470993 - Length of input patch is recommended to be a multiple of 32.
Traceback (most recent call last):
  File "/tmp/tmpduj1w7ul/work_dir/dints_0/scripts/dummy_runner.py", line 185, in <module>
    fire.Fire(DummyRunnerDiNTS)
  File "/home/wenqil/anaconda3/envs/py38/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/wenqil/anaconda3/envs/py38/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/wenqil/anaconda3/envs/py38/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/tmp/tmpduj1w7ul/work_dir/dints_0/scripts/dummy_runner.py", line 73, in __init__
    image_shape = data_stat["stats_by_cases"][_k]["image_stats"]["shape"]
TypeError: 'NoneType' object is not subscriptable

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Apr 18, 2023

  File "/tmp/tmpcvwc8wzi/work_dir/segresnet_0/scripts/train.py", line 23, in run
    run_segmenter(config_file=config_file, **override)
  File "/tmp/tmpcvwc8wzi/work_dir/segresnet_0/scripts/segmenter.py", line 1923, in run_segmenter
    run_segmenter_worker(0, config_file, kwargs)
  File "/tmp/tmpcvwc8wzi/work_dir/segresnet_0/scripts/segmenter.py", line 1897, in run_segmenter_worker
    best_metric = segmenter.run()
  File "/tmp/tmpcvwc8wzi/work_dir/segresnet_0/scripts/segmenter.py", line 1866, in run
    self.train()
  File "/tmp/tmpcvwc8wzi/work_dir/segresnet_0/scripts/segmenter.py", line 1150, in train
    train_loss, train_acc = self.train_epoch(
  File "/tmp/tmpcvwc8wzi/work_dir/segresnet_0/scripts/segmenter.py", line 1578, in train_epoch
    data = data_list[ich]
IndexError: tuple index out of range

@mingxin-zheng
Copy link
Copy Markdown
Contributor Author

For the first issue, it seems my changes in PR were overridden.

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Apr 18, 2023

For the first issue, it seems my changes in PR were overridden.

I see, let me revert, that's done by mistake

Signed-off-by: Wenqi Li <[email protected]>
@mingxin-zheng
Copy link
Copy Markdown
Contributor Author

For the first issue, it seems my changes in PR were overridden.

I see, let me revert, that's done by mistake

If it is complicated to revert, I can copy-pasted those changes to my PR

wyli added 2 commits April 18, 2023 08:31
Signed-off-by: Wenqi Li <[email protected]>
Signed-off-by: Wenqi Li <[email protected]>
@mingxin-zheng
Copy link
Copy Markdown
Contributor Author

Thanks @wyli !

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Apr 18, 2023

there's still an error after the above ones being fixed

======================================================================
ERROR: test_autorunner_ensemble (__main__.TestAutoRunner)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/wenqil/Documents/MONAI/tests/test_integration_autorunner.py", line 132, in test_autorunner_ensemble
    runner.run()
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/auto_runner.py", line 803, in run
    self._train_algo_in_sequence(history)
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/auto_runner.py", line 659, in _train_algo_in_sequence
    acc = algo.get_score()
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/bundle_gen.py", line 264, in get_score
    dict_file = ConfigParser.load_config_file(os.path.join(ckpt_path, "progress.yaml"))
  File "/home/wenqil/Documents/MONAI/monai/bundle/config_parser.py", line 402, in load_config_file
    with open(_filepath) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpd2llrjiy/work_dir/segresnet_0/model/progress.yaml'

@wyli wyli requested review from heyufan1995 and myron April 18, 2023 12:36
@wyli
Copy link
Copy Markdown
Contributor

wyli commented Apr 18, 2023

there's still an error after the above ones being fixed

======================================================================
ERROR: test_autorunner_ensemble (__main__.TestAutoRunner)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/wenqil/Documents/MONAI/tests/test_integration_autorunner.py", line 132, in test_autorunner_ensemble
    runner.run()
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/auto_runner.py", line 803, in run
    self._train_algo_in_sequence(history)
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/auto_runner.py", line 659, in _train_algo_in_sequence
    acc = algo.get_score()
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/bundle_gen.py", line 264, in get_score
    dict_file = ConfigParser.load_config_file(os.path.join(ckpt_path, "progress.yaml"))
  File "/home/wenqil/Documents/MONAI/monai/bundle/config_parser.py", line 402, in load_config_file
    with open(_filepath) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpd2llrjiy/work_dir/segresnet_0/model/progress.yaml'

this is still an issue when num_steps_per_image>1, @myron prgoress.yaml was not saved properly. but we are merging this and overriding it to num_steps_per_image=1 in the test cases because we want to run full integration tests to unblock the release.

@mingxin-zheng
Copy link
Copy Markdown
Contributor Author

there's still an error after the above ones being fixed

======================================================================
ERROR: test_autorunner_ensemble (__main__.TestAutoRunner)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/wenqil/Documents/MONAI/tests/test_integration_autorunner.py", line 132, in test_autorunner_ensemble
    runner.run()
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/auto_runner.py", line 803, in run
    self._train_algo_in_sequence(history)
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/auto_runner.py", line 659, in _train_algo_in_sequence
    acc = algo.get_score()
  File "/home/wenqil/Documents/MONAI/monai/apps/auto3dseg/bundle_gen.py", line 264, in get_score
    dict_file = ConfigParser.load_config_file(os.path.join(ckpt_path, "progress.yaml"))
  File "/home/wenqil/Documents/MONAI/monai/bundle/config_parser.py", line 402, in load_config_file
    with open(_filepath) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpd2llrjiy/work_dir/segresnet_0/model/progress.yaml'

Looks like @wyli has resolved this issue. I will merge it to the main branch for more integration tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants