Random CI errors may related to out of memory

**Is your feature request related to a problem? Please describe.**
https://github.com/Project-MONAI/MONAI/runs/6567755462?check_suite_focus=true
```
======================================================================
ERROR: test_verify_0____w_MONAI_MONAI_tests_testing_data_metadata_json (tests.test_bundle_verify_net.TestVerifyNetwork)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/parameterized/parameterized.py", line 533, in standalone_func
    return func(*(a + p.args), **p.kwargs)
  File "/__w/MONAI/MONAI/tests/test_bundle_verify_net.py", line 43, in test_verify
    subprocess.check_call(cmd, env=test_env)
  File "/opt/conda/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['coverage', 'run', '-m', 'monai.bundle', 'verify_net_in_out', 'network_def', '--meta_file', '/__w/MONAI/MONAI/tests/testing_data/metadata.json', '--config_file', '/__w/MONAI/MONAI/tests/testing_data/inference.json', '-n', '2', '--any', '32', '--args_file', '/tmp/tmp7y1u_zw9/def_args.json', '--_meta_#network_data_format#inputs#image#spatial_shape', "[32,'*','4**p*n']"]' returned non-zero exit status 1.

======================================================================
ERROR: test_bspline (tests.test_global_mutual_information_loss.TestGlobalMutualInformationLoss)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/MONAI/MONAI/tests/test_global_mutual_information_loss.py", line 106, in test_bspline
    result = loss_fn(a2, a1).detach().cpu().numpy()
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
tests finished, printing completed times >10.0s in ascending order...

test_read_patches_cucim_0 (tests.test_masked_inference_wsi_dataset.TestMaskedInferenceWSIDataset) (10.2s)
test_script_0 (tests.test_senet.TestSENET) (10.4s)
test_invert (tests.test_invertd.TestInvertd) (10.6s)
  File "/__w/MONAI/MONAI/monai/losses/image_dissimilarity.py", line 319, in forward
    wa, pa, wb, pb = self.parzen_windowing(pred, target)  # (batch, num_sample, num_bin), (batch, 1, num_bin)
  File "/__w/MONAI/MONAI/monai/losses/image_dissimilarity.py", line 233, in parzen_windowing
    pred_weight, pred_probability = self.parzen_windowing_b_spline(pred, order=3)
  File "/__w/MONAI/MONAI/monai/losses/image_dissimilarity.py", line 283, in parzen_windowing_b_spline
    weight + (4 - 6 * sample_bin_matrix**2 + 3 * sample_bin_matrix**3) * (sample_bin_matrix < 1) / 6
RuntimeError: CUDA out of memory. Tried to allocate 736.00 MiB (GPU 0; 14.76 GiB total capacity; 5.73 GiB already allocated; 94.00 MiB free; 5.75 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
```
Would be nice to enhance the test logic.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random CI errors may related to out of memory #4330

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Random CI errors may related to out of memory #4330

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions