Skip to content

test error: 3d_segmentation/unet_segmentation_3d_catalyst.ipynb #791

@wyli

Description

@wyli

Describe the bug

from nightly tests:

02:36:27  Running ./3d_segmentation/unet_segmentation_3d_catalyst.ipynb
02:36:27  Checking PEP8 compliance...
02:36:28  Running notebook...
02:36:28  Before:
02:36:28      "max_epochs = 50\n",
02:36:28  After:
02:36:28      "max_epochs = 1\n",
02:36:28  Before:
02:36:28      "val_interval = 2\n",
02:36:28  After:
02:36:28      "val_interval = 1\n",
02:36:32  MONAI version: 0.9.1rc3
02:36:32  Numpy version: 1.22.4
02:36:32  Pytorch version: 1.10.2+cu102
02:36:32  MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
02:36:32  MONAI rev id: 7a5de8b7b9db101a431e70ae2aa8ea7ebb8dfffe
02:36:32  MONAI __file__: /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/__init__.py
02:36:32  
02:36:32  Optional dependencies:
02:36:32  Pytorch Ignite version: 0.4.9
02:36:32  Nibabel version: 4.0.1
02:36:32  scikit-image version: 0.19.3
02:36:32  Pillow version: 7.0.0
02:36:32  Tensorboard version: 2.9.1
02:36:32  gdown version: 4.5.1
02:36:32  TorchVision version: 0.11.3+cu102
02:36:32  tqdm version: 4.64.0
02:36:32  lmdb version: 1.3.0
02:36:32  psutil version: 5.9.1
02:36:32  pandas version: 1.1.5
02:36:32  einops version: 0.4.1
02:36:32  transformers version: 4.20.1
02:36:32  mlflow version: 1.27.0
02:36:32  pynrrd version: 0.4.3
02:36:32  
02:36:32  For details about installing the optional dependencies, please visit:
02:36:32      https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
02:36:32  
02:36:32  /opt/conda/lib/python3.8/site-packages/papermill/iorw.py:58: FutureWarning: pyarrow.HadoopFileSystem is deprecated as of 2.0.0, please use pyarrow.fs.HadoopFileSystem instead.
02:36:32    from pyarrow import HadoopFileSystem
02:37:10  
Executing:   0%|          | 0/28 [00:00<?, ?cell/s]
Executing:   4%|▎         | 1/28 [00:01<00:35,  1.31s/cell]
Executing:  11%|█         | 3/28 [00:19<03:01,  7.24s/cell]
Executing:  18%|█▊        | 5/28 [00:22<01:37,  4.23s/cell]
Executing:  43%|████▎     | 12/28 [00:30<00:31,  1.95s/cell]
Executing:  50%|█████     | 14/28 [00:31<00:22,  1.63s/cell]
Executing:  79%|███████▊  | 22/28 [00:36<00:06,  1.08s/cell]
Executing:  79%|███████▊  | 22/28 [00:37<00:10,  1.72s/cell]
02:37:10  Traceback (most recent call last):
02:37:10    File "/opt/conda/bin/papermill", line 8, in <module>
02:37:10      sys.exit(papermill())
02:37:10    File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
02:37:10      return self.main(*args, **kwargs)
02:37:10    File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1053, in main
02:37:10      rv = self.invoke(ctx)
02:37:10    File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
02:37:10      return ctx.invoke(self.callback, **ctx.params)
02:37:10    File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 754, in invoke
02:37:10      return __callback(*args, **kwargs)
02:37:10    File "/opt/conda/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
02:37:10      return f(get_current_context(), *args, **kwargs)
02:37:10    File "/opt/conda/lib/python3.8/site-packages/papermill/cli.py", line 250, in papermill
02:37:10      execute_notebook(
02:37:10    File "/opt/conda/lib/python3.8/site-packages/papermill/execute.py", line 122, in execute_notebook
02:37:10      raise_for_execution_errors(nb, output_path)
02:37:10    File "/opt/conda/lib/python3.8/site-packages/papermill/execute.py", line 234, in raise_for_execution_errors
02:37:10      raise error
02:37:10  papermill.exceptions.PapermillExecutionError: 
02:37:10  ---------------------------------------------------------------------------
02:37:10  Exception encountered at "In [10]":
02:37:10  ---------------------------------------------------------------------------
02:37:10  AttributeError                            Traceback (most recent call last)
02:37:10  Input In [10], in <cell line: 7>()
02:37:10        3 log_dir = os.path.join(root_dir, "logs")
02:37:10        4 runner = MonaiSupervisedRunner(
02:37:10        5     input_key="img", input_target_key="seg", output_key="logits"
02:37:10        6 )  # you can also specify `device` here
02:37:10  ----> 7 runner.train(
02:37:10        8     loaders={"train": train_loader, "valid": val_loader},
02:37:10        9     model=model,
02:37:10       10     criterion=loss_function,
02:37:10       11     optimizer=optimizer,
02:37:10       12     num_epochs=max_epochs,
02:37:10       13     logdir=log_dir,
02:37:10       14     main_metric="dice_metric",
02:37:10       15     minimize_metric=False,
02:37:10       16     verbose=False,
02:37:10       17     timeit=True,  # let's use minimal logs, but with time checkers
02:37:10       18     callbacks={
02:37:10       19         "loss": catalyst.dl.CriterionCallback(
02:37:10       20             input_key="seg", output_key="logits"
02:37:10       21         ),
02:37:10       22         "periodic_valid": catalyst.dl.PeriodicLoaderCallback(
02:37:10       23             valid=val_interval
02:37:10       24         ),
02:37:10       25         "dice_metric": catalyst.dl.MetricCallback(
02:37:10       26             prefix="dice_metric",
02:37:10       27             metric_fn=lambda y_pred, y: get_metric(y_pred, y),
02:37:10       28             input_key="seg",
02:37:10       29             output_key="logits",
02:37:10       30         ),
02:37:10       31     },
02:37:10       32     load_best_on_end=True,  # user-friendly API :)
02:37:10       33 )
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/dl/runner/runner.py:163, in Runner.train(self, model, criterion, optimizer, scheduler, datasets, loaders, callbacks, logdir, resume, num_epochs, valid_loader, main_metric, minimize_metric, verbose, stage_kwargs, checkpoint_data, fp16, distributed, check, overfit, timeit, load_best_on_end, initial_seed, state_kwargs)
02:37:10      139 experiment = self._experiment_fn(
02:37:10      140     stage="train",
02:37:10      141     model=model,
02:37:10     (...)
02:37:10      160     initial_seed=initial_seed,
02:37:10      161 )
02:37:10      162 self.experiment = experiment
02:37:10  --> 163 utils.distributed_cmd_run(self.run_experiment, distributed)
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/utils/scripts.py:132, in distributed_cmd_run(worker_fn, distributed, *args, **kwargs)
02:37:10      122     warnings.warn(
02:37:10      123         "Looks like you are trying to call distributed setup twice, "
02:37:10      124         "switching to normal run for correct distributed training."
02:37:10      125     )
02:37:10      127 if (
02:37:10      128     not distributed
02:37:10      129     or torch.distributed.is_initialized()
02:37:10      130     or world_size <= 1
02:37:10      131 ):
02:37:10  --> 132     worker_fn(*args, **kwargs)
02:37:10      133 elif local_rank is not None:
02:37:10      134     torch.cuda.set_device(int(local_rank))
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/runner.py:987, in IRunner.run_experiment(self, experiment)
02:37:10      985 if _exception_handler_check(getattr(self, "callbacks", None)):
02:37:10      986     self.exception = ex
02:37:10  --> 987     self._run_event("on_exception")
02:37:10      988 else:
02:37:10      989     raise ex
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/runner.py:780, in IRunner._run_event(self, event)
02:37:10      769 """Inner method to run specified event on Runners' callbacks.
02:37:10      770 
02:37:10      771 Args:
02:37:10     (...)
02:37:10      777 
02:37:10      778 """
02:37:10      779 for callback in self.callbacks.values():
02:37:10  --> 780     getattr(callback, event)(self)
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/callbacks/exception.py:24, in ExceptionCallback.on_exception(self, runner)
02:37:10       21     return
02:37:10       23 if runner.need_exception_reraise:
02:37:10  ---> 24     raise exception
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/runner.py:975, in IRunner.run_experiment(self, experiment)
02:37:10      973 try:
02:37:10      974     for stage in self.experiment.stages:
02:37:10  --> 975         self._run_stage(stage)
02:37:10      976 except (Exception, KeyboardInterrupt) as ex:
02:37:10      977     from catalyst.core.callbacks.exception import ExceptionCallback
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/runner.py:943, in IRunner._run_stage(self, stage)
02:37:10      939 utils.set_global_seed(
02:37:10      940     self.experiment.initial_seed + self.global_epoch + 1
02:37:10      941 )
02:37:10      942 self._run_event("on_epoch_start")
02:37:10  --> 943 self._run_epoch(stage=stage, epoch=self.epoch)
02:37:10      944 self._run_event("on_epoch_end")
02:37:10      946 if self.need_early_stop:
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/runner.py:922, in IRunner._run_epoch(self, stage, epoch)
02:37:10      920 self._run_event("on_loader_start")
02:37:10      921 with torch.set_grad_enabled(self.is_train_loader):
02:37:10  --> 922     self._run_loader(loader)
02:37:10      923 self._run_event("on_loader_end")
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/runner.py:857, in IRunner._run_loader(self, loader)
02:37:10      855 self.global_batch_step += 1
02:37:10      856 self.loader_batch_step = i + 1
02:37:10  --> 857 self._run_batch(batch)
02:37:10      858 if self.need_early_stop:
02:37:10      859     self.need_early_stop = False
02:37:10  
02:37:10  File /opt/conda/lib/python3.8/site-packages/catalyst/core/runner.py:822, in IRunner._run_batch(self, batch)
02:37:10      813 """
02:37:10      814 Inner method to run train step on specified data batch,
02:37:10      815 with batch callbacks events.
02:37:10     (...)
02:37:10      819         from DataLoader.
02:37:10      820 """
02:37:10      821 if isinstance(batch, dict):
02:37:10  --> 822     self.batch_size = next(iter(batch.values())).shape[0]
02:37:10      823 else:
02:37:10      824     self.batch_size = len(batch[0])
02:37:10  
02:37:10  AttributeError: 'dict' object has no attribute 'shape'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions