-
Notifications
You must be signed in to change notification settings - Fork 781
test error: ./modules/TorchIO_MONAI_PyTorch_Lightning.ipynb #987
Copy link
Copy link
Closed
Closed
Copy link
Description
[2022-10-13T00:27:52.977Z] Running ./modules/TorchIO_MONAI_PyTorch_Lightning.ipynb
[2022-10-13T00:27:52.977Z] Checking PEP8 compliance...
[2022-10-13T00:27:52.977Z] Running notebook...
[2022-10-13T00:27:55.501Z] MONAI version: 1.0.0+38.g32a237a7
[2022-10-13T00:27:55.501Z] Numpy version: 1.22.2
[2022-10-13T00:27:55.501Z] Pytorch version: 1.13.0a0+d0d6b1f
[2022-10-13T00:27:55.501Z] MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
[2022-10-13T00:27:55.501Z] MONAI rev id: 32a237a7959e4890a963481c23baff84b95a253b
[2022-10-13T00:27:55.501Z] MONAI __file__: /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/__init__.py
[2022-10-13T00:27:55.501Z]
[2022-10-13T00:27:55.501Z] Optional dependencies:
[2022-10-13T00:27:55.501Z] Pytorch Ignite version: 0.4.10
[2022-10-13T00:27:55.501Z] Nibabel version: 4.0.2
[2022-10-13T00:27:55.501Z] scikit-image version: 0.19.3
[2022-10-13T00:27:55.501Z] Pillow version: 9.0.1
[2022-10-13T00:27:55.501Z] Tensorboard version: 2.10.0
[2022-10-13T00:27:55.501Z] gdown version: 4.5.1
[2022-10-13T00:27:55.501Z] TorchVision version: 0.14.0a0
[2022-10-13T00:27:55.501Z] tqdm version: 4.64.1
[2022-10-13T00:27:55.501Z] lmdb version: 1.3.0
[2022-10-13T00:27:55.501Z] psutil version: 5.9.2
[2022-10-13T00:27:55.501Z] pandas version: 1.4.4
[2022-10-13T00:27:55.501Z] einops version: 0.5.0
[2022-10-13T00:27:55.501Z] transformers version: 4.21.3
[2022-10-13T00:27:55.501Z] mlflow version: 1.29.0
[2022-10-13T00:27:55.501Z] pynrrd version: 1.0.0
[2022-10-13T00:27:55.501Z]
[2022-10-13T00:27:55.501Z] For details about installing the optional dependencies, please visit:
[2022-10-13T00:27:55.501Z] https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
[2022-10-13T00:27:55.501Z]
[2022-10-13T00:27:56.870Z] /opt/conda/lib/python3.8/site-packages/papermill/iorw.py:153: UserWarning: the file is not specified with any extension : -
[2022-10-13T00:27:56.870Z] warnings.warn(
[2022-10-13T00:29:09.859Z]
Executing: 0%| | 0/39 [00:00<?, ?cell/s]
Executing: 3%|▎ | 1/39 [00:01<00:49, 1.30s/cell]
Executing: 10%|█ | 4/39 [00:06<00:59, 1.69s/cell]
Executing: 15%|█▌ | 6/39 [00:38<04:17, 7.81s/cell]
Executing: 18%|█▊ | 7/39 [00:55<05:22, 10.07s/cell]
Executing: 21%|██ | 8/39 [00:59<04:23, 8.51s/cell]
Executing: 44%|████▎ | 17/39 [01:04<00:54, 2.47s/cell]
Executing: 62%|██████▏ | 24/39 [01:11<00:26, 1.76s/cell]
Executing: 62%|██████▏ | 24/39 [01:13<00:45, 3.05s/cell]
[2022-10-13T00:29:09.859Z] /opt/conda/lib/python3.8/site-packages/papermill/iorw.py:153: UserWarning: the file is not specified with any extension : -
[2022-10-13T00:29:09.859Z] warnings.warn(
[2022-10-13T00:29:09.859Z] Traceback (most recent call last):
[2022-10-13T00:29:09.859Z] File "/opt/conda/bin/papermill", line 8, in <module>
[2022-10-13T00:29:09.859Z] sys.exit(papermill())
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
[2022-10-13T00:29:09.859Z] return self.main(*args, **kwargs)
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1055, in main
[2022-10-13T00:29:09.859Z] rv = self.invoke(ctx)
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
[2022-10-13T00:29:09.859Z] return ctx.invoke(self.callback, **ctx.params)
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 760, in invoke
[2022-10-13T00:29:09.859Z] return __callback(*args, **kwargs)
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
[2022-10-13T00:29:09.859Z] return f(get_current_context(), *args, **kwargs)
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/papermill/cli.py", line 250, in papermill
[2022-10-13T00:29:09.859Z] execute_notebook(
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/papermill/execute.py", line 128, in execute_notebook
[2022-10-13T00:29:09.859Z] raise_for_execution_errors(nb, output_path)
[2022-10-13T00:29:09.859Z] File "/opt/conda/lib/python3.8/site-packages/papermill/execute.py", line 232, in raise_for_execution_errors
[2022-10-13T00:29:09.859Z] raise error
[2022-10-13T00:29:09.859Z] papermill.exceptions.PapermillExecutionError:
[2022-10-13T00:29:09.859Z] ---------------------------------------------------------------------------
[2022-10-13T00:29:09.859Z] Exception encountered at "In [11]":
[2022-10-13T00:29:09.859Z] ---------------------------------------------------------------------------
[2022-10-13T00:29:09.859Z] RuntimeError Traceback (most recent call last)
[2022-10-13T00:29:09.859Z] Cell In [11], line 3
[2022-10-13T00:29:09.859Z] 1 start = datetime.now()
[2022-10-13T00:29:09.859Z] 2 print('Training started at', start)
[2022-10-13T00:29:09.859Z] ----> 3 trainer.fit(model=model, datamodule=data)
[2022-10-13T00:29:09.859Z] 4 print('Training duration:', datetime.now() - start)
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:740, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, train_dataloader, ckpt_path)
[2022-10-13T00:29:09.859Z] 735 rank_zero_deprecation(
[2022-10-13T00:29:09.859Z] 736 "`trainer.fit(train_dataloader)` is deprecated in v1.4 and will be removed in v1.6."
[2022-10-13T00:29:09.859Z] 737 " Use `trainer.fit(train_dataloaders)` instead. HINT: added 's'"
[2022-10-13T00:29:09.859Z] 738 )
[2022-10-13T00:29:09.859Z] 739 train_dataloaders = train_dataloader
[2022-10-13T00:29:09.859Z] --> 740 self._call_and_handle_interrupt(
[2022-10-13T00:29:09.859Z] 741 self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
[2022-10-13T00:29:09.859Z] 742 )
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:685, in Trainer._call_and_handle_interrupt(self, trainer_fn, *args, **kwargs)
[2022-10-13T00:29:09.859Z] 675 r"""
[2022-10-13T00:29:09.859Z] 676 Error handling, intended to be used only for main trainer function entry points (fit, validate, test, predict)
[2022-10-13T00:29:09.859Z] 677 as all errors should funnel through them
[2022-10-13T00:29:09.859Z] (...)
[2022-10-13T00:29:09.859Z] 682 **kwargs: keyword arguments to be passed to `trainer_fn`
[2022-10-13T00:29:09.859Z] 683 """
[2022-10-13T00:29:09.859Z] 684 try:
[2022-10-13T00:29:09.859Z] --> 685 return trainer_fn(*args, **kwargs)
[2022-10-13T00:29:09.859Z] 686 # TODO: treat KeyboardInterrupt as BaseException (delete the code below) in v1.7
[2022-10-13T00:29:09.859Z] 687 except KeyboardInterrupt as exception:
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:777, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
[2022-10-13T00:29:09.859Z] 775 # TODO: ckpt_path only in v1.7
[2022-10-13T00:29:09.859Z] 776 ckpt_path = ckpt_path or self.resume_from_checkpoint
[2022-10-13T00:29:09.859Z] --> 777 self._run(model, ckpt_path=ckpt_path)
[2022-10-13T00:29:09.859Z] 779 assert self.state.stopped
[2022-10-13T00:29:09.859Z] 780 self.training = False
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1199, in Trainer._run(self, model, ckpt_path)
[2022-10-13T00:29:09.859Z] 1196 self.checkpoint_connector.resume_end()
[2022-10-13T00:29:09.859Z] 1198 # dispatch `start_training` or `start_evaluating` or `start_predicting`
[2022-10-13T00:29:09.859Z] -> 1199 self._dispatch()
[2022-10-13T00:29:09.859Z] 1201 # plugin will finalized fitting (e.g. ddp_spawn will load trained model)
[2022-10-13T00:29:09.859Z] 1202 self._post_dispatch()
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1279, in Trainer._dispatch(self)
[2022-10-13T00:29:09.859Z] 1277 self.training_type_plugin.start_predicting(self)
[2022-10-13T00:29:09.859Z] 1278 else:
[2022-10-13T00:29:09.859Z] -> 1279 self.training_type_plugin.start_training(self)
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py:202, in TrainingTypePlugin.start_training(self, trainer)
[2022-10-13T00:29:09.859Z] 200 def start_training(self, trainer: "pl.Trainer") -> None:
[2022-10-13T00:29:09.859Z] 201 # double dispatch to initiate the training loop
[2022-10-13T00:29:09.859Z] --> 202 self._results = trainer.run_stage()
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1289, in Trainer.run_stage(self)
[2022-10-13T00:29:09.859Z] 1287 if self.predicting:
[2022-10-13T00:29:09.859Z] 1288 return self._run_predict()
[2022-10-13T00:29:09.859Z] -> 1289 return self._run_train()
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1311, in Trainer._run_train(self)
[2022-10-13T00:29:09.859Z] 1308 if not self.is_global_zero and self.progress_bar_callback is not None:
[2022-10-13T00:29:09.859Z] 1309 self.progress_bar_callback.disable()
[2022-10-13T00:29:09.859Z] -> 1311 self._run_sanity_check(self.lightning_module)
[2022-10-13T00:29:09.859Z] 1313 # enable train mode
[2022-10-13T00:29:09.859Z] 1314 self.model.train()
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1375, in Trainer._run_sanity_check(self, ref_model)
[2022-10-13T00:29:09.859Z] 1373 # run eval step
[2022-10-13T00:29:09.859Z] 1374 with torch.no_grad():
[2022-10-13T00:29:09.859Z] -> 1375 self._evaluation_loop.run()
[2022-10-13T00:29:09.859Z] 1377 self.call_hook("on_sanity_check_end")
[2022-10-13T00:29:09.859Z] 1379 # reset logger connector
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/loops/base.py:145, in Loop.run(self, *args, **kwargs)
[2022-10-13T00:29:09.859Z] 143 try:
[2022-10-13T00:29:09.859Z] 144 self.on_advance_start(*args, **kwargs)
[2022-10-13T00:29:09.859Z] --> 145 self.advance(*args, **kwargs)
[2022-10-13T00:29:09.859Z] 146 self.on_advance_end()
[2022-10-13T00:29:09.859Z] 147 self.restarting = False
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py:110, in EvaluationLoop.advance(self, *args, **kwargs)
[2022-10-13T00:29:09.859Z] 105 self.data_fetcher = dataloader = self.trainer._data_connector.get_profiled_dataloader(
[2022-10-13T00:29:09.859Z] 106 dataloader, dataloader_idx=dataloader_idx
[2022-10-13T00:29:09.859Z] 107 )
[2022-10-13T00:29:09.859Z] 108 dl_max_batches = self._max_batches[dataloader_idx]
[2022-10-13T00:29:09.859Z] --> 110 dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
[2022-10-13T00:29:09.859Z] 112 # store batch level output per dataloader
[2022-10-13T00:29:09.859Z] 113 self.outputs.append(dl_outputs)
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/loops/base.py:145, in Loop.run(self, *args, **kwargs)
[2022-10-13T00:29:09.859Z] 143 try:
[2022-10-13T00:29:09.859Z] 144 self.on_advance_start(*args, **kwargs)
[2022-10-13T00:29:09.859Z] --> 145 self.advance(*args, **kwargs)
[2022-10-13T00:29:09.859Z] 146 self.on_advance_end()
[2022-10-13T00:29:09.859Z] 147 self.restarting = False
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py:122, in EvaluationEpochLoop.advance(self, data_fetcher, dataloader_idx, dl_max_batches, num_dataloaders)
[2022-10-13T00:29:09.859Z] 120 # lightning module methods
[2022-10-13T00:29:09.859Z] 121 with self.trainer.profiler.profile("evaluation_step_and_end"):
[2022-10-13T00:29:09.859Z] --> 122 output = self._evaluation_step(batch, batch_idx, dataloader_idx)
[2022-10-13T00:29:09.859Z] 123 output = self._evaluation_step_end(output)
[2022-10-13T00:29:09.859Z] 125 self.batch_progress.increment_processed()
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py:217, in EvaluationEpochLoop._evaluation_step(self, batch, batch_idx, dataloader_idx)
[2022-10-13T00:29:09.859Z] 215 self.trainer.lightning_module._current_fx_name = "validation_step"
[2022-10-13T00:29:09.859Z] 216 with self.trainer.profiler.profile("validation_step"):
[2022-10-13T00:29:09.859Z] --> 217 output = self.trainer.accelerator.validation_step(step_kwargs)
[2022-10-13T00:29:09.859Z] 219 return output
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py:239, in Accelerator.validation_step(self, step_kwargs)
[2022-10-13T00:29:09.859Z] 234 """The actual validation step.
[2022-10-13T00:29:09.859Z] 235
[2022-10-13T00:29:09.859Z] 236 See :meth:`~pytorch_lightning.core.lightning.LightningModule.validation_step` for more details
[2022-10-13T00:29:09.859Z] 237 """
[2022-10-13T00:29:09.859Z] 238 with self.precision_plugin.val_step_context():
[2022-10-13T00:29:09.859Z] --> 239 return self.training_type_plugin.validation_step(*step_kwargs.values())
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py:219, in TrainingTypePlugin.validation_step(self, *args, **kwargs)
[2022-10-13T00:29:09.859Z] 218 def validation_step(self, *args, **kwargs):
[2022-10-13T00:29:09.859Z] --> 219 return self.model.validation_step(*args, **kwargs)
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] Cell In [9], line 29, in Model.validation_step(self, batch, batch_idx)
[2022-10-13T00:29:09.859Z] 27 def validation_step(self, batch, batch_idx):
[2022-10-13T00:29:09.859Z] 28 y_hat, y = self.infer_batch(batch)
[2022-10-13T00:29:09.859Z] ---> 29 loss = self.criterion(y_hat, y)
[2022-10-13T00:29:09.859Z] 30 self.log('val_loss', loss)
[2022-10-13T00:29:09.859Z] 31 return loss
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
[2022-10-13T00:29:09.859Z] 1098 # If we don't have any hooks, we want to skip the rest of the logic in
[2022-10-13T00:29:09.859Z] 1099 # this function, and just call forward.
[2022-10-13T00:29:09.859Z] 1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
[2022-10-13T00:29:09.859Z] 1101 or _global_forward_hooks or _global_forward_pre_hooks):
[2022-10-13T00:29:09.859Z] -> 1102 return forward_call(*input, **kwargs)
[2022-10-13T00:29:09.859Z] 1103 # Do not call functions when jit is used
[2022-10-13T00:29:09.859Z] 1104 full_backward_hooks, non_full_backward_hooks = [], []
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/losses/dice.py:732, in DiceCELoss.forward(self, input, target)
[2022-10-13T00:29:09.859Z] 729 raise ValueError("the number of dimensions for input and target should be the same.")
[2022-10-13T00:29:09.859Z] 731 dice_loss = self.dice(input, target)
[2022-10-13T00:29:09.859Z] --> 732 ce_loss = self.ce(input, target)
[2022-10-13T00:29:09.859Z] 733 total_loss: torch.Tensor = self.lambda_dice * dice_loss + self.lambda_ce * ce_loss
[2022-10-13T00:29:09.859Z] 735 return total_loss
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /home/jenkins/agent/workspace/Monai-notebooks/MONAI/monai/losses/dice.py:715, in DiceCELoss.ce(self, input, target)
[2022-10-13T00:29:09.859Z] 709 warnings.warn(
[2022-10-13T00:29:09.859Z] 710 f"Multichannel targets are not supported in this older Pytorch version {torch.__version__}. "
[2022-10-13T00:29:09.859Z] 711 "Using argmax (as a workaround) to convert target to a single channel."
[2022-10-13T00:29:09.859Z] 712 )
[2022-10-13T00:29:09.859Z] 713 target = torch.argmax(target, dim=1)
[2022-10-13T00:29:09.859Z] --> 715 return self.cross_entropy(input, target)
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
[2022-10-13T00:29:09.859Z] 1098 # If we don't have any hooks, we want to skip the rest of the logic in
[2022-10-13T00:29:09.859Z] 1099 # this function, and just call forward.
[2022-10-13T00:29:09.859Z] 1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
[2022-10-13T00:29:09.859Z] 1101 or _global_forward_hooks or _global_forward_pre_hooks):
[2022-10-13T00:29:09.859Z] -> 1102 return forward_call(*input, **kwargs)
[2022-10-13T00:29:09.859Z] 1103 # Do not call functions when jit is used
[2022-10-13T00:29:09.859Z] 1104 full_backward_hooks, non_full_backward_hooks = [], []
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/loss.py:1150, in CrossEntropyLoss.forward(self, input, target)
[2022-10-13T00:29:09.859Z] 1149 def forward(self, input: Tensor, target: Tensor) -> Tensor:
[2022-10-13T00:29:09.859Z] -> 1150 return F.cross_entropy(input, target, weight=self.weight,
[2022-10-13T00:29:09.859Z] 1151 ignore_index=self.ignore_index, reduction=self.reduction,
[2022-10-13T00:29:09.859Z] 1152 label_smoothing=self.label_smoothing)
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] File /opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:2846, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
[2022-10-13T00:29:09.859Z] 2844 if size_average is not None or reduce is not None:
[2022-10-13T00:29:09.859Z] 2845 reduction = _Reduction.legacy_get_string(size_average, reduce)
[2022-10-13T00:29:09.859Z] -> 2846 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
[2022-10-13T00:29:09.859Z]
[2022-10-13T00:29:09.859Z] RuntimeError: Expected floating point type for target with class probabilities, got Byte
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels