Skip to content

MonaiAlgo: fix logging in multi-gpu training#5355

Merged
wyli merged 2 commits intoProject-MONAI:devfrom
holgerroth:5354-monaialgo-reduce-parallel-logging
Oct 19, 2022
Merged

MonaiAlgo: fix logging in multi-gpu training#5355
wyli merged 2 commits intoProject-MONAI:devfrom
holgerroth:5354-monaialgo-reduce-parallel-logging

Conversation

@holgerroth
Copy link
Copy Markdown
Collaborator

@holgerroth holgerroth commented Oct 18, 2022

Signed-off-by: Holger Roth [email protected]

Fixes #5354.

Description

Previous output:

Current output:
2022-10-18 12:45:59,679 - MonaiAlgo - INFO - Using multi-gpu training on rank 1 (available devices: 2)
2022-10-18 12:45:59,681 - MonaiAlgo - INFO - Using multi-gpu training on rank 0 (available devices: 2)
2022-10-18 12:49:48,790 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.802879669048168
2022-10-18 12:49:48,790 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.802879669048168
2022-10-18 12:49:56,579 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.1470419466495514
2022-10-18 12:49:56,579 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.1470419466495514

Output after fix:

2022-10-18 12:51:05,400 - MonaiAlgo - INFO - Using multi-gpu training on rank 0 (available devices: 2)
2022-10-18 12:51:05,410 - MonaiAlgo - INFO - Using multi-gpu training on rank 1 (available devices: 2)
2022-10-18 12:53:09,889 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.6750877521656178
2022-10-18 12:53:25,170 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.06980131566524506

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

Signed-off-by: Holger Roth <[email protected]>
@holgerroth holgerroth added the enhancement New feature or request label Oct 18, 2022
@holgerroth holgerroth requested review from Nic-Ma and wyli October 18, 2022 16:57
@wyli
Copy link
Copy Markdown
Contributor

wyli commented Oct 19, 2022

/build

@wyli wyli enabled auto-merge (squash) October 19, 2022 07:13
@wyli wyli merged commit 2a46e7d into Project-MONAI:dev Oct 19, 2022
wyli pushed a commit that referenced this pull request Oct 19, 2022
Signed-off-by: Holger Roth <[email protected]>

Fixes #5354.

### Description

Previous output:
```
Current output:
2022-10-18 12:45:59,679 - MonaiAlgo - INFO - Using multi-gpu training on rank 1 (available devices: 2)
2022-10-18 12:45:59,681 - MonaiAlgo - INFO - Using multi-gpu training on rank 0 (available devices: 2)
2022-10-18 12:49:48,790 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.802879669048168
2022-10-18 12:49:48,790 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.802879669048168
2022-10-18 12:49:56,579 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.1470419466495514
2022-10-18 12:49:56,579 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.1470419466495514
```
Output after fix:
```
2022-10-18 12:51:05,400 - MonaiAlgo - INFO - Using multi-gpu training on rank 0 (available devices: 2)
2022-10-18 12:51:05,410 - MonaiAlgo - INFO - Using multi-gpu training on rank 1 (available devices: 2)
2022-10-18 12:53:09,889 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.6750877521656178
2022-10-18 12:53:25,170 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.06980131566524506
```

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [x] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

Signed-off-by: Holger Roth <[email protected]>
@holgerroth holgerroth deleted the 5354-monaialgo-reduce-parallel-logging branch October 19, 2022 15:02
KumoLiu pushed a commit that referenced this pull request Nov 2, 2022
Signed-off-by: Holger Roth <[email protected]>

Fixes #5354.

### Description

Previous output:
```
Current output:
2022-10-18 12:45:59,679 - MonaiAlgo - INFO - Using multi-gpu training on rank 1 (available devices: 2)
2022-10-18 12:45:59,681 - MonaiAlgo - INFO - Using multi-gpu training on rank 0 (available devices: 2)
2022-10-18 12:49:48,790 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.802879669048168
2022-10-18 12:49:48,790 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.802879669048168
2022-10-18 12:49:56,579 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.1470419466495514
2022-10-18 12:49:56,579 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.1470419466495514
```
Output after fix:
```
2022-10-18 12:51:05,400 - MonaiAlgo - INFO - Using multi-gpu training on rank 0 (available devices: 2)
2022-10-18 12:51:05,410 - MonaiAlgo - INFO - Using multi-gpu training on rank 1 (available devices: 2)
2022-10-18 12:53:09,889 - ignite.engine.engine.SupervisedTrainer - INFO - Got new best metric of train_accuracy: 0.6750877521656178
2022-10-18 12:53:25,170 - ignite.engine.engine.SupervisedEvaluator - INFO - Got new best metric of val_mean_dice: 0.06980131566524506
```

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [x] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

Signed-off-by: Holger Roth <[email protected]>
Signed-off-by: KumoLiu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce redundant logging in multi-gpu training with MonaiAlgo

4 participants