2975 Fix the perf issue of RandCropByPosNegLabel#3050
2975 Fix the perf issue of RandCropByPosNegLabel#3050Nic-Ma merged 13 commits intoProject-MONAI:devfrom
Conversation
merge master
merge master
merge master
merge master
merge master
merge master
Signed-off-by: Nic Ma <[email protected]>
|
/black |
Signed-off-by: Nic Ma <[email protected]>
Signed-off-by: Nic Ma <[email protected]>
|
/black |
|
BTW, as the numpy version Thanks. |
Signed-off-by: Nic Ma <[email protected]>
|
/black |
So with the data on the GPU, we're only as fast as the numpy implementation with all on the CPU?
We could change the logic to only use torch if the data is already on the GPU. If on the CPU, use numpy regardless of whether input was torch or numpy: if isinstance(x, torch.Tensor) and x.device is not torch.device("cpu"):
torch.unravel
else:
np.unravel |
|
Hi @rijobro , Thanks for your review. What do you think? Thanks. |
|
@Nic-Ma sounds good, thanks for the explanations! |
Signed-off-by: Nic Ma <[email protected]>
Fixes #2975 .
Description
This PR is followup of ticket #3038 , fixed the training slow down issue.
Now the training speed is same as the numpy version benchmark of 0.7 release (56s-58s with 21.08 docker, 52s-54s with 21.06 docker).
The main change is to avoid saving indices into GPU because we actually need to get the
item()value in CPU and index the image to crop.Status
Ready
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests.make htmlcommand in thedocs/folder.