Skip to content

Improve sliding-window inference#6009

Merged
wyli merged 8 commits intoProject-MONAI:devfrom
dongyang0122:update_swinferer
Feb 16, 2023
Merged

Improve sliding-window inference#6009
wyli merged 8 commits intoProject-MONAI:devfrom
dongyang0122:update_swinferer

Conversation

@dongyang0122
Copy link
Copy Markdown
Collaborator

Description

Improve sliding-window inference with enhanced GPU memory efficiency.

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests --disttests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

Signed-off-by: dongy <[email protected]>
Copy link
Copy Markdown
Contributor

@wyli wyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls help double check @Nic-Ma @mingxin-zheng

Signed-off-by: dongy <[email protected]>
@dongyang0122 dongyang0122 self-assigned this Feb 15, 2023
@dongyang0122
Copy link
Copy Markdown
Collaborator Author

@wyli one of tests failed. I don't know what the root cause is.

@mingxin-zheng
Copy link
Copy Markdown
Contributor

I just re-ran the test premerge / quick-py3 (macOS-latest) (pull_request) to make sure it is not related to test resource limitation.

@mingxin-zheng
Copy link
Copy Markdown
Contributor

The code looks fine to me. But I don't have the big picture how it improves the GPU efficiency. @dongyang0122 Can you comment or link to an issue?

@dongyang0122
Copy link
Copy Markdown
Collaborator Author

The code looks fine to me. But I don't have the big picture how it improves the GPU efficiency. @dongyang0122 Can you comment or link to an issue?

@mingxin-zheng output_image_list[ss] is a multi-channel tensor, and count_map_list.pop(0) is a single-channel tensor. Direct division between them would create multiple copies of count_map_list.pop(0), which takes big chunks of GPU memory when the channel number is large. Moreover, it seems torch.isnan(output_i).any() takes much more GPU memory for 5D-array output_i. A for loop would reduce the cost.

@wyli
Copy link
Copy Markdown
Contributor

wyli commented Feb 16, 2023

/build

@wyli wyli enabled auto-merge (squash) February 16, 2023 17:39
@wyli wyli merged commit 11745a6 into Project-MONAI:dev Feb 16, 2023
@dongyang0122 dongyang0122 deleted the update_swinferer branch February 22, 2023 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants