Skip to content

Change batched NMS threshold for-loop version#4990

Merged
fmassa merged 3 commits intomainfrom
ppwwyyxx-patch-1
Dec 1, 2021
Merged

Change batched NMS threshold for-loop version#4990
fmassa merged 3 commits intomainfrom
ppwwyyxx-patch-1

Conversation

@ppwwyyxx
Copy link
Copy Markdown
Contributor

According to the benchmark link #1311 (comment), for GPU the threshold should be higher (which is why detectron2 used 40k).
This PR changes the threshold to be device dependent.

According to the benchmark link #1311 (comment), for GPU the threshold should be higher (which is why detectron2 used 40k).
This PR changes the threshold to be device dependent.
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Nov 24, 2021

💊 CI failures summary and remediations

As of commit 51b90a7 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job Step Action
CircleCI binary_linux_conda_py3.6_cu111 packaging/build_conda.sh 🔁 rerun

1 job timed out:

  • binary_linux_conda_py3.6_cu111

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@ppwwyyxx ppwwyyxx changed the title Change batched NMS threshold to choose for-loop version Change batched NMS threshold for-loop version Nov 24, 2021
Copy link
Copy Markdown
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @ppwwyyxx

We decided to use the same threshold for CPU and GPU to keep things as simple as possible and avoid potential surprises.

Do you have a use-case where there are more than 4k boxes on a GPU?

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

LVIS has 1200 classes. So in the last NMS of a faster r-cnn trained on LVIS it's common to have 10k-30k boxes.

Copy link
Copy Markdown
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details @ppwwyyxx . The changes LGTM but I'm wondering, did you observe a significant time improvement with the new threshold?

@ppwwyyxx
Copy link
Copy Markdown
Contributor Author

IIRC we added thresholding in detectron2 because batched_nms OOM for a specific model with too many boxes. Then we did some benchmarks and determined 40k threshold for GPU. This is similar to the conclusion in #1311.
D17157130 has some discussions at the time.

Copy link
Copy Markdown
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @ppwwyyxx !

I believe this is a net improvement. I think the previous threshold was indeed not ideal for GPU, and pytorch/benchmark#547 (comment) seems to indicate that.

@fmassa fmassa merged commit 0aa3717 into main Dec 1, 2021
@fmassa fmassa deleted the ppwwyyxx-patch-1 branch December 1, 2021 13:54
@github-actions
Copy link
Copy Markdown

github-actions bot commented Dec 1, 2021

Hey @fmassa!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

facebook-github-bot pushed a commit to facebookresearch/detectron2 that referenced this pull request Dec 1, 2021
Summary:
torchvision already has the thresholding logic.
The threshold in torchvision is not good, which will be fixed in pytorch/vision#4990

Reviewed By: zhanghang1989

Differential Revision: D32661585

fbshipit-source-id: 971b68a914c2bba29dff37298c38e31b02735c1d
facebook-github-bot pushed a commit that referenced this pull request Dec 2, 2021
Summary:
According to the benchmark link #1311 (comment), for GPU the threshold should be higher (which is why detectron2 used 40k).
This PR changes the threshold to be device dependent.

Reviewed By: NicolasHug

Differential Revision: D32759198

fbshipit-source-id: df6fff4b7fed27e41d9cfeb51d17a94b4511b4e7

Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Francisco Massa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants