Optimize RetinaNet inference time

## 🚀 Feature

The postprocessing step in RetinaNet is slow, and the whole inference time for RetinaNet is almost twice slower than Faster R-CNN as of today.
In particular, https://github.com/pytorch/vision/blob/5bb81c8e008c601237e2ffd5cbea7192775130bd/torchvision/models/detection/retinanet.py#L442-L471 does a for loop over the number of classes. This for loop can be parallelized by batching operations together over all classes, which should greatly improve the inference speed.

For reference, Detectron2 has sped up inference on RetinaNet a few times already, with latest optimization present in https://github.com/facebookresearch/detectron2/commit/8999946492ae6930a4b312dbbea952e326a9f1df , and also batch inference over classes (and only does a for loop on the number of feature maps, which is much smaller than the number of COCO classes)

	for class_index in range(num_classes):
	# remove low scoring boxes
	inds = torch.gt(scores_per_image[:, class_index], self.score_thresh)
	boxes_per_class, scores_per_class, labels_per_class = \
	boxes_per_image[inds], scores_per_image[inds, class_index], labels_per_image[inds, class_index]
	other_outputs_per_class = [(k, v[inds]) for k, v in other_outputs_per_image]

	# remove empty boxes
	keep = box_ops.remove_small_boxes(boxes_per_class, min_size=1e-2)
	boxes_per_class, scores_per_class, labels_per_class = \
	boxes_per_class[keep], scores_per_class[keep], labels_per_class[keep]
	other_outputs_per_class = [(k, v[keep]) for k, v in other_outputs_per_class]

	# non-maximum suppression, independently done per class
	keep = box_ops.nms(boxes_per_class, scores_per_class, self.nms_thresh)

	# keep only topk scoring predictions
	keep = keep[:self.detections_per_img]
	boxes_per_class, scores_per_class, labels_per_class = \
	boxes_per_class[keep], scores_per_class[keep], labels_per_class[keep]
	other_outputs_per_class = [(k, v[keep]) for k, v in other_outputs_per_class]

	image_boxes.append(boxes_per_class)
	image_scores.append(scores_per_class)
	image_labels.append(labels_per_class)

	for k, v in other_outputs_per_class:
	if k not in image_other_outputs:
	image_other_outputs[k] = []
	image_other_outputs[k].append(v)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize RetinaNet inference time #2799

🚀 Feature

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize RetinaNet inference time #2799

Description

🚀 Feature

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions