Add MobileNetV3 architecture for Segmentation#3276
Conversation
8061535 to
462d59a
Compare
231a525 to
359d941
Compare
610f13f to
406fa47
Compare
fmassa
left a comment
There was a problem hiding this comment.
Looks great, thanks!
I only have a couple of minor (non-blocking) comments. The only thing I would really like to see fixed before merge is to have the correct python -m torch.distributed.launch ... commands for reproducibility.
| self.block = nn.Sequential(*layers) | ||
| self.out_channels = cnf.out_channels | ||
| self.is_strided = cnf.stride > 1 | ||
| self._is_cn = cnf.stride > 1 |
There was a problem hiding this comment.
out of curiosity, what does cn mean in here
There was a problem hiding this comment.
It's from the C0,C1...C5,Cn names used in Object Detection. I use this feature internally to find out where the downsampling was supposed to happen but it's not always done with strides so I had to rename it. If you have any better name for it, happy to change it. I could not think of any...
There was a problem hiding this comment.
thanks for the explanation. Given that this is private I'm fine with this name
| backbone for Detection and Segmentation. | ||
| """ | ||
| # non-public config parameters | ||
| reduce_divider = 2 if kwargs.pop('_reduced_tail', False) else 1 |
There was a problem hiding this comment.
Is this feature used in any of the models? Otherwise we can just remove it
There was a problem hiding this comment.
This is a unique implementation detail from the paper on MobileNetV3 models and it's supposed to produce a further speed optimization on object detection and segmentation. In our training scripts we don't use it because we do transfer learning from ImageNet but if someone really wants to train it from scratch and go smaller I provide a way to do it.
On current master this is public (see reduced_tail param) but here I decide to hide before the release and make it an internal implementation detail for future models. Not quite convinced we will use it but want to provide an implementation very close to the paper.
Personally I would prefer to keep it hidden for now and decide later whether we want this gone. Let me know.
There was a problem hiding this comment.
Sounds good, I'm ok keeping this private for now and maybe removing it from the future.
Summary: * Making _segm_resnet() generic and reusable. * Adding fcn and deeplabv3 directly on mobilenetv3 backbone. * Adding tests for segmentation models. * Rename is_strided with _is_cn. * Add dilation support on MobileNetV3 for Segmentation. * Add Lite R-ASPP with MobileNetV3 backbone. * Add pretrained model weights. * Removing model fcn_mobilenet_v3_large. * Adding docs and imports. * Fixing typo and readme. Reviewed By: datumbox Differential Revision: D26156380 fbshipit-source-id: e62528b52728804a40da79c1311562a7f1c2afbd
Adding MobileNetV3 models for Semantic Segmentation (resolution 520):
Lite R-ASPP with Dilated MobileNetV3 Large Backbone
Heavily optimized for speed. Good for actual mobile usage.
Weight checkpoint:
Validate:
Accuracy metrics:
Speed Benchmark:
0.3278 sec per image on CPUDeepLabV3 with Dilated MobileNetV3 Large Backbone
Offers good balance between speed and accuracy, significantly faster than the FCN model with a resnet50 backbone without sacrificing too much accuracy.
Weight checkpoint:
Validate:
Accuracy metrics:
Speed Benchmark:
0.5869 sec per image on CPU