-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Should pooling regions be identical to convolution regions? #1318
Description
Pooling and convolution both aggregate inputs over rectangular regions defined by kernel size, stride, and padding parameters. (In fact, average pooling is convolution with a fixed filter, and max pooling is a special case of "tropical convolution".)
However, Caffe currently uses different sets of regions (and consequently, produces different output sizes) for pooling and convolution. Convolution regions are never allowed outside of the padded input, while pooling is performed when a strided pooling region extends off the ends of the input.
This inconsistency causes some annoyance when the exact sizes of things need to be computed and can vary (due to to #594). (The issue doesn't normally come up when using networks like AlexNet that ensure that their pooling regions don't encounter this edge case.)
Should the behavior be made consistent, or is there a good reason for its current state?
(See also #988, but note that the expression presented there is different from both the current behaviors.)