Clarify interpolation algorithms for resample2d#816
Clarify interpolation algorithms for resample2d#816fdwr merged 9 commits intowebmachinelearning:mainfrom inexorabletash:resample-algos
Conversation
fdwr
left a comment
There was a problem hiding this comment.
Nice. Thanks for improving the spec. 🙏
|
|
||
|
|
||
| <div class="note"> | ||
| The specific sampling algorithms are based on those widely used in existing Machine Learning frameworks. For example, when performing {{MLInterpolationMode/linear}} resampling from the following *[4, 4]* input tensor (considering only spatial dimensions): |
There was a problem hiding this comment.
The specific sampling algorithms are based on those widely used in existing Machine Learning frameworks.
Note
Some ML libraries got this wrong historically and did things like stretch the centers of the input corner pixels to the centers of the output corner pixels (rather than the corner extents, including the whole pixel box rectangle rather than just a point sample), which graphics experts know is incorrect 😉 and gives you poor results. Imaging libraries like OpenCV though do the right thing, and thankfully newer versions of TF and PyTorch have fixed this behavior by default. e.g. #1 #2.
(no action - resolve me)
There was a problem hiding this comment.
If we want to say more in the spec we can!
There was a problem hiding this comment.
Updated comment with visualization - think it would help? I should probably recreate it from scratch to avoid directly reusing Jacob Richeimer's figure https://jricheimer.github.io/tensorflow/2019/02/11/resize-confusion/.
There was a problem hiding this comment.
I'm thinking it will be hard to capture more of the history here without it turning into an essay equivalent to these linked resources. Maybe we should just link to these blog posts? @anssiko - any thoughts on non-normative links to potentially ephemeral resources?
There was a problem hiding this comment.
No need to stall on this aspect. Happy to merge if you say go.
huningxin
left a comment
There was a problem hiding this comment.
LGTM with nits, thanks much!
Co-authored-by: Dwayne Robinson <[email protected]>
Co-authored-by: Dwayne Robinson <[email protected]>
Co-authored-by: Dwayne Robinson <[email protected]>
Co-authored-by: Ningxin Hu <[email protected]>
Co-authored-by: Ningxin Hu <[email protected]>
Co-authored-by: Ningxin Hu <[email protected]>
|
If it still looks good @fdwr can you squash-merge ? |
SHA: 4c34b9e Reason: push, by fdwr Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
👍 |
|
it seems, it is not what Chrome on Windows does: nearest-neighbor of [0, 1, 2, 3, 4, 5] to the shape 21 gives [0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5] and for nearest-neighbor of [0, 1, 2, 3, 4, 5] to the shape 3: Some testing of all inputsizes and outputsizes between 1 and 51 shows that Chrome does something like |


This gives formal definitions for the
nearest-neighborandlinearinterpolation modes. The definitions are based on text given by @fdwr and baseline implementation by @BruceDai and independently verified.Resolves #358
Preview | Diff