Is your feature request related to a problem? Please describe.
This would be a useful tool to have in the box, both on its own or part of a larger algorithm such as #315
similar issue for more context...
kornia/kornia#702
Describe the solution you'd like
Would be nice to have an implementation which can operate on arbitrary spatial dimensions and channels.
Due to the high computational load I think dropping down to c++ and cuda could be worth the extra work.