-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
A common pattern in PyTorch is to have two implementations of a function which have different signatures:
- func: upsample_nearest1d.vec(Tensor input, int[]? output_size, float[]? scale_factors) -> Tensor
- func: upsample_nearest1d(Tensor self, int[1] output_size, float? scales=None) -> Tensor
Typically, one of these functions is implemented in terms of the other:
Tensor upsample_nearest1d(
const Tensor& input,
c10::optional<IntArrayRef> output_size,
c10::optional<ArrayRef<double>> scale_factors) {
auto osize = compute_output_size(input.sizes(), output_size, scale_factors);
auto scale_w = get_scale_value(scale_factors, 0);
return at::upsample_nearest1d(input, osize, scale_w);
}
Now, there is a very irritating problem with upsample_nearest1d as it is written here, which is that it necessitates two dispatches: once to the wrapper function (shown), and then once again when we call at::upsample_nearest1d. Alternately, we could write multiple copies of the wrapper function and bypass the second dispatch (using #49505):
Tensor upsample_nearest1d_cpu(
const Tensor& input,
c10::optional<IntArrayRef> output_size,
c10::optional<ArrayRef<double>> scale_factors) {
auto osize = compute_output_size(input.sizes(), output_size, scale_factors);
auto scale_w = get_scale_value(scale_factors, 0);
return at::cpu::upsample_nearest1d(input, osize, scale_w);
}
But this is irritating, and in the worst case scenario needs to be done per backend (CPU, CUDA) and per variant (out, functional, inplace). Oof!
What you would like to do, instead, is describe how to transform the (functional) input arguments from the wrapper function to the real function, and then automatically generate all of the variants.
It's a little uncertain to me what the parameters of this transformation should be. The easiest way to implement the transformation is to insert C++ code directly into native_functions.yaml, and then generate the multiple copies directly based on this code.
- func: upsample_nearest1d.vec(Tensor input, int[]? output_size, float[]? scale_factors) -> Tensor
- structured_wrapper: |
auto osize = compute_output_size(input.sizes(), output_size, scale_factors);
auto scale_w = get_scale_value(scale_factors, 0);
return upsample_nearest1d(input, osize, scale_w);
This would set a new precedent that it is OK to put C++ code inside native_functions.yaml. Maybe you do not like it, and would like the conversion code to live in C++. Unfortunately, I'm not too sure how to do this: recall that the class hierarchy looks like:
upsample_nearest1d
+- structured_upsample_nearest1d_cpu
+- structured_upsample_nearest1d_cpu_out
+- structured_upsample_nearest1d_cpu_inplace
+- structured_upsample_nearest1d_cpu_functional
+- structured_upsample_nearest1d_cuda
+- structured_upsample_nearest1d_cuda_out
+- structured_upsample_nearest1d_cuda_inplace
+- structured_upsample_nearest1d_cuda_functional
There is no logical place to interpose an adapter in the class hierarchy here.