Dispatch-less structured wrapper / composite / alias kernels

A common pattern in PyTorch is to have two implementations of a function which have different signatures:

```
- func: upsample_nearest1d.vec(Tensor input, int[]? output_size, float[]? scale_factors) -> Tensor
- func: upsample_nearest1d(Tensor self, int[1] output_size, float? scales=None) -> Tensor
```

Typically, one of these functions is implemented in terms of the other:

```
Tensor upsample_nearest1d(
    const Tensor& input,
    c10::optional<IntArrayRef> output_size,
    c10::optional<ArrayRef<double>> scale_factors) {
  auto osize = compute_output_size(input.sizes(), output_size, scale_factors);
  auto scale_w = get_scale_value(scale_factors, 0);
  return at::upsample_nearest1d(input, osize, scale_w);
}
```

Now, there is a very irritating problem with `upsample_nearest1d` as it is written here, which is that it necessitates two dispatches: once to the wrapper function (shown), and then once again when we call `at::upsample_nearest1d`. Alternately, we could write multiple copies of the wrapper function and bypass the second dispatch (using https://github.com/pytorch/pytorch/pull/49505):

```
Tensor upsample_nearest1d_cpu(
    const Tensor& input,
    c10::optional<IntArrayRef> output_size,
    c10::optional<ArrayRef<double>> scale_factors) {
  auto osize = compute_output_size(input.sizes(), output_size, scale_factors);
  auto scale_w = get_scale_value(scale_factors, 0);
  return at::cpu::upsample_nearest1d(input, osize, scale_w);
}
```

But this is irritating, and in the worst case scenario needs to be done per backend (CPU, CUDA) and per variant (out, functional, inplace). Oof!

What you would like to do, instead, is describe how to transform the (functional) input arguments from the wrapper function to the real function, and then automatically generate all of the variants.

It's a little uncertain to me what the parameters of this transformation should be. The easiest way to implement the transformation is to insert C++ code directly into `native_functions.yaml`, and then generate the multiple copies directly based on this code.

```
- func: upsample_nearest1d.vec(Tensor input, int[]? output_size, float[]? scale_factors) -> Tensor
  - structured_wrapper: |
    auto osize = compute_output_size(input.sizes(), output_size, scale_factors);
    auto scale_w = get_scale_value(scale_factors, 0);
    return upsample_nearest1d(input, osize, scale_w);
```

This would set a new precedent that it is OK to put C++ code inside native_functions.yaml. Maybe you do not like it, and would like the conversion code to live in C++. Unfortunately, I'm not too sure how to do this: recall that the class hierarchy looks like:

```
upsample_nearest1d
  +- structured_upsample_nearest1d_cpu
       +- structured_upsample_nearest1d_cpu_out
       +- structured_upsample_nearest1d_cpu_inplace
       +- structured_upsample_nearest1d_cpu_functional
  +- structured_upsample_nearest1d_cuda
       +- structured_upsample_nearest1d_cuda_out
       +- structured_upsample_nearest1d_cuda_inplace
       +- structured_upsample_nearest1d_cuda_functional
```

There is no logical place to interpose an adapter in the class hierarchy here.

cc @ezyang @bhosmer @smessmer @ljk53 @bdhirsh @ailzhang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dispatch-less structured wrapper / composite / alias kernels #50953

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dispatch-less structured wrapper / composite / alias kernels #50953

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions