Skip to content

Memory Format support for Resnet models #23403

@VitalyFedyunin

Description

@VitalyFedyunin

We define 4D tensor as stored in channels last memory format, when dimensions order is NCHW and C-strides < W-strides < H-strides < N-strides (If size of any dimension is equal to 1, this dimension strides value is not taken into account).

Channels last contiguous tensor is channel last tensor which occupies contiguous memory block. So x.is_contiguous(memory_format=torch.channels_last) checks if tensor is channels last contiguous.

The goal of the experiment is to use channels last memory format in all Resnet (https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py) model's operators and to measure performance gains on the Volta devices with CudNN library available.

This experiment requires:

  1. Update operators kernels to follow the next rule: if one of the operator's inputs is channel last tensor, all outputs should also be in the channel last memory format.
  2. For better performance gain, update DataLoader to output channel last tensors.

To avoid changing the model itself and more importantly, introduce this optimization to the existing saved models. We need to introduce the next changes:

Writing memory format aware operators require special functions introduced in #23391

auto memory_format = input_tensor.suggest_memory_format();
auto output_tensor = at::empty(output_shape, memory_format);

switch (memory_format) {
  case MemoryFormat::ChannelsLast: {
    input_cl_contiguous = input_tensor.contiguous(
        MemoryFormat::ChannelsLast); // if kernel requires memory contiguous
                                     // tensor
    // .... kernel code
    break;
  }
  case MemoryFormat::Contiguous: {
    // .... standard kernel
    break;
  }
  default:
    TORCH_CHECK(
        false,
        "Unsupported memory format. Supports only ChannelsLast, Contiguous");
}

Notes

  • Resnet calls x = x.reshape(x.size(0), -1) before linear layers, we are going to update reshape and view code and convert tensor's memory format to torch.contiguous_format at this step.
  • Making old models/libraries to work faster also requires BC sacrifices, such as
    empty_like ( and all _like operators ) will return channels last tensor if input is channels last, similar will apply to to, clone, resize_as. We are thinking about the ability to control suggest_memory_format behaviors by the global variable.

Metadata

Metadata

Labels

module: cudnnRelated to torch.backends.cudnn, and CuDNN supportmodule: memory formatMemory format/layout related issues/changes (channels_last, nhwc)triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions