Python/C++ API Parity: torch.nn modules and functional

Currently, PyTorch C++ API is missing many `torch::nn` layers that are available in the Python API. As part of the Python/C++ API parity work, we would like to add the following `torch::nn` modules and utilities in C++ API:

## Containers
- [ ] Module (**TODO**: some APIs are missing in C++, e.g. `register_forward_hook` / `register_forward_pre_hook`)
- [x] Sequential (@ShahriarSS)
- [x] ModuleList (@ShahriarSS)
- [x] ModuleDict (@ejguan https://github.com/pytorch/pytorch/pull/47707)
- [x] ParameterList (@yyn19951228  https://github.com/pytorch/pytorch/pull/41259)
- [x] ParameterDict (@yyn19951228 https://github.com/pytorch/pytorch/pull/40654)

## Convolution layers
- [x] Conv1d (@yf225 https://github.com/pytorch/pytorch/pull/28917)
- [x] Conv2d (@yf225 https://github.com/pytorch/pytorch/pull/28917)
- [x] Conv3d (@yf225 https://github.com/pytorch/pytorch/pull/28917)
- [x] ConvTranspose1d (@nuka137 https://github.com/pytorch/pytorch/pull/29721)
- [x] ConvTranspose2d (@nuka137 https://github.com/pytorch/pytorch/pull/29721)
- [x] ConvTranspose3d (@nuka137 https://github.com/pytorch/pytorch/pull/29721)
- [x] Unfold (@jon-tow https://github.com/pytorch/pytorch/pull/27809)
- [x] Fold (@ShahriarSS https://github.com/pytorch/pytorch/pull/24160)

## Pooling layers
- [x] MaxPool1d (@ShahriarSS https://github.com/pytorch/pytorch/pull/24860)
- [x] MaxPool2d (@ShahriarSS https://github.com/pytorch/pytorch/pull/24860)
- [x] MaxPool3d (@ShahriarSS https://github.com/pytorch/pytorch/pull/24860)
- [x] MaxUnpool1d (@pbelevich https://github.com/pytorch/pytorch/pull/26896)
- [x] MaxUnpool2d (@pbelevich https://github.com/pytorch/pytorch/pull/26915)
- [x] MaxUnpool3d (@pbelevich https://github.com/pytorch/pytorch/pull/27027)
- [x] AvgPool1d (@ShahriarSS https://github.com/pytorch/pytorch/pull/25800)
- [x] AvgPool2d (@ShahriarSS https://github.com/pytorch/pytorch/pull/25800)
- [x] AvgPool3d (@ShahriarSS https://github.com/pytorch/pytorch/pull/25800)
- [x] FractionalMaxPool2d (@yf225 https://github.com/pytorch/pytorch/pull/29933)
- [x] FractionalMaxPool3d (@yf225 https://github.com/pytorch/pytorch/pull/29933)
- [x] LPPool1d (@nuka137 https://github.com/pytorch/pytorch/pull/27800)
- [x] LPPool2d (@nuka137 https://github.com/pytorch/pytorch/pull/28492)
- [x] AdaptiveMaxPool1d (@pbelevich https://github.com/pytorch/pytorch/pull/26755)
- [x] AdaptiveMaxPool2d (@pbelevich https://github.com/pytorch/pytorch/pull/26772)
- [x] AdaptiveMaxPool3d (@pbelevich https://github.com/pytorch/pytorch/pull/26775)
- [x] AdaptiveAvgPool1d (@pbelevich https://github.com/pytorch/pytorch/pull/26808)
- [x] AdaptiveAvgPool2d (@pbelevich https://github.com/pytorch/pytorch/pull/26818)
- [x] AdaptiveAvgPool3d (@pbelevich https://github.com/pytorch/pytorch/pull/26819)

## Padding layers
- [x] ReflectionPad1d (@yf225 https://github.com/pytorch/pytorch/pull/28538)
- [x] ReflectionPad2d (@yf225 https://github.com/pytorch/pytorch/pull/28538)
- [x] ReplicationPad1d (@yf225 https://github.com/pytorch/pytorch/pull/28539)
- [x] ReplicationPad2d (@yf225 https://github.com/pytorch/pytorch/pull/28539)
- [x] ReplicationPad3d (@yf225 https://github.com/pytorch/pytorch/pull/28539)
- [x] ZeroPad2d (@yf225 https://github.com/pytorch/pytorch/pull/28540)
- [x] ConstantPad1d (@yf225 https://github.com/pytorch/pytorch/pull/28541)
- [x] ConstantPad2d (@yf225 https://github.com/pytorch/pytorch/pull/28541)
- [x] ConstantPad3d (@yf225 https://github.com/pytorch/pytorch/pull/28541)

## Non-linear activations (weighted sum, nonlinearity)
- [x] ELU (@pbelevich https://github.com/pytorch/pytorch/pull/27028)
- [x] Hardshrink (@pbelevich https://github.com/pytorch/pytorch/pull/27035)
- [x] Hardtanh (@pbelevich https://github.com/pytorch/pytorch/pull/27038)
- [x] LeakyReLU (@pbelevich https://github.com/pytorch/pytorch/pull/27059)
- [x] LogSigmoid (@pbelevich https://github.com/pytorch/pytorch/pull/27060)
- [x] MultiheadAttention (@pbelevich https://github.com/pytorch/pytorch/pull/27309)
- [x] PReLU (@pbelevich https://github.com/pytorch/pytorch/pull/27429)
- [x] ReLU (@pbelevich https://github.com/pytorch/pytorch/pull/27435)
- [x] ReLU6 (@pbelevich https://github.com/pytorch/pytorch/pull/27436)
- [x] RReLU (@pbelevich https://github.com/pytorch/pytorch/pull/27437)
- [x] SELU (@jon-tow https://github.com/pytorch/pytorch/pull/27434)
- [x] CELU (@pbelevich https://github.com/pytorch/pytorch/pull/27487)
- [x] GELU (@BIT-silence https://github.com/pytorch/pytorch/pull/28944)
- [x] Sigmoid (@pbelevich https://github.com/pytorch/pytorch/pull/27488)
- [x] Softplus (@pbelevich https://github.com/pytorch/pytorch/pull/27489)
- [x] Softshrink (@pbelevich https://github.com/pytorch/pytorch/pull/27534)
- [x] Softsign (@pbelevich https://github.com/pytorch/pytorch/pull/27535)
- [x] Tanh (@pbelevich https://github.com/pytorch/pytorch/pull/27536)
- [x] Tanhshrink (@pbelevich https://github.com/pytorch/pytorch/pull/27537)
- [x] Threshold (@pbelevich https://github.com/pytorch/pytorch/pull/27538)
- [x] GLU (@yf225 https://github.com/pytorch/pytorch/pull/29922)
- [x] SiLU (@heitorschueroff https://github.com/pytorch/pytorch/pull/41034)

## Non-linear activations (other)
- [x] Softmin (@nuka137 https://github.com/pytorch/pytorch/pull/27459)
- [x] Softmax (@nuka137 https://github.com/pytorch/pytorch/pull/27446)
- [x] Softmax2d (@nuka137 https://github.com/pytorch/pytorch/pull/27509)
- [x] LogSoftmax (@nuka137 https://github.com/pytorch/pytorch/pull/27462)
- [x] AdaptiveLogSoftmaxWithLoss (@mansoorcheema https://github.com/pytorch/pytorch/pull/29076)

## Normalization layers
- [x] BatchNorm1d (@nuka137 https://github.com/pytorch/pytorch/pull/28176)
- [x] BatchNorm2d (@nuka137 https://github.com/pytorch/pytorch/pull/28936)
- [x] BatchNorm3d (@nuka137 https://github.com/pytorch/pytorch/pull/28936)
- [x] GroupNorm (@yf225 https://github.com/pytorch/pytorch/pull/29920)
- [ ] SyncBatchNorm (Needs distributed data parallel support)
- [x] InstanceNorm1d (@divyanshsinghvi https://github.com/pytorch/pytorch/pull/28790)
- [x] InstanceNorm2d (@divyanshsinghvi https://github.com/pytorch/pytorch/pull/28790)
- [x] InstanceNorm3d (@divyanshsinghvi https://github.com/pytorch/pytorch/pull/28790)
- [x] LayerNorm (@anjali411 https://github.com/pytorch/pytorch/pull/28032)
- [x] LocalResponseNorm (@mansoorcheema https://github.com/pytorch/pytorch/pull/28759)
- [x] CrossMapLRN2d (@lsrock1 https://github.com/pytorch/pytorch/pull/29039)

## Recurrent layers
- [x] RNN (@yf225 https://github.com/pytorch/pytorch/pull/34322)
- [x] LSTM (@yf225 https://github.com/pytorch/pytorch/pull/34322)
- [x] GRU (@yf225 https://github.com/pytorch/pytorch/pull/34322)
- [x] RNNCell (@yf225 https://github.com/pytorch/pytorch/pull/34400)
- [x] LSTMCell (@yf225 https://github.com/pytorch/pytorch/pull/34400)
- [x] GRUCell (@yf225 https://github.com/pytorch/pytorch/pull/34400)

## Transformer layers
- [x] Transformer (@glaringlee https://github.com/pytorch/pytorch/pull/44333)
- [x] TransformerEncoder (@glaringlee https://github.com/pytorch/pytorch/pull/43187)
- [x] TransformerDecoder (@VinodSKumar https://github.com/pytorch/pytorch/pull/42886)
- [x] TransformerEncoderLayer (@glaringlee https://github.com/pytorch/pytorch/pull/42633)
- [x] TransformerDecoderLayer (@VinodSKumar https://github.com/pytorch/pytorch/pull/42717)

## Linear layers
- [x] Identity (@jon-tow https://github.com/pytorch/pytorch/pull/26713)
- [x] Linear (@pbelevich https://github.com/pytorch/pytorch/pull/27382)
- [x] Bilinear (@MJ10 https://github.com/pytorch/pytorch/pull/26082)
- [x] Flatten (@mrsalehi https://github.com/pytorch/pytorch/pull/28072)
- [x] Unflatten (@heitorschueroff https://github.com/pytorch/pytorch/pull/42613)

## Dropout layers
- [x] Dropout (@pbelevich https://github.com/pytorch/pytorch/pull/29761)
- [x] Dropout2d (@pbelevich https://github.com/pytorch/pytorch/pull/29761)
- [x] Dropout3d (@pbelevich https://github.com/pytorch/pytorch/pull/29761)
- [x] AlphaDropout (@Suyash458 https://github.com/pytorch/pytorch/pull/28424)
- [x] FeatureAlphaDropout (@Suyash458 https://github.com/pytorch/pytorch/pull/28424)

## Sparse layers
- [x] Embedding (@anjali411 https://github.com/pytorch/pytorch/pull/26358)
- [x] EmbeddingBag (@anjali411 https://github.com/pytorch/pytorch/pull/26358)

## Distance functions
- [x] CosineSimilarity (@jon-tow https://github.com/pytorch/pytorch/pull/26424)
- [x] PairwiseDistance (@jon-tow https://github.com/pytorch/pytorch/pull/26424)

## Loss functions
- [x] L1Loss (@ShahriarSS https://github.com/pytorch/pytorch/pull/25902)
- [x] MSELoss (@ShahriarSS https://github.com/pytorch/pytorch/pull/27156)
- [x] CrossEntropyLoss (@PyExtreme https://github.com/pytorch/pytorch/pull/29812)
- [x] CTCLoss (@pbelevich https://github.com/pytorch/pytorch/pull/28654)
- [x] NLLLoss (@PyExtreme https://github.com/pytorch/pytorch/pull/29812)
- [x] PoissonNLLLoss (@pbelevich https://github.com/pytorch/pytorch/pull/28755)
- [x] KLDivLoss (@ShahriarSS https://github.com/pytorch/pytorch/pull/27156)
- [x] BCELoss (@ShahriarSS https://github.com/pytorch/pytorch/pull/27156)
- [x] BCEWithLogitsLoss (@pbelevich https://github.com/pytorch/pytorch/pull/28783)
- [x] MarginRankingLoss (@pbelevich https://github.com/pytorch/pytorch/pull/29000)
- [x] HingeEmbeddingLoss (@jon-tow https://github.com/pytorch/pytorch/pull/27101)
- [x] MultiLabelMarginLoss (@CarMiranda https://github.com/pytorch/pytorch/pull/27659)
- [x] SmoothL1Loss (@CarMiranda https://github.com/pytorch/pytorch/pull/27661)
- [x] SoftMarginLoss (@CarMiranda https://github.com/pytorch/pytorch/pull/27660)
- [x] MultiLabelSoftMarginLoss (@CarMiranda https://github.com/pytorch/pytorch/pull/27669)
- [x] CosineEmbeddingLoss (@jon-tow https://github.com/pytorch/pytorch/pull/27345)
- [x] MultiMarginLoss (please see details in https://github.com/pytorch/pytorch/issues/27198) (@CarMiranda https://github.com/pytorch/pytorch/pull/27424)
- [x] TripletMarginLoss (please see details in https://github.com/pytorch/pytorch/issues/27197) (@PyExtreme https://github.com/pytorch/pytorch/pull/27713)

## Vision layers
- [x] PixelShuffle (@CarMiranda https://github.com/pytorch/pytorch/pull/28140)
- [x] Upsample (along with `torch::nn::functional::interpolate`) (@jon-tow https://github.com/pytorch/pytorch/pull/28413)

## Utilities
- [x] clip_grad_norm_ (@rohan-varma https://github.com/pytorch/pytorch/pull/26140)
- [x] clip_grad_value_ (@jokerkeny https://github.com/pytorch/pytorch/pull/28736)
- [x] parameters_to_vector (@lsrock1 https://github.com/pytorch/pytorch/pull/29267)
- [x] vector_to_parameters (@lsrock1 https://github.com/pytorch/pytorch/pull/29267)
- [ ] weight_norm (@bernsm3, has dependency on `Module._forward_pre_hooks`)
- [ ] remove_weight_norm (@bernsm3, has dependency on `Module._forward_pre_hooks`)
- [ ] spectral_norm (has dependency on `Module._forward_pre_hooks`)
- [ ] remove_spectral_norm (has dependency on `Module._forward_pre_hooks`)
- [x] PackedSequence (@yf225 https://github.com/pytorch/pytorch/pull/33652)
- [x] pack_padded_sequence (@yf225 https://github.com/pytorch/pytorch/pull/33652)
- [x] pad_packed_sequence (@yf225 https://github.com/pytorch/pytorch/pull/33652)
- [x] pad_sequence (@Suyash458 https://github.com/pytorch/pytorch/pull/32387)
- [x] pack_sequence (@yf225 https://github.com/pytorch/pytorch/pull/33652)

## torch.nn.functional
- [x] normalize (please see details in https://github.com/pytorch/pytorch/issues/27048) (@gtamba https://github.com/pytorch/pytorch/pull/27280)
- [x] gumbel_softmax (please see details in https://github.com/pytorch/pytorch/issues/27078) (@Naresh1318 https://github.com/pytorch/pytorch/pull/28121)
- [x] one_hot (please see details in https://github.com/pytorch/pytorch/issues/27081) (@bzinodev https://github.com/pytorch/pytorch/pull/27177)
- [x] pdist (please see details in https://github.com/pytorch/pytorch/issues/27082) (@jon-tow https://github.com/pytorch/pytorch/pull/27122)
- [x] grid_sample (@lsrock1 https://github.com/pytorch/pytorch/pull/28354)
- [x] affine_grid (please see details in https://github.com/pytorch/pytorch/issues/27196) (@jon-tow https://github.com/pytorch/pytorch/pull/27263)
- [x] gelu (@anjali411 https://github.com/pytorch/pytorch/pull/28433)


## Implementation Notes:
- For `torch.nn` modules, Python and C++ implementation must only differ in language-specific syntax, and their data member fields, control flow and logic must be **exactly** the same.
    - You might see that some of the `torch.nn` modules call the corresponding `torch.nn.functional` functions. When implementing the C++ version of those modules, we should also add the corresponding `torch::nn::functional` functions, in order to preserve the call structure.
- When you are adding a new C++ `torch::nn` module:
    - The new `torch::nn` module must subclass from `public Cloneable<ModuleName>`. For example, `class TORCH_API LinearImpl : public Cloneable<LinearImpl>` (in `torch/csrc/api/include/torch/nn/modules/linear.h`).
    - How do we decide which file to put the new C++ `torch::nn` module in? Answer: We should look at where the Python version of module is located and try to mirror that file structure. For example, the Python version of `torch.nn.Linear` lives in `torch/nn/modules/linear.py`, so the C++ version `torch::nn::Linear` should live in `torch/csrc/api/include/torch/nn/modules/linear.h`.
    -  If you add a new module header, you must also add the include statement for the new header in `torch/csrc/api/include/torch/nn/modules.h`.
    - If you add a new module `.cpp` file, you must also add it into `caffe2/CMakeLists.txt` and `tools/build_variables.py`. (Please search for other modules in those files to see how this should be done.)
    - You must add tests for the new module in `test/cpp/api/modules.cpp`. In particular, make sure the module's `pretty_print` is tested and it outputs the same value as the Python version.
    - if the module is templatized (e.g. `MaxPoolImpl`), you must add explicit template instantiation in the module’s `.cpp` file (e.g. search for `template class` in `torch/csrc/api/src/nn/modules/pooling.cpp` to see how this is done).
- When you are adding a new C++ `torch::nn::functional` function:
    - How do we decide which file to put the new C++ `torch::nn::functional` function? Answer: We should look at where the corresponding `torch::nn` module is located. For example, `torch::nn::PairwiseDistance` lives in `torch/csrc/api/include/torch/nn/modules/distance.h` (which, following the rule above, is determined by `torch/nn/modules/distance.py`), so `torch::nn::functional::pairwise_distance` should live in `torch/csrc/api/include/torch/nn/functional/distance.h`.
    - You must add tests for the new functional in `test/cpp/api/functional.cpp`.
    - If you add a new header file for the functional, you must add include statement for the header file in `torch/csrc/api/include/torch/nn/functional.h`.
    - The name of the functional must match the Python version.
- When you are adding a new module/functional options:
    - If the functional options only has optional arguments, the functional's function signature should have `options = {}`, to allow users to call the functional without passing options. And we must add a test for this.
    - If the functional / module options only has one non-optional argument, we need the implicit constructor for the options that only takes the non-optional argument (i.e. `/* implicit */ Options(value_type value);`), so that we are able to do `functional(input, options_arg_value)` / `auto m = Module(options_arg_value)` instead of `functional(input, Options(options_arg_value))` / `auto m = Module(Options(options_arg_value))`. And we must add a test for this. (See `DropoutOptions` in `torch/csrc/api/include/torch/nn/options/dropout.h` as example.)
    - for optional Tensor argument, we should use `torch::Tensor` (with `Tensor()` as default “null" value), and don’t use `c10::optional<Tensor>`.
    - If the options is templatized (e.g. `MaxPoolOptions` in `torch/csrc/api/include/torch/nn/options/pooling.h`), you must add explicit template instantiation in the options’ `.cpp` file (search for `template struct` in `torch/csrc/api/src/nn/options/pooling.cpp` to see how this is done).
- Do not use `torch::IntArrayRef`, use `std::vector<int64_t>` instead.

## How do I run tests?
- Follow https://github.com/pytorch/pytorch#from-source to build your branch of PyTorch from source.
- To test `test/cpp/api/modules.cpp`, run `./build/bin/test_api --gtest_filter=ModulesTest* --gtest_stack_trace_depth=10 --gmock_verbose=info`.
- To test `test/cpp/api/functional.cpp`, run `./build/bin/test_api --gtest_filter=FunctionalTest* --gtest_stack_trace_depth=10 --gmock_verbose=info`.

Please ask for @yf225's review when you open a PR to add a `torch::nn` module from this list.

Tracking post on PyTorch forum: https://discuss.pytorch.org/t/55650

cc @yf225


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python/C++ API Parity: torch.nn modules and functional #25883

Containers

Convolution layers

Pooling layers

Padding layers

Non-linear activations (weighted sum, nonlinearity)

Non-linear activations (other)

Normalization layers

Recurrent layers

Transformer layers

Linear layers

Dropout layers

Sparse layers

Distance functions

Loss functions

Vision layers

Utilities

torch.nn.functional

Implementation Notes:

How do I run tests?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python/C++ API Parity: torch.nn modules and functional #25883

Description

Containers

Convolution layers

Pooling layers

Padding layers

Non-linear activations (weighted sum, nonlinearity)

Non-linear activations (other)

Normalization layers

Recurrent layers

Transformer layers

Linear layers

Dropout layers

Sparse layers

Distance functions

Loss functions

Vision layers

Utilities

torch.nn.functional

Implementation Notes:

How do I run tests?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions