C++ APIs TransformerEncoder #43187

glaringlee · 2020-08-18T03:20:10Z

Stack from ghstack:

C++ APIs TransformerEncoder #43187 C++ APIs TransformerEncoder

Differential Revision: D23182770

[ghstack-poisoned]

ghstack-source-id: 23b7358 Pull Request resolved: #43187

dr-ci · 2020-08-18T03:32:22Z

💊 CI failures summary and remediations

As of commit e70c50b (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 24 times.

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

ghstack-source-id: 519e439 Pull Request resolved: #43187

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

ghstack-source-id: dc1c507 Pull Request resolved: #43187

zhangguanheng66 · 2020-08-21T16:49:31Z

test/cpp/api/transformer.cpp

    transformer_decoder_layer_test_helper(true);
 }

 void transformer_decoder_layer_test_helper_gelu(bool is_cuda) {


Just wondering why we need this helper specific for gelu? any idea for relu?

This is what we have in the test_nn.py.
I merged them into one test function for encoderlayer, but the decoder layer is contributed by external user who strictly followed the test_nn.py. I will refactor this a bit later

zhangguanheng66 · 2020-08-21T16:55:34Z

torch/csrc/api/src/nn/options/transformer.cpp

 : d_model_(d_model), nhead_(nhead){}

+
+TransformerEncoderOptions::TransformerEncoderOptions(


Should we publish transformer cpp later? The transformer model in general comes with decoder module, which I guess will be published later.

@zhangguanheng66 Current situation is that we have 2 people working on the transformer. Decoder will be in another PR. Pushing this first, so the decoder developer is easy to sync. And we will have Transformer top layer next week.

zou3519

(not a full review) asking some questions

torch/csrc/api/include/torch/nn/modules.h

torch/csrc/api/include/torch/nn/modules/transformercoder.h

zou3519 · 2020-08-25T15:31:01Z

torch/csrc/api/include/torch/nn/modules/transformercoder.h

+
+    Tensor forward(
+      const Tensor& src,
+      const Tensor& src_mask = {},


The second argument is called "mask" in Python:

pytorch/torch/nn/modules/transformer.py

Line 167 in cbdaa20

def forward(self, src: Tensor, mask: Optional[Tensor] = None, src_key_padding_mask: Optional[Tensor] = None) -> Tensor:

. Does it matter that this one is different?

I do prefer src_mask because it makes it clear what is getting masked...

@zhangguanheng66 I make the name to src_mask for the same reason as @zou3519, agree?

I think it makes sense to call src_mask and tgt_mask along with src_key_padding_mask and tgt_key_padding_mask (consistent with the python ones)

torch/csrc/api/include/torch/nn/modules/transformercoder.h

torch/csrc/api/include/torch/nn/options/transformercoder.h

zou3519 · 2020-08-25T15:46:29Z

torch/csrc/api/src/nn/modules/transformer.cpp

+  // a. No way to know whether module in AnyModule has api to reset_parameters, so replace instead
+  // b. Allow user to add/delete normalization module when reset parameters


When does reset_parameters() get called? is it a user API or something that is used in the framework?

@zou3519 As I mentioned before, this is a convenient function to reset parameters within the module. It is not necessary to call it. And I believe, it won't be called in most of the case.

Hmm, so the thing I am confused about here is:

there's no reset_parameters() or _reset_parameters on TransformerEncoderLayer in Python. Some of our Python nn.Modules do have reset_parameters() that re-initializes parameters.

And one potential concern is, let's say you have an optimizer that takes in a list of parameters (disclaimer: I have no clue how optimizers in C++ work). Is the following possible?

I create a TransformerEncoderLayer in C++

I add pass the parameters of a TransformerEncoderLayer to an optimizer (in C++)

then, I call reset_parameters() on TransformerEncoderLayer

The optimizer holds onto references of the original parameters (and not the new ones!)

This is not possible since the C++ module underlying is a shared pointer, and optimizer doesn't clone the module but just shallow copy the module, so this case should be ok.

But the reset_parameters() it self is tricky, I admit that, even in python impl, reset_parameters() has different usage in different modules and not always there and not always get called. There is a PR that trying to standarize it by making reset_parameter() virtual and moved that to base class, but this change doesn't make sense since half of the module doesn't have it and virtual will make extra cost. So currently, reset_parameters() is just an optional functions.

test/cpp/api/transformer.cpp

torch/csrc/api/include/torch/nn/options/transformercoder.h

torch/csrc/api/src/nn/modules/transformer.cpp

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

ghstack-source-id: d8ce09d Pull Request resolved: #43187

zou3519 · 2020-08-26T18:36:32Z

torch/csrc/api/include/torch/nn/options/transformercoder.h

+    // This constructor will create a new TransformerEncoderLayer obj based on passed in encoder_layer_options.
+    TransformerEncoderOptions(const TransformerEncoderLayerOptions& encoder_layer_options, int64_t num_layers);


this also keeps a shallow copy of the data in encoder_layer, right?

@zou3519
No, TransformerEncoder will use encoder_layer_options as input, allocating a new TransformerEncdoerLayer.
This is the major difference between these two constructors.

Oh I see now, thanks for the clarification

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

ghstack-source-id: 23d2e54 Pull Request resolved: #43187

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

ghstack-source-id: a2597a6 Pull Request resolved: #43187

codecov · 2020-08-27T02:41:40Z

Codecov Report

Merging #43187 into gh/glaringlee/25/base will not change coverage.
The diff coverage is n/a.

@@                  Coverage Diff                   @@
##           gh/glaringlee/25/base   #43187   +/-   ##
======================================================
  Coverage                  69.40%   69.40%           
======================================================
  Files                        378      378           
  Lines                      46606    46606           
======================================================
  Hits                       32346    32346           
  Misses                     14260    14260

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1bda5e4...e70c50b. Read the comment docs.

facebook-github-bot · 2020-08-27T10:18:00Z

@glaringlee merged this pull request in 48e08f8.

C++ APIs TransformerEncoder

3c7227f

[ghstack-poisoned]

glaringlee requested review from ebetica, goldsborough and yf225 as code owners August 18, 2020 03:20

glaringlee pushed a commit that referenced this pull request Aug 18, 2020

C++ APIs TransformerEncoder

59459e3

ghstack-source-id: 23b7358 Pull Request resolved: #43187

glaringlee requested review from zhangguanheng66 and removed request for ebetica and goldsborough August 18, 2020 03:21

glaringlee changed the title ~~C++ APIs TransformerEncoder~~ [WIP] C++ APIs TransformerEncoder Aug 18, 2020

Update on "[WIP] C++ APIs TransformerEncoder"

ac022ce

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

Update on "[WIP] C++ APIs TransformerEncoder"

223265d

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

glaringlee pushed a commit that referenced this pull request Aug 18, 2020

C++ APIs TransformerEncoder

1544f42

ghstack-source-id: 519e439 Pull Request resolved: #43187

Update on "[WIP] C++ APIs TransformerEncoder"

c13a830

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

glaringlee changed the title ~~[WIP] C++ APIs TransformerEncoder~~ C++ APIs TransformerEncoder Aug 19, 2020

Update on "C++ APIs TransformerEncoder"

2561b6f

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

glaringlee pushed a commit that referenced this pull request Aug 19, 2020

C++ APIs TransformerEncoder

a218042

ghstack-source-id: dc1c507 Pull Request resolved: #43187

zhangguanheng66 reviewed Aug 21, 2020

View reviewed changes

glaringlee requested a review from zou3519 August 24, 2020 15:34

zou3519 reviewed Aug 25, 2020

View reviewed changes

torch/csrc/api/include/torch/nn/options/transformercoder.h Outdated Show resolved Hide resolved

zou3519 reviewed Aug 25, 2020

View reviewed changes

torch/csrc/api/src/nn/modules/transformer.cpp Show resolved Hide resolved

Update on "C++ APIs TransformerEncoder"

545e0fc

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

Update on "C++ APIs TransformerEncoder"

8a0f215

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

glaringlee pushed a commit that referenced this pull request Aug 25, 2020

C++ APIs TransformerEncoder

b4543cf

ghstack-source-id: d8ce09d Pull Request resolved: #43187

glaringlee requested a review from zhangguanheng66 August 25, 2020 20:02

glaringlee requested a review from zou3519 August 25, 2020 20:04

zhangguanheng66 approved these changes Aug 26, 2020

View reviewed changes

zou3519 reviewed Aug 26, 2020

View reviewed changes

zou3519 approved these changes Aug 26, 2020

View reviewed changes

yf225 approved these changes Aug 26, 2020

View reviewed changes

Update on "C++ APIs TransformerEncoder"

1fbc190

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

glaringlee pushed a commit that referenced this pull request Aug 26, 2020

C++ APIs TransformerEncoder

2b60ccd

ghstack-source-id: 23d2e54 Pull Request resolved: #43187

Update on "C++ APIs TransformerEncoder"

e70c50b

Differential Revision: [D23182770](https://our.internmc.facebook.com/intern/diff/D23182770) [ghstack-poisoned]

glaringlee pushed a commit that referenced this pull request Aug 26, 2020

C++ APIs TransformerEncoder

f2f78ea

ghstack-source-id: a2597a6 Pull Request resolved: #43187

facebook-github-bot closed this in 48e08f8 Aug 27, 2020

facebook-github-bot added the merged label Aug 27, 2020

facebook-github-bot deleted the gh/glaringlee/25/head branch August 30, 2020 14:17

glaringlee mentioned this pull request Oct 23, 2020

Python/C++ API Parity: torch.nn modules and functional #25883

Open

mruberry added the Merged label Oct 28, 2020

		: d_model_(d_model), nhead_(nhead){}


		TransformerEncoderOptions::TransformerEncoderOptions(

		// a. No way to know whether module in AnyModule has api to reset_parameters, so replace instead
		// b. Allow user to add/delete normalization module when reset parameters

		// This constructor will create a new TransformerEncoderLayer obj based on passed in encoder_layer_options.
		TransformerEncoderOptions(const TransformerEncoderLayerOptions& encoder_layer_options, int64_t num_layers);

C++ APIs TransformerEncoder #43187

C++ APIs TransformerEncoder #43187

Uh oh!

Conversation

glaringlee commented Aug 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Aug 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

zhangguanheng66 Aug 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glaringlee Aug 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glaringlee Aug 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Aug 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

facebook-github-bot commented Aug 27, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

glaringlee commented Aug 18, 2020 •

edited

Loading

dr-ci bot commented Aug 18, 2020 •

edited

Loading

zhangguanheng66 Aug 21, 2020 •

edited

Loading

glaringlee Aug 25, 2020 •

edited

Loading

glaringlee Aug 26, 2020 •

edited

Loading

codecov bot commented Aug 27, 2020 •

edited

Loading