Python/C++ API Parity: TransformerDecoderLayer #42717

VinodSKumar · 2020-08-07T02:07:34Z

glaringlee

@VinodSKumar Thank you so much for making this change!

Can you tell me how you send the pull request to pytorch master branch? Your PR did not trigger our CI (contiguous integration) tests, so both you and I are not able to know whether your code is able to run under different envs.

Here is an example PR that triggered out CI test, check PR 42742 (add https://github.com/pytorch/pytorch/pull/ before). You can see at the bottom, it triggered many CI tests that test the code under different env, but your PR triggered only lint checks and some jenkins checks, I am wondering the way you create PRs to pytorch master branch.

Can please rebase your code? i checked in the encoder layer, so you don't need CMakeLists.txt and enum.h/cpp I guess.

test/cpp/api/transformer.cpp

torch/csrc/api/src/nn/modules/transformer.cpp

test/cpp/api/transformer.cpp

dr-ci · 2020-08-09T04:17:39Z

💊 CI failures summary and remediations

As of commit 54a0872 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 26 times.

VinodSKumar · 2020-08-09T18:11:09Z

@VinodSKumar Thank you so much for making this change!

Can you tell me how you send the pull request to pytorch master branch? Your PR did not trigger our CI (contiguous integration) tests, so both you and I are not able to know whether your code is able to run under different envs.

Here is an example PR that triggered out CI test, check PR 42742 (add https://github.com/pytorch/pytorch/pull/ before). You can see at the bottom, it triggered many CI tests that test the code under different env, but your PR triggered only lint checks and some jenkins checks, I am wondering the way you create PRs to pytorch master branch.

Can please rebase your code? i checked in the encoder layer, so you don't need CMakeLists.txt and enum.h/cpp I guess.

@glaringlee Thank you for your feedback and guidance, updated the CUDA test cases as per the comment. Fixed the reset parameters implementation.
The rebase was done by the following steps
Step 1: git fetch upstream
Step 2: git rebase upstream/master
Step 3: //resolved conflicts(in transformer files) and addressed the review comments
Step 4: git rebase --continue //due to conflicts
Step 5: git push origin CPP_API_TransformerDecoderLayer -f

I am not sure if I am missing something in the rebase and pushing the changes to git, the git log output is as below.
(base) v70786@master:/opt/ai/forks/pytorch$ git log --oneline
c3e004b837 (HEAD -> CPP_API_TransformerDecoderLayer, origin/CPP_API_TransformerDecoderLayer) As per the review feedback, rebased and addressed comments
283c148237 Python/C++ API Parity: TransformerDecoderLayer
87970b7 (upstream/master, upstream/fbcode/warm) Adds 'clip' alias for clamp (#42770)
b6810c1 Include/ExcludeDispatchKeySetGuard API (#42658)

Now the CI is still not getting triggered, Please help me to check the reason for the CI not getting triggered.

glaringlee · 2020-08-10T14:38:57Z

@VinodSKumar Yep, this is strange to me as well. I will figure it out, will update here for you later. Thank you!

malfet · 2020-08-10T16:33:22Z

The rebase was done by the following steps
Step 1: git fetch upstream
Step 2: git rebase upstream/master
Step 3: //resolved conflicts(in transformer files) and addressed the review comments
Step 4: git rebase --continue //due to conflicts
Step 5: git push origin CPP_API_TransformerDecoderLayer -f

This looks exactly like my workflow for rebasing/updating PRs.

malfet · 2020-08-10T16:53:22Z

@VinodSKumar are you by any chance watching your fork of the repo on CircleCI?

VinodSKumar · 2020-08-10T17:19:01Z

are you by any chance watching your fork of the repo on CircleCI

@malfet Could you please point me to the link on how to check it. May there is some logs which can help me to resolve it.

glaringlee · 2020-08-10T18:35:37Z

are you by any chance watching your fork of the repo on CircleCI

@malfet Could you please point me to the link on how to check it. May there is some logs which can help me to resolve it.

@VinodSKumar you basically just need to go to CircleCI.com, and log into your account and see whether you are following your forked pytorch repo. CircleCI's people already checked for you, you are not following anything. We are still looking into why. cc @malfet

glaringlee · 2020-08-10T18:42:23Z

@VinodSKumar
I did a test and it seems I can trigger the CI tests for your PR.
Can you re-push your code and tell me exactly the time you pushed your code? So we can take a look at the webhook log and find out why you can not trigger our CI tests. Thanks a lot.

VinodSKumar · 2020-08-11T03:42:35Z

@glaringlee There was a bug in CUDA test case, In one of the test case the tensor options were missed. I have fixed it and pushed.

Now the CI seems to be running, In case we need the time to take look at webhook log, the push was done at the following time
Bangalore, India Tue, 11 Aug 2020 at 08:55 IST
San Francisco, USA Mon, 10 Aug 2020 at 20:25 PDT
New York, USA Mon, 10 Aug 2020 at 23:25 EDT
Corresponding UTC Tue, 11 Aug 2020 at 03:25

One more point in CI, third_party/tensorpipe checkout is not successful, due to which some of the checks are not successful.

glaringlee · 2020-08-11T03:51:12Z

@glaringlee There was a bug in CUDA test case, In one of the test case the tensor options were missed. I have fixed it and pushed.

Now the CI seems to be running, In case we need the time to take look at webhook log, the push was done at the following time
Bangalore, India Tue, 11 Aug 2020 at 08:55 IST
San Francisco, USA Mon, 10 Aug 2020 at 20:25 PDT
New York, USA Mon, 10 Aug 2020 at 23:25 EDT
Corresponding UTC Tue, 11 Aug 2020 at 03:25

One more point in CI, third_party/tensorpipe checkout is not successful, due to which some of the checks are not successful.

@VinodSKumar
There was a bug for tensorpipe, if you rebase your code, it will be fixed automatically.

VinodSKumar · 2020-08-11T12:41:49Z

@glaringlee @malfet Thank you for your support and guidance, now CI is successful.

glaringlee · 2020-08-11T14:49:45Z

@VinodSKumar Can you rebase your code again due to some fixes from our end.

Then I am going to approve this change.

For TransformerDecoder work. I am currently thinking about what to support regarding the input:

In python version, input for both TransformerEncoder and Decoder contains TransformerEncoderLayer and DecoderLayer. If we let the TransformerEncoderOptions and DecoderOptions support using EncoderLayer and DecoderLayer as input, then we need to separate the header file between TransformerEncoder/Decoder and TransformerEncoderLayer/DecoderLayer, since the TransformerEncoderOptions/DecoderOptions need to include TransformerEncoderLayer/DecoderLayer module's header, and this will cause cycle inclusion if we put everything together.

The other option is that we don't support passing the TransformerEncoderLayer/DecoderLayer module as an input to TransformerEncoder/Decoder, instead, we pass TransformerEncoderLayerOptions/DecoderLayerOptions in, since anyway, we just clone the passing in EncoderLayer and DecoderLayer, in this case, we don't need to split the header file.

I am still thinking about that, if you have opinions, we can discuss here.

I will have a draft PR within next 2 days, i will let u know in #37756

Thanks again for your help!

glaringlee · 2020-08-11T19:29:32Z

@VinodSKumar Thanks for rebase.
There is currently some buggy code in the master which breaks our CI system. See here:
https://ezyang.github.io/pytorch-ci-hud/build2/pytorch-master

We might need a rebase later, I will post here.

VinodSKumar · 2020-08-11T20:32:54Z

@VinodSKumar Thanks for rebase.
There is currently some buggy code in the master which breaks our CI system. See here:
https://ezyang.github.io/pytorch-ci-hud/build2/pytorch-master

We might need a rebase later, I will post here.

Sure, will rebase once the issue in master it is resolved.

VinodSKumar · 2020-08-11T22:43:46Z

For TransformerDecoder work. I am currently thinking about what to support regarding the input:

In python version, input for both TransformerEncoder and Decoder contains TransformerEncoderLayer and DecoderLayer. If we let the TransformerEncoderOptions and DecoderOptions support using EncoderLayer and DecoderLayer as input, then we need to separate the header file between TransformerEncoder/Decoder and TransformerEncoderLayer/DecoderLayer, since the TransformerEncoderOptions/DecoderOptions need to include TransformerEncoderLayer/DecoderLayer module's header, and this will cause cycle inclusion if we put everything together.

The other option is that we don't support passing the TransformerEncoderLayer/DecoderLayer module as an input to TransformerEncoder/Decoder, instead, we pass TransformerEncoderLayerOptions/DecoderLayerOptions in, since anyway, we just clone the passing in EncoderLayer and DecoderLayer, in this case, we don't need to split the header file.

I am still thinking about that, if you have opinions, we can discuss here.

@glaringlee Yes, regarding the decoder implementation, we would need to split the transformer files to avoid cyclic dependency.
Today I was implementing the 2nd option of using the TransformerDecoderLayerOptions as parameters, and wanted to get the feed back in the review. Using the decoder layer options as parameter avoids splitting the files, and the same pattern could be used in the Transformer module to configure the custom encoder / decoder. The draft PR 42886 is based on using layer options as parameter.

The down side of using the layer options as parameters, is that the encoder / decoder must be created from scratch. In scenarios where users want to reuse an existing (encoder / decoder) / (encoder layer / decoder layer) they would not be able to do so. If this is applicable in scenarios like transfer learning or continuing training, then we need to honor the support for encoder / decoder layers are parameters. Would it be alright to split the files like transformer_layer, transformer_coder and transformer.

glaringlee · 2020-08-12T01:03:20Z

@VinodSKumar
Master branch is good now, can you please rebase? Thanks

glaringlee · 2020-08-12T01:20:54Z

For TransformerDecoder work. I am currently thinking about what to support regarding the input:
In python version, input for both TransformerEncoder and Decoder contains TransformerEncoderLayer and DecoderLayer. If we let the TransformerEncoderOptions and DecoderOptions support using EncoderLayer and DecoderLayer as input, then we need to separate the header file between TransformerEncoder/Decoder and TransformerEncoderLayer/DecoderLayer, since the TransformerEncoderOptions/DecoderOptions need to include TransformerEncoderLayer/DecoderLayer module's header, and this will cause cycle inclusion if we put everything together.
The other option is that we don't support passing the TransformerEncoderLayer/DecoderLayer module as an input to TransformerEncoder/Decoder, instead, we pass TransformerEncoderLayerOptions/DecoderLayerOptions in, since anyway, we just clone the passing in EncoderLayer and DecoderLayer, in this case, we don't need to split the header file.
I am still thinking about that, if you have opinions, we can discuss here.

@glaringlee Yes, regarding the decoder implementation, we would need to split the transformer files to avoid cyclic dependency.
Today I was implementing the 2nd option of using the TransformerDecoderLayerOptions as parameters, and wanted to get the feed back in the review. Using the decoder layer options as parameter avoids splitting the files, and the same pattern could be used in the Transformer module to configure the custom encoder / decoder. The draft PR 42886 is based on using layer options as parameter.

The down side of using the layer options as parameters, is that the encoder / decoder must be created from scratch. In scenarios where users want to reuse an existing (encoder / decoder) / (encoder layer / decoder layer) they would not be able to do so. If this is applicable in scenarios like transfer learning or continuing training, then we need to honor the support for encoder / decoder layers are parameters. Would it be alright to split the files like transformer_layer, transformer_coder and transformer.

Let's split the files. You don't need to split this one, I will split it for you after your code is landed, but for TransformerDecoder, please write them into transformerencoder.h for both module and options, cpp file can be one.

Currently, even we pass in encoder/decoder layer, people still can not use it, the reason is that in c++ implementation, when you call clone(), what we do currently is to create an empty module, eg, create an empty encoder/decoder layer and re-initialize all its member variables. I am not sure why since this is before I took over the c++ frontend codebase. I guess this is due to the nature of c++, it is not easy to duplicate a module in a generic way, but we don't have any complain on this implementation so far.

I talked to the python version owner, he never saw any example that an encoderlayer or a decoderlayer's forward() is called before passing into an TransformerEncoder/TransformerDecoder. So I think we can just split the file for now and provide two constructors for options, one is passing encoderlayer/decoderlayer option (and create a new module), the other one is passing ref of encoderlayer/decoderlayer. Though there is no much difference right now.

Your draft is similar with my draft, but for norm, we should use AnyModule. I will have my draft ready on Thursday since I have an urgent PR review tomorrow.

Will approve this one after rebase. And if you have slake or email, we can communicate there, otherwise, let use PR [42886]

glaringlee

@VinodSKumar see my previous comments, thanks

… use tensor options

VinodSKumar · 2020-08-12T10:57:57Z

@glaringlee After rebase all the checks in CI have passed.
I am not on Pytorch Slack, please let me know if I can apply. My email id is [email protected]

glaringlee · 2020-08-12T13:49:13Z

@VinodSKumar
Thanks a lot. I will approve this. I'll shot you an email to discuss Slack, thanks.

glaringlee

LGTM now

facebook-github-bot

@glaringlee has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-08-14T05:46:20Z

@glaringlee merged this pull request in 830423b.

VinodSKumar requested review from ebetica, goldsborough and yf225 as code owners August 7, 2020 02:07

VinodSKumar mentioned this pull request Aug 7, 2020

C++ API for Transformer model in libtorch 1.5.0 #37756

Closed

pytorchbot added the open source label Aug 7, 2020

VinodSKumar mentioned this pull request Aug 7, 2020

Python/C++ API Parity: TransformerDecoderLayer #42587

Closed

glaringlee self-requested a review August 7, 2020 03:36

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 7, 2020

glaringlee removed request for ebetica and goldsborough August 8, 2020 00:49

glaringlee requested changes Aug 8, 2020

View reviewed changes

test/cpp/api/transformer.cpp Outdated Show resolved Hide resolved

test/cpp/api/transformer.cpp Outdated Show resolved Hide resolved

VinodSKumar force-pushed the CPP_API_TransformerDecoderLayer branch from a95acb5 to 7587be2 Compare August 9, 2020 04:08

VinodSKumar force-pushed the CPP_API_TransformerDecoderLayer branch from 7587be2 to c3e004b Compare August 9, 2020 17:52

VinodSKumar requested a review from glaringlee August 9, 2020 18:17

VinodSKumar force-pushed the CPP_API_TransformerDecoderLayer branch from 0608d1d to 91b0206 Compare August 11, 2020 03:24

VinodSKumar force-pushed the CPP_API_TransformerDecoderLayer branch from 91b0206 to 2386c89 Compare August 11, 2020 08:43

VinodSKumar force-pushed the CPP_API_TransformerDecoderLayer branch from 2386c89 to 91eefcc Compare August 11, 2020 17:49

glaringlee reviewed Aug 12, 2020

View reviewed changes

VinodSKumar added 4 commits August 12, 2020 08:59

Python/C++ API Parity: TransformerDecoderLayer

e03f2f4

As per the review feedback, rebased and addressed comments

95ee275

Fixed the CUDA test case to use tensor options

484dcdf

rebased to address the tensorpipe bug and fixed the CUDA test case to…

54a0872

… use tensor options

VinodSKumar force-pushed the CPP_API_TransformerDecoderLayer branch from 91eefcc to 54a0872 Compare August 12, 2020 03:30

VinodSKumar requested a review from glaringlee August 12, 2020 11:12

glaringlee approved these changes Aug 13, 2020

View reviewed changes

facebook-github-bot reviewed Aug 13, 2020

View reviewed changes

facebook-github-bot closed this in 830423b Aug 14, 2020

facebook-github-bot added the merged label Aug 14, 2020

glaringlee mentioned this pull request Oct 23, 2020

Python/C++ API Parity: torch.nn modules and functional #25883

Open

mruberry added the Merged label Oct 28, 2020

Python/C++ API Parity: TransformerDecoderLayer #42717

Python/C++ API Parity: TransformerDecoderLayer #42717

Uh oh!

Conversation

VinodSKumar commented Aug 7, 2020 • edited by glaringlee Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glaringlee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dr-ci bot commented Aug 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

VinodSKumar commented Aug 9, 2020

Uh oh!

glaringlee commented Aug 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

malfet commented Aug 10, 2020

Uh oh!

malfet commented Aug 10, 2020

Uh oh!

VinodSKumar commented Aug 10, 2020

Uh oh!

glaringlee commented Aug 10, 2020

Uh oh!

glaringlee commented Aug 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VinodSKumar commented Aug 11, 2020

Uh oh!

glaringlee commented Aug 11, 2020

Uh oh!

VinodSKumar commented Aug 11, 2020

Uh oh!

glaringlee commented Aug 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glaringlee commented Aug 11, 2020

Uh oh!

VinodSKumar commented Aug 11, 2020

Uh oh!

VinodSKumar commented Aug 11, 2020

Uh oh!

glaringlee commented Aug 12, 2020

Uh oh!

glaringlee commented Aug 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glaringlee left a comment

Choose a reason for hiding this comment

Uh oh!

VinodSKumar commented Aug 12, 2020

Uh oh!

glaringlee commented Aug 12, 2020

Uh oh!

glaringlee left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Aug 14, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

VinodSKumar commented Aug 7, 2020 •

edited by glaringlee

Loading

dr-ci bot commented Aug 9, 2020 •

edited

Loading

glaringlee commented Aug 10, 2020 •

edited

Loading

glaringlee commented Aug 10, 2020 •

edited

Loading

glaringlee commented Aug 11, 2020 •

edited

Loading

glaringlee commented Aug 12, 2020 •

edited

Loading