Import MultiheadAttention to PyTorch #18334

zhangguanheng66 · 2019-03-22T15:19:13Z

Summary:
Import MultiheadAttention into the core pytorch framework.
Users now can import MultiheadAttention directly from torch.nn.
See "Attention Is All You Need" for more details related to MultiheadAttention function.

Differential Revision: D14577966

soumith

looks like a WIP PR. Next time, prefix the title with [WIP] so that reviewers dont end up reviewing prematurely :)

torch/nn/modules/activation.py

torch/nn/functional.py

torch/nn/modules/activation.py

zhangguanheng66 · 2019-03-22T19:21:17Z

Thanks @soumith . Will accommodate your review soon.

torch/nn/modules/activation.py

cpuhrsch · 2019-03-24T05:15:30Z

@zhangguanheng66 - just because it deserves explicit mention: since this is a straight-up port from fairseq, before making any major changes to the code, make sure you wrote a ton of tests so we don't get lost.

torch/nn/modules/activation.py

zhangguanheng66 · 2019-03-27T20:23:33Z

@cpuhrsch @soumith Thanks for the feedbacks. I updated the code accordingly. Functions that are not tested have been removed. A unit test is created in test.nn. An additional test was conducted (see D14577966 more details). Instead of using fairseq.MultiheadAttention, torch.nn.MultiheadAttention is applied and the corresponding unit test in pytorch_translate works fine.

torch/nn/modules/activation.py

cpuhrsch

I'm approving this under the assumption that the most recent comments will also be resolved.

Pinging @soumith in case something is still missing.

soumith · 2019-03-28T00:29:18Z

if you dont mind, before this lands I'd like to page some folks in the NLP community to be assured that they dont need any more features from this.

cc: @srush @kyunghyuncho @myleott @glample

soumith

the documentation for forward is still missing, and forward takes a lot of options. Please flesh out.

zhangguanheng66 · 2019-03-28T01:11:06Z

@soumith Sure. Let me know if any new features are necessary. I will work on the documentation for forward function at the same time.

@srush @kyunghyuncho @myleott @glample More unit tests on your end are welcome (I have two unit tests on my side).

kyunghyuncho · 2019-03-28T02:03:59Z

@mansimov @jasonleeinf care to take a quick look at this PR?

torch/nn/modules/activation.py

myleott

I didn't review the logic carefully, but let's test this in fairseq. Ideally we'll replace the multihead attention implementation in fairseq with this one.

torch/nn/modules/activation.py

zhangguanheng66 · 2019-04-09T00:18:45Z

@soumith @myleott Please let me know if you have additional suggestions before landing this PR.

torch/nn/modules/activation.py

facebook-github-bot

@zhangguanheng66 is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: Import MultiheadAttention into the core pytorch framework. Users now can import MultiheadAttention directly from torch.nn. See "Attention Is All You Need" for more details related to MultiheadAttention function. Pull Request resolved: pytorch#18334 Differential Revision: D14577966 fbshipit-source-id: b18d945ea461c07948d2f33f5b497ca51591d0ce

vadimkantorov · 2019-04-11T15:40:49Z

The doc string is there, but it seems missing the rst autoclass entry @soumith

facebook-github-bot · 2019-04-11T17:35:23Z

@zhangguanheng66 merged this pull request in 4b20fc8.

tshrjn · 2019-05-01T22:27:01Z

Missing documentation for attn_mask, add_zero_attn and add_bias_kv.

zhangguanheng66 · 2019-05-01T23:58:18Z

Missing documentation for attn_mask, add_zero_attn and add_bias_kv.

We had a PR to update the attn_mask. See here (#20071).

Summary: Import MultiheadAttention into the core pytorch framework. Users now can import MultiheadAttention directly from torch.nn. See "Attention Is All You Need" for more details related to MultiheadAttention function. Pull Request resolved: pytorch#18334 Differential Revision: D14577966 Pulled By: zhangguanheng66 fbshipit-source-id: 756c0deff623f3780651d9f9a70ce84516c806d3

deepwilson · 2022-12-19T05:30:55Z

What is the intuition behind adding "MultiheadAttention" block under activation.py? @zhangguanheng66

zhangguanheng66 force-pushed the export-D14577966 branch from f3c2eb6 to c5f7615 Compare March 22, 2019 16:01

soumith suggested changes Mar 22, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

torch/nn/functional.py Outdated Show resolved Hide resolved

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zhangguanheng66 changed the title ~~Import MultiheadAttention to PyTorch~~ [WIP] Import MultiheadAttention to PyTorch Mar 22, 2019

zhangguanheng66 force-pushed the export-D14577966 branch from c5f7615 to 4a604c7 Compare March 22, 2019 20:48

zhangguanheng66 commented Mar 23, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zhangguanheng66 force-pushed the export-D14577966 branch from 4a604c7 to 35eb09a Compare March 24, 2019 01:25

zhangguanheng66 force-pushed the export-D14577966 branch from 35eb09a to 0eb643f Compare March 24, 2019 02:56

zhangguanheng66 force-pushed the export-D14577966 branch from 0eb643f to c3e60e1 Compare March 25, 2019 19:28

cpuhrsch self-requested a review March 25, 2019 21:26

cpuhrsch reviewed Mar 25, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zhangguanheng66 force-pushed the export-D14577966 branch from c3e60e1 to a438789 Compare March 27, 2019 16:29

cpuhrsch reviewed Mar 27, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

cpuhrsch approved these changes Mar 27, 2019

View reviewed changes

zhangguanheng66 force-pushed the export-D14577966 branch from a438789 to ea31a70 Compare March 27, 2019 21:26

soumith suggested changes Mar 28, 2019

View reviewed changes

myleott reviewed Mar 28, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

myleott suggested changes Mar 28, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zhangguanheng66 closed this Apr 4, 2019

zhangguanheng66 reopened this Apr 4, 2019

zhangguanheng66 force-pushed the export-D14577966 branch from 30dab32 to d5931d5 Compare April 8, 2019 20:18

zou3519 reviewed Apr 9, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zou3519 reviewed Apr 9, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zou3519 reviewed Apr 9, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zou3519 reviewed Apr 9, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zou3519 reviewed Apr 9, 2019

View reviewed changes

torch/nn/modules/activation.py Outdated Show resolved Hide resolved

zhangguanheng66 force-pushed the export-D14577966 branch from d5931d5 to f7b9bab Compare April 10, 2019 14:35

facebook-github-bot reviewed Apr 10, 2019

View reviewed changes

zhangguanheng66 force-pushed the export-D14577966 branch from f7b9bab to c25c547 Compare April 10, 2019 19:46

ezyang changed the title ~~[WIP] Import MultiheadAttention to PyTorch~~ Import MultiheadAttention to PyTorch Apr 11, 2019

facebook-github-bot closed this in 4b20fc8 Apr 11, 2019

facebook-github-bot added the merged label Apr 11, 2019

zhangguanheng66 deleted the export-D14577966 branch April 16, 2019 02:00

stefan-it mentioned this pull request May 3, 2019

GH-538: attention flairNLP/flair#582

Closed

DNGros mentioned this pull request Jun 11, 2019

[feature request] PyTorch: Attention operation in NMT #11199

Closed

albertz mentioned this pull request Nov 7, 2021

Implement standard attention and self-attention module rwth-i6/returnn_common#52

Closed

Import MultiheadAttention to PyTorch #18334

Import MultiheadAttention to PyTorch #18334

Uh oh!

Conversation

zhangguanheng66 commented Mar 22, 2019

Uh oh!

soumith left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhangguanheng66 commented Mar 22, 2019

Uh oh!

Uh oh!

cpuhrsch commented Mar 24, 2019

Uh oh!

Uh oh!

zhangguanheng66 commented Mar 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cpuhrsch left a comment

Choose a reason for hiding this comment

Uh oh!

soumith commented Mar 28, 2019

Uh oh!

soumith left a comment

Choose a reason for hiding this comment

Uh oh!

zhangguanheng66 commented Mar 28, 2019

Uh oh!

kyunghyuncho commented Mar 28, 2019

Uh oh!

Uh oh!

myleott left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zhangguanheng66 commented Apr 9, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

vadimkantorov commented Apr 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Apr 11, 2019

Uh oh!

tshrjn commented May 1, 2019

Uh oh!

zhangguanheng66 commented May 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deepwilson commented Dec 19, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

zhangguanheng66 commented Mar 27, 2019 •

edited

Loading

vadimkantorov commented Apr 11, 2019 •

edited

Loading

zhangguanheng66 commented May 1, 2019 •

edited

Loading