Skip to content

attn_mask in nn.MultiheadAttention is additive #21518

@mttk

Description

@mttk

📚 Documentation

It likely should be mentioned that the attn_mask argument of MHA is an additive mask (-inf masks values), rather than the standard multiplicative mask (0 masks values). Perhaps even enforce a value check (all values should be 0 / -inf ?, otherwise print warning?)

cc @brianjo @mruberry @albanD @jbschlosser @zhangguanheng66

Metadata

Metadata

Labels

module: docsRelated to our documentation, both in docs/ and docblocksmodule: nnRelated to torch.nntriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions