Slight modifications to alibi which allow for bidirectional position embedding #88

jstjohn · 2022-05-15T01:24:18Z

Changes to AlibiPositionalBias

Alibi weights before change were only appropriate if an upper triangular mask is applied on the qk dot product:

Alibi weights after change are now appropriate for bidirectional attention without a mask, and should be equivalent with a mask:

Changes to LearnedAlibiPositionalBias

Alibi weights before change were not maximal on the diagonal if the full attention matrix in unmasked form was presented, so they were only usable with an upper triangular mask:

After the change Alibi weights in the learned module match the base module

…encoding

lucidrains · 2022-05-15T20:25:47Z

@jstjohn ohh yea, this makes sense, but i do account for it https://github.com/lucidrains/x-transformers/blob/main/x_transformers/x_transformers.py#L797 already

the main reason i kept the base Alibi the way it is, is because Ofir's original code was like that

jstjohn · 2022-05-15T20:33:03Z

There is another issue though that I didn’t call out. The bidirectional=True version has the opposite pattern you would want. The upper and lower corner has the max rather than the max being along the diagonal. So this also fixes that issue.

…

On May 15, 2022, 1:26 PM -0700, Phil Wang ***@***.***>, wrote: @jstjohn ohh yea, this makes sense, but i do account for it https://github.com/lucidrains/x-transformers/blob/main/x_transformers/x_transformers.py#L797 already the main reason i kept the base Alibi the way it is, is because Ofir's original code was like that — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

jstjohn · 2022-05-15T20:36:48Z

But then again since bidirectional and non bidirectional should be equivalent when implemented and masked, it seems easiest to drop that option and just implement it in the general way?

lucidrains · 2022-05-15T22:29:59Z

@jstjohn yes that is true, lets go for your way, thank you for the PR!

slight modifications to alibi which allow for bidirectional position …

2f940dc

…encoding

lucidrains merged commit 9a03f3a into lucidrains:main May 15, 2022

ofirpress mentioned this pull request Jul 16, 2022

Modifying ALiBi for Encoder-Attention or Cross-Attention ofirpress/attention_with_linear_biases#5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Slight modifications to alibi which allow for bidirectional position embedding #88

Slight modifications to alibi which allow for bidirectional position embedding #88

Uh oh!

jstjohn commented May 15, 2022 •

edited

Loading

Uh oh!

lucidrains commented May 15, 2022

Uh oh!

jstjohn commented May 15, 2022 via email

Uh oh!

jstjohn commented May 15, 2022

Uh oh!

lucidrains commented May 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Slight modifications to alibi which allow for bidirectional position embedding #88

Slight modifications to alibi which allow for bidirectional position embedding #88

Uh oh!

Conversation

jstjohn commented May 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes to AlibiPositionalBias

Changes to LearnedAlibiPositionalBias

Uh oh!

lucidrains commented May 15, 2022

Uh oh!

jstjohn commented May 15, 2022 via email

Uh oh!

jstjohn commented May 15, 2022

Uh oh!

lucidrains commented May 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jstjohn commented May 15, 2022 •

edited

Loading