Add block mask utility support for batches and heads > 1 #130227

Chillee · 2024-07-08T01:19:48Z

Stack from ghstack (oldest at bottom):

Add scale kwarg to FlexAttention (and some changes that get FlexAttention numerics to be as accurate as FA2) #130250
-> Add block mask utility support for batches and heads > 1 #130227

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

[ghstack-poisoned]

pytorch-bot · 2024-07-08T01:19:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130227

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit ed9a3e9 with merge base 6875179 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / before-test / llm-retrieval (gh) (matched llm-retrieval rule in flaky-rules.json)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: e722003 Pull Request resolved: #130227

Chillee · 2024-07-08T16:14:59Z

aten/src/ATen/native/IndexingUtils.h

 }

-static C10_UNUSED void checkIndexTensorTypes(IOptTensorListRef indices, bool allow_int=false) {
+static C10_UNUSED void checkIndexTensorTypes(IOptTensorListRef indices) {


This is only used in 3 places, and all of them have allow_int set to True or should have it set to True.

Chillee · 2024-07-08T17:51:37Z

@pytorchbot merge

pytorchmergebot · 2024-07-08T17:53:14Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

yanboliang · 2024-07-08T18:08:35Z

@pytorchbot merge

…tion numerics to be as accurate as FA2) (#130250) Pull Request resolved: #130250 Approved by: https://github.com/drisspg ghstack dependencies: #130160, #130106, #130224, #130227

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

facebook-github-bot · 2024-07-09T19:36:47Z

This pull request was exported from Phabricator. Differential Revision: D59498662

…lexAttention numerics to be as accurate as FA2) (#130250)" This reverts commit 3e48d92. Reverted #130250 on behalf of https://github.com/izaitsevfb due to depends on #130227 which needs to be reverted ([comment](#130250 (comment)))

izaitsevfb · 2024-07-09T22:33:09Z

@pytorchbot revert -m "breaks internal builds, please see D59498662" -c ghfirst

 error: too many arguments to function call, expected single argument 'indices', have 2 arguments
  at::native::checkIndexTensorTypes(indices, /*allow_int*/ true);
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                        ^~~~
buck-out/v2/gen/fbcode/795a8ff4e6618da8/caffe2/__aten-headers-cpu__/buck-headers/ATen/native/IndexingUtils.h:51:24: note: 'checkIndexTensorTypes' declared here
static C10_UNUSED void checkIndexTensorTypes(IOptTensorListRef indices) {

pytorchmergebot · 2024-07-09T22:34:29Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2024-07-09T22:34:41Z

@Chillee your PR has been successfully reverted.

…0227)" This reverts commit 6413998. Reverted #130227 on behalf of https://github.com/izaitsevfb due to breaks internal builds, please see D59498662 ([comment](#130227 (comment)))

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

…lexAttention numerics to be as accurate as FA2) (pytorch#130250)" This reverts commit 3e48d92. Reverted pytorch#130250 on behalf of https://github.com/izaitsevfb due to depends on pytorch#130227 which needs to be reverted ([comment](pytorch#130250 (comment)))

…orch#130227)" This reverts commit 6413998. Reverted pytorch#130227 on behalf of https://github.com/izaitsevfb due to breaks internal builds, please see D59498662 ([comment](pytorch#130227 (comment)))

…tion numerics to be as accurate as FA2) (#130250) After this PR, our numerical error is within 3% of FA2 for forward and gradients. Prior, for `dq` our numerical error was 30% higher. I also added a `PRESCALE_QK` kernel option that increases perf by about 3-4% but incurs about 20-30% more numerical error. ![image](https://github.com/pytorch/pytorch/assets/6355099/7b5ff44e-219b-4a05-8a1b-2a0182c01ab2) Pull Request resolved: #130250 Approved by: https://github.com/drisspg ghstack dependencies: #130227

…lexAttention numerics to be as accurate as FA2) (pytorch#130250)" This reverts commit 3e48d92. Reverted pytorch#130250 on behalf of https://github.com/izaitsevfb due to depends on pytorch#130227 which needs to be reverted ([comment](pytorch#130250 (comment)))

…orch#130227)" This reverts commit 6413998. Reverted pytorch#130227 on behalf of https://github.com/izaitsevfb due to breaks internal builds, please see D59498662 ([comment](pytorch#130227 (comment)))

…tion numerics to be as accurate as FA2) (pytorch#130250) Pull Request resolved: pytorch#130250 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#130160, pytorch#130106, pytorch#130224, pytorch#130227

…lexAttention numerics to be as accurate as FA2) (pytorch#130250)" This reverts commit 3e48d92. Reverted pytorch#130250 on behalf of https://github.com/izaitsevfb due to depends on pytorch#130227 which needs to be reverted ([comment](pytorch#130250 (comment)))

…orch#130227)" This reverts commit 6413998. Reverted pytorch#130227 on behalf of https://github.com/izaitsevfb due to breaks internal builds, please see D59498662 ([comment](pytorch#130227 (comment)))

) Pull Request resolved: pytorch#130227 Approved by: https://github.com/yanboliang

…tion numerics to be as accurate as FA2) (pytorch#130250) After this PR, our numerical error is within 3% of FA2 for forward and gradients. Prior, for `dq` our numerical error was 30% higher. I also added a `PRESCALE_QK` kernel option that increases perf by about 3-4% but incurs about 20-30% more numerical error. ![image](https://github.com/pytorch/pytorch/assets/6355099/7b5ff44e-219b-4a05-8a1b-2a0182c01ab2) Pull Request resolved: pytorch#130250 Approved by: https://github.com/drisspg ghstack dependencies: pytorch#130227

Add block mask utility support for batches and heads > 1

a88d7b9

[ghstack-poisoned]

Chillee requested review from albanD, eqy, jbschlosser and mikaylagawarecki as code owners July 8, 2024 01:19

Chillee mentioned this pull request Jul 8, 2024

Fix indexing twice with score_mod #130224

Closed

github-actions bot requested a review from ezyang July 8, 2024 01:20

Update on "Add block mask utility support for batches and heads > 1"

d6ab95d

[ghstack-poisoned]

pytorch-bot bot added the module: inductor label Jul 8, 2024

Chillee added a commit that referenced this pull request Jul 8, 2024

Add block mask utility support for batches and heads > 1

7edf6da

ghstack-source-id: e722003 Pull Request resolved: #130227

ezyang removed their request for review July 8, 2024 02:43

Chillee requested review from drisspg and yanboliang July 8, 2024 05:40

Chillee added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 8, 2024

Chillee commented Jul 8, 2024

View reviewed changes

Chillee mentioned this pull request Jul 8, 2024

Add scale kwarg to FlexAttention (and some changes that get FlexAttention numerics to be as accurate as FA2) #130250

Closed

yanboliang approved these changes Jul 8, 2024

View reviewed changes

pytorchmergebot added the merging label Jul 8, 2024

pytorchmergebot removed the merging label Jul 8, 2024

yanboliang added the topic: not user facing topic category label Jul 8, 2024

pytorchmergebot added the merging label Jul 8, 2024

pytorchmergebot removed the merging label Jul 8, 2024

Chillee mentioned this pull request Jul 9, 2024

Try removing indexing limitation from templates #130308

Closed

Chillee mentioned this pull request Jul 9, 2024

Forward fix revert for BC breakage #130365

Closed

facebook-github-bot added the fb-exported label Jul 9, 2024

pytorchmergebot added the Reverted label Jul 9, 2024

pytorchmergebot reopened this Jul 9, 2024

Chillee added 2 commits July 9, 2024 19:27

pytorchmergebot closed this in a7715e3 Jul 10, 2024

github-actions bot deleted the gh/chillee/318/head branch August 10, 2024 01:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add block mask utility support for batches and heads > 1 #130227

Add block mask utility support for batches and heads > 1 #130227

Uh oh!

Chillee commented Jul 8, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 8, 2024 •

edited

Loading

Uh oh!

Chillee Jul 8, 2024

Uh oh!

Chillee commented Jul 8, 2024

Uh oh!

pytorchmergebot commented Jul 8, 2024

Uh oh!

yanboliang commented Jul 8, 2024

Uh oh!

facebook-github-bot commented Jul 9, 2024

Uh oh!

izaitsevfb commented Jul 9, 2024

Uh oh!

pytorchmergebot commented Jul 9, 2024

Uh oh!

pytorchmergebot commented Jul 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add block mask utility support for batches and heads > 1 #130227

Add block mask utility support for batches and heads > 1 #130227

Uh oh!

Conversation

Chillee commented Jul 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130227

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

Chillee Jul 8, 2024

Choose a reason for hiding this comment

Uh oh!

Chillee commented Jul 8, 2024

Uh oh!

pytorchmergebot commented Jul 8, 2024

Merge failed

Uh oh!

yanboliang commented Jul 8, 2024

Uh oh!

facebook-github-bot commented Jul 9, 2024

Uh oh!

izaitsevfb commented Jul 9, 2024

Uh oh!

pytorchmergebot commented Jul 9, 2024

Uh oh!

pytorchmergebot commented Jul 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Chillee commented Jul 8, 2024 •

edited

Loading

pytorch-bot bot commented Jul 8, 2024 •

edited

Loading