Skip to content

Conversation

@davidberard98
Copy link
Contributor

@davidberard98 davidberard98 commented Nov 1, 2024

Stack from ghstack (oldest at bottom):

In upstream triton, triton-lang/triton#4589 introduces overflow checks. However, overflow checks likely add some overhead, and have some correctness bugs at the moment (e.g. triton-lang/triton#5033). Let's set sanitize_overflow=False but keep debug=True so that we can keep using device_assert but without the additional asserts added by sanitize_overflow.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 1, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139502

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit efb94d4 with merge base 560a070 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

davidberard98 added a commit that referenced this pull request Nov 1, 2024
ghstack-source-id: edc60a6
Pull Request resolved: #139502
@davidberard98 davidberard98 added ciflow/rocm Trigger "default" config CI on ROCm release notes: inductor labels Nov 1, 2024
@davidberard98 davidberard98 changed the title [inductor] set sanitize_overflow=False for triton [inductor] set sanitize_overflow=False for triton kernels Nov 1, 2024
"num_warps": compile_meta["num_warps"],
"num_stages": compile_meta["num_stages"],
"debug": compile_meta["debug"],
"sanitize_overflow": False, # turn off additional asserts added for overflow checks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this backward-compatible with the current pinned triton version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look too closely but:

  • relying on PyTorch CI to catch it if not.
  • I did test with AMD w/ new version of triton, which doesn't have sanitize_overflow defined in HIPOptions, and it doesn't fail.

@davidberard98 davidberard98 added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 1, 2024
@davidberard98
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

rahulsingh-intel pushed a commit to rahulsingh-intel/pytorch that referenced this pull request Nov 5, 2024
…9502)

In upstream triton, triton-lang/triton#4589 introduces overflow checks. However, overflow checks likely add some overhead, and have some correctness bugs at the moment (e.g. triton-lang/triton#5033). Let's set `sanitize_overflow=False` but keep `debug=True` so that we can keep using device_assert but without the additional asserts added by `sanitize_overflow`.

Pull Request resolved: pytorch#139502
Approved by: https://github.com/bertmaher
@github-actions github-actions bot deleted the gh/davidberard98/344/head branch December 2, 2024 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: inductor release notes: inductor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[triton 3.2] inductor cumsum w/ upstream triton fails accuracy and device asserts

3 participants