Bugfix for passing None args to user defined Triton kernel #138260

SamGinzburg · 2024-10-17T21:06:36Z

This is a PR to fix the following issue: #115344

In short, passing None as an arg. to a Triton kernel would cause problems:

Short repro of the specific bug fixed here:

import triton
import triton.language as tl

@triton.autotune( # E: Untyped decorator makes function "sin_kernel" untyped  [misc]
    configs=[
        triton.Config({'BLOCK_SIZE': 32}, num_stages=5, num_warps=2),
        triton.Config({'BLOCK_SIZE': 64}, num_stages=4, num_warps=4),
    ],
    key=['n_elements']
)
@triton.jit # E: Untyped decorator makes function "sin_kernel" untyped  [misc]
def sin_kernel( # E: Function is missing a return type annotation  [no-untyped-def]
    in_ptr0,
    out_ptr,
    n_elements,
    BLOCK_SIZE: "tl.constexpr",
):
    pid = tl.program_id(axis=0)
    block_start = pid * BLOCK_SIZE
    offsets = block_start + tl.arange(0, BLOCK_SIZE)
    mask = offsets < n_elements
    if in_ptr0 is not None:
        x = tl.load(in_ptr0 + offsets, mask=mask)
    else:
        x = 0.
    output = tl.sin(x)
    tl.store(out_ptr + offsets, output, mask=mask)

import torch

def sin_triton(x, out):
    n_elements = out.numel()
    sin_kernel[(n_elements,)](x, out, n_elements)

x = torch.randn(65, device="cuda")
out = torch.empty_like(x)
out_compiled = torch.empty_like(x)

sin_triton_compiled = torch.compile(fullgraph=True)(sin_triton)

for first in (x, None):
    sin_triton(first, out)
    sin_triton_compiled(first, out_compiled)
    torch.testing.assert_close(out, out_compiled)

I've added a unit test to catch this issue in the future and the tests in "test/inductor/test_triton_kernels.py"

topic: not user facing

================================================================================
Stack from ghstack (oldest at bottom):

-> Bugfix for passing None args to user defined Triton kernel #138260

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

Differential Revision: D64615061

add test fewer failing tests more tests passing tests passing lint [ghstack-poisoned]

pytorch-bot · 2024-10-17T21:06:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138260

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 2 Unrelated Failures

As of commit cbc1bc0 with merge base de51ed8 ():

NEW FAILURES - The following jobs have failed:

linux-binary-libtorch-cxx11-abi / libtorch-cpu-shared-with-deps-cxx11-abi-build / build (gh)
bash: /builder/libtorch/build.sh: No such file or directory
linux-binary-libtorch-pre-cxx11 / libtorch-cpu-shared-with-deps-pre-cxx11-build / build (gh)
bash: /builder/libtorch/build.sh: No such file or directory
linux-binary-manywheel / manywheel-py3_9-cuda11_8-build / build (gh)
bash: /builder/manywheel/build.sh: No such file or directory
linux-binary-manywheel / manywheel-py3_9-cuda12_1-build / build (gh)
bash: /builder/manywheel/build.sh: No such file or directory
linux-binary-manywheel / manywheel-py3_9-cuda12_4-build / build (gh)
bash: /builder/manywheel/build.sh: No such file or directory

FLAKY - The following job failed but was likely due to flakiness present on trunk:

inductor-periodic / cuda12.1-py3.10-gcc9-sm80 / test (inductor_torchbench_smoketest_perf, 1, 1, linux.gcp.a100) (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_distributed, 1, 1, linux.g5.12xlarge.nvidia.gpu) (gh) (trunk failure)
distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_compile_backward_only

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2024-10-17T21:06:40Z

The committers listed above are authorized under a signed CLA.

✅ login: SamGinzburg / name: Samuel Ginzburg (b1086d3, aec9202, cbc1bc0, 6e7d723, 9c27415, e795dd6, 75013a8)

This is a PR to fix the following issue: #115344 In short, passing None as an arg. to a Triton kernel would cause problems: Short repro of the specific bug fixed here: ``` import triton import triton.language as tl triton.autotune( # E: Untyped decorator makes function "sin_kernel" untyped [misc] configs=[ triton.Config({'BLOCK_SIZE': 32}, num_stages=5, num_warps=2), triton.Config({'BLOCK_SIZE': 64}, num_stages=4, num_warps=4), ], key=['n_elements'] ) triton.jit # E: Untyped decorator makes function "sin_kernel" untyped [misc] def sin_kernel( # E: Function is missing a return type annotation [no-untyped-def] in_ptr0, out_ptr, n_elements, BLOCK_SIZE: "tl.constexpr", ): pid = tl.program_id(axis=0) block_start = pid * BLOCK_SIZE offsets = block_start + tl.arange(0, BLOCK_SIZE) mask = offsets < n_elements if in_ptr0 is not None: x = tl.load(in_ptr0 + offsets, mask=mask) else: x = 0. output = tl.sin(x) tl.store(out_ptr + offsets, output, mask=mask) import torch def sin_triton(x, out): n_elements = out.numel() sin_kernel[(n_elements,)](x, out, n_elements) x = torch.randn(65, device="cuda") out = torch.empty_like(x) out_compiled = torch.empty_like(x) sin_triton_compiled = torch.compile(fullgraph=True)(sin_triton) for first in (x, None): sin_triton(first, out) sin_triton_compiled(first, out_compiled) torch.testing.assert_close(out, out_compiled) ``` I've added a unit test to catch this issue in the future and the tests in "test/inductor/test_triton_kernels.py" ================================================================================ cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

SamGinzburg · 2024-10-17T22:30:47Z

@pytorchbot label "topic: not user facing"

This is a PR to fix the following issue: #115344 In short, passing None as an arg. to a Triton kernel would cause problems: Short repro of the specific bug fixed here: ``` import triton import triton.language as tl triton.autotune( # E: Untyped decorator makes function "sin_kernel" untyped [misc] configs=[ triton.Config({'BLOCK_SIZE': 32}, num_stages=5, num_warps=2), triton.Config({'BLOCK_SIZE': 64}, num_stages=4, num_warps=4), ], key=['n_elements'] ) triton.jit # E: Untyped decorator makes function "sin_kernel" untyped [misc] def sin_kernel( # E: Function is missing a return type annotation [no-untyped-def] in_ptr0, out_ptr, n_elements, BLOCK_SIZE: "tl.constexpr", ): pid = tl.program_id(axis=0) block_start = pid * BLOCK_SIZE offsets = block_start + tl.arange(0, BLOCK_SIZE) mask = offsets < n_elements if in_ptr0 is not None: x = tl.load(in_ptr0 + offsets, mask=mask) else: x = 0. output = tl.sin(x) tl.store(out_ptr + offsets, output, mask=mask) import torch def sin_triton(x, out): n_elements = out.numel() sin_kernel[(n_elements,)](x, out, n_elements) x = torch.randn(65, device="cuda") out = torch.empty_like(x) out_compiled = torch.empty_like(x) sin_triton_compiled = torch.compile(fullgraph=True)(sin_triton) for first in (x, None): sin_triton(first, out) sin_triton_compiled(first, out_compiled) torch.testing.assert_close(out, out_compiled) ``` I've added a unit test to catch this issue in the future and the tests in "test/inductor/test_triton_kernels.py" topic: not user facing ================================================================================ cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

test/inductor/test_triton_kernels.py

This is a PR to fix the following issue: #115344 In short, passing None as an arg. to a Triton kernel would cause problems: Short repro of the specific bug fixed here: ``` import triton import triton.language as tl triton.autotune( # E: Untyped decorator makes function "sin_kernel" untyped [misc] configs=[ triton.Config({'BLOCK_SIZE': 32}, num_stages=5, num_warps=2), triton.Config({'BLOCK_SIZE': 64}, num_stages=4, num_warps=4), ], key=['n_elements'] ) triton.jit # E: Untyped decorator makes function "sin_kernel" untyped [misc] def sin_kernel( # E: Function is missing a return type annotation [no-untyped-def] in_ptr0, out_ptr, n_elements, BLOCK_SIZE: "tl.constexpr", ): pid = tl.program_id(axis=0) block_start = pid * BLOCK_SIZE offsets = block_start + tl.arange(0, BLOCK_SIZE) mask = offsets < n_elements if in_ptr0 is not None: x = tl.load(in_ptr0 + offsets, mask=mask) else: x = 0. output = tl.sin(x) tl.store(out_ptr + offsets, output, mask=mask) import torch def sin_triton(x, out): n_elements = out.numel() sin_kernel[(n_elements,)](x, out, n_elements) x = torch.randn(65, device="cuda") out = torch.empty_like(x) out_compiled = torch.empty_like(x) sin_triton_compiled = torch.compile(fullgraph=True)(sin_triton) for first in (x, None): sin_triton(first, out) sin_triton_compiled(first, out_compiled) torch.testing.assert_close(out, out_compiled) ``` I've added a unit test to catch this issue in the future and the tests in "test/inductor/test_triton_kernels.py" topic: not user facing ================================================================================ cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

add test fewer failing tests more tests passing tests passing lint ghstack-source-id: f089650 Pull Request resolved: #138260

aakhundov

Thanks, @SamGinzburg! Looks good overall. Added some minor comments and questions.

torch/_inductor/ir.py

torch/_inductor/runtime/triton_heuristics.py

aakhundov · 2024-10-18T00:38:56Z

cc @zou3519 as this fixes our long-standing None arg issue :)

This is a PR to fix the following issue: #115344 In short, passing None as an arg. to a Triton kernel would cause problems: Short repro of the specific bug fixed here: ``` import triton import triton.language as tl triton.autotune( # E: Untyped decorator makes function "sin_kernel" untyped [misc] configs=[ triton.Config({'BLOCK_SIZE': 32}, num_stages=5, num_warps=2), triton.Config({'BLOCK_SIZE': 64}, num_stages=4, num_warps=4), ], key=['n_elements'] ) triton.jit # E: Untyped decorator makes function "sin_kernel" untyped [misc] def sin_kernel( # E: Function is missing a return type annotation [no-untyped-def] in_ptr0, out_ptr, n_elements, BLOCK_SIZE: "tl.constexpr", ): pid = tl.program_id(axis=0) block_start = pid * BLOCK_SIZE offsets = block_start + tl.arange(0, BLOCK_SIZE) mask = offsets < n_elements if in_ptr0 is not None: x = tl.load(in_ptr0 + offsets, mask=mask) else: x = 0. output = tl.sin(x) tl.store(out_ptr + offsets, output, mask=mask) import torch def sin_triton(x, out): n_elements = out.numel() sin_kernel[(n_elements,)](x, out, n_elements) x = torch.randn(65, device="cuda") out = torch.empty_like(x) out_compiled = torch.empty_like(x) sin_triton_compiled = torch.compile(fullgraph=True)(sin_triton) for first in (x, None): sin_triton(first, out) sin_triton_compiled(first, out_compiled) torch.testing.assert_close(out, out_compiled) ``` I've added a unit test to catch this issue in the future and the tests in "test/inductor/test_triton_kernels.py" topic: not user facing ================================================================================ cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

add test fewer failing tests more tests passing tests passing lint ghstack-source-id: 749f8ff Pull Request resolved: #138260

SamGinzburg · 2024-10-18T18:56:06Z

@SamGinzburg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

This is a PR to fix the following issue: #115344 In short, passing None as an arg. to a Triton kernel would cause problems: Short repro of the specific bug fixed here: ``` import triton import triton.language as tl triton.autotune( # E: Untyped decorator makes function "sin_kernel" untyped [misc] configs=[ triton.Config({'BLOCK_SIZE': 32}, num_stages=5, num_warps=2), triton.Config({'BLOCK_SIZE': 64}, num_stages=4, num_warps=4), ], key=['n_elements'] ) triton.jit # E: Untyped decorator makes function "sin_kernel" untyped [misc] def sin_kernel( # E: Function is missing a return type annotation [no-untyped-def] in_ptr0, out_ptr, n_elements, BLOCK_SIZE: "tl.constexpr", ): pid = tl.program_id(axis=0) block_start = pid * BLOCK_SIZE offsets = block_start + tl.arange(0, BLOCK_SIZE) mask = offsets < n_elements if in_ptr0 is not None: x = tl.load(in_ptr0 + offsets, mask=mask) else: x = 0. output = tl.sin(x) tl.store(out_ptr + offsets, output, mask=mask) import torch def sin_triton(x, out): n_elements = out.numel() sin_kernel[(n_elements,)](x, out, n_elements) x = torch.randn(65, device="cuda") out = torch.empty_like(x) out_compiled = torch.empty_like(x) sin_triton_compiled = torch.compile(fullgraph=True)(sin_triton) for first in (x, None): sin_triton(first, out) sin_triton_compiled(first, out_compiled) torch.testing.assert_close(out, out_compiled) ``` I've added a unit test to catch this issue in the future and the tests in "test/inductor/test_triton_kernels.py" topic: not user facing ================================================================================ cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov Differential Revision: [D64615061](https://our.internmc.facebook.com/intern/diff/D64615061) [ghstack-poisoned]

add test fewer failing tests more tests passing tests passing lint ghstack-source-id: 4cb2b4f Pull Request resolved: #138260

SamGinzburg · 2024-10-18T20:56:05Z

@SamGinzburg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

SamGinzburg · 2024-10-21T15:05:46Z

@pytorchbot merge

pytorchmergebot · 2024-10-21T15:07:17Z

Can't merge closed PR #138260

…138260)" This reverts commit d74f5bd.

…138469) Revert "Bugfix for passing None args to user defined Triton kernel (#138260)" This reverts commit d74f5bd.

aakhundov · 2024-10-21T15:54:41Z

This PR was merged manually by mistake. Resubmitted in #138472.

Bugfix for passing None args to user defined Triton kernel

b1086d3

add test fewer failing tests more tests passing tests passing lint [ghstack-poisoned]

SamGinzburg requested a review from zou3519 as a code owner October 17, 2024 21:06

pytorch-bot bot added ciflow/inductor module: inductor labels Oct 17, 2024

SamGinzburg requested a review from aakhundov October 17, 2024 21:11

zou3519 removed their request for review October 17, 2024 22:10

pytorch-bot bot added the topic: not user facing topic category label Oct 17, 2024

aakhundov reviewed Oct 17, 2024

View reviewed changes

test/inductor/test_triton_kernels.py Outdated Show resolved Hide resolved

aakhundov reviewed Oct 17, 2024

View reviewed changes

test/inductor/test_triton_kernels.py Outdated Show resolved Hide resolved

aakhundov reviewed Oct 17, 2024

View reviewed changes

test/inductor/test_triton_kernels.py Outdated Show resolved Hide resolved

SamGinzburg added a commit that referenced this pull request Oct 17, 2024

Bugfix for passing None args to user defined Triton kernel

c1c853c

add test fewer failing tests more tests passing tests passing lint ghstack-source-id: f089650 Pull Request resolved: #138260

aakhundov reviewed Oct 18, 2024

View reviewed changes

SamGinzburg added a commit that referenced this pull request Oct 18, 2024

Bugfix for passing None args to user defined Triton kernel

f20812a

add test fewer failing tests more tests passing tests passing lint ghstack-source-id: 749f8ff Pull Request resolved: #138260

aakhundov requested a review from eellison October 18, 2024 19:05

aakhundov requested a review from oulgen October 18, 2024 19:06

aakhundov approved these changes Oct 18, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 18, 2024

SamGinzburg added a commit that referenced this pull request Oct 18, 2024

Bugfix for passing None args to user defined Triton kernel

bea1b35

add test fewer failing tests more tests passing tests passing lint ghstack-source-id: 4cb2b4f Pull Request resolved: #138260

SamGinzburg merged commit d74f5bd into gh/SamGinzburg/5/base Oct 21, 2024

SamGinzburg added a commit that referenced this pull request Oct 21, 2024

Revert "Bugfix for passing None args to user defined Triton kernel (#…

f0652d6

…138260)" This reverts commit d74f5bd.

SamGinzburg mentioned this pull request Oct 21, 2024

Revert "Bugfix for passing None args to user defined Triton kernel" #138469

Merged

SamGinzburg added a commit that referenced this pull request Oct 21, 2024

Revert "Bugfix for passing None args to user defined Triton kernel" (#…

b011a30

…138469) Revert "Bugfix for passing None args to user defined Triton kernel (#138260)" This reverts commit d74f5bd.

aakhundov mentioned this pull request Oct 21, 2024

Bugfix for passing None args to user defined Triton kernel #138472

Closed

github-actions bot deleted the gh/SamGinzburg/5/head branch November 21, 2024 02:08

Bugfix for passing None args to user defined Triton kernel #138260

Bugfix for passing None args to user defined Triton kernel #138260

Uh oh!

Conversation

SamGinzburg commented Oct 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138260

❌ 5 New Failures, 2 Unrelated Failures

Uh oh!

linux-foundation-easycla bot commented Oct 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SamGinzburg commented Oct 17, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aakhundov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aakhundov commented Oct 18, 2024

Uh oh!

SamGinzburg commented Oct 18, 2024

Uh oh!

SamGinzburg commented Oct 18, 2024

Uh oh!

SamGinzburg commented Oct 21, 2024

Uh oh!

pytorchmergebot commented Oct 21, 2024

Uh oh!

aakhundov commented Oct 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SamGinzburg commented Oct 17, 2024 •

edited

Loading

pytorch-bot bot commented Oct 17, 2024 •

edited

Loading

linux-foundation-easycla bot commented Oct 17, 2024 •

edited

Loading