Add Triton Backend #35

PaliC · 2025-03-18T18:12:34Z

This PR adds a triton backend to kernel bench. To invoke it simply add backend="triton" to the following 4 scripts (use them as normal otherwise)

scripts/generate_and_eval_single_sample_modal.py
scripts/generate_and_eval_single_sample.py
scripts/generate_samples.py
scripts/eval_from_generations.py

This PR also adds a {error_type}_name into the eval json. The reason for this is that it makes classifying errors (especially for triton) much easier. For example, from the error log it isn't obvious what an error is (ie. you might get at 37:15:\n h_start = pooled_row * stride - padding\n w_start = pooled_col * stride - padding\n\n # Initialize the max value\n max_val = tl.full((1,), float('-inf'), tl.float32)\n\n # Itera...). But if the error name is triton.compiler.errors.UnsupportedLanguageConstruct it's a lot more obvious.

Testing: I've tested the 4 scripts in both the triton and cuda variants and they seem to work normally. (outside of scripts/generate_and_eval_single_sample_modal.py which should be equivalent to scripts/generate_and_eval_single_sample.py)

Todo:

This PR is a little bit crude as it just adds the logic (and linting (sorry meta's IDE does this automatically). There are other things like updating the readme, and a bit of logical refactoring (use kernel instead of cuda for example). But I will leave that to a followup PR as this one is already 1000 lines.
Add fewshot for triton
Add CoT for triton

Below is the github copilot generated summary which is honestly pretty useful for navigating large PRs.

==========================================================================================
This pull request includes several changes to improve code readability and add new functionality to the scripts/eval_from_generations.py and scripts/generate_and_eval_single_sample.py files. The most notable changes include reformatting code for better readability, adding a new backend configuration option, and enhancing error logging.

Code readability improvements:

scripts/eval_from_generations.py: Reorganized import statements and reformatted multiple lines of code to follow PEP 8 guidelines.
scripts/generate_and_eval_single_sample.py: Reorganized import statements and reformatted multiple lines of code to follow PEP 8 guidelines.

New functionality:

scripts/eval_from_generations.py: Added a new configuration option backend to specify the backend for kernel implementation (cuda or triton).

Error logging enhancements:

scripts/eval_from_generations.py: Enhanced error logging by adding more detailed error messages and including error names in the metadata. [1] [2]

Miscellaneous:

scripts/eval_from_generations.py: Added indentation to JSON output in add_to_eval_results_file for better readability.
scripts/generate_and_eval_single_sample.py: Added the backend configuration option to the EvalConfig class.

george-mako · 2025-04-15T01:31:24Z

@simonguozirui Any update on when this will be merged to main?

ai-nikolai · 2025-08-22T08:17:07Z

@PaliC is this still up to date, or are there big changes in KernelBench since this PR was drafted?

simonguozirui · 2025-10-03T00:10:02Z

@AffectionateCurry and I are back working to merge this.
The key consideration is to think about code paths and modular design that allow for other future programming language support aka (tile-lang, ThunderKitten, hip, nki, cute, cutlass).

For jit-compile language it is quite easy to do so, for frameworks that require building and linking that is much more complicated.

ai-nikolai · 2025-10-22T04:44:55Z

@simonguozirui and @AffectionateCurry - amazing work.

I am currently actively looking into KernelBench + extension myself. Let me know if you guys are open to collaboration.

simonguozirui

Great work @AffectionateCurry @nathanjpaek
Look through my comments and see what you all think. I like many of the abstractions you all put in (along with the changes @PaliC did).

Can we ensure the new changes don't break existing CUDA pipeline. Test CUDA / Triton / CuTE on both local L40S Lab machine and modal cloud execution

simonguozirui · 2025-10-22T07:17:22Z

requirements.txt

-anthropic
-modal
-numpy
-openai


why do we remove requirements.txt? we should keep it? we can think about using uv later but not get rid of it here.

@nathanjpaek don't we also need to add tilelang here

simonguozirui · 2025-10-22T07:18:44Z

scripts/eval_from_generations.py


    with open(eval_file_path, "w") as f:
-        json.dump(eval_results, f)
+        json.dump(eval_results, f, indent=4)


great check

simonguozirui · 2025-10-22T07:19:30Z

scripts/generate_and_eval_single_sample.py

    elif config.dataset_src == "local":
-        problem_idx_in_dataset = config.problem_id - 1 # due to dataset list being 0-indexed locally
+        problem_idx_in_dataset = (
+            config.problem_id - 1


@pythonomar22 this is something we will get rid of with your new benchmark data class so we dont' have to deal with all these nasty off-by-one issue

simonguozirui · 2025-10-22T07:20:44Z

scripts/generate_and_eval_single_sample.py


+    # Use appropriate prompt constructor based on backend
+    if config.backend == "cuda":
+        custom_prompt = prompt_generate_custom_cuda_from_prompt_template(ref_arch_src)


@AffectionateCurry i see what you mean here now. we can refactor this later with a better prompt template!

simonguozirui · 2025-10-22T07:21:18Z

scripts/generate_and_eval_single_sample.py

+        custom_prompt = get_prompt_for_backend(ref_arch_src, config.backend)
+    else:
+        raise ValueError(
+            f"Unsupported backend: {config.backend}. Must be 'cuda', 'triton', or 'cute'."


nice catch here, we shall update read me in the pre-GPU mode hackathon to list these are available options

simonguozirui · 2025-10-22T07:24:24Z

src/eval.py

+    deleted manually be the caller.
+
+    This is a hack that is needed for triton code as compile / exec do not play well
+    with the @triton.jit decorator.


we did this for some of the multi-turn kernelbench experiments too. so we might need to support this for cuda code path as well.
Right now @AffectionateCurry you should state this is only used invoked upon for alternative backends

simonguozirui · 2025-10-22T07:24:40Z

src/eval.py

+    return ModelNew, temp_file
+
+
+# def load_tilelang_model(


we can do this in a later PR

simonguozirui · 2025-10-22T07:25:11Z

src/eval.py

+    # Create a new module based on that spec
+    temp_module = importlib.util.module_from_spec(spec)
+    # Execute the code in the module's namespace
+    spec.loader.exec_module(temp_module)


how safe is this haha, we shall really understand [in case any reward hacking]

simonguozirui · 2025-10-22T07:27:14Z

src/prompt_constructor_multilang.py

+import os
+from .utils import read_file
+
+"""


great first step, we will replace this with something a bit more modular later on!

simonguozirui · 2025-10-22T07:28:02Z

src/eval.py

+        return tensor.to(device=device)
+
+    # Apply backend-specific dtype casting for float tensors
+    # if backend.lower() == "tilelang":


did we write this function just for tile lang?

In general i actually quite like this abstraction we can do some checks etc

simonguozirui · 2025-10-23T01:18:10Z

@nathanjpaek thanks for adding an one-shot example for CuTE, we can figure out how to do tilelang support in another PR.

@AffectionateCurry have checked the current state of PR work for cuda, triton, cute, on both local and modal execution.

We will make prompt construction and eval logic more clean across backends in future PRs.

Great job @nathanjpaek @AffectionateCurry for your first PR! Thank you to @PaliC @msaroufim @Zacharias030 and the PyTorch team for your help!

…ce#35) * triton_backend_v2 * fix eval bugs * fix issues * revert eval * remove traceback * remove cot * improve eval * looked over pr and added future support for other languages * updated requirements * added back requirements.txt * add cute one shot addition example * remove unncessary files and redo requirements * let's see if that fixes it * fix config in file suggested soksoerey * move natalia's old file into change log --------- Co-authored-by: AffectionateCurry <[email protected]> Co-authored-by: nathanjpaek <[email protected]> Co-authored-by: Simon Guo <[email protected]>

PaliC and others added 5 commits March 18, 2025 11:11

triton_backend_v2

67e0c15

fix eval bugs

8bfdd21

fix issues

32ff679

revert eval

eb4e8aa

remove traceback

e9bf734

PaliC requested review from alexzhang13 and simonguozirui March 18, 2025 22:43

PaliC mentioned this pull request Mar 18, 2025

Add triton to kernel bench #18

Closed

PaliC added 2 commits March 25, 2025 09:36

remove cot

26d4cc0

improve eval

e5ba9b3

AffectionateCurry added 2 commits October 21, 2025 21:13

looked over pr and added future support for other languages

01e0c0e

updated requirements

31116dd

Merge branch 'main' into triton_backend_v2

c60b881

simonguozirui reviewed Oct 22, 2025

View reviewed changes

AffectionateCurry and others added 6 commits October 22, 2025 17:09

added back requirements.txt

37dd8db

add cute one shot addition example

f0b88d7

remove unncessary files and redo requirements

6bbcf97

let's see if that fixes it

e2bc271

fix config in file suggested soksoerey

a5f63af

move natalia's old file into change log

dc87a37

simonguozirui assigned PaliC, nathanjpaek and AffectionateCurry Oct 23, 2025

simonguozirui merged commit e395e91 into main Oct 23, 2025

simonguozirui mentioned this pull request Nov 5, 2025

[Roadmap] Fall 2025 KernelBench Maintenance + Improvement Plan #74

Open

28 tasks

simonguozirui mentioned this pull request Nov 19, 2025

Would there be a potential Mojo Support #94

Open

Add Triton Backend #35

Add Triton Backend #35

Uh oh!

Conversation

PaliC commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code readability improvements:

New functionality:

Error logging enhancements:

Miscellaneous:

Uh oh!

george-mako commented Apr 15, 2025

Uh oh!

ai-nikolai commented Aug 22, 2025

Uh oh!

simonguozirui commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ai-nikolai commented Oct 22, 2025

Uh oh!

simonguozirui left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simonguozirui commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

PaliC commented Mar 18, 2025 •

edited

Loading

simonguozirui commented Oct 3, 2025 •

edited

Loading

simonguozirui commented Oct 23, 2025 •

edited

Loading