Simplified Thunderkittens Port #107

Willy-Chan · 2025-12-18T23:36:18Z

Thunderkittens backend support, but with a simplified load_inline() implementation.

This turns out to be much simpler than using separate files, but has its own tradeoffs and implications.

Also, users don't have to put the TK repo in the root directory, it's automatically cloned to the Modal.

src/kernelbench/prompts/prompts.toml

willhu-jpg · 2025-12-31T18:24:42Z

We should add instructions for installing the TK dir in local environments to the README. Think there's also a mismatch with the CUDA version in the current TK repo too. It's hard coded to 12.6 if we rely on TK instructions to call 'source env.src'

Willy-Chan · 2026-01-01T03:53:30Z

We should add instructions for installing the TK dir in local environments to the README. Think there's also a mismatch with the CUDA version in the current TK repo too. It's hard coded to 12.6 if we rely on TK instructions to call 'source env.src'

@willhu-jpg Added instructions to the README, lmk what you think. Also, could you clarify the CUDA version point? Like the tk-v2 branch uses a different version than kernelbench right now?

willhu-jpg · 2026-01-02T02:11:15Z

README changes look good. For the CUDA version, checkout the env.src in the TK repo. Typical setup is to call "source env.src" to set all the environment variables, but they have it hardcoded to 12.6.

willhu-jpg

changes look good otherwise! let's merge it :)

willhu-jpg · 2026-01-02T02:10:14Z

README.md

+tk_root = os.environ.get("THUNDERKITTENS_ROOT", "/root/ThunderKittens")
+```
+
+This allows the kernel to include the right TK primitives.s


Sounds good, fixed the typo. @simonguozirui TK works on CUDA versions 12.6 - 12.9, rn in modal scripts the system uses CUDA 12.8 by default which should be fine. Locally users will set CUDA_HOME, default is 12.6 for TK's env.src. Lmk if this arrangement sounds fine.

that's awesome we will likely stick to CUDA 12.8 for modal container. I’ll add a note that local users should either override CUDA_HOME or modify env.src to point to the appropriate CUDA version on their local dev box.

simonguozirui · 2026-01-03T01:21:06Z

README.md

+
+This allows the kernel to include the right TK primitives.
+
+*NOTE*: Right now, all generated ThunderKittens kernels are required to be in datatype format BF16. FP16 support is TBD.


should be good for now! Most TK kernels from the TK paper was done in BF16. We will add code path to restrict it.

simonguozirui · 2026-01-03T01:22:05Z

scripts/eval_from_generations.py

+    # thunderkittens requires bf16 and H100 GPU
+    if backend == "thunderkittens":
+        config.precision = "bf16"
+        config.gpu = "H100"


technically blackwell and H200 too but we can worry about that later (i will address that in the enforcing H100 vs 200 PR)

simonguozirui · 2026-01-03T01:26:24Z

src/kernelbench/prompts/model_new_ex_add_thunderkittens.py

+        "--expt-extended-lambda",
+        "-DKITTENS_HOPPER",
+        "-DKITTENS_BLACKWELL",
+        "-diag-suppress=20012",


that's a smart way of doing it

simonguozirui · 2026-01-03T01:41:19Z

README.md

+If you plan on using `scripts/generate_and_eval_single_sample.py` using `backend=thunderkittens`, make sure to git clone the ThunderKittens repo and you set the following environment variable to point to your local ThunderKittens directory:
+
+```bash
+export THUNDERKITTENS_ROOT=/Users/willychan/Desktop/projects/KernelBench/ThunderKittens


fix to be generic and i will add some comments etc

simonguozirui · 2026-01-03T01:56:36Z

Validate TK example execution both locally on H200 and modal H200

python3 scripts/run_and_check.py   ref_origin=local   ref_arch_src_path=src/kernelbench/prompts/model_ex_add_thunderkittens.py   kernel_src_path=src/kernelbench/prompts/model_new_ex_add_thunderkittens.py   eval_mode=<local, modal>   gpu=H200   backend=thunderkittens precision=bf16

We will use matrix add example rather than vector add example for thunderkittens, similar to our last attempt so the tk example can leverage TK primitives and programming model.

simonguozirui

Validated it works! I will add more comments in the next PR since I don't have push access to this branch on Willy's repo.

Thank you so much @Willy-Chan for making TK work easily for KernelBench (with the load_inline format rather than the TK python binding interface). This has long awaited since Simran and my first attempt before NeurIPS '24 (branch thunderkittens), and thank you for the many attempts #101 #104. Much thanks to @willhu-jpg @simran-arora for your advice and feedback too!

* run and check works with new TK format * generate_and_eval_single_sample working with TK backend * eval from generations changes * generate_samples working * generate_single_examples adapted * updated git cloning of TK repo: using the tk-v2 branch * removed unnecessary pip installs now that uv works * fixed bug to prompts.toml pointing to correct file * README explaining how to run with TK locally * nit typo fix * current commits only support bf16, clarification to README

Willy-Chan requested review from simonguozirui and willhu-jpg December 18, 2025 23:36

Willy-Chan mentioned this pull request Dec 18, 2025

ThunderKittens (w/ Blackwell) Support #104

Closed

Willy-Chan added 6 commits December 31, 2025 12:05

run and check works with new TK format

4bc4c95

generate_and_eval_single_sample working with TK backend

06506a2

eval from generations changes

45eae0a

generate_samples working

7b38e6f

generate_single_examples adapted

3db725b

updated git cloning of TK repo: using the tk-v2 branch

faf0935

Willy-Chan force-pushed the simplified_tk_port branch from 1ce1f1b to faf0935 Compare December 31, 2025 17:10

simonguozirui mentioned this pull request Dec 31, 2025

[Roadmap] Fall 2025 KernelBench Maintenance + Improvement Plan #74

Open

28 tasks

willhu-jpg reviewed Dec 31, 2025

View reviewed changes

src/kernelbench/prompts/prompts.toml Outdated Show resolved Hide resolved

Willy-Chan added 3 commits December 31, 2025 22:04

removed unnecessary pip installs now that uv works

3ed42ab

fixed bug to prompts.toml pointing to correct file

dfe5df8

README explaining how to run with TK locally

e7e61ee

willhu-jpg self-requested a review January 2, 2026 03:03

willhu-jpg approved these changes Jan 2, 2026

View reviewed changes

Willy-Chan added 2 commits January 1, 2026 22:20

nit typo fix

e492af5

current commits only support bf16, clarification to README

d7fee2c

simonguozirui reviewed Jan 3, 2026

View reviewed changes

simonguozirui approved these changes Jan 3, 2026

View reviewed changes

simonguozirui merged commit f393682 into ScalingIntelligence:main Jan 3, 2026


		This allows the kernel to include the right TK primitives.

		NOTE: Right now, all generated ThunderKittens kernels are required to be in datatype format BF16. FP16 support is TBD.

Simplified Thunderkittens Port #107

Simplified Thunderkittens Port #107

Uh oh!

Conversation

Willy-Chan commented Dec 18, 2025

Uh oh!

Uh oh!

willhu-jpg commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Willy-Chan commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

willhu-jpg commented Jan 2, 2026

Uh oh!

willhu-jpg left a comment

Choose a reason for hiding this comment

Uh oh!

willhu-jpg Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Willy-Chan Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

simonguozirui Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

simonguozirui Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

simonguozirui Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

simonguozirui Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

simonguozirui Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

simonguozirui commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simonguozirui left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

willhu-jpg commented Dec 31, 2025 •

edited

Loading

Willy-Chan commented Jan 1, 2026 •

edited

Loading

simonguozirui commented Jan 3, 2026 •

edited

Loading

simonguozirui left a comment •

edited

Loading