Skip to content

Add runtime kernel validation for SageAttention#437

Merged
leszko merged 1 commit intomainfrom
fix/sageattention-kernel-validation
Feb 12, 2026
Merged

Add runtime kernel validation for SageAttention#437
leszko merged 1 commit intomainfrom
fix/sageattention-kernel-validation

Conversation

@yondonfu
Copy link
Copy Markdown
Contributor

Summary

  • Adds a runtime CUDA kernel probe after the SageAttention import succeeds, catching incompatible precompiled kernels (e.g. cudaErrorNoKernelImageForDevice) before inference rather than at inference time
  • Allocates small test tensors, calls sageattn(), and torch.cuda.synchronize() to surface async CUDA errors
  • On failure, sets SAGEATTN_AVAILABLE = False with a clear diagnostic message, allowing automatic fallback to Flash Attention

Test plan

  • uv run daydream-scope --no-browser starts without errors on a machine with compatible SageAttention
  • On a machine with incompatible SageAttention kernels (e.g. precompiled wheel missing the GPU's sm arch), the probe fails gracefully and prints sageattention kernels are not compatible with this GPU.
  • DISABLE_SAGEATTENTION=1 still bypasses SageAttention before the probe runs
  • Inference works via Flash Attention fallback when probe fails

🤖 Generated with Claude Code

@yondonfu yondonfu marked this pull request as draft February 11, 2026 00:01
@leszko leszko force-pushed the fix/sageattention-kernel-validation branch from fce29be to 65315ae Compare February 12, 2026 13:26
@leszko leszko marked this pull request as ready for review February 12, 2026 13:27
Copy link
Copy Markdown
Collaborator

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR and tested on both 4090 and 5090. Works fine now. Merging.

@leszko leszko merged commit 737ca6b into main Feb 12, 2026
6 checks passed
@yondonfu yondonfu deleted the fix/sageattention-kernel-validation branch February 12, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants