# cuDNN Frontend v1.21.0 Release Notes by Anerudhan · Pull Request #213 · NVIDIA/cudnn-frontend

Anerudhan · 2026-03-25T01:37:58Z

cuDNN Frontend v1.21.0 is the recommended version for cuDNN 9.20.0 and later releases.

General Improvements 🚀

Dropped dependency on the CUDA driver API for the frontend library, enabling builds without direct CUDA driver linkage.

Open-Source Kernels

Added new kernels for the GEMM fusions.

Grouped GEMM + GLU: Unified grouped GEMM GLU API supporting dense and discrete MoE weight layouts with optional bias.
Grouped GEMM + dGLU: Unified grouped GEMM dGLU backward API supporting dense and discrete MoE weight layouts with optional bias.
Discrete Grouped GEMM + SwiGLU: Per-expert-pointer SwiGLU grouped GEMM for MoE workloads without weight packing.
Discrete Grouped GEMM + dSwiGLU: Per-expert-pointer dSwiGLU backward grouped GEMM for MoE workloads without weight packing. Uses dSwiGLU/dGeGLU backward epilogue.
Grouped GEMM + dSwiglu: dSwiglu activation fused with Grouped GEMM
Grouped GEMM + Quant: Grouped GEMM with output quantization for MoE FC2/dFC1 workloads.

cuDNN Frontend v1.21.0 is the recommended version for [cuDNN 9.20.0](https://docs.nvidia.com/deeplearning/cudnn/backend/latest/release-notes.html#cudnn-9-20-0) and later releases. ## General Improvements 🚀 - Dropped dependency on the CUDA driver API for the frontend library, enabling builds without direct CUDA driver linkage. ## Open-Source Kernels Added new kernels for the GEMM fusions. **[Grouped GEMM + GLU](https://github.com/NVIDIA/cudnn-frontend/tree/main/python/cudnn/discrete_grouped_gemm/grouped_gemm_glu):** Unified grouped GEMM GLU API supporting dense and discrete MoE weight layouts with optional bias. **[Grouped GEMM + dGLU](https://github.com/NVIDIA/cudnn-frontend/tree/main/python/cudnn/discrete_grouped_gemm/grouped_gemm_dglu):** Unified grouped GEMM dGLU backward API supporting dense and discrete MoE weight layouts with optional bias. **[Discrete Grouped GEMM + SwiGLU](https://github.com/NVIDIA/cudnn-frontend/tree/main/python/cudnn/discrete_grouped_gemm/discrete_grouped_gemm_swiglu):** Per-expert-pointer SwiGLU grouped GEMM for MoE workloads without weight packing. **[Discrete Grouped GEMM + dSwiGLU](https://github.com/NVIDIA/cudnn-frontend/tree/main/python/cudnn/discrete_grouped_gemm/discrete_grouped_gemm_dswiglu):** Per-expert-pointer dSwiGLU backward grouped GEMM for MoE workloads without weight packing. Uses dSwiGLU/dGeGLU backward epilogue. **[Grouped GEMM + dSwiglu](https://github.com/NVIDIA/cudnn-frontend/tree/main/python/cudnn/gemm_dswiglu):** dSwiglu activation fused with Grouped GEMM **[Grouped GEMM + Quant](https://github.com/NVIDIA/cudnn-frontend/tree/main/python/cudnn/grouped_gemm/grouped_gemm_quant):** Grouped GEMM with output quantization for MoE FC2/dFC1 workloads

Anerudhan force-pushed the 1.21.0-rc branch from 0a84ea9 to 5b8c8b4 Compare March 25, 2026 01:59

Anerudhan merged commit 7b9b711 into main Mar 25, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

# cuDNN Frontend v1.21.0 Release Notes#213

# cuDNN Frontend v1.21.0 Release Notes#213
Anerudhan merged 1 commit into
mainfrom
1.21.0-rc

Anerudhan commented Mar 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Anerudhan commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

General Improvements 🚀

Open-Source Kernels

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Anerudhan commented Mar 25, 2026 •

edited

Loading