support qk norm for append attn #3145

rsmallblue · 2025-08-01T08:37:21Z

在append attn 中支持在rope后对q、k进行rms norm的计算

paddle-bot · 2025-08-01T08:37:26Z

Thanks for your contribution!

RichardWooSJTU · 2025-08-04T08:46:39Z

fastdeploy/model_executor/layers/attention/append_attn_backend.py

            metadata.kv_signal_data_list[layer.layer_id],
+            getattr(layer, "q_norm_weight", None),
+            getattr(layer, "k_norm_weight", None),
+            getattr(layer, "rms_norm_eps"),


这里不用设置默认值吗

RichardWooSJTU · 2025-08-04T08:49:56Z

custom_ops/gpu_ops/append_attn/decoder_write_cache_with_rope_kernel.cu

+  const uint32_t elem_nums =
+      use_neox_style ? bsz * (num_heads + 2 * kv_num_heads) * dim_head / 2
+                     : bsz * (num_heads + 2 * kv_num_heads) * dim_head;
+  constexpr int HEAD_DIM = 128;


这里是不是需要加判断：不支持dim_head不等于128的

RichardWooSJTU · 2025-08-04T08:51:52Z

fastdeploy/utils.py

+        instance_key = (cls, frozenset(kwargs.items()))
+        if instance_key not in instances:
+            instances[instance_key] = cls(*args, **kwargs)
+        return instances[instance_key]


这里的修改不属于q k norm的范畴吧，而且目前ep engine的实现不需要修改这个

这里的修改不属于q k norm的范畴吧，而且目前ep engine的实现不需要修改这个

Done

yuanlehome · 2025-08-05T03:01:30Z

在attention layer那里添加一下注释说明下use_qk_norm做的是qk_norm after rope，开源社区的其他模型看起来都是qk_norm before rope，这个diff需要显式指出来

rsmallblue · 2025-08-05T06:52:55Z

在attention layer那里添加一下注释说明下use_qk_norm做的是qk_norm after rope，开源社区的其他模型看起来都是qk_norm before rope，这个diff需要显式指出来

done

rsmallblue changed the title ~~support qk norm~~ support qk norm for append attn Aug 1, 2025

rsmallblue requested review from RichardWooSJTU, carryyu and yangjianfengo1 August 4, 2025 08:24

RichardWooSJTU reviewed Aug 4, 2025

View reviewed changes

rsmallblue force-pushed the qk_norm branch 4 times, most recently from b09f64e to ee996e3 Compare August 4, 2025 13:51

rsmallblue force-pushed the qk_norm branch 3 times, most recently from 37b6618 to ab37724 Compare August 5, 2025 05:50

support qk norm

3e06385

rsmallblue force-pushed the qk_norm branch from ab37724 to 3e06385 Compare August 5, 2025 06:51

rsmallblue requested a review from RichardWooSJTU August 5, 2025 06:52

RichardWooSJTU approved these changes Aug 5, 2025

View reviewed changes

Jiang-Jia-Jun merged commit 7ce00e5 into PaddlePaddle:develop Aug 5, 2025
11 of 14 checks passed

megemini added a commit to megemini/FastDeploy that referenced this pull request Aug 6, 2025

添加q_norm_weight和k_norm_weight参数支持 (PaddlePaddle#3145)

51e4077

DrRyanHuang mentioned this pull request Aug 12, 2025

[BUG FIX][SOT] Fix parameter order for custom op: rms_norm_eps #3348

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support qk norm for append attn #3145

support qk norm for append attn #3145

Uh oh!

rsmallblue commented Aug 1, 2025

Uh oh!

paddle-bot bot commented Aug 1, 2025

Uh oh!

RichardWooSJTU Aug 4, 2025

Uh oh!

rsmallblue Aug 5, 2025

Uh oh!

RichardWooSJTU Aug 4, 2025

Uh oh!

RichardWooSJTU Aug 4, 2025

Uh oh!

rsmallblue Aug 5, 2025

Uh oh!

yuanlehome commented Aug 5, 2025

Uh oh!

rsmallblue commented Aug 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

support qk norm for append attn #3145

support qk norm for append attn #3145

Uh oh!

Conversation

rsmallblue commented Aug 1, 2025

Uh oh!

paddle-bot bot commented Aug 1, 2025

Uh oh!

RichardWooSJTU Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

rsmallblue Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

RichardWooSJTU Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

RichardWooSJTU Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

rsmallblue Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

yuanlehome commented Aug 5, 2025

Uh oh!

rsmallblue commented Aug 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants