Skip to content

Conversation

@rsmallblue
Copy link
Collaborator

在append attn 中支持在rope后对q、k进行rms norm的计算

@paddle-bot
Copy link

paddle-bot bot commented Aug 1, 2025

Thanks for your contribution!

@rsmallblue rsmallblue changed the title support qk norm support qk norm for append attn Aug 1, 2025
metadata.kv_signal_data_list[layer.layer_id],
getattr(layer, "q_norm_weight", None),
getattr(layer, "k_norm_weight", None),
getattr(layer, "rms_norm_eps"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不用设置默认值吗

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

const uint32_t elem_nums =
use_neox_style ? bsz * (num_heads + 2 * kv_num_heads) * dim_head / 2
: bsz * (num_heads + 2 * kv_num_heads) * dim_head;
constexpr int HEAD_DIM = 128;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是需要加判断:不支持dim_head不等于128的

instance_key = (cls, frozenset(kwargs.items()))
if instance_key not in instances:
instances[instance_key] = cls(*args, **kwargs)
return instances[instance_key]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的修改不属于q k norm的范畴吧,而且目前ep engine的实现不需要修改这个

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的修改不属于q k norm的范畴吧,而且目前ep engine的实现不需要修改这个

Done

@rsmallblue rsmallblue force-pushed the qk_norm branch 4 times, most recently from b09f64e to ee996e3 Compare August 4, 2025 13:51
@yuanlehome
Copy link
Collaborator

在attention layer那里添加一下注释说明下use_qk_norm做的是qk_norm after rope,开源社区的其他模型看起来都是qk_norm before rope,这个diff需要显式指出来

@rsmallblue rsmallblue force-pushed the qk_norm branch 3 times, most recently from 37b6618 to ab37724 Compare August 5, 2025 05:50
@rsmallblue
Copy link
Collaborator Author

在attention layer那里添加一下注释说明下use_qk_norm做的是qk_norm after rope,开源社区的其他模型看起来都是qk_norm before rope,这个diff需要显式指出来

在attention layer那里添加一下注释说明下use_qk_norm做的是qk_norm after rope,开源社区的其他模型看起来都是qk_norm before rope,这个diff需要显式指出来

done

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 7ce00e5 into PaddlePaddle:develop Aug 5, 2025
11 of 14 checks passed
megemini added a commit to megemini/FastDeploy that referenced this pull request Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants