Skip to content

Conversation

@RichardWooSJTU
Copy link
Collaborator

Add lm head bias for EP

if self.linear_bias_key is None:
logits = paddle.matmul(logits, self.weight)
else:
logits = paddle.incubate.nn.functional.fused_linear(logits, self.weight, self.bias)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两个底层调用的OP不一致么,会有性能diff么

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不一致,paddle.matmul调用cublas,paddle.incubate.nn.functional.fused_linear调用融合的cublasLt,可以将bias的加法融合到epilogue中,理论上比单独调用add算子性能更好

@paddle-bot
Copy link

paddle-bot bot commented Aug 4, 2025

Thanks for your contribution!

is_bias=False,
)
if self.bias_key is not None:
self.bias = self.create_parameter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lm_head是否有bias不应该写在use_ep逻辑下吧?应该是通用的

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TP下面用了ColumnLinear的类,类里面会判断。之所以EP不复用ColumnLinear,原因在于这里直接传入了从fleet获取的num_ranks信息。合并EP和TP可以用fd_config中的ep_rank和tp_rank去判断。限于人力这里没有做重构。

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 1e9a8e8 into PaddlePaddle:develop Aug 5, 2025
11 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants