Skip to content

Conversation

@DrRyanHuang
Copy link
Collaborator

@DrRyanHuang DrRyanHuang commented Jul 21, 2025

这个PR分为三步:

  • 114c5eb (#2937) 精简了forward 半部分的算子,通过布尔索引移除了if语句,解决了forward后面碎算子的问题,但是由于引入了bool索引(在全文本token,没有图片token的情况下),性能会下降
  • 10b36e6 (#2937) 精简了forward 半部分不必要的算子
  • e545864 (#2937) 修复了第一个 commit 中性能下降的问题,将布尔索引移动到后面自定义算子 extract_text_token_output_kernel 中,原因是Paddle中的布尔索引,使用 non_zero + gather_nd 组合实现,non_zero存在一个 gpu -> cpu 的过程,这个过程比较耗时,同时,由于移除了布尔索引,性能会略有提升

cc @zyfncg @SigureMo

@paddle-bot
Copy link

paddle-bot bot commented Jul 21, 2025

Thanks for your contribution!

Copy link
Collaborator

@xiaoxiaohehe001 xiaoxiaohehe001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ming1753
ming1753 previously approved these changes Jul 21, 2025
Copy link
Collaborator

@ming1753 ming1753 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@ming1753 ming1753 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@gongshaotian gongshaotian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongshaotian gongshaotian merged commit 94264bb into PaddlePaddle:develop Aug 1, 2025
11 of 14 checks passed
@DrRyanHuang DrRyanHuang deleted the rm_forward_if branch August 1, 2025 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants