Skip to content

Conversation

@littledgg
Copy link
Contributor

解决了由于attention算子走不同分支导致cudagraph replay失败的问题(包括cuda error700 与乱码),跑完了压测。但是由于原本的算子可能存在许多不会执行到的分支,还需进一步更改。目前是最少更改而有效的版本

Comment on lines +490 to 491
if (num_chunks_this_seq <= 0) {
o_base_ptr_int8 = out + o_offset;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个分支能直接删掉吗

@paddle-bot
Copy link

paddle-bot bot commented Jul 31, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Jul 31, 2025
@littledgg
Copy link
Contributor Author

custom_ops/gpu_ops/append_attn/append_attention_func.cuh中merge_multi_chunks_decoder_kernel需要更改

if (num_chunks_this_seq <= 1) {
    return;
  }

if (num_chunks_this_seq <= 0) {
    return;
  }

,但当前多batch还存在问题,等排查完后一并提交

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants