Skip to content

Conversation

@EnflameGCU
Copy link
Contributor

@EnflameGCU EnflameGCU commented Jul 3, 2025

新增GCU硬件适配,包括:

  • 编译工程
  • platforms
  • gcu ops
  • gcu_model_runner/gcu_worker
  • attention
  • moe
  • quantization
  • activation/normalization/linear/rotary_embedding/sampler
  • 其他流程适配等

目前支持的模型:

  • ERNIE-4.5-300B-A47B
  • ERNIE-4.5-21B-A3B
  • ERNIE-4.5-0.3B

详细安装使用请参见使用说明

@paddle-bot
Copy link

paddle-bot bot commented Jul 3, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Jul 3, 2025
@EnflameGCU EnflameGCU force-pushed the support_gcu_platform branch 9 times, most recently from 9b02b05 to 08a285b Compare July 4, 2025 08:41
causal=self.causal,
)

# res = self.native_sdpa_impl(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove these useless codes

Copy link
Contributor Author

@EnflameGCU EnflameGCU Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mem_efficient_attn_backend其中的native sdpa实现主要用来调测中精度对比,期望的是保留该部分代码,调测新网络时方便本地快速切换,已切换到通过变量控制

Copy link
Collaborator

@vivienfanghuagood vivienfanghuagood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some detaied information in PR:

  • supported model lists & tutorials
  • user guide about build & install GCU Fastdeploy

@EnflameGCU
Copy link
Contributor Author

Please add some detaied information in PR:

  • supported model lists & tutorials
  • user guide about build & install GCU Fastdeploy

已提供,详见PR第一条comment

@EnflameGCU EnflameGCU force-pushed the support_gcu_platform branch from 08a285b to 592e800 Compare July 7, 2025 02:30
def insert_prefill_inputs(self, req_dicts: List[Request]):
"""
Process inputs for prefill tasks and insert it to share_inputs buffer
TODO(gongshaotian): Refactor this func
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多个内容的注释是gongshaotian,根据实际情况修改下吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

norm_out = self.norm_func(
x, residual_input, self.ln_weight, self.eps
)
# if residual_input is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些注释主要删除。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删除

raise NotImplementedError(
"prefix_caching is not support by GCUModelRunner."
)
# cache_kvs_list = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释掉的内容需要删除

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删除

@EnflameGCU EnflameGCU force-pushed the support_gcu_platform branch 3 times, most recently from e24c60b to 55598cf Compare July 7, 2025 09:31
@EnflameGCU EnflameGCU force-pushed the support_gcu_platform branch from c4a7a46 to 0ad9ae0 Compare July 8, 2025 03:35
@yongqiangma yongqiangma merged commit d0f4d6b into PaddlePaddle:develop Jul 8, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants