-
Notifications
You must be signed in to change notification settings - Fork 682
[GCU] Support gcu platform #2702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GCU] Support gcu platform #2702
Conversation
|
Thanks for your contribution! |
9b02b05 to
08a285b
Compare
| causal=self.causal, | ||
| ) | ||
|
|
||
| # res = self.native_sdpa_impl( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these useless codes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mem_efficient_attn_backend其中的native sdpa实现主要用来调测中精度对比,期望的是保留该部分代码,调测新网络时方便本地快速切换,已切换到通过变量控制
vivienfanghuagood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some detaied information in PR:
- supported model lists & tutorials
- user guide about build & install GCU Fastdeploy
已提供,详见PR第一条comment |
08a285b to
592e800
Compare
| def insert_prefill_inputs(self, req_dicts: List[Request]): | ||
| """ | ||
| Process inputs for prefill tasks and insert it to share_inputs buffer | ||
| TODO(gongshaotian): Refactor this func |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
多个内容的注释是gongshaotian,根据实际情况修改下吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
| norm_out = self.norm_func( | ||
| x, residual_input, self.ln_weight, self.eps | ||
| ) | ||
| # if residual_input is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些注释主要删除。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
| raise NotImplementedError( | ||
| "prefix_caching is not support by GCUModelRunner." | ||
| ) | ||
| # cache_kvs_list = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注释掉的内容需要删除
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
e24c60b to
55598cf
Compare
baseline: e7fa57e
c4a7a46 to
0ad9ae0
Compare
新增
GCU硬件适配,包括:platformsgcu opsgcu_model_runner/gcu_workerattentionmoequantizationactivation/normalization/linear/rotary_embedding/sampler目前支持的模型:
详细安装使用请参见使用说明。