[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 #4238

ckl117 · 2025-09-23T14:38:39Z

CP from 2.2 PR#4115

Optimize per_token_quant_fp8 kernel performance improved by 50%
Support Wfp8Afp8MoEMethod(weight quant in channel-wise)

python -m fastdeploy.entrypoints.openai.api_server \
    --model ${model_path} \
    --max-model-len 32768 \
    --max-num-seqs 128 \
    --tensor-parallel-size 1 \
    --load_choices "default_v1" \
    --quantization wfp8afp8 \

…into dev_moe_wfp8afp8

paddle-bot · 2025-09-23T15:11:59Z

Thanks for your contribution!

ckl117 added 4 commits September 23, 2025 15:52

improve per_token_quant_fp8 performance

144c663

support moe wfp8apf8

ec3f9ca

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

f1e8e62

…into dev_moe_wfp8afp8

add UT

2fbaf3f

glm UT result check

dbb5154

ckl117 changed the title ~~Dev moe wfp8afp8~~ [OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 Sep 24, 2025

zhoutianzi666 approved these changes Sep 24, 2025

View reviewed changes

ckl117 merged commit 7c1fd19 into PaddlePaddle:develop Sep 24, 2025
26 of 28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 #4238

[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 #4238

Uh oh!

ckl117 commented Sep 23, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 #4238

[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 #4238

Uh oh!

Conversation

ckl117 commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Sep 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ckl117 commented Sep 23, 2025 •

edited

Loading