Skip to content

Conversation

@ZhangYulongg
Copy link
Collaborator

增加fastdeploy bench cli:

  • latency:离线推理延时
    示例:
    fastdeploy bench latency --model /ModelData/ERNIE-4.5-0.3B-Paddle

  • serve:服务化推理吞吐、延时
    示例:
    参数与benchmarks脚本一致
    fastdeploy bench serve
    --backend openai-chat
    --label test
    --model EB45T
    --host 0.0.0.0
    --port 42688
    --dataset-name EBChat
    --hyperparameter-path test.yaml
    --percentile-metrics ttft,tpot,itl,e2el,s_ttft,s_itl,s_e2el,s_decode,input_len,s_input_len,output_len
    --metric-percentiles 80,95,99,99.9,99.95,99.99
    --dataset-path ./filtered_sharedgpt_2000_input_1136_output_200_fd.json
    --num-prompts 10
    --max-concurrency 10

@paddle-bot
Copy link

paddle-bot bot commented Sep 17, 2025

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Sep 17, 2025
@EmmonsCurse EmmonsCurse merged commit 5532e8a into PaddlePaddle:develop Sep 22, 2025
31 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants