[fix] qwen output inconsistency when top_p=0 #3634

liyonghua0910 · 2025-08-26T13:23:46Z

问题描述

Qwen-2-7b-Instruct 模型部署，请求设置 top_p=0，连续发送 2 次相同请求，输出结果存在差异。

产生原因

Diff 主要来源于 apply_penalty_multi_scores 步骤，两次请求的输入仅 sampling_metadata.pre_token_ids 存在差异。

其中，

第一条请求的 pre_token_ids=[198, 39814, -1, -1, -1, -1, -1, -1, -1, -1, ...
第二条请求的 pre_token_ids=[198, 39814, 11, 1588, 525, 2326, 5837, 323, 862, 92999, 1447, 16, 13, ...
可以看到，pre_token_ids 在第二条请求推理时没有重置为 -1。

查看 custom_ops/gpu_ops/token_penalty_multi_scores.cu 代码，并没有用 cur_len 去 mask 掉后面的无效值，而是依赖 pre_ids[cur_len: ] 被预先置为负数（如 -1），才能保证计算正确性。

而 V1 Scheduler 也没有在请求 prefill 时重置 pre_token_ids 为 -1 的逻辑，该逻辑在 V0 是有的。

解决方法

在 insert_tasks_v1 方法中添加初始化 pre_token_ids 的逻辑。

paddle-bot · 2025-08-26T13:23:53Z

Thanks for your contribution!

…-topp

codecov-commenter · 2025-08-27T04:30:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@e645db3). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff            @@
##             develop   #3634   +/-   ##
=========================================
  Coverage           ?   0.00%           
=========================================
  Files              ?       3           
  Lines              ?       3           
  Branches           ?       0           
=========================================
  Hits               ?       0           
  Misses             ?       3           
  Partials           ?       0

Flag	Coverage Δ
diff	`0.00% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

* [fix] qwen output inconsistency when top_p=0 * [fix] remove decode pre_id code

[fix] qwen output inconsistency when top_p=0

224251a

paddle-bot bot added the contributor External developers label Aug 26, 2025

liyonghua0910 added 3 commits August 26, 2025 13:35

[fix] remove decode pre_id code

ec549d1

Merge remote-tracking branch 'upstream/develop' into develop_fix-qwen…

7a43a92

…-topp

Merge remote-tracking branch 'upstream/develop' into develop_fix-qwen…

4804508

…-topp

Merge branch 'develop' into develop_fix-qwen-topp

1711698

Jiang-Jia-Jun merged commit b2afdf4 into PaddlePaddle:develop Aug 27, 2025
13 of 18 checks passed

liyonghua0910 added a commit to liyonghua0910/FastDeploy that referenced this pull request Aug 27, 2025

[fix] qwen output inconsistency when top_p=0 (PaddlePaddle#3634)

6bd38ea

* [fix] qwen output inconsistency when top_p=0 * [fix] remove decode pre_id code

Jiang-Jia-Jun pushed a commit that referenced this pull request Aug 28, 2025

[fix] qwen output inconsistency when top_p=0 (#3634) (#3662)

6545994

* [fix] qwen output inconsistency when top_p=0 * [fix] remove decode pre_id code

handsomecoderyang pushed a commit to handsomecoderyang/FastDeploy that referenced this pull request Aug 28, 2025

[fix] qwen output inconsistency when top_p=0 (PaddlePaddle#3634)

0716246

* [fix] qwen output inconsistency when top_p=0 * [fix] remove decode pre_id code

liyonghua0910 deleted the develop_fix-qwen-topp branch September 17, 2025 02:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix] qwen output inconsistency when top_p=0 #3634

[fix] qwen output inconsistency when top_p=0 #3634

Uh oh!

liyonghua0910 commented Aug 26, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Aug 26, 2025

Uh oh!

codecov-commenter commented Aug 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[fix] qwen output inconsistency when top_p=0 #3634

[fix] qwen output inconsistency when top_p=0 #3634

Uh oh!

Conversation

liyonghua0910 commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

问题描述

产生原因

解决方法

Uh oh!

paddle-bot bot commented Aug 26, 2025

Uh oh!

codecov-commenter commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liyonghua0910 commented Aug 26, 2025 •

edited

Loading

codecov-commenter commented Aug 27, 2025 •

edited

Loading