[XPU] xpu support think length limit #4539

ddchenhao66 · 2025-10-22T07:33:01Z

xpu支持限制思考长度，使用kernel提升性能

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

CLAassistant · 2025-10-22T07:33:07Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

ddchenhao66 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

paddle-bot · 2025-10-22T07:33:10Z

Thanks for your contribution!

yuanlehome · 2025-10-22T07:47:24Z

custom_ops/xpu_ops/src/ops/limit_thinking_content_length_v1.cc

+
+}
+
+PD_BUILD_OP(limit_thinking_content_length_v1)


这里不需要改成 PD_BUILD_STATIC_OP 吗，下同

yuanlehome · 2025-10-22T07:48:26Z

fastdeploy/worker/xpu_model_runner.py

+        max_think_lens = share_inputs["max_think_lens"]
+        step_idx = share_inputs["step_idx"]
+        limit_think_status = share_inputs["limit_think_status"]
+        print(f"ch66 limit_strategy:{limit_strategy}")


delete print

iosmers · 2025-10-22T08:05:03Z

fastdeploy/worker/xpu_model_runner.py

+            else:
+                # Disable thinking
+                self.share_inputs["max_think_lens"][idx : idx + 1, :] = -1
+                self.share_inputs["limit_think_status"][idx : idx + 1, :] = 0


self.share_inputs["limit_think_status"][idx : idx + 1, :] = 0 可以合并一下？

iosmers · 2025-10-22T08:05:28Z

fastdeploy/worker/xpu_model_runner.py

+                if request.get("enable_thinking", False) and request.get("reasoning_max_tokens", None) is not None:
+                    # Enable thinking
+                    self.share_inputs["max_think_lens"][idx : idx + 1, :] = request.get("reasoning_max_tokens")
+                    self.share_inputs["limit_think_status"][idx : idx + 1, :] = 0


self.share_inputs["limit_think_status"][idx : idx + 1, :] = 0 可以合并一下？

iosmers · 2025-10-22T08:07:35Z

custom_ops/xpu_ops/src/plugin/src/kernel/kunlun3cpp/limit_thinking_content_length_v2.xpu

+          // 强制将当前token替换为结束思考的token
+          next_token_lm = line_break_id;
+          limit_think_status_lm = 2;
+        }


这里再加一个else{}? 加一些debug信息？

不需要，else的场景不需要操作

hong19860320 · 2025-10-22T08:13:08Z

custom_ops/xpu_ops/src/plugin/src/wrapper/limit_thinking_content_length_v1.cpp

+
+    WRAPPER_DUMP(ctx);
+    if (ctx->dev().type() == api::kCPU) {
+        assert(false);


后面有空补上 CPU wrapper 的实现吧

hong19860320 · 2025-10-22T08:13:19Z

custom_ops/xpu_ops/src/plugin/src/wrapper/limit_thinking_content_length_v2.cpp

+    WRAPPER_DUMP_PARAM2(ctx,line_break_id,bs);
+    WRAPPER_DUMP(ctx);
+    if (ctx->dev().type() == api::kCPU) {
+        assert(false);


cqulilujia

顺手绑一下pybind.cc吧

yuanlehome

LGTM

XiaoguangHu01

LGTM

hong19860320

LGTM

yuanlehome reviewed Oct 22, 2025

View reviewed changes

[XPU] xpu support think length limit

25ea457

ddchenhao66 force-pushed the length_limit_1 branch from 2576384 to 25ea457 Compare October 22, 2025 07:55

ddchenhao66 requested review from hong19860320 and yuanlehome October 22, 2025 07:55

iosmers reviewed Oct 22, 2025

View reviewed changes

hong19860320 previously approved these changes Oct 22, 2025

View reviewed changes

cqulilujia reviewed Oct 22, 2025

View reviewed changes

yuanlehome previously approved these changes Oct 22, 2025

View reviewed changes

yuanlehome mentioned this pull request Oct 22, 2025

Optimizing the performance of think length limit using custom operators #4279

Merged

DDDivano previously approved these changes Oct 22, 2025

View reviewed changes

XiaoguangHu01 previously approved these changes Oct 22, 2025

View reviewed changes

qingqing01 approved these changes Oct 22, 2025

View reviewed changes

qingqing01 previously approved these changes Oct 22, 2025

View reviewed changes

ddchenhao66 dismissed stale reviews from qingqing01, XiaoguangHu01, DDDivano, yuanlehome, and hong19860320 via 25d4408 October 22, 2025 13:32

Merge branch 'develop' into length_limit_1

31fc218

ddchenhao66 force-pushed the length_limit_1 branch from 25d4408 to 31fc218 Compare October 22, 2025 13:58

[XPU] xpu c++ code files format

6ee222e

hong19860320 approved these changes Oct 23, 2025

View reviewed changes

yuanlehome approved these changes Oct 23, 2025

View reviewed changes

qingqing01 approved these changes Oct 23, 2025

View reviewed changes

DDDivano approved these changes Oct 23, 2025

View reviewed changes

ddchenhao66 requested a review from XiaoguangHu01 October 23, 2025 06:35

jeff41404 approved these changes Oct 23, 2025

View reviewed changes

EmmonsCurse added the skip-ci: coverage label Oct 23, 2025

EmmonsCurse merged commit 5443b2c into PaddlePaddle:develop Oct 23, 2025
28 of 39 checks passed


		}

		PD_BUILD_OP(limit_thinking_content_length_v1)

[XPU] xpu support think length limit #4539

[XPU] xpu support think length limit #4539

Uh oh!

Conversation

ddchenhao66 commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

CLAassistant commented Oct 22, 2025

Uh oh!

paddle-bot bot commented Oct 22, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cqulilujia left a comment

Choose a reason for hiding this comment

Uh oh!

yuanlehome left a comment

Choose a reason for hiding this comment

Uh oh!

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

Uh oh!

hong19860320 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

ddchenhao66 commented Oct 22, 2025 •

edited

Loading