[Feature] support logits processors #4515

liyonghua0910 · 2025-10-21T08:01:47Z

Motivation

LogitsProcessor （以下简称 LP）在处理链中位于「模型输出 logits → 采样策略」之间，负责对即将生成 token 的 logits 进行任意变换或约束，然后再交给采样器（top-k / top-p / temperature 等）挑选最终 token。其输入是未处理的 logits 值，以及可选的一些推理状态（例如已生成 tokens 的统计信息），输出是处理后的 logits 值。

LP 是有状态的批处理器，集成为采样器的子模块。一个标准的 LP 需要提供 update_state 和 apply 接口。

update_state - 在 execute_model 的开头执行，根据当前推理的 batch 信息更新 LP 内部维护的状态，可以保存一些统计信息、索引、张量等；
apply- 在 model.forward 之后、sampling 之前执行，根据当前 LP 状态对 logits 值应用修改。

以上的标准化接口可以支持用户自定义任意 LP，用户需要了解 FD 内部的推理状态是如何维护的，在 update_state 时编写 LP 状态的更新代码，然后在 apply 中编写对 logits 值的修改代码。

Modifications

config.py/args_utils.py - 添加 --logits-processors 服务启动参数，并新增配置，通过 fd_config.structured_outputs_config.logits_processors 引用
sampling_params.py/protocol.py - 添加 logits_processors_args 请求参数
sample/meta_data.py - 将 logits_processors 以 sampling_metadata 属性的形式传入 sampler
fastdeploy/model_executor/logits_processor/
- __init__.py - 提供 LP 的实例化函数，以及 logits_processor 模块的命名空间
- base.py - 提供 LP 的抽象类接口，自定义 LP 需继承自该类
- builtin.py - 内置的 LP 实现，目前仅提供 LogitBiasLogitsProcessor (为指定 token_id 施加 logits 偏置)

Usage or Command

服务部署时，在启动命令中加入 --logits-processors 参数，指定当前服务可支持的 Logits Processors. 指定的 LP 会在引擎启动时实例化。传入的每个 LP 字符串必须是合法的 FQCN (Fully Qualified Class Name)，即 module.path:ClassName 格式；如果使用内置的 LP，则不需要指定模块路径，只需传入 ClassName。

python -m fastdeploy.entrypoints.openai.api_server \
    --model $your_model_path \
    --port 8580 \
    --metrics-port 8581 \
    --engine-worker-queue-port 8582 \
    --cache-queue-port 8583 \
    --logits-processors LogitBiasLogitsProcessor dotted.path.to.your.module:YourCustomLogitsProcessor

注：--logits-processors 参数决定服务可支持的 logits processors 有哪些，这里被指定的顺序就是各种处理器的执行顺序。

发送请求时，在请求体中加入 logits_processors_args 参数，可用于控制当前请求是否应用特定的 LP，传参需与 LP 类定义时约定的可接收参数一致：

curl -X POST "http://0.0.0.0:8580/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "鲁迅是谁"}
  ],
  "logits_processors_args": {
    "logit_bias": {"0": 100.0},
    "enable_your_custom_logits_processors": true
  }
}'

Accuracy Tests

以内置 LogitBiasLogitsProcessor 为例，首先发送普通请求，再发送带偏置的请求（给 token_id=0 即 ! 字符的 logits 值很大的偏置），最后再发送普通请求验证回答正确性：

curl -X POST "http://0.0.0.0:$1/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "鲁迅是谁"}
  ],
  "stream": false,
  "top_p": 0.0
}'

curl -X POST "http://0.0.0.0:$1/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "鲁迅是谁"}
  ],
  "stream": false,
  "top_p": 0.0,
  "logits_processors_args": {
    "logit_bias": {"0": 1000.0}
  }
}'

curl -X POST "http://0.0.0.0:$1/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "鲁迅是谁"}
  ],
  "top_p": 0.0,
  "stream": false
}'

输出：

{"id":"chatcmpl-0dfc6083-c0ce-4f94-beaf-9a23c72e865b","object":"chat.completion","created":1761280673,"model":"/root/paddlejob/workspace/env_run/liyonghua/models/Qwen/Qwen2-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"鲁迅，原名周樟寿，后改名为周树人，字豫山，后改豫才，“鲁迅”是他1918年发表《狂人日记》时所用的笔名，也是他影响最为广泛的笔名，浙江绍兴人。著名文学家、思想家、民主战士，五四新文化运动的重要参与者，中国现代文学的奠基人。\n\n鲁迅的作品包括杂文、短篇小说、评论、散文、翻译作品。对于“鲁迅”这个名字，人们熟知的是他的文学创作，尤其是那些深刻揭示社会现实和人性弱点的作品，如《呐喊》、《彷徨》、《故事新编》等。他的文字犀利、深刻，对封建制度、旧道德、国民性等问题进行了无情的批判，同时也对进步青年寄予了深切的期望。鲁迅不仅是中国现代文学的巨匠，也是中国文化界的杰出人物，其思想和作品对后世产生了深远的影响。","multimodal_content":null,"reasoning_content":null,"tool_calls":null,"prompt_token_ids":null,"completion_token_ids":null,"prompt_tokens":null,"completion_tokens":null},"logprobs":null,"draft_logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":21,"total_tokens":221,"completion_tokens":200,"prompt_tokens_details":{"cached_tokens":0}}}

{"id":"chatcmpl-91f6066f-c27b-436d-a129-011ec2e7a539","object":"chat.completion","created":1761280674,"model":"/root/paddlejob/workspace/env_run/liyonghua/models/Qwen/Qwen2-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!","multimodal_content":null,"reasoning_content":null,"tool_calls":null,"prompt_token_ids":null,"completion_token_ids":null,"prompt_tokens":null,"completion_tokens":null},"logprobs":null,"draft_logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":21,"total_tokens":2048,"completion_tokens":2027,"prompt_tokens_details":{"cached_tokens":0}}}

{"id":"chatcmpl-99e83644-02f2-48db-91db-b60f60ac048e","object":"chat.completion","created":1761280692,"model":"/root/paddlejob/workspace/env_run/liyonghua/models/Qwen/Qwen2-7B-Instruct","choices":[{"index":0,"message":{"role":"assistant","content":"鲁迅，原名周樟寿，后改名为周树人，字豫山，后改豫才，“鲁迅”是他1918年发表《狂人日记》时所用的笔名，也是他影响最为广泛的笔名，浙江绍兴人。著名文学家、思想家、民主战士，五四新文化运动的重要参与者，中国现代文学的奠基人。\n\n鲁迅的作品包括杂文、短篇小说、评论、散文、翻译作品。对于“鲁迅”这个名字，人们熟知的是他的文学创作，尤其是那些深刻揭示社会现实和人性弱点的作品，如《呐喊》、《彷徨》、《故事新编》等。他的文字犀利、深刻，对封建制度、旧道德、国民性等问题进行了无情的批判，同时也对进步青年寄予了深切的期望。鲁迅不仅是中国现代文学的巨匠，也是中国文化界的杰出人物，其思想和作品对后世产生了深远的影响。","multimodal_content":null,"reasoning_content":null,"tool_calls":null,"prompt_token_ids":null,"completion_token_ids":null,"prompt_tokens":null,"completion_tokens":null},"logprobs":null,"draft_logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":21,"total_tokens":221,"completion_tokens":200,"prompt_tokens_details":{"cached_tokens":0}}}

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

…BiasLogitsProcessor

paddle-bot · 2025-10-21T08:01:53Z

Thanks for your contribution!

…rocessor

Copilot

Pull Request Overview

This PR introduces a logits processor framework that allows users to apply custom transformations to model logits before sampling. The implementation includes a base class for processors, a built-in LogitBiasLogitsProcessor for token bias adjustment, and integration throughout the inference pipeline.

Key changes:

Added abstract LogitsProcessor base class and LogitBiasLogitsProcessor implementation
Integrated logits processors into the sampling pipeline with state management via update_state and apply methods
Added --logits-processors CLI parameter and logits_processors_args request parameter for runtime configuration

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
fastdeploy/model_executor/logits_processor/base.py	Defines abstract LogitsProcessor interface
fastdeploy/model_executor/logits_processor/builtin.py	Implements LogitBiasLogitsProcessor for token bias adjustment
fastdeploy/model_executor/logits_processor/init.py	Provides factory functions for loading and instantiating processors
fastdeploy/config.py	Adds logits_processors field to StructuredOutputsConfig
fastdeploy/engine/args_utils.py	Adds --logits-processors CLI argument
fastdeploy/engine/engine.py	Passes logits processor configuration to workers
fastdeploy/engine/sampling_params.py	Adds logits_processors_args field with validation
fastdeploy/entrypoints/openai/protocol.py	Adds logits_processors_args to API request models
fastdeploy/worker/worker_process.py	Adds --logits-processors argument to worker parser
fastdeploy/worker/gpu_model_runner.py	Initializes processors and updates state before model forward
fastdeploy/model_executor/layers/sample/meta_data.py	Adds logits_processors field to SamplingMetadata
fastdeploy/model_executor/layers/sample/sampler.py	Applies logits processors in sampling pipeline and renames SamplerProcessor to GuidedDecoding
fastdeploy/engine/sched/resource_manager_v1.py	Removes metrics updates (unrelated change)
tests/model_executor/test_logits_processor.py	Adds comprehensive unit tests for LogitBiasLogitsProcessor

fastdeploy/model_executor/logits_processor/builtin.py

tests/model_executor/test_logits_processor.py

fastdeploy/engine/sampling_params.py

fastdeploy/model_executor/logits_processor/__init__.py

fastdeploy/engine/sampling_params.py

tests/model_executor/test_logits_processor.py

fastdeploy/worker/gpu_model_runner.py

fastdeploy/engine/sched/resource_manager_v1.py

fastdeploy/model_executor/layers/sample/sampler.py

Co-authored-by: Copilot <[email protected]>

fastdeploy/model_executor/logits_processor/builtin.py

Jiang-Jia-Jun · 2025-10-27T02:36:25Z

对应使用文档参考下vLLM/SGlang中的说明，在FD中也补充下

…puts into LP, do not copy share_inputs and logits

…rocessor

…& add docs and tests

[feat] provide an interface for logits processors and a builtin Logit…

22e4ddc

…BiasLogitsProcessor

liyonghua0910 added 6 commits October 21, 2025 16:41

[chore] fix code style

7a6ae48

[fix] add unit test & fix existing bugs

71a98d2

[feat] add engine/worker arg --logits-processors

60a1096

Merge remote-tracking branch 'upstream/develop' into develop+logits_p…

ea294f3

…rocessor

[fix] redefine user args as logits_processors_args and fix some bugs

9af8d6c

Merge remote-tracking branch 'upstream/develop' into develop+logits_p…

5baafa5

…rocessor

liyonghua0910 marked this pull request as ready for review October 24, 2025 03:23

[fix] fix test_sampler

e45f3a9

Jiang-Jia-Jun requested a review from Copilot October 24, 2025 08:29

Copilot AI reviewed Oct 24, 2025

View reviewed changes

yuanlehome reviewed Oct 24, 2025

View reviewed changes

fastdeploy/engine/sched/resource_manager_v1.py Show resolved Hide resolved

fastdeploy/model_executor/layers/sample/sampler.py Outdated Show resolved Hide resolved

liyonghua0910 and others added 5 commits October 24, 2025 17:12

Update fastdeploy/model_executor/logits_processor/builtin.py

09e2815

Co-authored-by: Copilot <[email protected]>

Update fastdeploy/model_executor/logits_processor/__init__.py

5bba682

Co-authored-by: Copilot <[email protected]>

Update tests/model_executor/test_logits_processor.py

e164010

Co-authored-by: Copilot <[email protected]>

[fix] fix typo

d655fbd

Update fastdeploy/engine/sampling_params.py

6309efd

Co-authored-by: Copilot <[email protected]>

kevincheng2 reviewed Oct 24, 2025

View reviewed changes

fastdeploy/model_executor/logits_processor/builtin.py Outdated Show resolved Hide resolved

[fix] fix bracelet

b788655

liyonghua0910 added 6 commits October 27, 2025 18:50

[chore] redefine logits processor interface: pass the entire share_in…

d2f125e

…puts into LP, do not copy share_inputs and logits

[doc] add docs

785a448

Merge remote-tracking branch 'upstream/develop' into develop+logits_p…

eba07f5

…rocessor

[fix] fix logit bias processor not applied when decoding is too fast …

71fa9dc

…& add docs and tests

[fix] fix redundant code

876dfd3

[feat] skip apply() if no bias is specified

0493a3b

Jiang-Jia-Jun approved these changes Oct 28, 2025

View reviewed changes

yuanlehome approved these changes Oct 28, 2025

View reviewed changes

Jiang-Jia-Jun merged commit a012e36 into PaddlePaddle:develop Oct 28, 2025
24 of 28 checks passed

Jiang-Jia-Jun added the skip-ci: coverage label Oct 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] support logits processors #4515

[Feature] support logits processors #4515

Uh oh!

liyonghua0910 commented Oct 21, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Oct 21, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jiang-Jia-Jun commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Feature] support logits processors #4515

[Feature] support logits processors #4515

Uh oh!

Conversation

liyonghua0910 commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Oct 21, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jiang-Jia-Jun commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liyonghua0910 commented Oct 21, 2025 •

edited

Loading