-
Notifications
You must be signed in to change notification settings - Fork 682
[Feature] Support logprobs_mode #4567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…,processed_logits]
|
Thanks for your contribution! |
| model_group.add_argument( | ||
| "--logprobs-mode", | ||
| type=str, | ||
| default=EngineArgs.logprobs_mode, | ||
| help="Indicates the content returned in the logprobs.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要添加参数配置非法拦截报错
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
另外logprobs相关的配置是对齐vllm的不
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已添加非法参数拦截。配置是对齐vllm,只是当前还不支持投机解码场景下根据后处理后的logits计算logprob。
gongshaotian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Motivation
Support specifying the content returned by logprobes by adding the startup parameter
--logprobs-mode, default is raw_logprobs.Options: raw_logprobs, processed_logprobs, raw_logits, processed_logits
Raw means the values before applying logit processors, like bad words. Processed means the values after applying such processors.
TODO: Speculative decoding support processed logprob.
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.