Skip to content

Conversation

@kevincheng2
Copy link
Collaborator

@kevincheng2 kevincheng2 commented Jul 8, 2025

  • Multimodal model and thinking model support structred output
  • Offline Inference support structred output
  • Add some test case for structred output

@paddle-bot
Copy link

paddle-bot bot commented Jul 8, 2025

Thanks for your contribution!

@kevincheng2 kevincheng2 changed the title [vl] mm and thinking model support structred output [Feature] mm and thinking model support structred output Jul 8, 2025

This comment was marked as outdated.

@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from d07f737 to 72de4a3 Compare July 11, 2025 06:41
@Jiang-Jia-Jun Jiang-Jia-Jun requested a review from Copilot July 12, 2025 16:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds structured output support via guided decoding (reasoning parsers) for multi-modal and thinking models, including offline inference capabilities.

  • Introduce a new --reasoning_parser CLI argument and propagate it through configuration to model runners.
  • Extend the sampling and guided decoding pipeline: updated Sampler, guided backend interfaces, and skip-index logic.
  • Enhance SamplingParams with GuidedDecodingParams and document offline inference usage for structured outputs.

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
fastdeploy/worker/worker_process.py Add --reasoning_parser CLI arg and integrate it into FDConfig.
fastdeploy/worker/vl_gpu_model_runner.py Initialize guided backend and reasoning parser; update guided decoding flow in the GPU model runner.
fastdeploy/model_executor/layers/sample/sampler.py Enhance Sampler to support reasoning parsing and skip indices when masking tokens.
fastdeploy/engine/sampling_params.py Introduce GuidedDecodingParams in SamplingParams for offline structured inference.
docs/features/structured_outputs.md Add offline inference examples for structured output using GuidedDecodingParams.
Comments suppressed due to low confidence (3)

fastdeploy/worker/vl_gpu_model_runner.py:145

  • The code checks for guided_json, guided_regex, guided_grammar, and structural_tag but does not handle guided_choice from GuidedDecodingParams. Add support for guided_choice to ensure all constraint types are honored.
        elif request.guided_grammar is not None:

fastdeploy/engine/engine.py:1049

  • The code references self.cfg.reasoning_parser, but reasoning_parser is not defined on the engine config object. It should likely reference self.cfg.model_config.reasoning_parser.
            f" --reasoning_parser {self.cfg.reasoning_parser}")

fastdeploy/worker/vl_gpu_model_runner.py:152

  • Using request.get(...) may not work if request is not a dict-like object. Consider using getattr(request, 'enable_thinking', True) to access the attribute safely.
            enable_thinking=request.get("enable_thinking", True),

@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch 2 times, most recently from aac8503 to 04c2f3c Compare July 17, 2025 12:44
Jiang-Jia-Jun
Jiang-Jia-Jun previously approved these changes Jul 18, 2025
@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch 3 times, most recently from 2ef373a to 69fc3a2 Compare July 18, 2025 11:29
@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from 69fc3a2 to 6bd3676 Compare July 29, 2025 11:19
@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from 0429910 to 3e9bba5 Compare August 5, 2025 09:16
@kevincheng2 kevincheng2 force-pushed the mm_structred_output branch from aec275d to 278d3bd Compare August 8, 2025 08:34
@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@d37331f). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #2749   +/-   ##
==========================================
  Coverage           ?   39.02%           
==========================================
  Files              ?       11           
  Lines              ?      123           
  Branches           ?       19           
==========================================
  Hits               ?       48           
  Misses             ?       69           
  Partials           ?        6           
Flag Coverage Δ
diff 39.02% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 1908465 into PaddlePaddle:develop Sep 2, 2025
14 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants