Skip to content

fix: retry when LLM returns text instead of calling final_response#3850

Merged
jlowin merged 4 commits intomainfrom
strawgate/retry-text-response-3847
Apr 12, 2026
Merged

fix: retry when LLM returns text instead of calling final_response#3850
jlowin merged 4 commits intomainfrom
strawgate/retry-text-response-3847

Conversation

@strawgate
Copy link
Copy Markdown
Collaborator

@strawgate strawgate commented Apr 11, 2026

Fixes #3847

Problem

sample_impl at line 685 raises RuntimeError immediately when structured output is expected but the LLM returns text instead of calling final_response. No retry, no nudge — just a crash.

This contrasts with validation errors (lines 657-679) which append feedback to history and retry. Some models occasionally ignore tool_choice="required" and return text, especially on later iterations of the tool loop after gathering information.

Fix

Add a retry loop (capped at 3 attempts) that appends a nudge message asking the model to use the final_response tool, matching the existing pattern for validation error retries.

_MAX_TEXT_RESPONSE_RETRIES = 3

After exhausting retries, raises RuntimeError with the attempt count.

Tests

3 tests: retry-then-succeed, exceed-max-retries, and no-retry-when-result_type-is-None.

🤖 Generated with Claude Code

Instead of raising RuntimeError immediately when the LLM returns a text
response instead of calling the `final_response` tool for structured
output, retry up to 3 times with an explicit nudge message asking the
model to use the tool. This mirrors the existing retry behavior for
validation errors but with a separate, smaller cap.

Fixes #3847

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@marvin-context-protocol marvin-context-protocol Bot added bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. server Related to FastMCP server implementation or server-side functionality. DON'T MERGE PR is not ready for merging. Used by authors to prevent premature merging. labels Apr 11, 2026
@strawgate strawgate changed the title [DNM] Retry when LLM returns text instead of calling final_response Retry when LLM returns text instead of calling final_response Apr 11, 2026
Tests cover:
- Text response followed by successful final_response (retry works)
- Text response exceeding max retries (raises RuntimeError)
- Nudge message appended to history on retry
- No retry when result_type is None (text is valid)

Addresses review feedback from PR review tool (v1 flagged missing tests as high severity).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@strawgate strawgate changed the title Retry when LLM returns text instead of calling final_response fix: retry when LLM returns text instead of calling final_response Apr 11, 2026
Remove test_nudge_message_in_history (implementation detail).
Reduce boilerplate in remaining 3 tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@marvin-context-protocol
Copy link
Copy Markdown
Contributor

Test Failure Analysis

Summary: The Run static analysis workflow failed because ruff format reformatted a line in the new test file — the PR was pushed without running the formatter locally.

Root Cause: In tests/client/test_sampling_result_types.py, the _tool_reply helper constructs a ToolUseContent with all four arguments on a single line. ruff format enforces one-argument-per-line style when a call exceeds the line length limit, and it reformatted the call automatically — causing prek to exit with code 1 because hooks modified files.

Suggested Solution: Run ruff format (or uv run prek run --all-files) locally and commit the result. The only change needed is in tests/client/test_sampling_result_types.py around line 468:

# Before (current)
ToolUseContent(
    type="tool_use", id="c1", name="final_response", input={"value": value}
)

# After (ruff-formatted)
ToolUseContent(
    type="tool_use",
    id="c1",
    name="final_response",
    input={"value": value},
)
Detailed Analysis

Failed hook: ruff format (all others — validate-pyproject, prettier, ruff-check, ty, loq, codespell — passed)

Workflow run: #24292594795, job static_analysis

Relevant log excerpt:

ruff format..............................................................[41mFailed[49m
...
hint: Some hooks made changes to the files.
If you are seeing this message in CI, reproduce locally with: `prek run --all-files`

All changes made by hooks:
diff --git a/tests/client/test_sampling_result_types.py b/tests/client/test_sampling_result_types.py
index 52c246b..1aaccdf 100644
--- a/tests/client/test_sampling_result_types.py
+++ b/tests/client/test_sampling_result_types.py
@@ -465,7 +465,10 @@ class TestTextResponseRetry:
             role="assistant",
             content=[
                 ToolUseContent(
-                    type="tool_use", id="c1", name="final_response", input={"value": value}
+                    type="tool_use",
+                    id="c1",
+                    name="final_response",
+                    input={"value": value},
                 )
             ],
             model="m",
Related Files
  • tests/client/test_sampling_result_types.py — the test file added by this PR; the _tool_reply helper at ~line 461 needs its ToolUseContent(...) constructor call reformatted by ruff

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@strawgate strawgate removed the DON'T MERGE PR is not ready for merging. Used by authors to prevent premature merging. label Apr 11, 2026
strawgate added a commit to strawgate/fastmcp that referenced this pull request Apr 12, 2026
The merge of PrefectHQ#3850 and PrefectHQ#3851 dropped the variable initialization.
text_response_retries was used at line 707 but never assigned.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Member

@jlowin jlowin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mirrors the existing validation-retry convention and handles the real UX cliff where some models ignore tool_choice=required. Bounded at 3 retries.

@jlowin jlowin merged commit f26c8fa into main Apr 12, 2026
8 checks passed
@jlowin jlowin deleted the strawgate/retry-text-response-3847 branch April 12, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. server Related to FastMCP server implementation or server-side functionality.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Sampling loop crashes immediately when LLM returns text instead of calling final_response

2 participants