Fix RetryMiddleware not retrying tool errors#3858
Merged
Conversation
strawgate
added a commit
that referenced
this pull request
Apr 11, 2026
- Delete entirely commented-out test_run_server.py (99 lines dead code) - Fix test_pydantic_model_with_stringified_json_no_strict: replace try/except-both-branches-pass with clear pytest.raises assertion - Fix test_path_traversal_blocked: remove dead assertions after pytest.raises (lines after raise never execute) Error handling middleware test fixes are in a separate PR (#3858) which also fixes the underlying RetryMiddleware bug. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
jlowin
pushed a commit
that referenced
this pull request
Apr 12, 2026
- Delete entirely commented-out test_run_server.py (99 lines dead code) - Fix test_pydantic_model_with_stringified_json_no_strict: replace try/except-both-branches-pass with clear pytest.raises assertion - Fix test_path_traversal_blocked: remove dead assertions after pytest.raises (lines after raise never execute) Error handling middleware test fixes are in a separate PR (#3858) which also fixes the underlying RetryMiddleware bug. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
jlowin
pushed a commit
that referenced
this pull request
Apr 12, 2026
- Delete entirely commented-out test_run_server.py (99 lines dead code) - Fix test_pydantic_model_with_stringified_json_no_strict: replace try/except-both-branches-pass with clear pytest.raises assertion - Fix test_path_traversal_blocked: remove dead assertions after pytest.raises (lines after raise never execute) Error handling middleware test fixes are in a separate PR (#3858) which also fixes the underlying RetryMiddleware bug. 🤖 Generated with Claude Code Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
FastMCP wraps tool exceptions as ToolError(...) from <original>, so _should_retry() only saw ToolError and never matched the retryable ConnectionError/TimeoutError in __cause__. Now checks error.__cause__ in addition to the error itself, matching the pattern already used by ErrorHandlingMiddleware. 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
6b506c9 to
5899eed
Compare
- Move FastMCP/Client imports to module level in test_error_handling.py per the project's 'all imports at top' rule. - Strengthen test_retry_middleware_with_permanent_failures: assert call_count == 1 to prove the non-retryable error actually doesn't retry (was passing whether retries fired or not). - Add a one-line note on _should_retry explaining the single-level __cause__ assumption so future middleware doesn't accidentally re-wrap and hide the real error type.
strawgate
added a commit
that referenced
this pull request
Apr 14, 2026
Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> Co-authored-by: Jeremiah Lowin <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RetryMiddlewarenever retried tool calls because FastMCP wraps tool exceptions asToolError(...) from <original>before control returns to the middleware. The middleware's_should_retry()checkedisinstance(error, retry_exceptions)but receivedToolError, not the originalConnectionError.The fix: also check
error.__cause__. This matches the pattern already used byErrorHandlingMiddleware._transform_error()(line 92).Why wasn't this caught by tests?
The existing
test_retry_middleware_with_transient_failureshad zero assertions. It used a random fail rate (0.7) so some calls succeeded on the first try, creating the illusion that retries worked. The test passed whether retries fired or not.What's in this PR
Source fix (4 lines):
_should_retry()now checkserror.__cause__in addition to the error itself.New unit test:
test_should_retry_checks_cause_chain— verifies_should_retryreturns True forToolErrorwrappingConnectionError, False forToolErrorwrappingValueError, False for bareToolError.New integration test:
test_retry_actually_retries_through_server_pipeline— creates a tool that fails twice then succeeds, verifies the tool is called exactly 3 times (2 retries + 1 success). This is deterministic, not random.Replaced broken test: The old random-based test is replaced by the deterministic one above.
Closes #3855
🤖 Generated with Claude Code