Conversation
Test Failure Analysis
Summary: The only failing job is Tests: Python 3.10 on windows-latest. A test in Root Cause: This is a pre-existing flaky Windows runner issue. The same Suggested Solution: No action needed on this PR. The failure is a known intermittent Windows CI flake, not introduced by these changes. Re-running the failed job should be sufficient. Detailed AnalysisFailing job: Tests: Python 3.10 on windows-latest (job 70933497777) Failure pattern: The test runner reached the 19th test in Relevant log excerpt: Evidence this is pre-existing flakiness:
PR changes are unrelated to the failure:
Related Files
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 27c3b2b85a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8eb32536a4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Filter thought parts (part.thought=True) from response content instead of leaking them as TextContent in _response_to_result_with_tools - Include finish_reason in error messages when no content is found, so safety-filtered responses (SAFETY, RECITATION, etc.) are distinguishable - Add specific error message for thinking-only responses in _response_to_create_message_result Fixes #3846 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Tests cover: - Thought parts filtered from tool-path responses - Thought-only responses produce descriptive errors - Safety-filtered responses include finish_reason in error - Normal responses (text + function calls) unaffected Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Remove unused ty: ignore comments from lines where isinstance() narrows the type, and add correct ty: ignore[invalid-argument-type] and ty: ignore[not-subscriptable] comments on lines in newly added test functions where ty cannot infer the union type is a list. Also apply ruff format fix in test_task_return_types.py. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Addresses review feedback: any() would misclassify mixed responses (thought + function_call) as thinking-only, hiding the real error. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
c005b0a to
187823c
Compare
Fixes #3846
Problem
Three related bugs in
GoogleGenaiSamplingHandler's response conversion:Thought parts leak as TextContent — Line 364 has a comment "Skip thought parts" but only checks
if part.text:which is truthy for thought parts too. Internal reasoning text appears in MCP responses.Thinking-only responses crash — When all parts have
thought=True,response.textisNone(Google SDK filters thoughts), causingValueError: No content in response: STOPwith no indication of why.Safety errors are unhelpful — Tool-path error says "No content in response from completion" without including
finish_reason. A safety-filtered response is indistinguishable from any other empty response.Fix
part.thought == Truein_response_to_result_with_toolsfinish_reasonin tool-path error messageTests
7 new unit tests covering thought filtering, safety errors, and normal response passthrough. Tests construct mock
GenerateContentResponseobjects directly — no API calls.🤖 Generated with Claude Code