Skip to content

Conversation

@daniel-lxs
Copy link
Member

@daniel-lxs daniel-lxs commented Nov 26, 2025

Summary

Fixes #9597

This PR fixes a race condition in the model cache that caused "model ID not found" errors when API returned empty responses.

Root Cause

When the OpenRouter API (or other providers) failed silently and returned {} instead of throwing an error, the empty response was being cached to both memory and disk. This overwrote valid cached data, causing subsequent model lookups to fail with "model not found" and fall back to defaults.

Changes

1. Empty response protection

  • getModels(): Only cache non-empty API responses (>0 models)
  • refreshModels(): Return existing cache when API returns empty response

2. In-flight request deduplication

Added inFlightRefresh Map to track ongoing refresh requests. When multiple calls to refreshModels() happen concurrently for the same provider, they now share the same promise instead of racing against each other.

3. Telemetry

Added new MODEL_CACHE_EMPTY_RESPONSE telemetry event to track when API returns empty responses, which helps identify problematic API behavior in production.

Testing

  • Added 8 new test cases covering empty response handling
  • All 99 tests in the fetchers test suite pass
  • All 21 tests in modelCache.spec.ts pass

Files Changed

  • packages/types/src/telemetry.ts - Added telemetry event
  • src/api/providers/fetchers/modelCache.ts - Core fix
  • src/api/providers/fetchers/__tests__/modelCache.spec.ts - Tests

Important

Fixes caching of empty API responses in modelCache.ts and adds telemetry for tracking, with concurrency improvements and new tests.

  • Behavior:
    • getModels(): Only caches non-empty API responses.
    • refreshModels(): Returns existing cache if API response is empty.
  • Concurrency:
    • Introduces inFlightRefresh Map to deduplicate concurrent refreshModels() calls.
  • Telemetry:
    • Adds MODEL_CACHE_EMPTY_RESPONSE event in telemetry.ts to track empty API responses.
  • Testing:
    • Adds 8 test cases for empty response handling in modelCache.spec.ts.
  • Files Changed:
    • telemetry.ts: Adds telemetry event.
    • modelCache.ts: Implements core fix and concurrency handling.
    • modelCache.spec.ts: Adds tests.

This description was created by Ellipsis for c02fe6b. You can customize this summary. It will automatically update as commits are pushed.

- Added protection against caching empty API responses in getModels() and refreshModels()
- Added in-flight request tracking to prevent concurrent refresh calls from racing
- Added telemetry event MODEL_CACHE_EMPTY_RESPONSE to track these occurrences
- Added comprehensive tests for empty cache protection

Fixes #9597
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Nov 26, 2025
@roomote
Copy link
Contributor

roomote bot commented Nov 26, 2025

Rooviewer Clock   See task on Roo Cloud

All issues resolved. The error logging has been successfully added to the refreshModels() catch block.

  • Add error logging in refreshModels() catch block to aid debugging when API failures occur
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Nov 26, 2025
Addresses PR review feedback to improve production debugging.
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Nov 26, 2025
@hannesrudolph hannesrudolph added PR - Needs Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Nov 26, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 26, 2025
@mrubens mrubens merged commit f388919 into main Nov 26, 2025
16 checks passed
@mrubens mrubens deleted the fix/model-cache-empty-response-9597 branch November 26, 2025 21:59
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Nov 26, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Nov 26, 2025
mini2s added a commit to zgsm-ai/costrict that referenced this pull request Nov 27, 2025
* fix: include mcpServers in getState() for auto-approval (RooCodeInc#9199)

* fix: replace rate-limited badges with badgen.net (RooCodeInc#9200)

* Batch settings updates from the webview to the extension host (RooCodeInc#9165)

Co-authored-by: Roo Code <[email protected]>

* fix: Apply updated API profile settings when provider/model unchanged (RooCodeInc#9208) (RooCodeInc#9210)

fix: apply updated API profile settings when provider/model unchanged (RooCodeInc#9208)

* fix: migrate Issue Fixer to REST + ProjectsV2 (RooCodeInc#9207)

* fix(issue-fixer): migrate to REST for issue/comments and add ProjectsV2; remove Projects Classic mentions

* Update .roo/rules-issue-fixer/4_github_cli_usage.xml

Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>

* Update .roo/rules-issue-fixer/4_github_cli_usage.xml

Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>

---------

Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>

* Migrate conversation continuity to plugin-side encrypted reasoning items (Responses API) (RooCodeInc#9203)

* Migrate conversation continuity to plugin-side encrypted reasoning items (Responses API)

Summary
We moved continuity off OpenAI servers and now maintain conversation state locally by persisting and replaying encrypted reasoning items. Requests are stateless (store=false) while retaining the performance/caching benefits of the Responses API.

Why
This aligns with how Roo manages context and simplifies our Responses API implementation while keeping all the benefits of continuity, caching, and latency improvements.

What changed
- All OpenAI models now use the Responses API; system instructions are passed via the top-level instructions field; requests include store=false and include=["reasoning.encrypted_content"].
- We persist encrypted reasoning items (type: "reasoning", encrypted_content, optional id) into API history and replay them on subsequent turns.
- Reasoning summaries default to summary: "auto" when supported; text.verbosity only when supported.
- Atomic persistence via safeWriteJson.

Removed
- previous_response_id flows, suppressPreviousResponseId/skipPrevResponseIdOnce, persistGpt5Metadata(), and GPT‑5 response ID metadata in UI messages.

Kept
- taskId and mode metadata for cross-provider features.

Result
- ZDR-friendly, stateless continuity with equal or better performance and a simpler codepath.

* fix(webview): remove unused metadata prop from ReasoningBlock render

* Responses API: retain response id for troubleshooting (not continuity)

Continuity is stateless via encrypted reasoning items that we persist and replay. We now capture the top-level response id in OpenAiNativeHandler and persist the assistant message id into api_conversation_history.json solely for debugging/correlation with provider logs; it is not used for continuity or control flow.

Also: silence request-body debug logging to avoid leaking prompts.

* remove DEPRECATED tests

* chore: remove unused Task types file to satisfy knip CI

* fix(task): properly type cleanConversationHistory and createMessage args in Task to address Dan's review

* chore: add changeset for v3.31.2 (RooCodeInc#9216)

* Changeset version bump (RooCodeInc#9217)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Matt Rubens <[email protected]>

* rename: sliding-window -> context-management; truncateConversationIfNeeded -> manageContext (RooCodeInc#9206)

* Fix: Roo Anthropic input token normalization (avoid double-count) (RooCodeInc#9224)

* OpenAI Native: gate encrypted_content include; remove gpt-5-chat-latest verbosity flag (fixes RooCodeInc#9225) (RooCodeInc#9231)

openai-native: include reasoning.encrypted_content only when reasoningEffort is set; prevent Responses API error on non-reasoning models. types: remove supportsVerbosity from gpt-5-chat-latest to avoid invalid verbosity error. Fixes RooCodeInc#9225

* docs: remove Contributors section from README files (RooCodeInc#9198)

Co-authored-by: Roo Code <[email protected]>

* Release v3.31.3 (RooCodeInc#9232)

* Changeset version bump (RooCodeInc#9233)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Matt Rubens <[email protected]>

* Add native tool call support (RooCodeInc#9159)

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Consistently use Package.name for better support of the nightly app (RooCodeInc#9240)

* fix: resolve 400 error with native tools on OpenRouter (RooCodeInc#9238)

* fix: change tool_choice from required to auto for native protocol (RooCodeInc#9242)

* docs: include PR numbers in release guide (RooCodeInc#9236)

* Add enum support to configuration schema (RooCodeInc#9247)

* refactor(task): switch to <feedback> wrapper to prevent focus drift after context-management event (condense/truncate) (RooCodeInc#9237)

* refactor(task): wrap initial user message in <feedback> instead of <task> to prevent focus drift after context-management

Rationale: After a successful context-management event, framing the next user block as feedback reduces model focus drift. Mentions parsing already supports <feedback>, and tool flows (attemptCompletion, responses) are aligned. No change to loop/persistence.

* refactor(mentions): drop <task> parsing; standardize on <feedback>; update tests

* fix: Filter native tools by mode restrictions (RooCodeInc#9246)

* fix: filter native tools by mode restrictions

Native tools are now filtered based on mode restrictions before being sent to the API, matching the behavior of XML tools. Previously, all native tools were sent to the API regardless of mode, causing the model to attempt using disallowed tools.

Changes:
- Created filterNativeToolsForMode() and filterMcpToolsForMode() utility functions
- Extracted filtering logic from Task.ts into dedicated module
- Applied same filtering approach used for XML tools in system prompt
- Added comprehensive test coverage (10 tests)

Impact:
- Model only sees tools allowed by current mode
- No more failed tool attempts due to mode restrictions
- Consistent behavior between XML and Native protocols
- Better UX with appropriate tool suggestions per mode

* refactor: eliminate repetitive tool checking using group-based approach

- Add getAvailableToolsInGroup() helper to check tools by group instead of individually
- Refactor filterNativeToolsForMode() to reuse getToolsForMode() instead of duplicating logic
- Simplify capabilities.ts by using group-based checks (60% reduction)
- Refactor rules.ts to use group helper (56% reduction)
- Remove debug console.log statements
- Update tests and snapshots

Benefits:
- Eliminates code duplication
- Leverages existing TOOL_GROUPS structure
- More maintainable - new tools in groups work automatically
- All tests passing (26/26)

* fix: add fallback to default mode when mode config not found

Ensures the agent always has functional tools even if:
- A custom mode is deleted while tasks still reference it
- Mode configuration becomes corrupted
- An invalid mode slug is provided

Without this fallback, the agent would have zero tools (not even
ask_followup_question or attempt_completion), completely breaking it.

* Fix broken share button (RooCodeInc#9253)

fix(webview-ui): make Share button popover work by forwarding ref in LucideIconButton

- Convert LucideIconButton to forwardRef so Radix PopoverTrigger(asChild) receives a focusable element
- Enables Share popover and shareCurrentTask flow
- Verified with ShareButton/TaskActions Vitest suites

* Add GPT-5.1 models and clean up reasoning effort logic (RooCodeInc#9252)

* Reasoning effort: capability-driven; add disable/none/minimal; remove GPT-5 minimal special-casing; document UI semantics; remove temporary logs

* Remove Unused supportsReasoningNone

* Roo reasoning: omit field on 'disable'; UI: do not flip enableReasoningEffort when selecting 'disable'

* Update packages/types/src/model.ts

Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>

* Update webview-ui/src/components/settings/SimpleThinkingBudget.tsx

Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>

---------

Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>

* fix: make line_ranges optional in read_file tool schema (RooCodeInc#9254)

The OpenAI tool schema required both 'path' and 'line_ranges' in FileEntry,
but the TypeScript type definition marks lineRanges as optional. This caused
the AI to fail when trying to read files without specifying line_ranges.

Changes:
- Updated read_file tool schema to only require 'path' parameter
- line_ranges remains available but optional, matching TypeScript types
- Aligns with implementation which treats lineRanges as optional throughout

Fixes issue where read_file tool kept failing with missing parameters.

* fix: prevent consecutive user messages on streaming retry (RooCodeInc#9249)

* feat(openai): OpenAI Responses: model-driven prompt caching and generic reasoning options refactor (RooCodeInc#9259)

* revert out of scope changes from RooCodeInc#9252 (RooCodeInc#9258)

* Revert "refactor(task): switch to <feedback> wrapper to prevent focus drift after context-management event (condense/truncate)" (RooCodeInc#9261)

* Release v3.32.0 (RooCodeInc#9264)

* Changeset version bump (RooCodeInc#9265)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Matt Rubens <[email protected]>

* [FIX] Fix OpenAI Native handling of encrypted reasoning blocks to prevent error when condensing (RooCodeInc#9263)

* fix: prevent duplicate tool_result blocks in native protocol mode for read_file (RooCodeInc#9272)

When read_file encountered errors (e.g., file not found), it would call
handleError() which internally calls pushToolResult(), then continue to
call pushToolResult() again with the final XML. In native protocol mode,
this created two tool_result blocks with the same tool_call_id, causing
400 errors on subsequent API calls.

This fix replaces handleError() with task.say() for error notifications.
The agent still receives error details through the XML in the single
final pushToolResult() call.

This change works for both protocols:
- Native: Only one tool_result per tool_call_id (fixes duplicate issue)
- XML: Only one text block with complete XML (cleaner than before)

Agent visibility preserved: Errors are included in the XML response
sent to the agent via pushToolResult().

Tests: All 44 tests passing. Updated test to verify say() is called.

* Fix duplicate tool blocks causing 'tool has already been used' error (RooCodeInc#9275)

* feat(openai-native): add abort controller for request cancellation (RooCodeInc#9276)

* Disable XML parser for native tool protocol (