Skip to content

fix(model): propagate provider model properties in fallback resolution#25641

Closed
mcaxtr wants to merge 1 commit intoopenclaw:mainfrom
mcaxtr:fix/model-fallback-reasoning
Closed

fix(model): propagate provider model properties in fallback resolution#25641
mcaxtr wants to merge 1 commit intoopenclaw:mainfrom
mcaxtr:fix/model-fallback-reasoning

Conversation

@mcaxtr
Copy link
Copy Markdown
Contributor

@mcaxtr mcaxtr commented Feb 24, 2026

Summary

  • Problem: Generic fallback in resolveModel() hardcodes reasoning: false and reads contextWindow/maxTokens from models[0] instead of matched model
  • Why it matters: Ollama models with reasoning: true produce minimal thinking (~30 tokens) instead of deep reasoning (200+ tokens); multiple models under same provider get wrong context limits
  • What changed: Find matched model by ID from provider config and propagate all its properties (reasoning, input, cost, contextWindow, maxTokens, name, api)
  • What did NOT change (scope boundary): Forward-compat fallbacks, OpenRouter fallback, model registry, auth storage, error messages

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

Users running Ollama models (or other custom provider models) with reasoning: true in their config will now correctly receive deep reasoning output with reasoning.effort forwarded to the API. Users with multiple models configured under a custom provider will now get the correct context window and max tokens for the requested model instead of inheriting from the first model in the array.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: All platforms
  • Runtime/container: Node 22+
  • Model/provider: Ollama (or any custom provider with multiple models)
  • Integration/channel (if any): Any
  • Relevant config (redacted):
{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://localhost:11434/v1",
        "api": "openai-responses",
        "models": [{
          "id": "thinking-model",
          "reasoning": true,
          "contextWindow": 128000
        }]
      }
    }
  }
}

Steps

  1. Configure Ollama provider with reasoning: true model
  2. Set agents.defaults.thinkingDefault to high
  3. Send a message requiring reasoning
  4. Check session history for thinking output

Expected

  • reasoning.effort: "high" included in API request
  • Model produces detailed reasoning (200+ tokens)
  • Context window matches config value (128000)

Actual

Before fix:

  • reasoning.effort not forwarded (reasoning field was hardcoded false)
  • Minimal thinking output (~30 tokens)
  • Wrong context window if multiple models configured

After fix:

  • reasoning.effort correctly forwarded
  • Deep reasoning output
  • Correct context window from matched model

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Added 3 new tests in src/agents/pi-embedded-runner/model.test.ts:

  • Fallback does not bleed contextWindow from unrelated models[0]
  • Fallback propagates reasoning from matched provider model
  • Fallback uses correct model when provider has multiple models

All tests pass (20 total: 17 existing + 3 new).

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:

    • All model resolution tests pass (20 tests)
    • Build completes successfully (pnpm build)
    • Lint and format checks pass (pnpm check)
    • Fix applies to both reasoning and contextWindow/maxTokens bugs
  • Edge cases checked:

    • Unmatched model (not in provider's models array) → uses defaults
    • Matched model with reasoning → propagates reasoning setting
    • Multiple models in provider → selects correct one by ID
    • ID trimming for whitespace safety
  • What you did not verify:

    • End-to-end testing with real Ollama instance and reasoning-capable model
    • Performance impact measurement
    • Interaction with all provider types in production

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Users with affected configs will immediately get correct behavior after upgrade.

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert commit 0260d60 or cherry-pick the inverse
  • Files/config to restore: src/agents/pi-embedded-runner/model.ts (remove matchedModel lookup, restore hardcoded reasoning: false and models[0] reads)
  • Known bad symptoms reviewers should watch for: Model not receiving reasoning parameter, wrong context windows in logs, models not found that were previously working

Risks and Mitigations

  • Risk: ID matching might fail if model IDs have unexpected whitespace
    • Mitigation: Uses .trim() on both sides of comparison
  • Risk: Breaking change if someone relied on models[0] behavior
    • Mitigation: This was clearly a bug; correct behavior is to match by ID. Any existing usage was unintentional.

Greptile Summary

Fixes model property propagation in fallback resolution by matching models by ID. Before this fix, the generic fallback hardcoded reasoning: false and read contextWindow/maxTokens from the first model in the provider's array (models[0]), causing incorrect behavior for Ollama and other custom providers with reasoning-capable models or multiple models with different limits.

Confidence Score: 5/5

  • Safe to merge with minimal risk
  • The fix correctly addresses a clear bug in model property propagation. All changes are well-tested with 3 new comprehensive test cases covering the exact scenarios described in the PR. The implementation uses defensive programming with proper fallbacks, and the logic is straightforward: find the matched model by ID and propagate its properties instead of hardcoding defaults or reading from an unrelated model. No breaking changes or risky operations.
  • No files require special attention

Last reviewed commit: 0260d60

@mcaxtr mcaxtr force-pushed the fix/model-fallback-reasoning branch from 0260d60 to b559298 Compare February 27, 2026 15:50
@vincentkoc
Copy link
Copy Markdown
Member

Thanks for this PR and the detailed tests.

Context after merge of #29205:

  • The contextWindow/maxTokens fallback selection bug is now covered there.
  • The reasoning propagation part (reasoning from matched provider model) still appears unresolved in main.

If you want to keep this alive, the cleanest path is to re-scope this PR to only the remaining reasoning-propagation delta, then it should be much easier to review/merge.

@vincentkoc
Copy link
Copy Markdown
Member

Thanks again for the report and patch here.

This is now fully covered by merged PRs:

Closing this as superseded to keep one canonical implementation path, with credit preserved to this PR and issue #25636.

@vincentkoc vincentkoc closed this Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Model fallback ignores reasoning and uses wrong contextWindow/maxTokens

2 participants