Skip to content

fix: use provider-qualified key in MODEL_CACHE for context window lookup#15632

Closed
linwebs wants to merge 2 commits intoopenclaw:mainfrom
linwebs:fix/model-cache-provider-qualified-key
Closed

fix: use provider-qualified key in MODEL_CACHE for context window lookup#15632
linwebs wants to merge 2 commits intoopenclaw:mainfrom
linwebs:fix/model-cache-provider-qualified-key

Conversation

@linwebs
Copy link
Copy Markdown

@linwebs linwebs commented Feb 13, 2026

Summary

  • Problem: When multiple providers define the same model ID with different context windows, /status and compaction logic show/use the wrong context window for the non-last-loaded provider.
  • Why it matters: Users running e.g. provider-1/claude-4.6-sonnet (200k) and provider-2/claude-4.6-sonnet (1M) would see both show 1.0m, causing premature compaction on the 1M provider and incorrect status display on the 200k one.
  • What changed: applyConfiguredContextWindows now writes only bare model-id keys to MODEL_CACHE. Call sites with provider info (get-reply-directives, memory-flush, agent-runner-memory) now call resolveContextTokensForModel with the full cfg, which scans models.providers directly to disambiguate same-named models across providers. resolveContextTokens in model-selection.ts reverted to remove the now-unused provider param.
  • What did NOT change: Discovery cache entries (OpenRouter slash-path keys), overall context resolution priority chain, API contracts, config schema.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Gateway / orchestration
  • Memory / storage

Linked Issue/PR

User-visible / Behavior Changes

  • /status now displays the correct context window when the same model ID exists across multiple providers with different limits.
  • Compaction threshold is now computed from the actual provider's configured context window instead of the last-written cache entry.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: Linux (Ubuntu 24.04)
  • Runtime/container: Node 22, local gateway
  • Model/provider: provider-1/claude-4.6-sonnet (200k), provider-2/claude-4.6-sonnet (1M)
  • Relevant config: Two providers sharing claude-4.6-sonnet and claude-4.6-opus with different contextWindow values

Steps

  1. Configure two providers with the same model ID but different contextWindow in openclaw.json
  2. /model provider-1/claude-4.6-sonnet/status
  3. /model provider-2/claude-4.6-sonnet/status

Expected

  • provider-1/claude-4.6-sonnettokens xx/200k
  • provider-2/claude-4.6-sonnettokens xx/1.0m

Actual (after fix)

  • provider-1/claude-4.6-sonnettokens 19k/200k (9%)
  • provider-2/claude-4.6-sonnettokens 18k/1.0m (2%)

Evidence

  • Trace/log snippets (TUI status bar confirmed above)

Human Verification (required)

  • Verified scenarios: Both claude-4.6-sonnet and claude-4.6-opus tested across two providers via TUI. Context window displayed correctly in status bar for each provider. Messages sent to both providers successfully.
  • Edge cases checked: Confirmed MODEL_CACHE bare-key entries for discovery (OpenRouter slash-path IDs) are not affected by the change; resolveContextTokensForModel falls through to bare cache key when config scan finds no match.
  • What I did not verify: Agents using contextTokens global override; OpenRouter provider with slash-containing model IDs in live environment.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

(Both Greptile review threads from the previous commit are resolved. The issues they identified — unguarded undefined/model lookup and provider mismatch in persistRunSessionUsage — no longer exist in the current code because the qualified-key-in-cache approach has been replaced entirely.)

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert: git revert f5dcd2409
  • Files/config to restore: None
  • Known bad symptoms: If context windows regress to always showing the last-loaded provider's value, this commit is the first place to check.

Risks and Mitigations

  • Risk: resolveContextTokensForModel scans cfg.models.providers on every call instead of reading from cache.
    • Mitigation: Config is an in-memory object; scan cost is O(providers × models) and negligible compared to API roundtrip latency.

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 13, 2026

Additional Comments (1)

src/auto-reply/reply/followup-runner.ts
Persisted provider mismatch
providerUsed is computed from runResult.meta.agentMeta?.provider ?? fallbackProvider and used to compute contextTokensUsed, but persistRunSessionUsage is called with providerUsed: fallbackProvider (not the providerUsed variable). If the agent meta provider differs from the fallback provider, the session store will persist modelProvider for one provider while contextTokensUsed came from another, which can reintroduce incorrect context-window resolution on subsequent /status/session views.

        await persistRunSessionUsage({
          storePath,
          sessionKey,
          usage,
          lastCallUsage: runResult.meta.agentMeta?.lastCallUsage,
          promptTokens,
          modelUsed,
          providerUsed,
          contextTokensUsed,
          logLabel: "followup",
        });
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/followup-runner.ts
Line: 198:216

Comment:
**Persisted provider mismatch**
`providerUsed` is computed from `runResult.meta.agentMeta?.provider ?? fallbackProvider` and used to compute `contextTokensUsed`, but `persistRunSessionUsage` is called with `providerUsed: fallbackProvider` (not the `providerUsed` variable). If the agent meta provider differs from the fallback provider, the session store will persist `modelProvider` for one provider while `contextTokensUsed` came from another, which can reintroduce incorrect context-window resolution on subsequent `/status`/session views.

```suggestion
        await persistRunSessionUsage({
          storePath,
          sessionKey,
          usage,
          lastCallUsage: runResult.meta.agentMeta?.lastCallUsage,
          promptTokens,
          modelUsed,
          providerUsed,
          contextTokensUsed,
          logLabel: "followup",
        });
```

How can I resolve this? If you propose a fix, please make it concise.

@linwebs
Copy link
Copy Markdown
Author

linwebs commented Feb 13, 2026

This PR fixes the issue described in #11629MODEL_CACHE now uses provider-qualified keys (provider/model) so same-named models across different providers retain their correct context window values.

Closes #11629

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime commands Command implementations agents Agent runtime and tooling size: S labels Feb 13, 2026
sauerdaniel added a commit to sauerdaniel/openclaw that referenced this pull request Feb 14, 2026
@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added stale Marked as stale due to inactivity and removed stale Marked as stale due to inactivity labels Feb 21, 2026
@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Mar 12, 2026
…ll sites

applyConfiguredContextWindows now stores provider-qualified keys
(e.g. "rdsec/claude-4.6-sonnet") in MODEL_CACHE alongside the bare model
id fallback, using normalizeProviderId so the key format is consistent with
resolveContextTokensForModel's lookup logic.

All call sites that have provider info now prefer provider-qualified lookups:
- agent-runner, followup-runner, directive-handling.persist: switched to
  resolveContextTokensForModel (direct config scan + qualified cache)
- cron/isolated-agent/run, gateway/session-utils, commands/sessions:
  same migration to resolveContextTokensForModel
- resolveContextTokens (model-selection), resolveMemoryFlushContextWindowTokens
  (memory-flush): added optional provider param with normalizeProviderId-based
  qualified lookup before bare id fallback (no cfg available at these sites)
- get-reply-directives, agent-runner-memory: pass provider through to the
  above utilities

Matches the provider-aware approach already applied to session-store.ts and
status.ts in upstream. Fixes incorrect context window display and compaction
timing when the same model id is configured across multiple providers with
different context limits.
@linwebs linwebs force-pushed the fix/model-cache-provider-qualified-key branch from 9f4f541 to f5dcd24 Compare March 13, 2026 16:05
applyConfiguredContextWindows now writes only bare model-id keys to
MODEL_CACHE, reserving the provider-qualified key space exclusively for
raw discovery entries (e.g. OpenRouter slash-paths).

Call sites that need per-provider disambiguation (get-reply-directives,
memory-flush, agent-runner-memory) now call resolveContextTokensForModel
with the full cfg so they scan models.providers directly, rather than
relying on a qualified cache key that was never reliably populated.

resolveContextTokens in model-selection.ts reverted to remove the now-
unused provider param; resolveMemoryFlushContextWindowTokens accepts cfg
and delegates to resolveContextTokensForModel.
@linwebs linwebs force-pushed the fix/model-cache-provider-qualified-key branch from f5dcd24 to 80b8160 Compare March 13, 2026 16:18
@linwebs
Copy link
Copy Markdown
Author

linwebs commented Mar 13, 2026

Revised approach in 80b8160: instead of writing provider-qualified keys into MODEL_CACHE (which corrupted discovery entries for OpenRouter-style slash-path model IDs), call sites that need per-provider disambiguation now call resolveContextTokensForModel with the full cfg and scan models.providers directly.

Both previous Greptile review threads are resolved — the issues they flagged no longer exist in the new code.

CI failure in src/cron/isolated-agent.delivers-response-has-heartbeat-ok-but-includes.test.ts is pre-existing in upstream main and unrelated to this PR.

@greptileai review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR fixes a multi-provider context-window collision bug where two providers sharing the same model ID (e.g. claude-4.6-sonnet) would both display and compute the context window of whichever provider was loaded last. The fix has two parts: applyConfiguredContextWindows is changed to write only bare model-id keys to MODEL_CACHE (preventing synthetic config writes from corrupting the slash-keyed discovery entries), and a new resolveContextTokensForModel helper is introduced that scans cfg.models.providers directly when a provider is known, completely bypassing the stale bare-key cache for provider-qualified lookups.

  • All ten call sites (get-reply-directives, agent-runner, followup-runner, agent-runner-memory, memory-flush, directive-handling.persist, cron/isolated-agent/run, sessions, session-utils) are consistently migrated to resolveContextTokensForModel.
  • followup-runner.ts additionally corrects providerUsed to prefer the actual post-fallback provider from agentMeta rather than always using fallbackProvider.
  • resolveContextTokens in model-selection.ts is now dead code — its only call site was removed in this PR and the function can be deleted.
  • The lookupContextTokens mock in run.test-harness.ts is now only reached via resolveContextTokensForModel's fallback path; tests that rely on a fixed 128 k return value should also mock resolveContextTokensForModel directly to remain reliable.

Confidence Score: 4/5

  • Safe to merge — the logic is correct and human-verified; minor dead code and a potentially stale test mock are the only remaining loose ends.
  • The core fix is well-reasoned and cleanly implemented across all call sites. The resolveContextTokensForModel function correctly handles the provider-qualified vs. bare-key distinction with appropriate guards. Two minor housekeeping items lower the score slightly: resolveContextTokens in model-selection.ts is now dead code and should be removed, and the lookupContextTokens mock in run.test-harness.ts may not reliably cover the changed code path in run.ts, potentially leaving a test gap.
  • src/auto-reply/reply/model-selection.ts (dead resolveContextTokens export), src/cron/isolated-agent/run.test-harness.ts (stale mock)

Comments Outside Diff (2)

  1. src/auto-reply/reply/model-selection.ts, line 603-610 (link)

    resolveContextTokens is now dead code

    After this PR removes its only import in get-reply-directives.ts, resolveContextTokens has no remaining callers in the codebase (confirmed via grep — only the definition in this file remains). It can be deleted to avoid confusion for future contributors who might wonder whether it's intentionally kept.

  2. src/cron/isolated-agent/run.test-harness.ts, line 124-130 (link)

    Stale mock may silently pass through real config scan

    run.ts now calls resolveContextTokensForModel (real implementation, since ...actual is spread) instead of lookupContextTokens directly. The mocked lookupContextTokens is only reached by resolveContextTokensForModel as a late fallback — after resolveConfiguredProviderContextWindow scans the test's cfg. If a test's cfg object has a matching contextWindow entry, the mock is never reached and the expected 128 000 may not be returned.

    Consider adding resolveContextTokensForModel to the mock so the return value is stable regardless of the cfg fixture:

    vi.mock("../../agents/context.js", async (importOriginal) => {
      const actual = await importOriginal<typeof import("../../agents/context.js")>();
      return {
        ...actual,
        lookupContextTokens: vi.fn().mockReturnValue(128000),
        resolveContextTokensForModel: vi.fn().mockReturnValue(128000),
      };
    });
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/model-selection.ts
Line: 603-610

Comment:
**`resolveContextTokens` is now dead code**

After this PR removes its only import in `get-reply-directives.ts`, `resolveContextTokens` has no remaining callers in the codebase (confirmed via grep — only the definition in this file remains). It can be deleted to avoid confusion for future contributors who might wonder whether it's intentionally kept.

```suggestion
// (delete lines 603-610)
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/cron/isolated-agent/run.test-harness.ts
Line: 124-130

Comment:
**Stale mock may silently pass through real config scan**

`run.ts` now calls `resolveContextTokensForModel` (real implementation, since `...actual` is spread) instead of `lookupContextTokens` directly. The mocked `lookupContextTokens` is only reached by `resolveContextTokensForModel` as a late fallback — after `resolveConfiguredProviderContextWindow` scans the test's cfg. If a test's cfg object has a matching `contextWindow` entry, the mock is never reached and the expected 128 000 may not be returned.

Consider adding `resolveContextTokensForModel` to the mock so the return value is stable regardless of the cfg fixture:

```ts
vi.mock("../../agents/context.js", async (importOriginal) => {
  const actual = await importOriginal<typeof import("../../agents/context.js")>();
  return {
    ...actual,
    lookupContextTokens: vi.fn().mockReturnValue(128000),
    resolveContextTokensForModel: vi.fn().mockReturnValue(128000),
  };
});
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 80b8160

@openclaw-barnacle openclaw-barnacle bot removed the stale Marked as stale due to inactivity label Mar 14, 2026
@openclaw-barnacle
Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Mar 19, 2026
@openclaw-barnacle
Copy link
Copy Markdown

Closing due to inactivity.
If you believe this PR should be revived, post in #pr-thunderdome-dangerzone on Discord to talk to a maintainer.
That channel is the escape hatch for high-quality PRs that get auto-closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling commands Command implementations gateway Gateway runtime size: S stale Marked as stale due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MODEL_CACHE silently overwrites context windows on provider conflicts (last-write-wins)

1 participant