Skip to content

fix: refactor context window cache to be provider-scoped#34511

Closed
cyberincome-openclaw wants to merge 15 commits intoopenclaw:mainfrom
cyberincome-openclaw:fix/provider-scoped-context
Closed

fix: refactor context window cache to be provider-scoped#34511
cyberincome-openclaw wants to merge 15 commits intoopenclaw:mainfrom
cyberincome-openclaw:fix/provider-scoped-context

Conversation

@cyberincome-openclaw
Copy link
Copy Markdown

  • Key MODEL_CACHE by provider::modelId to prevent cross-provider poisoning
  • Update lookupContextTokens to support scoped and unscoped lookups
  • Update call sites to pass provider metadata where available
  • Fixes issue where Gemini/Codex limits were truncated by third-party proxies

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: OpenClaw's global capability cache is keyed only by modelId, causing "cache poisoning" where a low limit from a proxy provider (e.g., GitHub Copilot at 128k) overwrites the native limit (e.g., Google API at 1M+) for the same model ID.
  • Why it matters: Users are forced into high-context overflows and reasoning degradation on high-capacity models because the system incorrectly caps their token budget.
  • What changed: Refactored MODEL_CACHE to use composite provider::modelId keys and updated core resolution logic to be provider-aware.
  • What did NOT change (scope boundary): No changes were made to the underlying provider discovery logic or token counting math; only the precedence and storage of the resulting limits were modified.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes # (No official issue number yet, discovered via runtime audit)
  • Related #

User-visible / Behavior Changes

  • Context Window Restoration: Models like Gemini and Codex will now correctly report and use their full native context windows (e.g., 1,048,576 tokens) even if lower-limit variants of the same model exist in the provider catalog.
  • Override Persistence: Manual overrides in sessions.json will no longer be overwritten by "fail-safe" discovery logic during session hydration.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Linux (WSL2)
  • Runtime/container: Docker (OpenClaw official image)
  • Model/provider: google/gemini-3-flash-preview
  • Integration/channel (if any): Direct / Telegram
  • Relevant config (redacted): Both google (API) and google-gemini-cli (OAuth) configured alongside github-copilot.

Steps

  1. Configure an agent to use Gemini via Google API.
  2. Ensure GitHub Copilot is also configured (which exposes a 128k variant of the same Gemini model).
  3. Start a session and observe context usage via /status.

Expected

  • Context window should be 1,048,576.

Actual

  • Context window was capped at 128,000, leading to a 221%+ overflow status.

Evidence

Attach at least one:

  • Failing test/log before + passing after (Observed via session_status tool)
  • Trace/log snippets (Provided in investigation report)
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: Confirmed that google::gemini... correctly pulls 1M+ tokens while github-copilot::gemini... maintains its 128k limit within the same running instance.
  • Edge cases checked: Verified that legacy lookups (calls without a provider string) still fallback to the bare modelId key to prevent breaking third-party plugins.
  • What you did not verify: I did not verify behavior on niche local providers like Ollama/vLLM, though the logic is generic and should apply.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert changes to src/agents/context.ts and the associated runner/utility files.
  • Files/config to restore: src/agents/context.ts, src/agents/pi-embedded-runner/run.ts, src/auto-reply/reply/followup-runner.ts, etc.
  • Known bad symptoms reviewers should watch for: Unexpected undefined context window limits if a provider string is mismatched (mitigated by legacy fallback logic).

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: Mismatched provider strings could cause a miss in the scoped cache.
  • Mitigation: A robust fallback to the legacy modelId-only lookup is implemented to ensure a limit is always returned.

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime commands Command implementations agents Agent runtime and tooling size: S labels Mar 4, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92f6ee2f57

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 217 to 219
// Fallback to legacy behavior where we check for a bare modelId match.
// This supports cases where the provider is not yet known or for generic aliases.
return MODEL_CACHE.get(modelId);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize bare-key fallback before cache lookup

applyDiscoveredContextWindows and applyConfiguredContextWindows now write bare model keys in normalized lowercase form, but the fallback path in lookupContextTokens still does MODEL_CACHE.get(modelId) with the raw string. When a caller omits provider (or a provider mismatch forces fallback) and the model id is not already lowercase/trimmed, the cache miss now drops to DEFAULT_CONTEXT_TOKENS, which regresses the backward-compatible unscoped lookup behavior this change intends to keep.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please review the latest commits, this issue should now be fixed.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 4, 2026

Greptile Summary

This PR refactors the global context-window cache from bare modelId keys to composite provider::modelId keys, fixing the cross-provider "cache poisoning" where a low-limit proxy variant (e.g. GitHub Copilot at 128k) could overwrite the native limit (e.g. Google API at 1M+) for the same model ID. The call-site changes are thorough and correctly thread provider through the entire call chain.

Issues found:

  • Critical — require() in ESM module (src/agents/context-window-guard.ts, line 46): The project uses "type": "module", making require undefined at runtime. The call require("./context.js") inside the resolveContextWindowInfo IIFE will throw a ReferenceError whenever fromModelsConfig is falsy — which is the common case. This needs to become a static import at the top of the file.

  • Logic — Bare key skipped for provider-less models (src/agents/context.ts, lines 34–36): The new guard if (!model?.id || !model?.provider) means discovered models without a provider field are silently excluded from all cache paths (both scoped and bare). Before this PR, bare-key coverage was always written regardless of provider. Because discoverModels is dynamically imported as unknown, TypeScript cannot catch if models lack provider at runtime — any such entries will now be invisible to lookupContextTokens.

  • Logic — Inconsistent bare-key replacement policy in applyConfiguredContextWindows (src/agents/context.ts, line 89): applyDiscoveredContextWindows uses "smallest wins" for the bare key to keep budgeting fail-safe. applyConfiguredContextWindows uses unconditional set() (last-write-wins) for the same bare key, so a larger configured limit from a second provider can silently overwrite a smaller one.

Confidence Score: 1/5

  • Not safe to merge — a require() call in an ESM module will throw a ReferenceError at runtime on virtually every code path through resolveContextWindowInfo.
  • The require("./context.js") on line 46 of context-window-guard.ts is an unconditional runtime crash in this ESM project. It will surface the moment any agent session without an explicit modelsConfig override calls resolveContextWindowInfo — which is the standard path. Two additional logic issues (provider-less models being silently dropped from all cache paths, and inconsistent bare-key replacement policy) further reduce confidence, though those are less immediately catastrophic.
  • src/agents/context-window-guard.ts (ESM require crash) and src/agents/context.ts (bare-key consistency issues) require attention before merging.

Last reviewed commit: adc752e


// Fallback to legacy behavior where we check for a bare modelId match.
// This supports cases where the provider is not yet known or for generic aliases.
return MODEL_CACHE.get(modelId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare-key fallback uses unnormalized modelId

Both applyDiscoveredContextWindows and applyConfiguredContextWindows now store bare keys with .toLowerCase().trim() normalization, but this fallback lookup passes the raw, unnormalized modelId string directly to Map.get(). Any caller passing a model ID with uppercase letters or surrounding whitespace will silently miss the cache, even though the key exists under the normalized form. The scoped key lookup (earlier in the same function) correctly normalizes both sides of the key, making this an inconsistency introduced by this PR.

Suggested change
return MODEL_CACHE.get(modelId);
return MODEL_CACHE.get(modelId.toLowerCase().trim());
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/context.ts
Line: 219

Comment:
Bare-key fallback uses unnormalized `modelId`

Both `applyDiscoveredContextWindows` and `applyConfiguredContextWindows` now store bare keys with `.toLowerCase().trim()` normalization, but this fallback lookup passes the raw, unnormalized `modelId` string directly to `Map.get()`. Any caller passing a model ID with uppercase letters or surrounding whitespace will silently miss the cache, even though the key exists under the normalized form. The scoped key lookup (earlier in the same function) correctly normalizes both sides of the key, making this an inconsistency introduced by this PR.

```suggestion
  return MODEL_CACHE.get(modelId.toLowerCase().trim());
```

How can I resolve this? If you propose a fix, please make it concise.

params.sessionEntry ??
(params.sessionKey ? params.sessionStore?.[params.sessionKey] : undefined);
const contextWindowTokens = resolveMemoryFlushContextWindowTokens({
provider: params.followupRun.run.provider ?? params.defaultProvider,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaultProvider not declared in params type

params.defaultProvider is referenced here, but defaultProvider was never added to the runMemoryFlushIfNeeded params type (defined at lines 250–264). defaultModel is present in the type and works correctly on the next line, but its provider counterpart was omitted. TypeScript will flag this as a type error, and at JavaScript runtime the expression silently evaluates to undefined, so the provider fallback never actually takes effect—the context window lookup always falls back to the bare model ID key.

The fix is to add defaultProvider?: string; to the function's params type:

Suggested change
provider: params.followupRun.run.provider ?? params.defaultProvider,
export async function runMemoryFlushIfNeeded(params: {
cfg: OpenClawConfig;
followupRun: FollowupRun;
promptForEstimate?: string;
sessionCtx: TemplateContext;
opts?: GetReplyOptions;
defaultModel: string;
defaultProvider?: string; // ← add this
agentCfgContextTokens?: number;
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-memory.ts
Line: 291

Comment:
`defaultProvider` not declared in params type

`params.defaultProvider` is referenced here, but `defaultProvider` was never added to the `runMemoryFlushIfNeeded` params type (defined at lines 250–264). `defaultModel` is present in the type and works correctly on the next line, but its provider counterpart was omitted. TypeScript will flag this as a type error, and at JavaScript runtime the expression silently evaluates to `undefined`, so the provider fallback never actually takes effect—the context window lookup always falls back to the bare model ID key.

The fix is to add `defaultProvider?: string;` to the function's params type:

```suggestion
export async function runMemoryFlushIfNeeded(params: {
  cfg: OpenClawConfig;
  followupRun: FollowupRun;
  promptForEstimate?: string;
  sessionCtx: TemplateContext;
  opts?: GetReplyOptions;
  defaultModel: string;
  defaultProvider?: string;       // ← add this
  agentCfgContextTokens?: number;
```

How can I resolve this? If you propose a fix, please make it concise.

@cyberincome-openclaw
Copy link
Copy Markdown
Author

cyberincome-openclaw commented Mar 4, 2026

please review the latest commits, this issue should now be fixed.
@codex
@greptile-apps

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c956475517

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

const contextTokensUsed =
agentCfgContextTokens ??
lookupContextTokens(modelUsed) ??
lookupContextTokens(modelUsed, fallbackProvider) ??
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use resolved provider for followup context cache lookup

This lookup now uses fallbackProvider, but the run result already exposes the actual provider in runResult.meta?.agentMeta?.provider (and other runners prioritize that value). When those differ (for example, if the embedded runner reports a provider different from the selected fallback candidate), this line queries the wrong scoped cache key and can fall back to an unrelated bare/default context window, which skews token utilization and compaction decisions for followup turns.

Useful? React with 👍 / 👎.

@cyberincome-openclaw
Copy link
Copy Markdown
Author

please review the latest commits, all issues should now be fixed.
@codex

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e1d2baf91a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +47 to +49
const cachedTokens = lookupContextTokens(params.modelId, params.provider);
if (cachedTokens) {
return { tokens: cachedTokens, source: "model" as const };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prefer model metadata over fallback cache in guard

This cache lookup runs before modelContextWindow, but lookupContextTokens falls back to the bare model-id key when a scoped provider::model key is missing (src/agents/context.ts fallback path). On cold start or when only another provider defines the same model id, cachedTokens can come from the wrong provider and override a correct modelContextWindow, causing false low-context warnings/blocks for otherwise valid runs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 636ca8ca38

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +47 to +49
const cachedTokens = lookupContextTokens(params.modelId, params.provider);
if (cachedTokens) {
return { tokens: cachedTokens, source: "model" as const };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prefer runtime model context window before cache fallback

This cache lookup now runs before modelContextWindow, so when lookupContextTokens misses the scoped key and falls back to a bare model key, it can return a limit learned from a different provider and override the actual runtime model metadata for this run. In resolveContextWindowInfo, that incorrect cached value is then used for warnings/blocks, which can falsely block valid models (for example on cold cache or provider-id mismatch) even when params.modelContextWindow has the correct window.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: adc752ec09

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}

// Try a scoped lookup in the model cache (composite key: provider::modelId)
const { lookupContextTokens } = require("./context.js");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Replace CommonJS require with ESM-safe lookup import

resolveContextWindowInfo now calls require("./context.js") on the hot path; in Node ESM execution contexts (the repo ships with "type": "module"), this throws before the guard can return a context window when modelsConfig has no direct match, which can break agent runs that rely on this guard path. Use an ESM-safe approach (for example a top-level import or createRequire) so the guard works consistently across Node runtimes.

Useful? React with 👍 / 👎.

}) {
for (const model of params.models) {
if (!model?.id) {
if (!model?.id || !model?.provider) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve bare-key discovery for providerless model entries

The new !model?.provider guard skips discovered models that only provide id + contextWindow, so those entries no longer populate even the legacy bare model key; in environments using older/partial registries this regresses to DEFAULT_CONTEXT_TOKENS instead of the discovered limit, reintroducing incorrect budgeting for unscoped lookups. Keep writing the bare-key cache entry when provider is missing.

Useful? React with 👍 / 👎.

@cyberincome-openclaw
Copy link
Copy Markdown
Author

@codex
@greptileai

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

Comment on lines +46 to +47
const { lookupContextTokens } = require("./context.js");
const cachedTokens = lookupContextTokens(params.modelId, params.provider);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require() throws ReferenceError in ESM module

This package is declared as "type": "module" in package.json, making all .ts/.js files native ESM. In ESM, require is not defined — calling it here will throw a ReferenceError: require is not defined at runtime on any code path where fromModelsConfig is falsy (i.e., the vast majority of calls, since most sessions have no per-model config override).

The fix is to use a static import at the top of the file instead:

import { lookupContextTokens } from "./context.js";

Then remove the require call entirely and call lookupContextTokens directly inside the IIFE.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/context-window-guard.ts
Line: 46-47

Comment:
**`require()` throws `ReferenceError` in ESM module**

This package is declared as `"type": "module"` in `package.json`, making all `.ts`/`.js` files native ESM. In ESM, `require` is not defined — calling it here will throw a `ReferenceError: require is not defined` at runtime on any code path where `fromModelsConfig` is falsy (i.e., the vast majority of calls, since most sessions have no per-model config override).

The fix is to use a static `import` at the top of the file instead:

```
import { lookupContextTokens } from "./context.js";
```

Then remove the `require` call entirely and call `lookupContextTokens` directly inside the IIFE.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +86 to +89
params.cache.set(`${normalizedProvider}::${normalizedModelId}`, contextWindow);

// Set bare key (last one wins or smallest wins? Config should probably override discovery)
params.cache.set(normalizedModelId, contextWindow);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare key in applyConfiguredContextWindows uses last-write-wins, not smallest-wins

applyDiscoveredContextWindows explicitly implements "smallest wins" for the bare key (lines 53–58) to keep budgeting fail-safe when the provider is unknown. But applyConfiguredContextWindows uses a plain cache.set() with no comparison, meaning if two providers in config list the same modelId with different limits, the last one iterated wins — regardless of which limit is smaller.

This inconsistency means that a larger configured limit from a second provider can silently overwrite a smaller one in the bare-key cache, undermining the fail-safe rationale documented in the comment above it.

Suggested change
params.cache.set(`${normalizedProvider}::${normalizedModelId}`, contextWindow);
// Set bare key (last one wins or smallest wins? Config should probably override discovery)
params.cache.set(normalizedModelId, contextWindow);
// Set bare key – prefer the smaller window so budgeting stays fail-safe
// when the provider is unknown, consistent with applyDiscoveredContextWindows.
const existingBare = params.cache.get(normalizedModelId);
if (existingBare === undefined || contextWindow < existingBare) {
params.cache.set(normalizedModelId, contextWindow);
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/context.ts
Line: 86-89

Comment:
**Bare key in `applyConfiguredContextWindows` uses last-write-wins, not smallest-wins**

`applyDiscoveredContextWindows` explicitly implements "smallest wins" for the bare key (lines 53–58) to keep budgeting fail-safe when the provider is unknown. But `applyConfiguredContextWindows` uses a plain `cache.set()` with no comparison, meaning if two providers in config list the same `modelId` with different limits, the last one iterated wins — regardless of which limit is smaller.

This inconsistency means that a larger configured limit from a second provider can silently overwrite a smaller one in the bare-key cache, undermining the fail-safe rationale documented in the comment above it.

```suggestion
      // Set bare key – prefer the smaller window so budgeting stays fail-safe
      // when the provider is unknown, consistent with applyDiscoveredContextWindows.
      const existingBare = params.cache.get(normalizedModelId);
      if (existingBare === undefined || contextWindow < existingBare) {
        params.cache.set(normalizedModelId, contextWindow);
      }
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 34 to 36
for (const model of params.models) {
if (!model?.id) {
if (!model?.id || !model?.provider) {
continue;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Models without provider now silently skip all cache paths (regression risk)

Before this PR, a model entry with no provider still populated the bare-key cache via the original applyDiscoveredContextWindows logic. After this change the early continue at line 35 means such entries are entirely bypassed — neither the scoped nor the bare key is written for them.

discoverModels is dynamically imported and cast through unknown as ModelRegistryLike, so TypeScript cannot verify at compile time whether every returned entry actually carries a provider string. If any discovered models lack that field (e.g. because the underlying PI model-discovery layer doesn't attach a provider to every entry), those models will silently produce zero cache coverage — a regression from the previous behaviour where at least the bare key was populated.

The safest fix is to guard only the scoped-key write behind the provider truthiness check, while always writing the bare key regardless of whether provider is present. This preserves the full fallback coverage that existed before the refactor.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/context.ts
Line: 34-36

Comment:
**Models without `provider` now silently skip all cache paths (regression risk)**

Before this PR, a model entry with no `provider` still populated the bare-key cache via the original `applyDiscoveredContextWindows` logic. After this change the early `continue` at line 35 means such entries are entirely bypassed — neither the scoped nor the bare key is written for them.

`discoverModels` is dynamically imported and cast through `unknown as ModelRegistryLike`, so TypeScript cannot verify at compile time whether every returned entry actually carries a `provider` string. If any discovered models lack that field (e.g. because the underlying PI model-discovery layer doesn't attach a provider to every entry), those models will silently produce zero cache coverage — a regression from the previous behaviour where at least the bare key was populated.

The safest fix is to guard only the scoped-key write behind the `provider` truthiness check, while always writing the bare key regardless of whether `provider` is present. This preserves the full fallback coverage that existed before the refactor.

How can I resolve this? If you propose a fix, please make it concise.

@jalehman jalehman self-assigned this Mar 4, 2026
@jalehman
Copy link
Copy Markdown
Contributor

jalehman commented Mar 4, 2026

Thanks for the provider-scoped cache refactor; the direction is solid. Before merge, please fix a blocker in src/agents/context-window-guard.ts: require("./context.js") fails in the current ESM TS environment (Cannot find module './context.js') and breaks resolveContextWindowInfo. An ESM-safe import (typically top-level import) should resolve this. I reproduced it with:

scripts/pr review-tests 34511 src/agents/context-window-guard.test.ts src/agents/context.lookup.test.ts

cyberincome-openclaw and others added 10 commits March 6, 2026 05:55
- Key MODEL_CACHE by provider::modelId to prevent cross-provider poisoning
- Update lookupContextTokens to support scoped and unscoped lookups
- Update call sites to pass provider metadata where available
- Fixes issue where Gemini/Codex limits were truncated by third-party proxies
- Update resolveContextWindowInfo to attempt a scoped lookup in the model cache
- Prevents the guard from falling back to poisoned global limits or default tokens when a native limit is known for the provider
- Normalize bare modelId in lookupContextTokens fallback path
- Add defaultProvider to runMemoryFlushIfNeeded params type
- Add defaultProvider to runReplyAgent params type
- Pass defaultProvider from runPreparedReply to runReplyAgent
- Pass defaultProvider from runReplyAgent to runMemoryFlushIfNeeded
- Use runResult.meta?.agentMeta?.provider if available, falling back to fallbackProvider
- Ensures scoped cache lookups use the most accurate provider metadata
…cache

- Strip provider prefixes from model IDs during cache storage and lookup
- Implement 'Largest Wins' strategy for bare model ID cache to prevent proxy poisoning
- Align test fixtures with new normalization and priority logic
@cyberincome-openclaw cyberincome-openclaw force-pushed the fix/provider-scoped-context branch from 4519d30 to 48a8c08 Compare March 6, 2026 06:36
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48a8c0813f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


// Set bare key (Largest wins globally to avoid proxy poisoning)
const existingBare = params.cache.get(normalizedModelId);
if (existingBare === undefined || contextWindow > existingBare) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Let configured context window override bare cache entry

In applyConfiguredContextWindows, the bare-key update only writes when contextWindow > existingBare, so a discovered larger value is never replaced by a smaller explicit config value. That means providerless lookups (and scoped lookups that fall back to bare in lookupContextTokens) can ignore a user-configured cap and keep using the discovered value instead. This regresses config override behavior for fallback paths and can inflate budgeting/guard decisions when models.providers[*].models[].contextWindow is intentionally lower.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@jalehman jalehman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes Requested

Thank you for this contribution! We've reviewed this PR and have some feedback before it can move forward.

Preserve hierarchical model IDs during config lookup

normalizeModelId is currently stripping the first path segment unconditionally. That breaks slash-containing provider-local model IDs such as OpenRouter identifiers like anthropic/claude-opus-4-6, which means provider-scoped config entries no longer match during context resolution. Please adjust the lookup flow so provider-local model IDs are preserved when matching configured limits and cache keys.

Keep explicit configured limits authoritative on the bare fallback key

The provider-aware cache refactor changed the legacy bare fallback behavior so discovered limits can win over an explicit configured override. That regresses the existing contract and is already visible in the targeted failing test around models.providers overrides. Please restore the fallback behavior so an explicit configured limit still overrides discovered cache values for the bare-model entry.


Once these are addressed, we'll re-review. Feel free to ask questions if anything is unclear.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6fca3f3568

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".


// 1. Scoped lookup (highest precedence)
if (provider) {
const scopedKey = `${provider.toLowerCase().trim()}::${normalizedModelId}`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Canonicalize provider-scoped cache keys before writing

applyDiscoveredContextWindows stores scoped entries with the raw normalized model id (provider::model.id), so provider-prefixed ids become doubly prefixed (for example github-copilot::github-copilot/gemini-3-flash). Runtime lookups use split refs (provider + provider-stripped model, via parseModelRef in src/agents/model-selection.ts), so lookupContextTokens misses the scoped key and falls back to the bare key, which can pull another provider’s limit and reintroduce cross-provider context-window poisoning.

Useful? React with 👍 / 👎.

This ensures that scoped lookups hit correctly even when the model ID contains a provider prefix (e.g., github-copilot/gemini-3-flash).
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 477ea17192

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +45 to +47
const cachedTokens = lookupContextTokens(params.modelId, params.provider);
if (cachedTokens) {
return { tokens: cachedTokens, source: "model" as const };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Prefer runtime model context window over cache fallback

resolveContextWindowInfo now returns lookupContextTokens(...) before considering modelContextWindow, so a scoped-cache miss can pull a bare-model fallback from another provider and override the actual runtime window reported for this run. In mixed-provider setups (or cold/warming cache states), this can produce false low-context warnings/blocks even when params.modelContextWindow is valid for the selected provider, so the runtime model metadata should be checked before cache fallback.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

const scopedKey = `${normalizeProviderId(provider)}::${model.id}`;

P2 Badge Normalize discovered model IDs before writing scoped keys

lookupContextTokens lowercases modelId before every cache read, but applyDiscoveredContextWindows writes discovered IDs verbatim into provider-scoped keys. If discovery returns mixed-case IDs (for example MiniMax-M2.5-style IDs), provider-aware lookups will miss the scoped entry and fall back to bare/default limits, which can under/over-estimate context windows for runs that rely on discovery rather than explicit models.providers[*].models[].contextWindow config.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@jalehman
Copy link
Copy Markdown
Contributor

Sorry I didn't get back to this one soon enough. Looks like it's been resolved by #36389 now. Thank you though!

@jalehman jalehman closed this Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling commands Command implementations gateway Gateway runtime size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants