Skip to content

Context window cache uses min-across-providers for same model id, causing incorrect /status display #35976

@akari-musubi

Description

@akari-musubi

Summary

/status shows an incorrect (lower) context window for models that appear under multiple providers in the built-in catalog. The context window cache in src/agents/context.ts uses only the model id as cache key, so when the same model id exists under different providers with different context limits, the smallest value wins.

Reproduction

  1. Configure an agent with google-gemini-cli/gemini-3.1-pro-preview as primary model
  2. Run /status
  3. Context line shows 0/128k instead of 0/1024k

Root Cause

In src/agents/context.ts, the cache population loop iterates over all providers and uses model.id as the cache key:

// L44-48
const existing = params.cache.get(model.id);
// When multiple providers expose the same model id with different limits,
// prefer the smaller window so token budgeting is fail-safe (no overestimation).
if (existing === undefined || contextWindow < existing) {
    params.cache.set(model.id, contextWindow);
}

gemini-3.1-pro-preview appears in the pi-ai built-in catalog under multiple providers:

  • github-copilot → contextWindow: 128,000
  • google-gemini-cli → contextWindow: 1,048,576

Since the cache takes the minimum, 128k is stored.

Impact

  • Display only/status shows the wrong context window
  • No runtime impactpi-embedded-runner/model.ts resolves contextWindow from the correct provider config at runtime, so actual context capacity and compaction thresholds are unaffected

Workaround

Manually set contextWindow in config to override the cache:

{
  "agents": {
    "defaults": {
      "models": {
        "google-gemini-cli/gemini-3.1-pro-preview": {
          "contextWindow": 1048576
        }
      }
    }
  }
}

Suggested Fix

Change the cache key from model.id to provider/model.id (or equivalent composite key) so that different providers with different context limits for the same model id do not collide. The lookupContextTokens caller would then need to pass the provider as well.

Alternatively, resolveContextTokensForModel could consult the model registry directly (which is provider-aware) instead of the flat cache.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions