Skip to content

bug(context-indicator): older sessions show '100' / '890% used' on load — #1356 fix didn't cover GET /api/session path, falls back to cumulative input_tokens vs 128K default #1436

@nesquena-hermes

Description

@nesquena-hermes

Summary

The context-window indicator shows "100" on the ring with the tooltip claiming "890% used (context exceeded), 1.2M / 131.1k tokens used" for older sessions, even when the active model has a 1M context window. Reproduces consistently for sessions Nathan opened today (May 1 2026, v0.50.253+).

This is a regression-shaped bug in older / loaded sessions#1356 (closed Apr 30) fixed the same symptom on the live SSE path but didn't cover the GET /api/session load path that older sessions go through.

Verification — #1356 fixes ARE live in master (v0.50.253)

Per the "fix shipped but user reports it still happens" protocol, I verified each #1356 claim against current master:

#1356 claim Status Location
api/streaming.py SSE payload falls back to get_model_context_length() when compressor doesn't supply it ✅ Present api/streaming.py:2333-2342
api/streaming.py falls back to s.last_prompt_tokens to avoid cumulative input_tokens ✅ Present api/streaming.py:2347-2350
static/ui.js tracks rawPct separately and shows "(context exceeded)" when > 100 ✅ Present static/ui.js:1283-1285, 1313

All three fixes are live. The bug Nathan reports is a different code path the fix didn't address.

Root cause — the GET /api/session load path has no fallback

When an older session is loaded (clicking it in the sidebar), the chain is:

  1. api/routes.py:1302-1304 returns context_length, threshold_tokens, last_prompt_tokens straight from the persisted Session object:

    "context_length": getattr(s, "context_length", 0) or 0,
    "threshold_tokens": getattr(s, "threshold_tokens", 0) or 0,
    "last_prompt_tokens": getattr(s, "last_prompt_tokens", 0) or 0,

    For sessions that pre-date #1318 (when these fields were added to Session), all three are 0.

  2. static/sessions.js:512-519 then calls _syncCtxIndicator with those zeros:

    _syncCtxIndicator({
      input_tokens:      _pick(u.input_tokens,      _s.input_tokens),     // cumulative
      last_prompt_tokens:_pick(u.last_prompt_tokens,_s.last_prompt_tokens), // 0 for old sessions
      context_length:    _pick(u.context_length,    _s.context_length),    // 0 for old sessions
      ...
    });
  3. static/ui.js:1269, 1273 then triggers two cascading fallbacks:

    const promptTok = usage.last_prompt_tokens || usage.input_tokens || 0;  // → cumulative
    const ctxWindow = usage.context_length || DEFAULT_CTX;                  // → 131,072 (128K)
  4. Result: rawPct = 1.2M / 131K = ~890%, capped to 100 on the ring per :1284 (Math.min(100, rawPct)), with the tooltip showing the raw 890% per :1313.

This is exactly the symptom #1356 fixed on the SSE path, repeated on the load path because the same fallback was never added to api/routes.py.

Why the SSE-path fix doesn't help

After loading a session, the indicator stays at "100 / 890%" until a new message is sent and the SSE done payload arrives. Users hitting older sessions on page load see the broken number and never reach the SSE path that would correct it. Worse — the tooltip is misleading enough ("Compress now to free context") that some users will compress unnecessarily.

Fix shape

Mirror the SSE-path fallback in two places so the load path matches:

Backend — api/routes.py:1302-1304

Add the same get_model_context_length() fallback that api/streaming.py:2333-2342 uses:

# Resolve effective values with model-metadata fallback so older sessions
# (pre-#1318) that have context_length=0 still get a meaningful indicator.
_persisted_cl = getattr(s, "context_length", 0) or 0
_persisted_lpt = getattr(s, "last_prompt_tokens", 0) or 0
if not _persisted_cl:
    try:
        from agent.model_metadata import get_model_context_length
        _persisted_cl = get_model_context_length(
            getattr(s, "model", "") or effective_model or "",
            "",  # base_url not stored on Session
        ) or 0
    except Exception:
        pass

raw = s.compact() | {
    ...
    "context_length": _persisted_cl,
    "threshold_tokens": getattr(s, "threshold_tokens", 0) or 0,
    "last_prompt_tokens": _persisted_lpt,
}

Frontend — static/ui.js:1269 (defense in depth)

When last_prompt_tokens is missing, don't fall back to cumulative input_tokens. Show "·" or "no last-prompt data" instead:

// Old:
const promptTok = usage.last_prompt_tokens || usage.input_tokens || 0;
// New:
const promptTok = usage.last_prompt_tokens || 0;
const hasPromptTok = !!promptTok;
// (existing branch already shows "tokens used" when !hasPromptTok)

The current behavior of falling back to cumulative input_tokens was a workaround for missing last_prompt_tokens in pre-#1318 sessions, but the cumulative number is fundamentally wrong for "context window % used" — it conflates input across all turns. Better to show "no data" than wrong data.

This second change is the more important one: it prevents this specific failure mode from recurring even if a future code path forgets the backend fallback.

Test shape

Add to tests/test_session_load_context_length.py:

def test_get_session_returns_context_length_fallback_for_older_session(...):
    # Build a Session with context_length=0 (pre-#1318)
    s = Session(...)
    s.context_length = 0
    s.last_prompt_tokens = 0
    s.input_tokens = 1_200_000  # cumulative
    s.model = "claude-sonnet-4.6"  # 1M context
    s.save()

    resp = GET /api/session?session_id=...
    assert resp["context_length"] == 1_000_000  # resolved from model_metadata
    # NOT 0, NOT 128K

Plus a frontend test in the QA harness that loads an older session and confirms the ring doesn't show "100" against an artificial 128K default.

Reported by

@AvidFuturist in Discord (webui channel, May 1 2026):

  • "the 100 comes up way too often in context circle now. We must prevent that from happening to older sessions or whatever incorrectly causes this now"
  • Reproduces in "all my chats today" — consistent with older sessions cached in browser/localStorage where s.context_length was never persisted

Screenshot: tooltip shows "Context window — 890% used (context exceeded), 1.2M / 131.1k tokens used" with the ring capped at 100.

Cross-links

Scope

api/routes.py (~10 LOC) + static/ui.js (~3 LOC). Both are surgical. Add a regression test alongside.

This is a P1 user-facing regression — the indicator is broken for every older session as soon as users open them today. Worth a hotfix release rather than waiting for the next batch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingsprint-candidateStrong candidate for next sprint

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions