Summary
The context-window indicator shows "100" on the ring with the tooltip claiming "890% used (context exceeded), 1.2M / 131.1k tokens used" for older sessions, even when the active model has a 1M context window. Reproduces consistently for sessions Nathan opened today (May 1 2026, v0.50.253+).
This is a regression-shaped bug in older / loaded sessions — #1356 (closed Apr 30) fixed the same symptom on the live SSE path but didn't cover the GET /api/session load path that older sessions go through.
Verification — #1356 fixes ARE live in master (v0.50.253)
Per the "fix shipped but user reports it still happens" protocol, I verified each #1356 claim against current master:
| #1356 claim |
Status |
Location |
api/streaming.py SSE payload falls back to get_model_context_length() when compressor doesn't supply it |
✅ Present |
api/streaming.py:2333-2342 |
api/streaming.py falls back to s.last_prompt_tokens to avoid cumulative input_tokens |
✅ Present |
api/streaming.py:2347-2350 |
static/ui.js tracks rawPct separately and shows "(context exceeded)" when > 100 |
✅ Present |
static/ui.js:1283-1285, 1313 |
All three fixes are live. The bug Nathan reports is a different code path the fix didn't address.
Root cause — the GET /api/session load path has no fallback
When an older session is loaded (clicking it in the sidebar), the chain is:
-
api/routes.py:1302-1304 returns context_length, threshold_tokens, last_prompt_tokens straight from the persisted Session object:
"context_length": getattr(s, "context_length", 0) or 0,
"threshold_tokens": getattr(s, "threshold_tokens", 0) or 0,
"last_prompt_tokens": getattr(s, "last_prompt_tokens", 0) or 0,
For sessions that pre-date #1318 (when these fields were added to Session), all three are 0.
-
static/sessions.js:512-519 then calls _syncCtxIndicator with those zeros:
_syncCtxIndicator({
input_tokens: _pick(u.input_tokens, _s.input_tokens), // cumulative
last_prompt_tokens:_pick(u.last_prompt_tokens,_s.last_prompt_tokens), // 0 for old sessions
context_length: _pick(u.context_length, _s.context_length), // 0 for old sessions
...
});
-
static/ui.js:1269, 1273 then triggers two cascading fallbacks:
const promptTok = usage.last_prompt_tokens || usage.input_tokens || 0; // → cumulative
const ctxWindow = usage.context_length || DEFAULT_CTX; // → 131,072 (128K)
-
Result: rawPct = 1.2M / 131K = ~890%, capped to 100 on the ring per :1284 (Math.min(100, rawPct)), with the tooltip showing the raw 890% per :1313.
This is exactly the symptom #1356 fixed on the SSE path, repeated on the load path because the same fallback was never added to api/routes.py.
Why the SSE-path fix doesn't help
After loading a session, the indicator stays at "100 / 890%" until a new message is sent and the SSE done payload arrives. Users hitting older sessions on page load see the broken number and never reach the SSE path that would correct it. Worse — the tooltip is misleading enough ("Compress now to free context") that some users will compress unnecessarily.
Fix shape
Mirror the SSE-path fallback in two places so the load path matches:
Backend — api/routes.py:1302-1304
Add the same get_model_context_length() fallback that api/streaming.py:2333-2342 uses:
# Resolve effective values with model-metadata fallback so older sessions
# (pre-#1318) that have context_length=0 still get a meaningful indicator.
_persisted_cl = getattr(s, "context_length", 0) or 0
_persisted_lpt = getattr(s, "last_prompt_tokens", 0) or 0
if not _persisted_cl:
try:
from agent.model_metadata import get_model_context_length
_persisted_cl = get_model_context_length(
getattr(s, "model", "") or effective_model or "",
"", # base_url not stored on Session
) or 0
except Exception:
pass
raw = s.compact() | {
...
"context_length": _persisted_cl,
"threshold_tokens": getattr(s, "threshold_tokens", 0) or 0,
"last_prompt_tokens": _persisted_lpt,
}
Frontend — static/ui.js:1269 (defense in depth)
When last_prompt_tokens is missing, don't fall back to cumulative input_tokens. Show "·" or "no last-prompt data" instead:
// Old:
const promptTok = usage.last_prompt_tokens || usage.input_tokens || 0;
// New:
const promptTok = usage.last_prompt_tokens || 0;
const hasPromptTok = !!promptTok;
// (existing branch already shows "tokens used" when !hasPromptTok)
The current behavior of falling back to cumulative input_tokens was a workaround for missing last_prompt_tokens in pre-#1318 sessions, but the cumulative number is fundamentally wrong for "context window % used" — it conflates input across all turns. Better to show "no data" than wrong data.
This second change is the more important one: it prevents this specific failure mode from recurring even if a future code path forgets the backend fallback.
Test shape
Add to tests/test_session_load_context_length.py:
def test_get_session_returns_context_length_fallback_for_older_session(...):
# Build a Session with context_length=0 (pre-#1318)
s = Session(...)
s.context_length = 0
s.last_prompt_tokens = 0
s.input_tokens = 1_200_000 # cumulative
s.model = "claude-sonnet-4.6" # 1M context
s.save()
resp = GET /api/session?session_id=...
assert resp["context_length"] == 1_000_000 # resolved from model_metadata
# NOT 0, NOT 128K
Plus a frontend test in the QA harness that loads an older session and confirms the ring doesn't show "100" against an artificial 128K default.
Reported by
@AvidFuturist in Discord (webui channel, May 1 2026):
- "the 100 comes up way too often in context circle now. We must prevent that from happening to older sessions or whatever incorrectly causes this now"
- Reproduces in "all my chats today" — consistent with older sessions cached in browser/localStorage where
s.context_length was never persisted
Screenshot: tooltip shows "Context window — 890% used (context exceeded), 1.2M / 131.1k tokens used" with the ring capped at 100.
Cross-links
Scope
api/routes.py (~10 LOC) + static/ui.js (~3 LOC). Both are surgical. Add a regression test alongside.
This is a P1 user-facing regression — the indicator is broken for every older session as soon as users open them today. Worth a hotfix release rather than waiting for the next batch.
Summary
The context-window indicator shows "100" on the ring with the tooltip claiming "890% used (context exceeded), 1.2M / 131.1k tokens used" for older sessions, even when the active model has a 1M context window. Reproduces consistently for sessions Nathan opened today (May 1 2026, v0.50.253+).
This is a regression-shaped bug in older / loaded sessions —
#1356(closed Apr 30) fixed the same symptom on the live SSE path but didn't cover the GET /api/session load path that older sessions go through.Verification — #1356 fixes ARE live in master (v0.50.253)
Per the "fix shipped but user reports it still happens" protocol, I verified each #1356 claim against current master:
api/streaming.pySSE payload falls back toget_model_context_length()when compressor doesn't supply itapi/streaming.py:2333-2342api/streaming.pyfalls back tos.last_prompt_tokensto avoid cumulativeinput_tokensapi/streaming.py:2347-2350static/ui.jstracksrawPctseparately and shows "(context exceeded)" when > 100static/ui.js:1283-1285, 1313All three fixes are live. The bug Nathan reports is a different code path the fix didn't address.
Root cause — the GET /api/session load path has no fallback
When an older session is loaded (clicking it in the sidebar), the chain is:
api/routes.py:1302-1304returnscontext_length,threshold_tokens,last_prompt_tokensstraight from the persisted Session object:For sessions that pre-date
#1318(when these fields were added toSession), all three are0.static/sessions.js:512-519then calls_syncCtxIndicatorwith those zeros:static/ui.js:1269, 1273then triggers two cascading fallbacks:Result:
rawPct = 1.2M / 131K = ~890%, capped to 100 on the ring per:1284(Math.min(100, rawPct)), with the tooltip showing the raw 890% per:1313.This is exactly the symptom #1356 fixed on the SSE path, repeated on the load path because the same fallback was never added to
api/routes.py.Why the SSE-path fix doesn't help
After loading a session, the indicator stays at "100 / 890%" until a new message is sent and the SSE
donepayload arrives. Users hitting older sessions on page load see the broken number and never reach the SSE path that would correct it. Worse — the tooltip is misleading enough ("Compress now to free context") that some users will compress unnecessarily.Fix shape
Mirror the SSE-path fallback in two places so the load path matches:
Backend —
api/routes.py:1302-1304Add the same
get_model_context_length()fallback thatapi/streaming.py:2333-2342uses:Frontend —
static/ui.js:1269(defense in depth)When
last_prompt_tokensis missing, don't fall back to cumulativeinput_tokens. Show "·" or "no last-prompt data" instead:The current behavior of falling back to cumulative
input_tokenswas a workaround for missinglast_prompt_tokensin pre-#1318 sessions, but the cumulative number is fundamentally wrong for "context window % used" — it conflates input across all turns. Better to show "no data" than wrong data.This second change is the more important one: it prevents this specific failure mode from recurring even if a future code path forgets the backend fallback.
Test shape
Add to
tests/test_session_load_context_length.py:Plus a frontend test in the QA harness that loads an older session and confirms the ring doesn't show "100" against an artificial 128K default.
Reported by
@AvidFuturist in Discord (webui channel, May 1 2026):
s.context_lengthwas never persistedScreenshot: tooltip shows "Context window — 890% used (context exceeded), 1.2M / 131.1k tokens used" with the ring capped at 100.
Cross-links
context_lengthfield to Session), fix(streaming): fallback to model_metadata for context_length when compressor missing (#1318 follow-up) #1348 (fix: context window indicator not updating + responsive sidebar + chat area width at 769px #1318 follow-up — model_metadata fallback in session save), fix(ui): show context indicator percentage without explicit context_length #1349 (UI: show percentage without explicit context_length), WebUI context indicator can prefer stale session usage over latest usage on session load #437 (older closed: stale session usage)context_length=0persisted; need a runtime fallback when loading them, mirroring the SSE-path fallback that fix: context window indicator shows 100% when context_length is missing from SSE payload #1356 addedScope
api/routes.py(~10 LOC) +static/ui.js(~3 LOC). Both are surgical. Add a regression test alongside.This is a P1 user-facing regression — the indicator is broken for every older session as soon as users open them today. Worth a hotfix release rather than waiting for the next batch.