bug(context-indicator): older sessions show '100' / '890% used' on load — #1356 fix didn't cover GET /api/session path, falls back to cumulative input_tokens vs 128K default

## Summary

The context-window indicator shows "100" on the ring with the tooltip claiming "**890% used (context exceeded), 1.2M / 131.1k tokens used**" for older sessions, even when the active model has a 1M context window. Reproduces consistently for sessions Nathan opened today (May 1 2026, v0.50.253+).

This is a regression-shaped bug in **older / loaded sessions** — `#1356` (closed Apr 30) fixed the same symptom on the **live SSE path** but didn't cover the **GET /api/session load path** that older sessions go through.

## Verification — #1356 fixes ARE live in master (v0.50.253)

Per the "fix shipped but user reports it still happens" protocol, I verified each #1356 claim against current master:

| #1356 claim | Status | Location |
|---|---|---|
| `api/streaming.py` SSE payload falls back to `get_model_context_length()` when compressor doesn't supply it | ✅ Present | `api/streaming.py:2333-2342` |
| `api/streaming.py` falls back to `s.last_prompt_tokens` to avoid cumulative `input_tokens` | ✅ Present | `api/streaming.py:2347-2350` |
| `static/ui.js` tracks `rawPct` separately and shows "(context exceeded)" when > 100 | ✅ Present | `static/ui.js:1283-1285, 1313` |

All three fixes are live. The bug Nathan reports is a **different code path** the fix didn't address.

## Root cause — the GET /api/session load path has no fallback

When an older session is **loaded** (clicking it in the sidebar), the chain is:

1. **`api/routes.py:1302-1304`** returns `context_length`, `threshold_tokens`, `last_prompt_tokens` straight from the persisted Session object:
   ```python
   "context_length": getattr(s, "context_length", 0) or 0,
   "threshold_tokens": getattr(s, "threshold_tokens", 0) or 0,
   "last_prompt_tokens": getattr(s, "last_prompt_tokens", 0) or 0,
   ```
   For sessions that pre-date `#1318` (when these fields were added to `Session`), all three are `0`.

2. **`static/sessions.js:512-519`** then calls `_syncCtxIndicator` with those zeros:
   ```js
   _syncCtxIndicator({
     input_tokens:      _pick(u.input_tokens,      _s.input_tokens),     // cumulative
     last_prompt_tokens:_pick(u.last_prompt_tokens,_s.last_prompt_tokens), // 0 for old sessions
     context_length:    _pick(u.context_length,    _s.context_length),    // 0 for old sessions
     ...
   });
   ```

3. **`static/ui.js:1269, 1273`** then triggers two cascading fallbacks:
   ```js
   const promptTok = usage.last_prompt_tokens || usage.input_tokens || 0;  // → cumulative
   const ctxWindow = usage.context_length || DEFAULT_CTX;                  // → 131,072 (128K)
   ```

4. Result: `rawPct = 1.2M / 131K = ~890%`, capped to 100 on the ring per `:1284` (`Math.min(100, rawPct)`), with the tooltip showing the raw 890% per `:1313`.

This is **exactly** the symptom #1356 fixed on the SSE path, repeated on the load path because the same fallback was never added to `api/routes.py`.

## Why the SSE-path fix doesn't help

After loading a session, the indicator stays at "100 / 890%" until a new message is sent and the SSE `done` payload arrives. Users hitting older sessions on page load see the broken number and never reach the SSE path that would correct it. Worse — the tooltip is misleading enough ("Compress now to free context") that some users will compress unnecessarily.

## Fix shape

Mirror the SSE-path fallback in **two places** so the load path matches:

### Backend — `api/routes.py:1302-1304`

Add the same `get_model_context_length()` fallback that `api/streaming.py:2333-2342` uses:

```python
# Resolve effective values with model-metadata fallback so older sessions
# (pre-#1318) that have context_length=0 still get a meaningful indicator.
_persisted_cl = getattr(s, "context_length", 0) or 0
_persisted_lpt = getattr(s, "last_prompt_tokens", 0) or 0
if not _persisted_cl:
    try:
        from agent.model_metadata import get_model_context_length
        _persisted_cl = get_model_context_length(
            getattr(s, "model", "") or effective_model or "",
            "",  # base_url not stored on Session
        ) or 0
    except Exception:
        pass

raw = s.compact() | {
    ...
    "context_length": _persisted_cl,
    "threshold_tokens": getattr(s, "threshold_tokens", 0) or 0,
    "last_prompt_tokens": _persisted_lpt,
}
```

### Frontend — `static/ui.js:1269` (defense in depth)

When `last_prompt_tokens` is missing, **don't** fall back to cumulative `input_tokens`. Show "·" or "no last-prompt data" instead:

```js
// Old:
const promptTok = usage.last_prompt_tokens || usage.input_tokens || 0;
// New:
const promptTok = usage.last_prompt_tokens || 0;
const hasPromptTok = !!promptTok;
// (existing branch already shows "tokens used" when !hasPromptTok)
```

The current behavior of falling back to cumulative `input_tokens` was a workaround for missing `last_prompt_tokens` in pre-#1318 sessions, but the cumulative number is fundamentally wrong for "context window % used" — it conflates input across all turns. Better to show "no data" than wrong data.

This second change is the more important one: it prevents this specific failure mode from recurring even if a future code path forgets the backend fallback.

## Test shape

Add to `tests/test_session_load_context_length.py`:

```python
def test_get_session_returns_context_length_fallback_for_older_session(...):
    # Build a Session with context_length=0 (pre-#1318)
    s = Session(...)
    s.context_length = 0
    s.last_prompt_tokens = 0
    s.input_tokens = 1_200_000  # cumulative
    s.model = "claude-sonnet-4.6"  # 1M context
    s.save()

    resp = GET /api/session?session_id=...
    assert resp["context_length"] == 1_000_000  # resolved from model_metadata
    # NOT 0, NOT 128K
```

Plus a frontend test in the QA harness that loads an older session and confirms the ring doesn't show "100" against an artificial 128K default.

## Reported by

@AvidFuturist in Discord (webui channel, May 1 2026):
- "the 100 comes up way too often in context circle now. We must prevent that from happening to older sessions or whatever incorrectly causes this now"
- Reproduces in "all my chats today" — consistent with older sessions cached in browser/localStorage where `s.context_length` was never persisted

Screenshot: tooltip shows "Context window — 890% used (context exceeded), 1.2M / 131.1k tokens used" with the ring capped at 100.

## Cross-links

- **Predecessor:** #1356 (closed Apr 30 2026) — same symptom on SSE path, fix did not cover load path
- Related: #1318 (added `context_length` field to Session), #1348 (#1318 follow-up — model_metadata fallback in session save), #1349 (UI: show percentage without explicit context_length), #437 (older closed: stale session usage)
- Same shape: pre-#1318 sessions have `context_length=0` persisted; need a runtime fallback when loading them, mirroring the SSE-path fallback that #1356 added

## Scope

`api/routes.py` (~10 LOC) + `static/ui.js` (~3 LOC). Both are surgical. Add a regression test alongside.

This is a P1 user-facing regression — the indicator is broken for **every older session** as soon as users open them today. Worth a hotfix release rather than waiting for the next batch.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug(context-indicator): older sessions show '100' / '890% used' on load — #1356 fix didn't cover GET /api/session path, falls back to cumulative input_tokens vs 128K default #1436

Summary

Verification — #1356 fixes ARE live in master (v0.50.253)

Root cause — the GET /api/session load path has no fallback

Why the SSE-path fix doesn't help

Fix shape

Backend — `api/routes.py:1302-1304`

Frontend — `static/ui.js:1269` (defense in depth)

Test shape

Reported by

Cross-links

Scope

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

#1356 claim	Status	Location
`api/streaming.py` SSE payload falls back to `get_model_context_length()` when compressor doesn't supply it	✅ Present	`api/streaming.py:2333-2342`
`api/streaming.py` falls back to `s.last_prompt_tokens` to avoid cumulative `input_tokens`	✅ Present	`api/streaming.py:2347-2350`
`static/ui.js` tracks `rawPct` separately and shows "(context exceeded)" when > 100	✅ Present	`static/ui.js:1283-1285, 1313`

bug(context-indicator): older sessions show '100' / '890% used' on load — #1356 fix didn't cover GET /api/session path, falls back to cumulative input_tokens vs 128K default #1436

Description

Summary

Verification — #1356 fixes ARE live in master (v0.50.253)

Root cause — the GET /api/session load path has no fallback

Why the SSE-path fix doesn't help

Fix shape

Backend — api/routes.py:1302-1304

Frontend — static/ui.js:1269 (defense in depth)

Test shape

Reported by

Cross-links

Scope

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Backend — `api/routes.py:1302-1304`

Frontend — `static/ui.js:1269` (defense in depth)