Summary
The cost/token display attached to the last assistant message shows cumulative session totals (S.session.input_tokens, S.session.output_tokens, S.session.estimated_cost). This means the number attached to message 5 is the total cost of the entire conversation, not just that turn — which reads as misleading.
Proposed behaviour
Two acceptable options:
Option A: Show per-turn cost from the usage object delivered with each done SSE event, attached to that specific message bubble. Each message shows only what it cost.
Option B: Move the cumulative total out of the message bubble entirely and into the context ring / topbar badge. The composer footer context indicator already shows live token counts — cost could live there instead.
Either approach eliminates the misleading "last message has cumulative total" pattern.
Current behaviour
[user msg 1]
[assistant reply 1]
[user msg 2]
[assistant reply 2] ← shows "4,217 in · 832 out · ~$0.0041" (TOTAL, not this turn)
Implementation notes
- The per-turn
usage object arrives in the done SSE event and is currently stored in S.lastUsage but only used to sync the context ring
S.lastUsage could also be used to set a per-turn cost annotation on that specific message
- The context ring / topbar already shows live totals so moving the cumulative number there wouldn't lose information
Identified in Sprint 24 planning notes.
Summary
The cost/token display attached to the last assistant message shows cumulative session totals (
S.session.input_tokens,S.session.output_tokens,S.session.estimated_cost). This means the number attached to message 5 is the total cost of the entire conversation, not just that turn — which reads as misleading.Proposed behaviour
Two acceptable options:
Option A: Show per-turn cost from the
usageobject delivered with eachdoneSSE event, attached to that specific message bubble. Each message shows only what it cost.Option B: Move the cumulative total out of the message bubble entirely and into the context ring / topbar badge. The composer footer context indicator already shows live token counts — cost could live there instead.
Either approach eliminates the misleading "last message has cumulative total" pattern.
Current behaviour
Implementation notes
usageobject arrives in thedoneSSE event and is currently stored inS.lastUsagebut only used to sync the context ringS.lastUsagecould also be used to set a per-turn cost annotation on that specific messageIdentified in Sprint 24 planning notes.