-
Notifications
You must be signed in to change notification settings - Fork 211
Fix cost overcounting — deduplicate streaming JSONL entries #74
Description
Summary
Our cost calculation inflates costs by ~2x compared to Claude Code's /cost command. The root cause is that Claude Code writes multiple JSONL entries per API response during streaming, each with incrementally accumulating output_tokens. Our calculateMetrics() sums every entry without deduplication, double-counting tokens and cost.
Example: A session where /cost reports ~$3.60, our UI shows ~$6.00.
Root Cause Analysis
Problem 1: Streaming Duplicate Entries (primary — ~2x inflation)
Claude Code streams API responses and writes intermediate JSONL entries that share the same requestId but have incrementally increasing output_tokens. Our parser (src/main/utils/jsonl.ts:250-272) treats each entry independently and sums them all.
Fix: Keep only the last entry per requestId (the final, complete usage). The AssistantEntry type already has requestId (src/main/types/jsonl.ts:179) — we just don't use it for dedup.
This is the same approach as ryoppippi/ccusage#835, which fixed the same class of bug in ccusage.
Problem 2: Cache Creation Token Overlap (secondary)
When parent and subagent sessions share prompt caches, both JSONL files may report cache_creation_input_tokens for the same cached content. Our sessionAnalyzer.ts:385 skips sidechain messages for the parent, but subagent files independently report overlapping cache creation tokens.
Problem 3: Incomplete UsageMetadata Type
Our UsageMetadata (src/main/types/jsonl.ts:80-85) only captures 4 flat fields. The actual JSONL contains additional fields we ignore:
cache_creation.ephemeral_5m_input_tokenscache_creation.ephemeral_1h_input_tokensservice_tier
These may be needed for accurate cost calculation.
Files to Modify
| File | Change |
|---|---|
src/main/utils/jsonl.ts |
Add requestId-based dedup in calculateMetrics() and parseJsonlFile() |
src/main/types/jsonl.ts |
Expand UsageMetadata with cache sub-object and service_tier |
src/renderer/utils/sessionAnalyzer.ts |
Apply same dedup when computing parent cost |
test/main/utils/costCalculation.test.ts |
Add dedup test cases |
test/shared/utils/pricing.test.ts |
Verify no regression |
Key Decisions
- Dedup strategy: last-entry-wins by
requestId— matches the ccusage approach and ensures we get the final, complete token counts - Scope to streaming dedup first — cache overlap (Problem 2) and UsageMetadata expansion (Problem 3) can be follow-up work
- No server-reported cost exists in JSONL — all cost is client-computed from pricing tables, so dedup is essential
Related
- PR Unify cost calculation with shared pricing module #73 (Unify cost calculation with shared pricing module) — discovered during review
- ryoppippi/ccusage#835 — same dedup fix for ccusage (unmerged)
- ryoppippi/ccusage#313 — subagent token tracking discussion
cc @KaustubhPatange — thank you for your research identifying this discrepancy and pointing us to the ccusage findings. Your investigation of the ~2x cost inflation directly led to uncovering these root causes.