fix: deduplicate streaming costs and unify cost source of truth#79
fix: deduplicate streaming costs and unify cost source of truth#79holstein13 wants to merge 13 commits intomatt1398:mainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add 5 test cases in a Streaming Deduplication describe block verifying that calculateMetrics() deduplicates streaming entries sharing the same requestId. Two tests (duplicate and triple requestIds) intentionally fail as the dedup logic has not been implemented yet (red phase). Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
…dedup The `continue` statement was skipping the entire loop iteration for streaming duplicates, not just the cost accumulation block. This caused non-cost metadata (tool calls, thinking blocks, git branches, model switches) to be silently dropped for intermediate streaming entries. Replace with an `isLastForRequest` conditional that wraps only the cost accumulation logic, allowing the rest of the loop body to execute for all messages. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Instead of recomputing parent cost by accumulating per-message costs in the renderer loop, use the authoritative costUsd from detail.metrics (computed by calculateMetrics() in the main process). Per-model stats.costUsd still computed from tokens for the report table breakdown. Also update the basic session test fixture to supply metrics.costUsd since parentCost now reads from there instead of accumulating from the loop. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Remove stale comment claiming proc.metrics.costUsd is not populated upstream — it IS populated by calculateMetrics() in the main process. Use the authoritative value directly instead of recomputing via calculateMessageCost(). Co-Authored-By: Claude Opus 4.6 <[email protected]>
The lastRequestIdIndex map and isLastForRequest conditional were streaming dedup logic that is no longer needed since parentCost now comes from detail.metrics.costUsd. Per-model token stats are display-only, so slight inflation from streaming duplicates is acceptable. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Summary of ChangesHello @holstein13, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses the issue of cost overcounting due to duplicate streaming JSONL entries and inconsistencies in cost displays. It introduces a deduplication mechanism, unifies the cost source of truth, and fixes a bug that caused incorrect loop iterations. The changes ensure accurate cost calculation and consistent cost reporting across the application. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthroughImplements streaming deduplication of JSONL entries by Changes
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/main/utils/costCalculation.test.ts (1)
746-781: Test title mentions "system messages" but fixture only includes a user message.The test is titled "should not deduplicate user or system messages" but only exercises the user message path. Consider adding a system message to the fixture or adjusting the title to "should not deduplicate user messages (no requestId)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/main/utils/costCalculation.test.ts` around lines 746 - 781, The test "should not deduplicate user or system messages" only includes a user ParsedMessage and misses the system path; update the fixture or the test title: either add a system ParsedMessage (type: 'system', uuid like 'msg-s1', timestamp, content, usage tokens) to the messages array and assert calculateMetrics still counts its tokens, or rename the test to "should not deduplicate user messages (no requestId)" so the title matches the existing messages and expectations for calculateMetrics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/main/utils/costCalculation.test.ts`:
- Around line 746-781: The test "should not deduplicate user or system messages"
only includes a user ParsedMessage and misses the system path; update the
fixture or the test title: either add a system ParsedMessage (type: 'system',
uuid like 'msg-s1', timestamp, content, usage tokens) to the messages array and
assert calculateMetrics still counts its tokens, or rename the test to "should
not deduplicate user messages (no requestId)" so the title matches the existing
messages and expectations for calculateMetrics.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docs/plans/2026-02-24-unify-cost-source-of-truth-design.mdsrc/main/types/messages.tssrc/main/utils/jsonl.tssrc/renderer/utils/sessionAnalyzer.tstest/main/utils/costCalculation.test.tstest/renderer/utils/sessionAnalyzer.test.ts
There was a problem hiding this comment.
Code Review
This is a great pull request that addresses an important issue of cost overcounting and inconsistency. The introduction of requestId for deduplicating streaming messages is a solid approach, and unifying the cost calculation under a single source of truth in calculateMetrics is a significant improvement for data consistency across the application. The code is well-structured, and the addition of comprehensive unit tests for both the deduplication logic and the new source-of-truth behavior is excellent.
I have one suggestion in src/main/utils/jsonl.ts to make the deduplication logic more robust by preserving the relative order of messages, which could prevent potential issues in the future.
Overall, this is a high-quality contribution that significantly improves the accuracy of cost metrics.
| // Deduplicate streaming entries: multiple JSONL lines can share the same | ||
| // requestId with incrementally accumulating output_tokens. Keep only the | ||
| // last entry per requestId (the final, complete usage). | ||
| const lastByRequestId = new Map<string, ParsedMessage>(); | ||
| const dedupedMessages: ParsedMessage[] = []; | ||
|
|
||
| for (const msg of messages) { | ||
| if (msg.requestId) { | ||
| lastByRequestId.set(msg.requestId, msg); | ||
| } else { | ||
| dedupedMessages.push(msg); | ||
| } | ||
| } | ||
| dedupedMessages.push(...lastByRequestId.values()); |
There was a problem hiding this comment.
The current deduplication logic separates messages with and without a requestId and then appends the deduplicated messages with a requestId to the end. This alters the relative chronological order of messages, which could introduce subtle bugs if any downstream logic relies on this order.
A more robust approach is to perform deduplication while preserving the original relative message order. You can achieve this by iterating through the messages array backwards and using a Set to track seen requestIds. This ensures that only the last instance of a message with a given requestId is kept, and the overall message order is maintained.
| // Deduplicate streaming entries: multiple JSONL lines can share the same | |
| // requestId with incrementally accumulating output_tokens. Keep only the | |
| // last entry per requestId (the final, complete usage). | |
| const lastByRequestId = new Map<string, ParsedMessage>(); | |
| const dedupedMessages: ParsedMessage[] = []; | |
| for (const msg of messages) { | |
| if (msg.requestId) { | |
| lastByRequestId.set(msg.requestId, msg); | |
| } else { | |
| dedupedMessages.push(msg); | |
| } | |
| } | |
| dedupedMessages.push(...lastByRequestId.values()); | |
| // Deduplicate streaming entries: multiple JSONL lines can share the same | |
| // requestId with incrementally accumulating output_tokens. Keep only the | |
| // last entry per requestId (the final, complete usage), while preserving | |
| // the relative order of messages. | |
| const dedupedMessages: ParsedMessage[] = []; | |
| const seenRequestIds = new Set<string>(); | |
| for (let i = messages.length - 1; i >= 0; i--) { | |
| const msg = messages[i]; | |
| if (msg.requestId) { | |
| if (!seenRequestIds.has(msg.requestId)) { | |
| seenRequestIds.add(msg.requestId); | |
| dedupedMessages.unshift(msg); | |
| } | |
| } else { | |
| dedupedMessages.unshift(msg); | |
| } | |
| } |
|
Closing in favor of #77 which landed first with a cleaner approach to per-model stats. The unified cost source-of-truth concept from this PR (renderer consuming |
Summary
Fixes #74 — Cost overcounting from streaming JSONL duplicates and mismatched cost displays between the cost analysis panel and chat header.
requestIdbut incrementally accumulatingoutput_tokens. AddedrequestIdtoParsedMessage, populated it during parsing, and deduped incalculateMetrics()using last-entry-wins — eliminates ~2x cost inflationsessionAnalyzer.tsnow consumesdetail.metrics.costUsd(parent) andproc.metrics.costUsd(subagent) fromcalculateMetrics()instead of independently recomputing costs. The cost analysis panel and chat header now show identical numberscontinuestatement insessionAnalyzer.tsthat skipped the entire loop iteration (dropping tool calls, thinking blocks, git branches) instead of just the cost block for streaming duplicateslastRequestIdIndex,isLastForRequest) and stale comment claimingproc.metrics.costUsdis not populated upstreamWhat's still open
Per-model token stats in
sessionAnalyzerare now slightly inflated by streaming duplicates since the dedup was removed from that loop. This is explicitly acknowledged in code comments as acceptable — per-model stats are display-only approximations for the report table, and the authoritative totals fromcalculateMetrics()are unaffected. Good candidate for a follow-up ticket.Test plan
costCalculation.test.tssessionAnalyzer.test.tspnpm typecheckcleanpnpm lint:fix— 0 errorsAcknowledgments
Thanks to @KaustubhPatange for his help figuring this out — the numbers look much more reasonable now.
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests