Skip to content

fix: deduplicate streaming costs and unify cost source of truth#79

Closed
holstein13 wants to merge 13 commits intomatt1398:mainfrom
holstein13:feat/fix-cost-overcounting-dedup
Closed

fix: deduplicate streaming costs and unify cost source of truth#79
holstein13 wants to merge 13 commits intomatt1398:mainfrom
holstein13:feat/fix-cost-overcounting-dedup

Conversation

@holstein13
Copy link
Copy Markdown
Contributor

@holstein13 holstein13 commented Feb 24, 2026

Summary

Fixes #74 — Cost overcounting from streaming JSONL duplicates and mismatched cost displays between the cost analysis panel and chat header.

  • Streaming dedup: Claude Code writes multiple JSONL entries per API response with the same requestId but incrementally accumulating output_tokens. Added requestId to ParsedMessage, populated it during parsing, and deduped in calculateMetrics() using last-entry-wins — eliminates ~2x cost inflation
  • Unified cost source of truth: sessionAnalyzer.ts now consumes detail.metrics.costUsd (parent) and proc.metrics.costUsd (subagent) from calculateMetrics() instead of independently recomputing costs. The cost analysis panel and chat header now show identical numbers
  • Bug fix: Fixed a continue statement in sessionAnalyzer.ts that skipped the entire loop iteration (dropping tool calls, thinking blocks, git branches) instead of just the cost block for streaming duplicates
  • Cleanup: Removed dead dedup code (lastRequestIdIndex, isLastForRequest) and stale comment claiming proc.metrics.costUsd is not populated upstream

What's still open

Per-model token stats in sessionAnalyzer are now slightly inflated by streaming duplicates since the dedup was removed from that loop. This is explicitly acknowledged in code comments as acceptable — per-model stats are display-only approximations for the report table, and the authoritative totals from calculateMetrics() are unaffected. Good candidate for a follow-up ticket.

Test plan

  • 12 new streaming dedup tests in costCalculation.test.ts
  • 3 new cost source-of-truth tests in sessionAnalyzer.test.ts
  • 836 tests pass, 0 failures
  • pnpm typecheck clean
  • pnpm lint:fix — 0 errors
  • Manual verification: cost analysis panel and chat header show matching costs

Acknowledgments

Thanks to @KaustubhPatange for his help figuring this out — the numbers look much more reasonable now.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added streaming deduplication to eliminate duplicate messages and ensure accurate usage tracking.
  • Bug Fixes

    • Improved cost calculation accuracy by sourcing from authoritative upstream values, reducing potential recalculation discrepancies.
  • Tests

    • Added comprehensive tests for streaming deduplication behavior and cost calculation verification.

holstein13 and others added 13 commits February 24, 2026 10:05
Add 5 test cases in a Streaming Deduplication describe block verifying
that calculateMetrics() deduplicates streaming entries sharing the same
requestId. Two tests (duplicate and triple requestIds) intentionally
fail as the dedup logic has not been implemented yet (red phase).

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…dedup

The `continue` statement was skipping the entire loop iteration for
streaming duplicates, not just the cost accumulation block. This caused
non-cost metadata (tool calls, thinking blocks, git branches, model
switches) to be silently dropped for intermediate streaming entries.

Replace with an `isLastForRequest` conditional that wraps only the cost
accumulation logic, allowing the rest of the loop body to execute for
all messages.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Instead of recomputing parent cost by accumulating per-message costs in the
renderer loop, use the authoritative costUsd from detail.metrics (computed
by calculateMetrics() in the main process). Per-model stats.costUsd still
computed from tokens for the report table breakdown.

Also update the basic session test fixture to supply metrics.costUsd since
parentCost now reads from there instead of accumulating from the loop.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Remove stale comment claiming proc.metrics.costUsd is not populated
upstream — it IS populated by calculateMetrics() in the main process.
Use the authoritative value directly instead of recomputing via
calculateMessageCost().

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The lastRequestIdIndex map and isLastForRequest conditional were
streaming dedup logic that is no longer needed since parentCost now
comes from detail.metrics.costUsd. Per-model token stats are
display-only, so slight inflation from streaming duplicates is
acceptable.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @holstein13, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses the issue of cost overcounting due to duplicate streaming JSONL entries and inconsistencies in cost displays. It introduces a deduplication mechanism, unifies the cost source of truth, and fixes a bug that caused incorrect loop iterations. The changes ensure accurate cost calculation and consistent cost reporting across the application.

Highlights

  • Cost Deduplication: Implemented deduplication of streaming JSONL entries by using the requestId to identify and remove duplicate entries, preventing cost overcounting.
  • Unified Cost Source: Consolidated the cost source of truth to calculateMetrics() in sessionAnalyzer.ts, ensuring consistent cost display across the cost analysis panel and chat header.
  • Bug Fix: Corrected a continue statement in sessionAnalyzer.ts to prevent skipping entire loop iterations, ensuring proper processing of tool calls, thinking blocks, and git branches.
  • Testing: Added 12 new streaming deduplication tests in costCalculation.test.ts and 3 new cost source-of-truth tests in sessionAnalyzer.test.ts to ensure the changes are robust.
Changelog
  • src/main/types/messages.ts
    • Added requestId to ParsedMessage interface for streaming deduplication.
  • src/main/utils/jsonl.ts
    • Populated requestId during parsing of chat history entries.
  • src/renderer/utils/sessionAnalyzer.ts
    • Modified analyzeSession to use detail.metrics.costUsd and proc.metrics.costUsd for parent and subagent costs, respectively.
    • Removed redundant cost calculation logic.
  • test/main/utils/costCalculation.test.ts
    • Added tests for streaming deduplication to ensure accurate cost calculation.
  • test/renderer/utils/sessionAnalyzer.test.ts
    • Added tests to verify that detail.metrics.costUsd and proc.metrics.costUsd are used as the authoritative cost sources.
Activity
  • Addressed cost overcounting from streaming JSONL duplicates.
  • Resolved mismatched cost displays between the cost analysis panel and chat header.
  • Implemented streaming deduplication using requestId in calculateMetrics().
  • Unified cost source of truth by consuming detail.metrics.costUsd and proc.metrics.costUsd from calculateMetrics() in sessionAnalyzer.ts.
  • Fixed a bug in sessionAnalyzer.ts that skipped entire loop iterations.
  • Removed dead deduplication code and stale comments.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai coderabbitai bot added bug Something isn't working documentation Improvements or additions to documentation labels Feb 24, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

Implements streaming deduplication of JSONL entries by requestId to fix cost overcounting in sessions with multiple API responses. Adds requestId field to message types, deduplicates metrics by keeping only the last entry per request, and centralizes cost calculation to use authoritative metrics as the single source of truth rather than re-computing from per-message data.

Changes

Cohort / File(s) Summary
Design Documentation
docs/plans/2026-02-24-unify-cost-source-of-truth-design.md
New design document describing the unification of cost calculation, removal of redundant cost computations, and clarification that authoritative costs come from metrics in the main process and subagents.
Type Definitions
src/main/types/messages.ts
Added optional requestId?: string field to ParsedMessage interface to track streaming deduplication metadata within parsed messages.
JSONL Parsing & Deduplication
src/main/utils/jsonl.ts
Implemented streaming deduplication by building a last-by-requestId map; metrics calculations now operate on deduplicated messages to eliminate duplicate token counting, while parsing remains streaming-friendly.
Cost Source of Truth
src/renderer/utils/sessionAnalyzer.ts
Replaced per-message cost accumulation with authoritative costs sourced from detail.metrics.costUsd (parent) and proc.metrics.costUsd (subagents); removed incremental cost updates during per-message processing.
Deduplication Tests
test/main/utils/costCalculation.test.ts
Added comprehensive test suite covering last-entry-wins deduplication for duplicate requestIds, distinct requestId handling, legacy messages without requestId, and cost/token calculation across multiple streaming entries.
Cost Verification Tests
test/renderer/utils/sessionAnalyzer.test.ts
Added test block verifying authoritative cost sources: parent cost matches detail.metrics.costUsd, subagent cost uses proc.metrics.costUsd, and total cost is the sum of both, with fallback to 0 when cost data is missing.

Possibly related PRs

Suggested labels

bug, documentation

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed PR successfully addresses all primary objectives from issue #74: implements requestId-based streaming deduplication (last-entry-wins) to eliminate ~2x cost inflation, unifies cost source of truth by having renderer consume calculateMetrics() output, and adds comprehensive test coverage for dedup logic.
Out of Scope Changes check ✅ Passed All changes are scoped to issue #74 objectives. Design document addition is documentation-only and supports primary fix. Secondary issues (cache overlap, UsageMetadata expansion) are correctly deferred as follow-up work, not introduced here.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/main/utils/costCalculation.test.ts (1)

746-781: Test title mentions "system messages" but fixture only includes a user message.

The test is titled "should not deduplicate user or system messages" but only exercises the user message path. Consider adding a system message to the fixture or adjusting the title to "should not deduplicate user messages (no requestId)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/main/utils/costCalculation.test.ts` around lines 746 - 781, The test
"should not deduplicate user or system messages" only includes a user
ParsedMessage and misses the system path; update the fixture or the test title:
either add a system ParsedMessage (type: 'system', uuid like 'msg-s1',
timestamp, content, usage tokens) to the messages array and assert
calculateMetrics still counts its tokens, or rename the test to "should not
deduplicate user messages (no requestId)" so the title matches the existing
messages and expectations for calculateMetrics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/main/utils/costCalculation.test.ts`:
- Around line 746-781: The test "should not deduplicate user or system messages"
only includes a user ParsedMessage and misses the system path; update the
fixture or the test title: either add a system ParsedMessage (type: 'system',
uuid like 'msg-s1', timestamp, content, usage tokens) to the messages array and
assert calculateMetrics still counts its tokens, or rename the test to "should
not deduplicate user messages (no requestId)" so the title matches the existing
messages and expectations for calculateMetrics.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3c4fa6 and 6aa6eb8.

📒 Files selected for processing (6)
  • docs/plans/2026-02-24-unify-cost-source-of-truth-design.md
  • src/main/types/messages.ts
  • src/main/utils/jsonl.ts
  • src/renderer/utils/sessionAnalyzer.ts
  • test/main/utils/costCalculation.test.ts
  • test/renderer/utils/sessionAnalyzer.test.ts

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a great pull request that addresses an important issue of cost overcounting and inconsistency. The introduction of requestId for deduplicating streaming messages is a solid approach, and unifying the cost calculation under a single source of truth in calculateMetrics is a significant improvement for data consistency across the application. The code is well-structured, and the addition of comprehensive unit tests for both the deduplication logic and the new source-of-truth behavior is excellent.

I have one suggestion in src/main/utils/jsonl.ts to make the deduplication logic more robust by preserving the relative order of messages, which could prevent potential issues in the future.

Overall, this is a high-quality contribution that significantly improves the accuracy of cost metrics.

Comment on lines +231 to +244
// Deduplicate streaming entries: multiple JSONL lines can share the same
// requestId with incrementally accumulating output_tokens. Keep only the
// last entry per requestId (the final, complete usage).
const lastByRequestId = new Map<string, ParsedMessage>();
const dedupedMessages: ParsedMessage[] = [];

for (const msg of messages) {
if (msg.requestId) {
lastByRequestId.set(msg.requestId, msg);
} else {
dedupedMessages.push(msg);
}
}
dedupedMessages.push(...lastByRequestId.values());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current deduplication logic separates messages with and without a requestId and then appends the deduplicated messages with a requestId to the end. This alters the relative chronological order of messages, which could introduce subtle bugs if any downstream logic relies on this order.

A more robust approach is to perform deduplication while preserving the original relative message order. You can achieve this by iterating through the messages array backwards and using a Set to track seen requestIds. This ensures that only the last instance of a message with a given requestId is kept, and the overall message order is maintained.

Suggested change
// Deduplicate streaming entries: multiple JSONL lines can share the same
// requestId with incrementally accumulating output_tokens. Keep only the
// last entry per requestId (the final, complete usage).
const lastByRequestId = new Map<string, ParsedMessage>();
const dedupedMessages: ParsedMessage[] = [];
for (const msg of messages) {
if (msg.requestId) {
lastByRequestId.set(msg.requestId, msg);
} else {
dedupedMessages.push(msg);
}
}
dedupedMessages.push(...lastByRequestId.values());
// Deduplicate streaming entries: multiple JSONL lines can share the same
// requestId with incrementally accumulating output_tokens. Keep only the
// last entry per requestId (the final, complete usage), while preserving
// the relative order of messages.
const dedupedMessages: ParsedMessage[] = [];
const seenRequestIds = new Set<string>();
for (let i = messages.length - 1; i >= 0; i--) {
const msg = messages[i];
if (msg.requestId) {
if (!seenRequestIds.has(msg.requestId)) {
seenRequestIds.add(msg.requestId);
dedupedMessages.unshift(msg);
}
} else {
dedupedMessages.unshift(msg);
}
}

@holstein13
Copy link
Copy Markdown
Contributor Author

Closing in favor of #77 which landed first with a cleaner approach to per-model stats. The unified cost source-of-truth concept from this PR (renderer consuming detail.metrics.costUsd instead of recomputing) is tracked as a follow-up in a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix cost overcounting — deduplicate streaming JSONL entries

1 participant