-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat: streaming tool stats + token usage throttling #9926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
Re-review complete for commits through
Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues. |
e310dbd to
e52cd96
Compare
mrubens
reviewed
Dec 9, 2025
- Throttle TaskTokenUsageUpdated emissions to 2-second intervals (~80-90% reduction) - Stream toolUsage alongside tokenUsage in real-time updates - Force final emission on task completion/abort to capture latest stats - Update evals UI to display live tool stats for running tasks - Add comprehensive unit tests for throttle logic This reduces system load (Redis/SSE/DB) while providing real-time tool visibility and preventing data loss on task timeout/abort.
- Add ToolUsage type import and use proper types in ClineProvider event handlers - Make emitFinalTokenUsageUpdate() public in Task.ts for reuse - Use emitFinalTokenUsageUpdate() in AttemptCompletionTool for consistent behavior - Add final token usage emission in handlePartial for completion with commands
- Add hasToolUsageChanged() helper function to getApiMetrics.ts - Add toolUsageSnapshot property to Task.ts to track tool usage changes - Update saveClineMessages() to emit when either token OR tool usage changes - Update emitFinalTokenUsageUpdate() to also check tool usage changes - Add comprehensive tests for tool usage change detection This ensures final tool usage stats are captured on task abort even if token usage hasn't changed (e.g., when task is aborted before API request completes but tools were already executed).
…tedAt in toolColumns deps - Add emitFinalTokenUsageUpdate mock to clineC and clineB in nested-delegation-resume test - Add usageUpdatedAt to toolColumns useMemo dependency array for proper recomputation when streaming tool usage updates
When tasks timeout, there's a race condition where the DB might not have the latest stats when the frontend refetches. This fix adds fallback logic to use streaming values (from the in-memory Maps) when DB values are empty or missing. Changes: - taskMetrics useMemo: prefer DB values but fall back to streaming if empty - toolColumns useMemo: prefer DB values but fall back to streaming if empty - stats useMemo: same fallback logic for aggregate tool usage - tool cells in table: prefer DB values but fall back to streaming if missing
After a task is aborted due to timeout, the extension rehydrates the task with a new instance. This new instance has empty toolUsage, and if it emits any TaskTokenUsageUpdated events, they would overwrite the final metrics that were saved before the abort. This fix adds a check to ignore TaskTokenUsageUpdated events once taskAbortedAt is set, preserving the final metrics from before the abort.
Instead of ignoring TaskTokenUsageUpdated events after TaskAborted, accumulate tool usage data using a MAX strategy. This ensures: - Empty rehydrated data won't overwrite existing: max(5, 0) = 5 - Legitimate restart with additional work is captured: max(5, 8) = 8 This approach is more robust than simply ignoring post-abort events, as it handles both spurious rehydration and legitimate restart scenarios.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Enhancement
New feature or request
lgtm
This PR has been approved by a maintainer
PR - Needs Review
size:XL
This PR changes 500-999 lines, ignoring generated files.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Implements two related improvements to the evals system:
Changes
Part 1: Token Usage Throttling
TOKEN_USAGE_EMIT_INTERVAL_MS = 2000)saveClineMessages()to only emit when 2+ seconds have elapsedPart 2: Streaming Tool Usage
TaskTokenUsageUpdatedevent signature to includetoolUsageparameterTesting
Task.throttle.test.ts)Expected Results
Important
This PR introduces token usage throttling and real-time tool usage stats streaming, reducing system load and enhancing task visibility.
TaskTokenUsageUpdatedemissions inTaskclass to every 2 seconds, reducing emissions by 80-90%.TaskTokenUsageUpdatedevent signature to includetoolUsage.Taskclass to includedebouncedEmitTokenUsagefor throttling inTask.ts.ClineProviderandAPIclasses to handle newTaskTokenUsageUpdatedevent signature.hasToolUsageChanged()function ingetApiMetrics.ts.Task.throttle.test.ts.This description was created by
for 452a0f1. You can customize this summary. It will automatically update as commits are pushed.