Conversation
🦋 Changeset detectedLatest commit: 8de8bf1 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Coverage ReportExtension CoverageBase branch: 45% PR branch: 46% ✅ Coverage increased or remained the same Webview CoverageBase branch: 11% PR branch: 11% ✅ Coverage increased or remained the same Overall Assessment✅ Test coverage has been maintained or improved Last updated: 2025-05-02T04:42:45.827765 |
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed everything up to e0db81c in 2 minutes and 46 seconds. Click for details.
- Reviewed
1190lines of code in7files - Skipped
1files when reviewing. - Skipped posting
8draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/api/providers/gemini.ts:96
- Draft comment:
Consider using async/await for the cache creation flow instead of the promise chain for improved readability. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a code quality suggestion that could improve readability. However, this is background processing code that intentionally doesn't await completion. The promise chain actually makes the asynchronous, non-blocking nature of this code more explicit. Using async/await might make it less clear that this is intentionally not awaited. The comment assumes async/await is always more readable, but in this case the promise chain better expresses the intent of background processing. The current implementation clearly shows the flow of background operations. While async/await can improve readability in many cases, here the promise chain better communicates the non-blocking nature of the cache creation process. The comment should be deleted as the current promise chain implementation is actually more appropriate for this background processing code.
2. src/utils/cost.ts:14
- Draft comment:
Ensure the default prices and tiered adjustments are well documented; returning undefined when inputPrice or outputPrice is missing may lead to silent failures. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =50%<= threshold50%The comment is suggesting that returningundefinedcould lead to silent failures, which is a valid concern. However, it also asks to ensure documentation, which is not allowed. I need to determine if the comment is more about the potential issue or the documentation.
3. webview-ui/src/components/settings/ApiOptions.tsx:230
- Draft comment:
The createDropdown function is large and tightly coupled; consider extracting it into a separate component to improve readability and maintainability. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. webview-ui/src/components/settings/ApiOptions.tsx:1815
- Draft comment:
Ensure that callback URLs in getOpenRouterAuthUrl are URL-encoded to avoid issues with special characters in the uriScheme. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. src/api/providers/gemini.ts:67
- Draft comment:
The caching eligibility check uses a multiplier of 4 on CONTEXT_CACHE_TOKEN_MINIMUM. It isn’t immediately clear why 4 is chosen. Consider adding a brief comment explaining the rationale behind this multiplier. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the comment raises a valid point about documentation, this seems like a minor implementation detail. The code already has good comments explaining the high-level purpose of the cache eligibility check. The '4' multiplier appears to be a tuning parameter based on the character-per-token estimation used elsewhere in the code. Adding another comment here would add noise without significant value. I might be undervaluing the importance of documenting magic numbers. Future maintainers might waste time trying to understand why 4 was chosen. The relationship to the 4 characters-per-token estimation is clear from the codebase context, and the exact multiplier value isn't critical - it's just a rough heuristic for when caching might be worthwhile. The comment should be deleted. The multiplier is a tuning parameter whose exact value isn't critical, and its relationship to token estimation is clear from context.
6. src/api/providers/vertex.ts:28
- Draft comment:
For model IDs that include 'claude', the implementation is left as a comment. Consider adding a TODO or placeholder to explicitly handle or warn about this incomplete branch. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The comment appears to be incorrect in its premise. The code shows clear signs of intentional refactoring where Gemini handling is extracted while Claude handling remains. The "..." comment is likely a placeholder in the diff to avoid showing unchanged code. This is not a case of missing implementation, but rather unchanged implementation. I could be wrong about the "..." - maybe it really does mean the implementation is missing. The comment could be pointing out a real issue. The surrounding context strongly suggests this is just a diff simplification - the comments on lines 29-30 explicitly state "Your existing Claude implementation remains unchanged" and "This preserves your current Claude-specific functionality". The comment should be deleted as it incorrectly assumes missing implementation where the code is actually just preserving existing functionality.
7. webview-ui/src/components/settings/ApiOptions.tsx:1944
- Draft comment:
The 'Supports browser use' item uses the same 'supportsImages' property as 'Supports Images'. If these represent distinct features, consider introducing a separate flag to avoid confusion. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. src/utils/cost.ts:26
- Draft comment:
The tiered pricing lookup in calculateApiCostInternal relies on tiers sorted by contextWindow. While the fallback uses the last tier, consider adding a brief comment to clarify that the highest tier should use Infinity as the upper limit for robust matching. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_vhZhppHNlCHqwb5F
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed 6f1a41f in 2 minutes and 11 seconds. Click for details.
- Reviewed
235lines of code in1files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/api/providers/vertex.ts:45
- Draft comment:
Consider adding an explicit type annotation for the 'stream' variable to improve clarity and type safety. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% Adding an explicit 'any' type would actually reduce type safety by opting out of TypeScript's type inference. The stream variable's type should be correctly inferred from the AnthropicVertex SDK's return type. Adding 'any' would hide potential type errors and make the code less safe. Perhaps there's a specific reason why type inference isn't working well here that I'm missing? Maybe the SDK types are incomplete? Even if there are SDK type issues, explicitly typing as 'any' would be worse than letting TypeScript infer what it can. If there are type issues, they should be fixed in the SDK or by using a more specific type. The comment should be deleted as it suggests adding an 'any' type which would reduce type safety rather than improve it.
2. src/api/providers/vertex.ts:74
- Draft comment:
The message mapping logic for ephemeral cache control (lines 74–112) is very similar to the one in the default case (lines 133–144). Consider refactoring this into a helper function to reduce duplication. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
3. src/api/providers/vertex.ts:43
- Draft comment:
Variable 'budget_tokens' is written in snake_case while most other variables use camelCase. Confirm if this naming is intentional (e.g. matching API expectations) or consider standardizing for consistency. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
4. src/api/providers/vertex.ts:170
- Draft comment:
In the 'message_stop' case, only a break from the switch is used. If 'message_stop' signifies the end of the stream, consider exiting the loop to prevent processing any further chunks. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_lDunSZLwIJ6IXIsq
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
|
This is great improvement, I like how cache information is shown, it will be tremendously useful to see how cache is growing overtime.
But curious if the default TTL can be configured? |
|
Cache information will be shown when using what providers? OpenRouter, Gemini? |
|
@pashpashpash Here's the problem and the fix. At the moment every user turn triggers two separate network calls:
Because caches.create is executed on every turn, the full prompt is written to the cache over and over again. With the default TTL of 15 minutes this costs roughly $4.50 per million tokens per hour, i.e. $1.125 for every million tokens kept for 15 minutes—a charge that the user never sees in the normal token accounting. Recommended changes
- On subsquent turns if the cache doesn't exist because of TTL limits then just create a new ones.
Stable cache key
Cost visibilityWe may never get a perfect breakdown from the API, but we can: The net effect: we create the cache once, keep extending it, avoid dozens of unused caches, and eliminate the hidden “hidden double spend” on every turn. |
d871446 to
2024be7
Compare
2024be7 to
f3c07f3
Compare
There was a problem hiding this comment.
This PR works but @pashpashpash and I would still like to be vigilant.
WE have tried best to show the ongoing pricing of the cache usage and made the best use of the available cache to save on costs given how much we hit the same parts of cache over and over again this PR makes sense.
This PR has been tested on vertex and gemini for caching.
dcbartlett
left a comment
There was a problem hiding this comment.
I'm going to review this more when I have access to my computer. The changes commented on should happen for now, might have more later.
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>


Description
improved pricing calculation for gemini and vertex + more robust caching & cache tracking for gemini & vertex
Test Procedure
Type of Change
Pre-flight Checklist
npm test) and code is formatted and linted (npm run format && npm run lint)npm run changeset(required for user-facing changes)Screenshots
Additional Notes
Important
Enhance Gemini and Vertex API integration with improved caching strategy, accurate cost calculation, and updated UI components.
GeminiHandleringemini.ts, creating a single cache per task and reusing it for subsequent turns.updateCacheContent()to handle cache updates and deletions, ensuring non-blocking operations.calculateCost()andgetTaskOngoingCosts().GeminiHandlertoVertexHandlerinvertex.tsfor handling Gemini models.ApiHandlerOptionsinapi.tsto include new configuration options for Gemini and Vertex.TaskHeader.tsxto display cache and cost information.ApiOptions.tsxto include new settings for Gemini and Vertex APIs.node-cachedependency inpackage.jsonfor caching support.api.tsto reflect new cost structures.This description was created by
for f3c07f3. You can customize this summary. It will automatically update as commits are pushed.