-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add media resolution high to gemini 3 hybrid agent #1465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add media resolution high to gemini 3 hybrid agent #1465
Conversation
🦋 Changeset detectedLatest commit: 5e00581 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 2 files
Greptile SummaryAdded Critical Issue Found:
Changes:
Confidence Score: 1/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant V3AgentHandler
participant LLMClient
participant GoogleAPI
Client->>V3AgentHandler: execute() or stream()
V3AgentHandler->>V3AgentHandler: prepareAgent()
V3AgentHandler->>LLMClient: getLanguageModel()
LLMClient-->>V3AgentHandler: wrappedModel
V3AgentHandler->>V3AgentHandler: Check modelId.includes("gemini-3")
alt Model is gemini-3
V3AgentHandler->>LLMClient: generateText/streamText with providerOptions
Note over V3AgentHandler,LLMClient: providerOptions: { google: { mediaResolution: "MEDIA_RESOLUTION_HIGH" } }
LLMClient->>GoogleAPI: Request with MEDIA_RESOLUTION_HIGH
GoogleAPI-->>LLMClient: Response
else Model is not gemini-3
V3AgentHandler->>LLMClient: generateText/streamText without providerOptions
LLMClient->>GoogleAPI: Standard request
GoogleAPI-->>LLMClient: Response
end
LLMClient-->>V3AgentHandler: Result
V3AgentHandler-->>Client: AgentResult/AgentStreamResult
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 2 comments
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/[email protected] ### Patch Changes - [#1461](#1461) [`0f3991e`](0f3991e) Thanks [@tkattkat](https://github.com/tkattkat)! - Move hybrid mode out of experimental - [#1433](#1433) [`e0e22e0`](e0e22e0) Thanks [@tkattkat](https://github.com/tkattkat)! - Put hybrid mode behind experimental - [#1456](#1456) [`f261051`](f261051) Thanks [@shrey150](https://github.com/shrey150)! - Invoke page.hover for agent move action - [#1473](#1473) [`e021674`](e021674) Thanks [@shrey150](https://github.com/shrey150)! - Add safety confirmation support for OpenAI + Google CUA - [#1399](#1399) [`6a5496f`](6a5496f) Thanks [@tkattkat](https://github.com/tkattkat)! - Ensure cua agent is killed when stagehand.close is called - [#1436](#1436) [`fea1700`](fea1700) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix auto-load key for act/extract/observe parametrized models on api - [#1439](#1439) [`5b288d9`](5b288d9) Thanks [@tkattkat](https://github.com/tkattkat)! - Remove base64 from agent actions array ( still present in messages object ) - [#1408](#1408) [`e822f5a`](e822f5a) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - allow for act() cache hit when variable values change - [#1472](#1472) [`638efc7`](638efc7) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: agent cache not refreshed on action failure - [#1424](#1424) [`a890f16`](a890f16) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: "Error: -32000 Failed to convert response to JSON: CBOR: stack limit exceeded" - [#1418](#1418) [`934f492`](934f492) Thanks [@miguelg719](https://github.com/miguelg719)! - Cleanup handlers and bus listeners on close - [#1430](#1430) [`bd2db92`](bd2db92) Thanks [@shrey150](https://github.com/shrey150)! - Fix CUA model coordinate translation - [#1465](#1465) [`51e0170`](51e0170) Thanks [@miguelg719](https://github.com/miguelg719)! - Add media resolution high provider option to gemini 3 hybrid agent - [#1431](#1431) [`05f5580`](05f5580) Thanks [@tkattkat](https://github.com/tkattkat)! - Update the cache handling for agent - [#1432](#1432) [`f56a9c2`](f56a9c2) Thanks [@tkattkat](https://github.com/tkattkat)! - Deprecate cua: true in favor of mode: "cua" - [#1406](#1406) [`b40ae11`](b40ae11) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for hovering with coordinates ( page.hover ) - [#1407](#1407) [`0d2b398`](0d2b398) Thanks [@tkattkat](https://github.com/tkattkat)! - Clean up page methods - [#1412](#1412) [`cd01f29`](cd01f29) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: load GOOGLE_API_KEY from .env - [#1462](#1462) [`a734fca`](a734fca) Thanks [@shrey150](https://github.com/shrey150)! - fix: correctly pass userDataDir to chrome launcher - [#1466](#1466) [`b342acf`](b342acf) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - move playwright to optional dependencies - [#1440](#1440) [`2987cd1`](2987cd1) Thanks [@tkattkat](https://github.com/tkattkat)! - [Feature] support excluding tools from agent - [#1455](#1455) [`dfab1d5`](dfab1d5) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - update aisdk client to better enforce structured output with deepseek models - [#1428](#1428) [`4d71162`](4d71162) Thanks [@tkattkat](https://github.com/tkattkat)! - Add "hybrid" mode to stagehand agent ## @browserbasehq/[email protected] ### Minor Changes - [#1459](#1459) [`abb3469`](abb3469) Thanks [@monadoid](https://github.com/monadoid)! - Added building of binaries - [#1457](#1457) [`5fc1281`](5fc1281) Thanks [@monadoid](https://github.com/monadoid)! - First changeset for stagehand-server - [#1469](#1469) [`d634d45`](d634d45) Thanks [@monadoid](https://github.com/monadoid)! - Bump to test binary builds ### Patch Changes - Updated dependencies \[[`0f3991e`](0f3991e), [`e0e22e0`](e0e22e0), [`f261051`](f261051), [`e021674`](e021674), [`6a5496f`](6a5496f), [`fea1700`](fea1700), [`5b288d9`](5b288d9), [`e822f5a`](e822f5a), [`638efc7`](638efc7), [`a890f16`](a890f16), [`934f492`](934f492), [`bd2db92`](bd2db92), [`51e0170`](51e0170), [`05f5580`](05f5580), [`f56a9c2`](f56a9c2), [`b40ae11`](b40ae11), [`0d2b398`](0d2b398), [`cd01f29`](cd01f29), [`a734fca`](a734fca), [`b342acf`](b342acf), [`2987cd1`](2987cd1), [`dfab1d5`](dfab1d5), [`4d71162`](4d71162)]: - @browserbasehq/[email protected] ## @browserbasehq/[email protected] ### Patch Changes - [#1373](#1373) [`cadd192`](cadd192) Thanks [@tkattkat](https://github.com/tkattkat)! - Update screenshot collector in agent evals cli - Updated dependencies \[[`0f3991e`](0f3991e), [`e0e22e0`](e0e22e0), [`f261051`](f261051), [`e021674`](e021674), [`6a5496f`](6a5496f), [`fea1700`](fea1700), [`5b288d9`](5b288d9), [`e822f5a`](e822f5a), [`638efc7`](638efc7), [`a890f16`](a890f16), [`934f492`](934f492), [`bd2db92`](bd2db92), [`51e0170`](51e0170), [`05f5580`](05f5580), [`f56a9c2`](f56a9c2), [`b40ae11`](b40ae11), [`0d2b398`](0d2b398), [`cd01f29`](cd01f29), [`a734fca`](a734fca), [`b342acf`](b342acf), [`2987cd1`](2987cd1), [`dfab1d5`](dfab1d5), [`4d71162`](4d71162)]: - @browserbasehq/[email protected] Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
why
This setting improves the performance of gemini 3 models for computer use applications (recommended by Deepmind)
what changed
Added a
providerOptionto google models (specifically filtered by gemini-3) to include MEDIA_RESOLUTION_HIGH (nit: ULTRA_HIGH not supported by aisdk yet)test plan
Summary by cubic
Set the Google provider option mediaResolution to MEDIA_RESOLUTION_HIGH for Gemini 3 models in the hybrid agent to improve computer-use performance.
Applied conditionally when the modelId includes "gemini-3" for both step execution and streaming paths.
Written for commit 5e00581. Summary will update automatically on new commits.