-
Notifications
You must be signed in to change notification settings - Fork 1.3k
update cua agents key & system prompt handling #1125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: a63a9f6 The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Summary
Fixed key action handling for OpenAI CUA agent and enabled custom system prompt support for Google CUA agent.
Key Changes:
- All key actions in keypress handlers now properly map keys through
mapKeyToPlaywrightfunction to ensure compatibility with Playwright - Google CUA agent now respects custom system prompts passed via
userProvidedInstructions, overriding the default system prompt - Proper changeset included for version tracking
Confidence Score: 5/5
- This PR is safe to merge with minimal risk
- The changes are well-scoped bug fixes that improve key action handling and add missing functionality for custom system prompts. The
mapKeyToPlaywrightutility function already exists and is being used elsewhere in the codebase. The system prompt change simply adds conditional logic to use custom instructions when provided. All changes include proper changeset documentation. - No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| lib/agent/GoogleCUAClient.ts | 5/5 | Updated system prompt handling to use custom instructions when provided, falling back to default prompt |
| lib/handlers/cuaAgentHandler.ts | 5/5 | Fixed keypress handling by mapping keys through mapKeyToPlaywright function before sending to Playwright |
Sequence Diagram
sequenceDiagram
participant Agent as CUA Agent
participant Handler as CuaAgentHandler
participant Mapper as mapKeyToPlaywright
participant Page as Playwright Page
Agent->>Handler: keypress action with keys array
Handler->>Mapper: map each key (e.g., "ENTER")
Mapper-->>Handler: return Playwright key (e.g., "Enter")
Handler->>Handler: join mapped keys with "+"
Handler->>Page: keyboard.press(mappedKeys)
Note over Agent,Handler: Google CUA Client System Prompt
Agent->>Agent: initializeHistory(instruction)
alt userProvidedInstructions exists
Agent->>Agent: use userProvidedInstructions
else no custom instructions
Agent->>Agent: use buildGoogleCUASystemPrompt().content
end
Agent->>Agent: add system prompt to history
2 files reviewed, no comments
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to v2, this PR will be updated. # Releases ## @browserbasehq/[email protected] ### Patch Changes - [#1275](#1275) [`a372b3c`](a372b3c) Thanks [@miguelg719](https://github.com/miguelg719)! - Remove process exit on signal handler - [#1143](#1143) [`fc06d40`](fc06d40) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add logger param to external aisdk client - [#1137](#1137) [`2dbac99`](2dbac99) Thanks [@miguelg719](https://github.com/miguelg719)! - Add haiku 4.5 computer use support - [#1116](#1116) [`b419fc3`](b419fc3) Thanks [@tkattkat](https://github.com/tkattkat)! - patch stagehand agent api support - [#1362](#1362) [`f26333e`](f26333e) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - use CDP to find scrollable nodes instead of injected JS - [#1125](#1125) [`cbff109`](cbff109) Thanks [@tkattkat](https://github.com/tkattkat)! - update cua agents key & system prompt handling - [#1363](#1363) [`223e158`](223e158) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add causedBy to StagehandDefaultError - [#1123](#1123) [`f426ba5`](f426ba5) Thanks [@tkattkat](https://github.com/tkattkat)! - Add pageUrl & timestamp to agent actions - [#1365](#1365) [`2f71b02`](2f71b02) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - export getAccessibilityTree() - [#1366](#1366) [`e098b0d`](e098b0d) Thanks [@miguelg719](https://github.com/miguelg719)! - Update finding scrollable nodes using CDP ## @browserbasehq/[email protected] ### Patch Changes - Updated dependencies \[[`a372b3c`](a372b3c), [`fc06d40`](fc06d40), [`2dbac99`](2dbac99), [`b419fc3`](b419fc3), [`f26333e`](f26333e), [`cbff109`](cbff109), [`223e158`](223e158), [`f426ba5`](f426ba5), [`2f71b02`](2f71b02), [`e098b0d`](e098b0d)]: - @browserbasehq/[email protected] ## @browserbasehq/[email protected] ### Patch Changes - Updated dependencies \[[`a372b3c`](a372b3c), [`fc06d40`](fc06d40), [`2dbac99`](2dbac99), [`b419fc3`](b419fc3), [`f26333e`](f26333e), [`cbff109`](cbff109), [`223e158`](223e158), [`f426ba5`](f426ba5), [`2f71b02`](2f71b02), [`e098b0d`](e098b0d)]: - @browserbasehq/[email protected] <!-- This is an auto-generated description by cubic. --> --- ## Summary by cubic Publish Stagehand 2.5.3 and bump evals (1.1.3) and examples (1.0.12). Improves stability, CDP-based scrolling, and adds better logging and error context. - **New Features** - Haiku 4.5 computer-use support. - Export getAccessibilityTree(). - Logger param for the external AISDK client. - Page URL and timestamp on agent actions. - causedBy on StagehandDefaultError for richer error context. - **Bug Fixes** - Detect scrollable nodes via CDP (removed injected JS). - Refined CDP scrollable node detection. - Do not exit the process on signal handler. - Patch agent API support and update CUA key/system prompt handling. <sup>Written for commit 6b62062. Summary will update automatically on new commits.</sup> <!-- End of auto-generated description by cubic. --> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
why
currently, for openai cua agent, we are handling keypress actions incorrectly
currently, there is no way to pass a custom system prompt to the Google cua agent
what changed
test plan
tested locally with google & openai cua agents
Fixes #1122