-
Notifications
You must be signed in to change notification settings - Fork 1.3k
update agent cache handling #1431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: 29d57a3 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 issues found across 10 files
Prompt for AI agents (all 2 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="packages/core/lib/v3/agent/tools/type.ts">
<violation number="1" location="packages/core/lib/v3/agent/tools/type.ts:65">
P2: The replay step is silently skipped when xpath is not available. Previously, the step was always recorded. This could cause incomplete replay caches with no logging or warning. Consider adding an else clause to log when xpath cannot be determined, or fall back to recording with coordinates.</violation>
</file>
<file name="packages/core/lib/v3/agent/tools/click.ts">
<violation number="1" location="packages/core/lib/v3/agent/tools/click.ts:61">
P1: Silent failure when xpath is unavailable - click succeeds but no replay step is recorded. The old code always recorded a replay step, but now if `xpath` is null/undefined/empty, caching is silently skipped. Consider adding a fallback to record using coordinates when xpath is unavailable, or at least log a warning.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
Greptile SummaryThis PR enhances agent cache handling by converting coordinate-based actions (click, type, dragAndDrop, clickAndHold, fillFormVision) to XPath-based deterministic actions for reliable cache replay. The implementation extracts Key changes:
Architecture: Trade-offs: Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Agent as Agent Tool (click/type/etc)
participant Page as Page (Understudy)
participant XPath as ensureXPath Util
participant Recording as v3.recordAgentReplayStep
participant Cache as AgentCache
participant Replay as Cache Replay
Note over Agent,Replay: Recording Phase (First Execution)
Agent->>Page: click(x, y, {returnXpath: true})
Page-->>Agent: xpath string
Agent->>XPath: ensureXPath(xpath)
XPath-->>Agent: normalized xpath with "xpath=" prefix (or null)
alt xpath is valid
Agent->>Recording: recordAgentReplayStep({type: "act", actions: [{selector: xpath, method: "click", ...}]})
Recording->>Cache: store step with xpath-based Action
else xpath is null/invalid
Note over Agent,Recording: Skip recording (silent failure)
end
Note over Agent,Replay: Replay Phase (Cache Hit)
Cache->>Replay: replayAgentActStep(step)
Replay->>Page: takeDeterministicAction(action with xpath selector)
Page->>Page: Locate element by xpath and execute action
Note over Page: Action replayed deterministically using selector
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, 2 comments
pirate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved but I would recommend recording the typing step even if we cant get a normalized xpath, typing into the page without a specific focus is still better than not typing at all
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/[email protected] ### Patch Changes - [#1461](#1461) [`0f3991e`](0f3991e) Thanks [@tkattkat](https://github.com/tkattkat)! - Move hybrid mode out of experimental - [#1433](#1433) [`e0e22e0`](e0e22e0) Thanks [@tkattkat](https://github.com/tkattkat)! - Put hybrid mode behind experimental - [#1456](#1456) [`f261051`](f261051) Thanks [@shrey150](https://github.com/shrey150)! - Invoke page.hover for agent move action - [#1473](#1473) [`e021674`](e021674) Thanks [@shrey150](https://github.com/shrey150)! - Add safety confirmation support for OpenAI + Google CUA - [#1399](#1399) [`6a5496f`](6a5496f) Thanks [@tkattkat](https://github.com/tkattkat)! - Ensure cua agent is killed when stagehand.close is called - [#1436](#1436) [`fea1700`](fea1700) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix auto-load key for act/extract/observe parametrized models on api - [#1439](#1439) [`5b288d9`](5b288d9) Thanks [@tkattkat](https://github.com/tkattkat)! - Remove base64 from agent actions array ( still present in messages object ) - [#1408](#1408) [`e822f5a`](e822f5a) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - allow for act() cache hit when variable values change - [#1472](#1472) [`638efc7`](638efc7) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: agent cache not refreshed on action failure - [#1424](#1424) [`a890f16`](a890f16) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: "Error: -32000 Failed to convert response to JSON: CBOR: stack limit exceeded" - [#1418](#1418) [`934f492`](934f492) Thanks [@miguelg719](https://github.com/miguelg719)! - Cleanup handlers and bus listeners on close - [#1430](#1430) [`bd2db92`](bd2db92) Thanks [@shrey150](https://github.com/shrey150)! - Fix CUA model coordinate translation - [#1465](#1465) [`51e0170`](51e0170) Thanks [@miguelg719](https://github.com/miguelg719)! - Add media resolution high provider option to gemini 3 hybrid agent - [#1431](#1431) [`05f5580`](05f5580) Thanks [@tkattkat](https://github.com/tkattkat)! - Update the cache handling for agent - [#1432](#1432) [`f56a9c2`](f56a9c2) Thanks [@tkattkat](https://github.com/tkattkat)! - Deprecate cua: true in favor of mode: "cua" - [#1406](#1406) [`b40ae11`](b40ae11) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for hovering with coordinates ( page.hover ) - [#1407](#1407) [`0d2b398`](0d2b398) Thanks [@tkattkat](https://github.com/tkattkat)! - Clean up page methods - [#1412](#1412) [`cd01f29`](cd01f29) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: load GOOGLE_API_KEY from .env - [#1462](#1462) [`a734fca`](a734fca) Thanks [@shrey150](https://github.com/shrey150)! - fix: correctly pass userDataDir to chrome launcher - [#1466](#1466) [`b342acf`](b342acf) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - move playwright to optional dependencies - [#1440](#1440) [`2987cd1`](2987cd1) Thanks [@tkattkat](https://github.com/tkattkat)! - [Feature] support excluding tools from agent - [#1455](#1455) [`dfab1d5`](dfab1d5) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - update aisdk client to better enforce structured output with deepseek models - [#1428](#1428) [`4d71162`](4d71162) Thanks [@tkattkat](https://github.com/tkattkat)! - Add "hybrid" mode to stagehand agent ## @browserbasehq/[email protected] ### Minor Changes - [#1459](#1459) [`abb3469`](abb3469) Thanks [@monadoid](https://github.com/monadoid)! - Added building of binaries - [#1457](#1457) [`5fc1281`](5fc1281) Thanks [@monadoid](https://github.com/monadoid)! - First changeset for stagehand-server - [#1469](#1469) [`d634d45`](d634d45) Thanks [@monadoid](https://github.com/monadoid)! - Bump to test binary builds ### Patch Changes - Updated dependencies \[[`0f3991e`](0f3991e), [`e0e22e0`](e0e22e0), [`f261051`](f261051), [`e021674`](e021674), [`6a5496f`](6a5496f), [`fea1700`](fea1700), [`5b288d9`](5b288d9), [`e822f5a`](e822f5a), [`638efc7`](638efc7), [`a890f16`](a890f16), [`934f492`](934f492), [`bd2db92`](bd2db92), [`51e0170`](51e0170), [`05f5580`](05f5580), [`f56a9c2`](f56a9c2), [`b40ae11`](b40ae11), [`0d2b398`](0d2b398), [`cd01f29`](cd01f29), [`a734fca`](a734fca), [`b342acf`](b342acf), [`2987cd1`](2987cd1), [`dfab1d5`](dfab1d5), [`4d71162`](4d71162)]: - @browserbasehq/[email protected] ## @browserbasehq/[email protected] ### Patch Changes - [#1373](#1373) [`cadd192`](cadd192) Thanks [@tkattkat](https://github.com/tkattkat)! - Update screenshot collector in agent evals cli - Updated dependencies \[[`0f3991e`](0f3991e), [`e0e22e0`](e0e22e0), [`f261051`](f261051), [`e021674`](e021674), [`6a5496f`](6a5496f), [`fea1700`](fea1700), [`5b288d9`](5b288d9), [`e822f5a`](e822f5a), [`638efc7`](638efc7), [`a890f16`](a890f16), [`934f492`](934f492), [`bd2db92`](bd2db92), [`51e0170`](51e0170), [`05f5580`](05f5580), [`f56a9c2`](f56a9c2), [`b40ae11`](b40ae11), [`0d2b398`](0d2b398), [`cd01f29`](cd01f29), [`a734fca`](a734fca), [`b342acf`](b342acf), [`2987cd1`](2987cd1), [`dfab1d5`](dfab1d5), [`4d71162`](4d71162)]: - @browserbasehq/[email protected] Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
why
add support for keys cache
what changed
save click , and type actions in act cache for agent
test plan
ran on sign in example with click type and keys , re ran using cache
Summary by cubic
Makes agent actions deterministic and cacheable by recording XPath-based “act” steps for clicks, typing, drag-and-drop, click-and-hold, and form fills, and adds replay support for key events. This improves cache hit rates and makes replays more reliable.
Written for commit f62c537. Summary will update automatically on new commits.