-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add support for Fara 7b #1304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Fara 7b #1304
Conversation
🦋 Changeset detectedLatest commit: de165e1 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Greptile OverviewGreptile SummaryAdded support for Microsoft's Fara-7B model with a new Key Changes:
Issues Found:
Confidence Score: 3/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User
participant Stagehand
participant AgentProvider
participant MicrosoftCUAClient
participant OpenAI API
participant V3CuaAgentHandler
participant Browser
User->>Stagehand: agent({cua: true, model: "microsoft/fara-7b"})
Stagehand->>AgentProvider: getClient(modelName, clientOptions)
AgentProvider->>AgentProvider: Check explicit provider or map "fara-7b" to "microsoft"
AgentProvider->>MicrosoftCUAClient: new MicrosoftCUAClient(type, modelName, instructions, options)
MicrosoftCUAClient->>MicrosoftCUAClient: Initialize OpenAI client with baseURL/apiKey
MicrosoftCUAClient->>MicrosoftCUAClient: Calculate resized viewport using smart_resize
MicrosoftCUAClient-->>Stagehand: Return agent instance
User->>Stagehand: agent.execute({instruction, maxSteps})
Stagehand->>V3CuaAgentHandler: execute(options)
loop For each step (up to maxSteps)
V3CuaAgentHandler->>MicrosoftCUAClient: executeStep(logger, isFirstRound)
MicrosoftCUAClient->>Browser: captureScreenshot()
Browser-->>MicrosoftCUAClient: base64 screenshot
MicrosoftCUAClient->>MicrosoftCUAClient: Add screenshot to conversation history
MicrosoftCUAClient->>MicrosoftCUAClient: reconstructHistory() + generateSystemPrompt()
MicrosoftCUAClient->>OpenAI API: chat.completions.create(messages)
OpenAI API-->>MicrosoftCUAClient: XML response with <tool_call>
MicrosoftCUAClient->>MicrosoftCUAClient: parseThoughtsAndAction(response)
MicrosoftCUAClient->>MicrosoftCUAClient: convertFunctionCallToAction(functionCall)
MicrosoftCUAClient->>MicrosoftCUAClient: Transform coordinates (resized→original viewport)
alt action is "type" with coordinates
MicrosoftCUAClient->>MicrosoftCUAClient: Expand to [click, type, press_enter]
else action is "terminate"
MicrosoftCUAClient->>MicrosoftCUAClient: Mark as completed
else other actions
MicrosoftCUAClient->>MicrosoftCUAClient: Pass through as-is
end
MicrosoftCUAClient->>V3CuaAgentHandler: Return actions & completion status
loop For each action
V3CuaAgentHandler->>V3CuaAgentHandler: executeAction(action)
V3CuaAgentHandler->>Browser: Perform action (click, type, scroll, etc.)
Browser-->>V3CuaAgentHandler: Action completed
end
alt completed or maxSteps reached
V3CuaAgentHandler-->>User: Return AgentResult
end
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
packages/core/lib/v3/handlers/v3CuaAgentHandler.ts, line 414-423 (link)logic: missing handler for
pause_and_memorize_factaction - will fall through to default and fail
8 files reviewed, 2 comments
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/[email protected] ### Patch Changes - [#1388](#1388) [`605ed6b`](605ed6b) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix multiple click event dispatches on CDP and Anthropic CUA handling (double clicks) - [#1400](#1400) [`34e7e5b`](34e7e5b) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - don't write base64 encoded screenshots to disk when caching agent actions - [#1345](#1345) [`943d2d7`](943d2d7) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for aborting / stopping an agent run & continuing an agent run using messages from prior runs - [#1334](#1334) [`0e95cd2`](0e95cd2) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for google vertex provider - [#1410](#1410) [`d4237e4`](d4237e4) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: include extract in stagehand.history() - [#1315](#1315) [`86975e7`](86975e7) Thanks [@tkattkat](https://github.com/tkattkat)! - Add streaming support to agent through stream:true in the agent config - [#1304](#1304) [`d5e119b`](d5e119b) Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for Microsoft's Fara-7B - [#1346](#1346) [`4e051b2`](4e051b2) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: don't attach to targets twice - [#1327](#1327) [`6b5a3c9`](6b5a3c9) Thanks [@miguelg719](https://github.com/miguelg719)! - Informed error parsing from api - [#1335](#1335) [`bb85ad9`](bb85ad9) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add support for page.addInitScript() - [#1331](#1331) [`88d28cc`](88d28cc) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: page.evaluate() now works with scripts injected via context.addInitScript() - [#1316](#1316) [`45bcef0`](45bcef0) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for callbacks in stagehand agent - [#1374](#1374) [`6aa9d45`](6aa9d45) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix key action mapping in Anthropic CUA - [#1330](#1330) [`d382084`](d382084) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: make act, extract, and observe respect user defined timeout param - [#1336](#1336) [`1df08cc`](1df08cc) Thanks [@tkattkat](https://github.com/tkattkat)! - Patch agent on api - [#1358](#1358) [`2b56600`](2b56600) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for 4.5 opus in cua agent ## @browserbasehq/[email protected] ### Patch Changes - [#1364](#1364) [`ca0630e`](ca0630e) Thanks [@tkattkat](https://github.com/tkattkat)! - Update model handling in agent evals cli - Updated dependencies \[[`605ed6b`](605ed6b), [`34e7e5b`](34e7e5b), [`943d2d7`](943d2d7), [`0e95cd2`](0e95cd2), [`d4237e4`](d4237e4), [`86975e7`](86975e7), [`d5e119b`](d5e119b), [`4e051b2`](4e051b2), [`6b5a3c9`](6b5a3c9), [`bb85ad9`](bb85ad9), [`88d28cc`](88d28cc), [`45bcef0`](45bcef0), [`6aa9d45`](6aa9d45), [`d382084`](d382084), [`1df08cc`](1df08cc), [`2b56600`](2b56600)]: - @browserbasehq/[email protected] ## @browserbasehq/[email protected] ### Patch Changes - Updated dependencies \[[`605ed6b`](605ed6b), [`34e7e5b`](34e7e5b), [`943d2d7`](943d2d7), [`0e95cd2`](0e95cd2), [`d4237e4`](d4237e4), [`86975e7`](86975e7), [`d5e119b`](d5e119b), [`4e051b2`](4e051b2), [`6b5a3c9`](6b5a3c9), [`bb85ad9`](bb85ad9), [`88d28cc`](88d28cc), [`45bcef0`](45bcef0), [`6aa9d45`](6aa9d45), [`d382084`](d382084), [`1df08cc`](1df08cc), [`2b56600`](2b56600)]: - @browserbasehq/[email protected] Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/[email protected] ### Patch Changes - [#1388](browserbase/stagehand#1388) [`605ed6b`](browserbase/stagehand@605ed6b) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix multiple click event dispatches on CDP and Anthropic CUA handling (double clicks) - [#1400](browserbase/stagehand#1400) [`34e7e5b`](browserbase/stagehand@34e7e5b) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - don't write base64 encoded screenshots to disk when caching agent actions - [#1345](browserbase/stagehand#1345) [`943d2d7`](browserbase/stagehand@943d2d7) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for aborting / stopping an agent run & continuing an agent run using messages from prior runs - [#1334](browserbase/stagehand#1334) [`0e95cd2`](browserbase/stagehand@0e95cd2) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for google vertex provider - [#1410](browserbase/stagehand#1410) [`d4237e4`](browserbase/stagehand@d4237e4) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: include extract in stagehand.history() - [#1315](browserbase/stagehand#1315) [`86975e7`](browserbase/stagehand@86975e7) Thanks [@tkattkat](https://github.com/tkattkat)! - Add streaming support to agent through stream:true in the agent config - [#1304](browserbase/stagehand#1304) [`d5e119b`](browserbase/stagehand@d5e119b) Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for Microsoft's Fara-7B - [#1346](browserbase/stagehand#1346) [`4e051b2`](browserbase/stagehand@4e051b2) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: don't attach to targets twice - [#1327](browserbase/stagehand#1327) [`6b5a3c9`](browserbase/stagehand@6b5a3c9) Thanks [@miguelg719](https://github.com/miguelg719)! - Informed error parsing from api - [#1335](browserbase/stagehand#1335) [`bb85ad9`](browserbase/stagehand@bb85ad9) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add support for page.addInitScript() - [#1331](browserbase/stagehand#1331) [`88d28cc`](browserbase/stagehand@88d28cc) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: page.evaluate() now works with scripts injected via context.addInitScript() - [#1316](browserbase/stagehand#1316) [`45bcef0`](browserbase/stagehand@45bcef0) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for callbacks in stagehand agent - [#1374](browserbase/stagehand#1374) [`6aa9d45`](browserbase/stagehand@6aa9d45) Thanks [@miguelg719](https://github.com/miguelg719)! - Fix key action mapping in Anthropic CUA - [#1330](browserbase/stagehand#1330) [`d382084`](browserbase/stagehand@d382084) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: make act, extract, and observe respect user defined timeout param - [#1336](browserbase/stagehand#1336) [`1df08cc`](browserbase/stagehand@1df08cc) Thanks [@tkattkat](https://github.com/tkattkat)! - Patch agent on api - [#1358](browserbase/stagehand#1358) [`2b56600`](browserbase/stagehand@2b56600) Thanks [@tkattkat](https://github.com/tkattkat)! - Add support for 4.5 opus in cua agent ## @browserbasehq/[email protected] ### Patch Changes - [#1364](browserbase/stagehand#1364) [`ca0630e`](browserbase/stagehand@ca0630e) Thanks [@tkattkat](https://github.com/tkattkat)! - Update model handling in agent evals cli - Updated dependencies \[[`605ed6b`](browserbase/stagehand@605ed6b), [`34e7e5b`](browserbase/stagehand@34e7e5b), [`943d2d7`](browserbase/stagehand@943d2d7), [`0e95cd2`](browserbase/stagehand@0e95cd2), [`d4237e4`](browserbase/stagehand@d4237e4), [`86975e7`](browserbase/stagehand@86975e7), [`d5e119b`](browserbase/stagehand@d5e119b), [`4e051b2`](browserbase/stagehand@4e051b2), [`6b5a3c9`](browserbase/stagehand@6b5a3c9), [`bb85ad9`](browserbase/stagehand@bb85ad9), [`88d28cc`](browserbase/stagehand@88d28cc), [`45bcef0`](browserbase/stagehand@45bcef0), [`6aa9d45`](browserbase/stagehand@6aa9d45), [`d382084`](browserbase/stagehand@d382084), [`1df08cc`](browserbase/stagehand@1df08cc), [`2b56600`](browserbase/stagehand@2b56600)]: - @browserbasehq/[email protected] ## @browserbasehq/[email protected] ### Patch Changes - Updated dependencies \[[`605ed6b`](browserbase/stagehand@605ed6b), [`34e7e5b`](browserbase/stagehand@34e7e5b), [`943d2d7`](browserbase/stagehand@943d2d7), [`0e95cd2`](browserbase/stagehand@0e95cd2), [`d4237e4`](browserbase/stagehand@d4237e4), [`86975e7`](browserbase/stagehand@86975e7), [`d5e119b`](browserbase/stagehand@d5e119b), [`4e051b2`](browserbase/stagehand@4e051b2), [`6b5a3c9`](browserbase/stagehand@6b5a3c9), [`bb85ad9`](browserbase/stagehand@bb85ad9), [`88d28cc`](browserbase/stagehand@88d28cc), [`45bcef0`](browserbase/stagehand@45bcef0), [`6aa9d45`](browserbase/stagehand@6aa9d45), [`d382084`](browserbase/stagehand@d382084), [`1df08cc`](browserbase/stagehand@1df08cc), [`2b56600`](browserbase/stagehand@2b56600)]: - @browserbasehq/[email protected] Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
why
Adding support for Microsoft's new Fara 7B model
what changed
test plan