feat(browser): browser automation skill via playwright-mcp integration

## Summary

Zeph's current web capabilities are limited to static HTTP scraping (`web_scrape` / `fetch`). This covers about 50% of real-world use cases. Modern SPAs (React/Vue/Next.js), authenticated workflows, dynamic content, and form interactions require a real browser.

This issue proposes integrating browser automation via `@playwright/mcp` as an optional MCP server, complemented by a new `browser` skill that teaches the LLM a cost-aware escalation strategy.

## Current state

`WebScrapeExecutor` (`crates/zeph-tools/src/scrape.rs`):
- Static HTML + CSS selector extraction
- HTTPS-only + SSRF protection
- Fast and token-efficient

Gaps:
- No JS rendering — SPAs return empty/broken HTML
- No interaction — cannot click, type, authenticate
- Blocked by Cloudflare and bot-detection

## Proposed solution

### Phase 1 — MCP config + skill (zero new Rust code)

Add `@playwright/mcp` as an optional pre-configured `[[mcp.servers]]` entry:

```toml
[[mcp.servers]]
id = "browser"
transport = "stdio"
command = "npx"
args = ["@playwright/mcp@latest", "--headless"]
```

Or via Docker (headless, no Node.js on host):

```bash
docker run -d -p 8931:8931 mcr.microsoft.com/playwright/mcp cli.js \
  --headless --browser chromium --no-sandbox --port 8931
```

Create `.zeph/skills/browser/SKILL.md` with a decision tree:

| Scenario | Tool |
|---|---|
| Static HTML | `web_scrape` (fast, no overhead) |
| SPA / JS-rendered page | `browser_navigate` + `browser_snapshot` |
| Form fill / login flow | `browser_click` + `browser_type` |
| Visual capture | `browser_take_screenshot` |
| JS data extraction | `browser_evaluate` |

Key playwright-mcp tools to expose (core group only, 19 tools):
- `browser_navigate`, `browser_snapshot` (accessibility tree — token-efficient)
- `browser_click`, `browser_type`, `browser_hover`, `browser_select_option`
- `browser_evaluate` (arbitrary JS execution)
- `browser_take_screenshot`, `browser_console_messages`, `browser_wait_for`
- Tab management: `browser_new_tab`, `browser_close_tab`, `browser_tab_list`

### Phase 2 — BrowserConfig + init wizard

- Add `[browser]` config section to `zeph-tools/src/config.rs`
- Wire into `--init` wizard: detect Node.js/Docker, offer auto-config
- Wire into `--migrate-config` for adding `[browser]` defaults

Proposed config schema:
```toml
[browser]
enabled = false
transport = "stdio"           # "stdio" | "http"
command = "npx"
args = ["@playwright/mcp@latest", "--headless"]
url = ""                      # for http transport
caps = []                     # optional: ["vision", "pdf"]
max_tabs = 5
```

### Phase 3 — Native BrowserExecutor (optional)

If MCP latency or Node.js dependency is unacceptable: implement `crates/zeph-tools/src/browser.rs` as a native `ToolExecutor` using a Rust WebDriver/CDP crate. Only pursue if Phase 1 proves insufficient.

## Why playwright-mcp

- Maintained by Microsoft (Playwright team); GitHub Copilot and Claude Code use it natively
- Both stdio and HTTP/SSE transports — directly compatible with Zeph's rmcp client
- **Accessibility snapshot mode** (default): structured refs, ~4x fewer tokens than screenshot approach
- Official Docker image for headless deployment
- Apache 2.0 license

Alternatives evaluated:
- `@modelcontextprotocol/server-puppeteer` — **deprecated** (archived May 2025)
- `browserbase` — paid cloud, vendor lock-in
- `browsermcp.io` — desktop Chrome extension only, not headless
- `rust-browser-mcp` — community project, immature, WebDriver limitations

## Acceptance criteria

- [ ] `[[mcp.servers]]` example for playwright-mcp in `config.toml.example` / docs
- [ ] `.zeph/skills/browser/SKILL.md` with escalation decision tree and tool usage guide
- [ ] `[browser]` config section in `BrowserConfig` struct
- [ ] `--init` wizard detects Node.js/Docker and offers browser auto-config
- [ ] `--migrate-config` adds `[browser]` defaults
- [ ] Docs: `docs/src/configuration.md` browser section
- [ ] Live session test: navigate to a JS-rendered page, extract content via `browser_snapshot`

## Open questions

1. Should browser state (cookies, localStorage) persist across agent turns?
2. Screenshots: inline base64 in `ToolOutput` or written to `.local/` + referenced by path?
3. Token budget: should `browser_snapshot` output be summarized before feeding to LLM on large pages?
4. SSRF: browser can access internal network — should the same SSRF rules from `WebScrapeExecutor` apply via skill constraints?

## Research notes

Full research report: `.local/reports/browser-skill-research.md`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(browser): browser automation skill via playwright-mcp integration #2186

Summary

Current state

Proposed solution

Phase 1 — MCP config + skill (zero new Rust code)

Phase 2 — BrowserConfig + init wizard

Phase 3 — Native BrowserExecutor (optional)

Why playwright-mcp

Acceptance criteria

Open questions

Research notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Scenario	Tool
Static HTML	`web_scrape` (fast, no overhead)
SPA / JS-rendered page	`browser_navigate` + `browser_snapshot`
Form fill / login flow	`browser_click` + `browser_type`
Visual capture	`browser_take_screenshot`
JS data extraction	`browser_evaluate`

feat(browser): browser automation skill via playwright-mcp integration #2186

Description

Summary

Current state

Proposed solution

Phase 1 — MCP config + skill (zero new Rust code)

Phase 2 — BrowserConfig + init wizard

Phase 3 — Native BrowserExecutor (optional)

Why playwright-mcp

Acceptance criteria

Open questions

Research notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions