name: browser description: "Chrome browser control: open pages, take ref snapshots, click, type, screenshot. Requires cli-jaw server running." metadata: { "openclaw": { "emoji": "🌐", "requires": { "bins": ["cli-jaw"], "system": ["Google Chrome"] }, "install": [ { "id": "brew-cliclick", "kind": "brew", "formula": "cliclick", "bins": ["cliclick"], "label": "Install cliclick (optional, for coordinate-based clicks)", }, ], }, }

Browser Control

Control Chrome through cli-jaw browser commands. Use ref-based snapshots to identify page elements, then click/type by ref ID.

This skill follows the newer 30_browser workflow shape, adapted for the server-backed cli-jaw browser runtime. Commands that are not implemented in the current cli-jaw runtime are separated under Planned Runtime Delta and must not be used as current commands.

Prerequisites

cli-jaw serve must be running.
Google Chrome must be installed.
playwright-core must be installed in the cli-jaw project.

Quick Start

cli-jaw browser start --agent                  # Automation session (headless, no visible test window)
cli-jaw browser start                          # Interactive browser (manual only)
cli-jaw browser start --headless               # Manual headless mode (server/CI/WSL)
cli-jaw browser navigate "https://example.com" # Go to URL
cli-jaw browser snapshot --interactive         # Interactive elements with ref IDs
cli-jaw browser click e3                       # Click ref e3
cli-jaw browser type e5 "hello" --submit       # Type + Enter
cli-jaw browser screenshot                     # Save screenshot path

Core Workflow

Always follow this pattern:

snapshot --interactive -> act by ref or key -> snapshot -> verify

Use a fresh snapshot after navigation, reload, tab changes, or any action that substantially changes the page. Ref IDs belong to the latest usable snapshot and can go stale.

Current Commands

These commands are implemented in the current cli-jaw browser runtime.

Browser Management

cli-jaw browser start [--port <auto>] [--headless] [--agent]
cli-jaw browser stop
cli-jaw browser status
cli-jaw browser reset [--force]

--agent enables an automated headless session.
Plain browser start is for user-requested interactive browsing.
reset clears the browser profile and screenshots; use only when the user explicitly wants a reset or you have confirmed it.

Observe

cli-jaw browser snapshot
cli-jaw browser snapshot --interactive
cli-jaw browser snapshot --interactive --max-nodes 30 --json
cli-jaw browser screenshot
cli-jaw browser screenshot --full-page
cli-jaw browser screenshot --ref e5
cli-jaw browser screenshot --json
cli-jaw browser screenshot --clip 0 0 320 180 --json
cli-jaw browser text
cli-jaw browser text --format html
cli-jaw browser get-dom --selector ".card" --max-chars 2000 --json
cli-jaw browser console --json --limit 20
cli-jaw browser network --json --limit 20

Snapshot Output Example

e1   link       "Gmail"
e2   link       "Images"
e3   textbox    "Search"           <- To type here: type e3 "query"
e4   button     "Google Search"    <- To click: click e4
e5   button     "I'm Feeling Lucky"

Act

cli-jaw browser click e3
cli-jaw browser click e3 --double
cli-jaw browser click e3 --right
cli-jaw browser type e3 "hello"
cli-jaw browser type e3 "hello" --submit
cli-jaw browser press Enter
cli-jaw browser press Escape
cli-jaw browser press Tab
cli-jaw browser hover e5
cli-jaw browser mouse-click 400 300
cli-jaw browser mouse-click 400 300 --double
cli-jaw browser select e7 "option1"
cli-jaw browser drag e3 e5
cli-jaw browser move-mouse 400 300
cli-jaw browser mouse-down
cli-jaw browser mouse-up --right

Navigate and Inspect

cli-jaw browser navigate "https://example.com"
cli-jaw browser open "https://example.com"
cli-jaw browser tabs
cli-jaw browser tabs --json
cli-jaw browser active-tab --json
cli-jaw browser tab-switch 2
cli-jaw browser reload
cli-jaw browser resize 1440 900
cli-jaw browser scroll --x 0 --y 1000
cli-jaw browser wait-for-selector ".toast-success" --timeout 30000
cli-jaw browser wait-for-text "Dashboard" --timeout 30000
cli-jaw browser evaluate "document.title"

evaluate is a top-level browser diagnostic command. Do not expose arbitrary user-provided JavaScript through higher-level vendor workflows such as web-ai.

Common Workflows

AI Web Workflows

For ChatGPT web-ai workflows, use the web-ai skill. The browser skill owns primitive page control; web-ai owns structured question rendering, active-tab safety, and response baseline handling.

Standalone agbrowse Alternative

When the user explicitly wants to drive a single Chrome instance (for example: keep one logged-in profile open, avoid running both cli-jaw serve and a second CDP session), the same browser commands are available through the standalone agbrowse CLI (npm install -g agbrowse). The flag surface is identical; only the binary prefix changes.

`cli-jaw browser` form	`agbrowse` form
`cli-jaw browser start --agent`	`agbrowse start`
`cli-jaw browser status`	`agbrowse status`
`cli-jaw browser navigate "<url>"`	`agbrowse navigate "<url>"`
`cli-jaw browser snapshot --interactive`	`agbrowse snapshot --interactive`
`cli-jaw browser click e3`	`agbrowse click e3`
`cli-jaw browser type e5 "hello" --submit`	`agbrowse type e5 "hello" --submit`
`cli-jaw browser screenshot`	`agbrowse screenshot`
`cli-jaw browser tabs`	`agbrowse tabs`
`cli-jaw browser stop`	`agbrowse stop`

Only switch when the user explicitly asks for the standalone path. Do not run cli-jaw browser and agbrowse against the same --port simultaneously — the second start will reuse the first CDP and the persisted state files can collide. For the web-ai layer, see the corresponding Standalone agbrowse Alternative section in the web-ai skill.

Web Search

cli-jaw browser start --agent
cli-jaw browser navigate "https://www.google.com"
cli-jaw browser snapshot --interactive
cli-jaw browser type e3 "search query" --submit
cli-jaw browser snapshot --interactive
cli-jaw browser click e7

Form Filling

cli-jaw browser snapshot --interactive
cli-jaw browser type e1 "John Doe"
cli-jaw browser type e2 "john@example.com"
cli-jaw browser click e3
cli-jaw browser snapshot

Read Page Content

cli-jaw browser navigate "https://news.ycombinator.com"
cli-jaw browser text
cli-jaw browser text --format html
cli-jaw browser snapshot --interactive

Planned Runtime Delta

The copied 30_browser reference documents a richer command surface. These are planned cli-jaw browser parity targets, not current commands unless the runtime has been upgraded in a later PRD.

Observe and Diagnostics

cli-jaw browser console --clear --reload --duration 3000
cli-jaw browser network --reload --duration 1000
cli-jaw browser wait 2000

Actions

cli-jaw browser resize 0 0 --fullscreen
cli-jaw browser scroll down
cli-jaw browser scroll up --amount 1000

Navigation and Sync

wait-for <ref> is deprecated in the reference design because refs are snapshot-scoped. Prefer selector/text waits.

Recovery Strategy

If something goes wrong, stop and inspect state before the next action.

snapshot fails -> take screenshot for visual inspection.
Ref not found -> re-run snapshot --interactive; refs can go stale.
Async UI not ready -> use wait-for-selector or wait-for-text.
CDP connection fails -> report the exact error, then use status; only stop/start when that is the selected recovery path.
Chrome/profile is truly stuck -> ask before reset unless the user already requested destructive reset.
DOM ref unavailable -> use the vision-click skill only after confirming no usable ref exists.

Environment Variables

Variable	Description
`CHROME_HEADLESS=1`	Enable headless mode for manual starts.
`CHROME_NO_SANDBOX=1`	Disable Chrome sandbox for Docker/CI only.

The default CDP port is derived from the cli-jaw server port. Use cli-jaw browser start --port <port> only when you need an explicit override.

Headless Mode

cli-jaw browser start --headless
cli-jaw browser start --agent
CHROME_HEADLESS=1 cli-jaw browser start

Use --agent for automation. It avoids popping a visible browser window.

Troubleshooting

Symptom	Cause	Fix
CDP connection refused	Chrome not started or wrong port	`cli-jaw browser status`, then start with the expected port
Windows only opens test browser	Chrome singleton absorbed launch	Close all Chrome windows, then use `start --agent`
Headless CDP not opening	headless not requested in GUI-less env	Add `--headless` or use `--agent`
Port conflict	another process owns the CDP port	choose a different `--port`
Snapshot too large	page has many nodes	planned: `--max-nodes`; current: use `--interactive`

Notes

Ref IDs are short-lived and should be treated as latest-snapshot scoped.
Always re-run snapshot --interactive after navigation or major page changes.
Prefer --interactive for token budget.
Screenshots save to ~/.cli-jaw/screenshots/.
start --agent should be the default for agent automation.
Non-DOM elements such as Canvas, WebGL, cross-origin iframes, and custom UI should use the vision-click skill only as an explicit fallback.

ナビゲーション

Skillsとは？

リンク

browser