OpenChrome

Harness-Engineered Browser Automation
The MCP server that drives and guides AI agents through a real Chrome.

English · 한국어

What it is

OpenChrome is an MCP server that controls your real, already-logged-in Chrome through the CDP — no middleware, no separate browser, no re-authentication. One Chrome process, many isolated tabs, ~300 MB for 20 parallel lanes.

It is harness-engineered: the server doesn't just expose browser APIs, it wraps them with a hint engine, a circuit breaker, an automatic-recovery runtime, and token-efficient page serialization — so the agent makes fewer mistakes, recovers without "thinking", and burns far fewer tokens.

You: compare "AirPods Pro" prices across Amazon, eBay, Walmart, Best Buy

AI:  [4 parallel lanes, already authenticated everywhere]
     Best Buy $179 · Amazon $189 · Walmart $185 · eBay $172
     2.4s — live pages, past bot detection

	Traditional (Playwright et al.)	OpenChrome
5-site task	~250s (login each)	~3s (parallel)
Memory	~2.5 GB (5 browsers)	~300 MB (1 Chrome)
Re-auth	every run	never
Bot detection	flagged	invisible (real Chrome)

Quick start

Install and point your MCP client at it — one command:

npm install -g openchrome-mcp

openchrome setup                       # Claude Code
openchrome setup --client codex        # Codex CLI
npx openchrome-mcp setup --client opencode   # OpenCode

Restart your MCP client. That's it — Chrome auto-launches on first tool call.

Manual MCP config (Cursor / VS Code / Windsurf / others)

{
  "mcpServers": {
    "openchrome": {
      "command": "openchrome",
      "args": ["serve", "--auto-launch"]
    }
  }
}

Run openchrome update later to refresh the CLI and client config.

Prefer no terminal? A one-click desktop app (macOS / Windows / Linux, beta) runs the server with no Node.js setup.

What you can do with it

Ask your agent in plain language — these all map to OpenChrome tools:

Parallel research — "screenshot AWS billing, GCP, Stripe, Datadog at once" → 4 lanes, one Chrome, already authenticated.
Authenticated scraping — crawl dashboards and member-only pages using your existing login. No credentials in config.
Form & flow automation — fill, click, navigate multi-step flows; the agent gets corrective hints when a step drifts.
Production UI debugging — oc_performance_insights / oc_vitals for LCP/CLS, console_capture, oc_devtools_url to attach live DevTools.
Site monitoring & diffing — oc_evidence_bundle snapshots + oc_diff for deterministic before/after (DOM, screenshot pHash, network, console).
Crawling — async crawl_start / crawl_status / crawl_cancel jobs with cursor pagination.
Verifiable runs — oc_assert checks page state against an Outcome Contract (pass / fail / inconclusive) instead of guessing.

The default surface is ~110 tools across navigation, interaction, reading, extraction, parallel workflows, contracts, skills, recovery, and diagnostics. Full catalogue: docs/agent/capability-map.md.

Using it conveniently

Drive it from the shell — no MCP host needed

The CLI can call the MCP surface directly. Great for scripts, CI, and debugging:

oc run navigate --arg url=https://example.com
oc run read_page --arg mode=dom --json
oc navigate https://example.com      # positional sugar for common tools
oc click ref_5

Declarative scenarios with `oc playbook`

Write a YAML scenario where every step is one tool call with an inline Outcome Contract — deterministic, no LLM judgement:

oc playbook run scenario.yaml --vars url=https://iana.org --out report.md

See docs/cli/playbook.md.

Keep one browser warm — HTTP daemon mode

Run OpenChrome as a long-lived daemon so multiple clients (Claude Code + CI + a dashboard) share one Chrome process, and the server outlives whatever launched it (Docker, systemd, CI):

openchrome serve --http 3100 --auth-token <token> --idle-timeout 30m
curl -s http://127.0.0.1:3100/health

One Chrome process, tabs isolated per session. Without --idle-timeout it stays up until stopped; with it, it self-exits after the idle window. Full guide: docs/getting-started/http-daemon.md.

Diagnose your environment

openchrome doctor      # Node, disk, Chrome binary/port, orphans, perms, locks
openchrome check       # verify CLI + runtime wiring

Token-efficient page reads

read_page mode="dom" serializes the page into a compact text form — ~5–15x fewer tokens than the raw DOM. Each element carries an affordance marker so the agent knows the type at a glance:

# [142]<input type="search" .../> ★      ← # text input
$ [156]<button type="submit"/>Search ★   ← $ button / control
@ [289]<a href="/home"/>Home ★           ← @ link   (% = visual target)

[backendNodeId] identifiers are stable for the node's lifetime — pass 142, node_142, or ref_N to any action tool. oc_observe goes further: it returns a ready-to-act numbered list in one call instead of read_page → query_dom → inspect → interact.

Why agents fail less on OpenChrome

The bottleneck in browser automation is the LLM thinking between steps — every wrong guess costs 10–15s of inference. OpenChrome's harness cuts that loop:

Subsystem	What it does
Hint engine (30+ rules)	Catches error→recovery patterns and corrects the agent before mistakes cascade. Promotes repeated patterns to permanent rules.
Recovery runtime	Deterministic, bounded recovery for a tool call — recover in-server, no LLM round-trip (pilot tier).
Ralph engine	7-strategy interaction waterfall: AX click → CSS → CDP coords → JS → keyboard → raw mouse → human escalation.
3-level circuit breaker	Element / page / global — stops the agent burning tokens on permanently broken elements.
Outcome classifier	Reports what actually happened after a click (SUCCESS / SILENT_CLICK / WRONG_ELEMENT).
49 reliability mechanisms	8 defense layers from process lifecycle to MCP gateway — no single failure hangs the server. See `docs/architecture.md`.

Result on a typical 5-site task: ~80% fewer LLM calls, ~80x faster wall time, ~5x cheaper.

Other capabilities worth knowing

Parallel sessions — 1 Chrome, N tabs/lanes; workerId + profileDirectory give per-client isolation. Multiple MCP clients can share tabs safely.
Anti-bot / Turnstile — 3-tier auto-fallback (headless → stealth → real headed Chrome) bypasses CDN/WAF blocks. Turnstile guide.
Interactive login — headed by default since the launcher runs visible; complete 2FA/CAPTCHA once, reuse the persistent profile after.
Session persistence — --persist-storage saves cookies + localStorage atomically for headless reuse.
Shadow DOM — open + closed roots via CDP-pierced reads; __pierce() / __openchrome.querySelectorAllDeep() helpers in javascript_tool.
Element intelligence — find elements by natural language (AX-first, CSS fallback, Korean role keywords built in: "버튼" → button).
Core / pilot tiers — core is on by default and preserves the stable surface; --pilot opts into contract runtime, handoff persistence, voting, and the skill curator.

Server & headless deployment

openchrome serve --server-mode     # headless + auto-launch + server defaults

Works in CI/CD and containers with no login — navigation, scraping, screenshots, forms, and parallel workflows all run in clean sessions. A production Dockerfile is included (docker build -t openchrome . && docker run openchrome).

Authentication (per-tenant API keys, JWT/OAuth, shared token): docs/auth.md. Transport stability policy: docs/transport-lifecycle.md.

Documentation

Topic	Link
Architecture & reliability layers	`docs/architecture.md`
Getting started walkthrough	`docs/getting-started.md`
Full tool catalogue	`docs/agent/capability-map.md`
CLI & playbook	`docs/cli.md` · `docs/cli/playbook.md`
HTTP daemon mode	`docs/getting-started/http-daemon.md`
Research recipes	`docs/recipes/README.md`
Latest release notes	`docs/releases/v1.12.4.md`

Development

git clone https://github.com/shaun0927/openchrome.git
cd openchrome
npm install && npm run build && npm test

Lint before submitting source changes: npm run lint -- --max-warnings=0 (or npm run lint:changed -- --base origin/develop for changed files only). PRs target the develop branch.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2,761 Commits
.claude		.claude
.github/workflows		.github/workflows
assets		assets
benchmark		benchmark
cli		cli
config		config
deploy		deploy
desktop		desktop
docs		docs
extension		extension
native-host		native-host
scripts		scripts
src		src
tests		tests
.dependency-cruiser.cjs		.dependency-cruiser.cjs
.dockerignore		.dockerignore
.e2e-state.json		.e2e-state.json
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.ko.md		README.ko.md
README.md		README.md
SECURITY.md		SECURITY.md
jest.ci.config.js		jest.ci.config.js
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.cli.json		tsconfig.cli.json
tsconfig.json		tsconfig.json
tsconfig.test.json		tsconfig.test.json
webpack.config.js		webpack.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenChrome

What it is

Quick start

What you can do with it

Using it conveniently

Drive it from the shell — no MCP host needed

Declarative scenarios with `oc playbook`

Keep one browser warm — HTTP daemon mode

Diagnose your environment

Token-efficient page reads

Why agents fail less on OpenChrome

Other capabilities worth knowing

Server & headless deployment

Documentation

Development

License

About

Uh oh!

Releases 87

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenChrome

What it is

Quick start

What you can do with it

Using it conveniently

Drive it from the shell — no MCP host needed

Declarative scenarios with oc playbook

Keep one browser warm — HTTP daemon mode

Diagnose your environment

Token-efficient page reads

Why agents fail less on OpenChrome

Other capabilities worth knowing

Server & headless deployment

Documentation

Development

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 87

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Declarative scenarios with `oc playbook`

Packages