Playwright::LLM is a Ruby helper that combines the conversational capabilities of RubyLLM with a Playwright-powered Chromium browser so you can browse, click, and scrape websites through the same tooling used by large language models.
- Persistent Playwright bridge.
PlaywrightLLM::Browserstarts the Node.js launcher injs/launcher.js, keeps Chromium open on CDP port 9222, and exposes environment variables for headless mode and user agents. - Reusable browser tools. The Ruby tools in
lib/playwright_llm/tools/*wrap the scripts injs/tools/, letting a model callNavigate,SlimHtml,Click,FullHtml, or even the genericExecutorwithout writing JavaScript. - LLM-centric agent.
PlaywrightLLM::Agentwires up aRubyLLM::Chat, registers the browser tools, normalizes garbled tool names, and exposes a simplelaunch,ask, andcloselifecycle.
- Ruby
>= 3.2.0(seeplaywright_llm.gemspec). - Node.js (Playwright 1.56.1 prefers Node 18+).
- Playwright dependencies. Run
npm installandnpx playwright install chromiumso the JavaScript helpers can launch Chromium. - API keys. Provide
OPENROUTER_API_KEYand/orGEMINI_API_KEYvia ENV so the default CLI content providers can talk to OpenRouter or Gemini. If you also want to use the Parallel.ai web search tool, setPARALLEL_API_KEYin your environment. This key is used by thePlaywrightLLM::Tools::ParallelSearchtool to callhttps://api.parallel.ai/v1beta/search. You must add this tool to the agent manually.
bundle install # or ./bin/setup
npm install # pulls in Playwright
npx playwright install chromium # downloads the browser binariesrequire "logger"
require "ruby_llm"
require "playwright_llm"
logger = Logger.new($stdout)
logger.level = Logger::DEBUG
RubyLLM.configure do |config|
config.openrouter_api_key = ENV["OPENROUTER_API_KEY"]
config.gemini_api_key = ENV["GEMINI_API_KEY"]
config.default_model = "google/gemini-2.5-flash-preview-09-2025"
config.use_new_acts_as = true
config.logger = logger
end
PlaywrightLLM.configure do |config|
config.logger = logger
config.headless = true
config.user_agent = ENV["PLAYWRIGHT_LLM_USER_AGENT"]
endloggerdefaults to anINFOlogger printing to STDOUT; replace it with your own if you needDEBUGor file logging.headlessdefaults totrue; passfalseor use--no-headlessin the CLI to open a visible browser.user_agentis optional; if set, it is forwarded to the Node launcher (PLAYWRIGHT_LLM_USER_AGENT).
agent = PlaywrightLLM::Agent.from_provider_model(
provider: "openrouter",
model: "google/gemini-2.5-flash-preview-09-2025"
)
agent.launch
response = agent.ask("Navigate to https://example.com and summarize the header")
puts response.content
agent.close- Always call
agent.launchbefore asking so the CDP browser is running. - Use
agent.with_instructions(<<~INSTRUCTIONS)to steer the tool usage before youlaunch. - Wrap the agent in a
begin/ensureblock so the background browser process is killed even when the prompt raises.
| Tool | Description |
|---|---|
PlaywrightLLM::Tools::Navigate |
Navigates to a URL and logs the HTTP status code. |
PlaywrightLLM::Tools::SlimHtml |
Returns cleaned HTML split in 80 000-character chunks (page: selects the chunk). |
PlaywrightLLM::Tools::Click |
Clicks a CSS selector, waits for networkidle, and reports the resulting URL/status. |
PlaywrightLLM::Tools::FullHtml |
Extracts the full HTML inside a selector (bodies are blocked to keep payloads manageable). |
PlaywrightLLM::Tools::ParallelSearch |
Calls the Parallel.ai search API with a single search query and returns structured JSON results. Not included by default. |
All tools depend on the Chromium session started by js/launcher.js, which the Ruby browser process creates before the tool scripts run.
If you already maintain a RubyLLM::Chat instance—for example, to reuse streaming callbacks—init the agent with PlaywrightLLM::Agent.from_chat(rubyllm_chat: chat) so it registers the browser tools on your existing chat client.
js/launcher.jslaunches Chromium via Playwright, honorsPLAYWRIGHT_LLM_HEADLESSandPLAYWRIGHT_LLM_USER_AGENT, and keeps the browser alive so Ruby tools can connect overlocalhost:9222.- The scripts in
js/tools/connect withchromium.connectOverCDP('http://localhost:9222'), reuse the first context/page, and log helpfulPLWLLM_LOG:lines that the Ruby helpers forward to the configuredLogger. js/slim_html.jsencapsulates the DOM cleanup, pagination, and ID helpers used by the RubySlimHtmltool.
- Run
bundle exec ruby bin/one_shot.rb [options] PROMPTfor a single-turn request. - Options:
--provider PROVIDER(default:openrouter).--model MODEL(default:google/gemini-2.5-flash-preview-09-2025).--[no-]headlesstoggles the browser mode.--user-agent USER_AGENToverrides the browser identity.
- Honors
OPENROUTER_API_KEY,GEMINI_API_KEY, andPLAYWRIGHT_LLM_USER_AGENTenvironment variables. - Logs token counts (
input_tokens,output_tokens,cached_tokens) when your logger is inDEBUG.
- Starts an interactive chat REPL, letting you craft multi-line prompts (finish with an empty line) and type
exit/quitto end. - Accepts the same
--[no-]headlessand--user-agentflags and picks sensible defaults if they are omitted. - The CLI currently buffers replies until the model finishes; enable the streaming branch manually if you need chunked output.
- Run
bin/setup(=bundle install). - Use
bin/consoleto experiment interactively; bundler and Playwright are already wired in. - Update
lib/playwright_llm/version.rbbefore a release. - Publish with
bundle exec rake release(builds the gem, tags it, and pushes to RubyGems). Install locally withbundle exec rake install. - There are no automated tests yet—if you add some, explain how to run them in this README.
MIT (see LICENSE.txt).