Usage
Use ./lightpanda help for all options.
Fetch a webpage
./lightpanda fetch --obey-robots --dump html https://demo-browser.lightpanda.io/campfire-commerce/INFO http : navigate . . . . . . . . . . . . . . . . . . . . [+0ms]
url = https://demo-browser.lightpanda.io/campfire-commerce/
method = GET
reason = address_bar
body = false
INFO browser : executing script . . . . . . . . . . . . . . [+196ms]
src = https://demo-browser.lightpanda.io/campfire-commerce/script.js
kind = javascript
cacheable = true
INFO http : request complete . . . . . . . . . . . . . . . . [+223ms]
source = xhr
url = https://demo-browser.lightpanda.io/campfire-commerce/json/product.json
status = 200
INFO http : request complete . . . . . . . . . . . . . . . . [+234ms]
source = xhr
url = https://demo-browser.lightpanda.io/campfire-commerce/json/reviews.json
status = 200
<!DOCTYPE html>Options
fetch command options
--dump Dumps document to stdout.
Argument must be 'html', 'markdown', 'semantic_tree', or 'semantic_tree_text'.
Defaults to no dump.
--strip-mode Comma separated list of tag groups to remove from dump
the dump. e.g. --strip-mode js,css
- "js" script and link[as=script, rel=preload]
- "ui" includes img, picture, video, css and svg
- "css" includes style and link[rel=stylesheet]
- "full" includes js, ui and css
--with-base Add a <base> tag in dump. Defaults to false.
--with-frames Includes the contents of iframes. Defaults to false.
--wait-ms Wait time in milliseconds.
Defaults to 5000.
--wait-until Wait until the specified event.
Supported events: load, domcontentloaded, networkidle, done.
Defaults to 'done'.
--insecure-disable-tls-host-verification
Disables host verification on all HTTP requests. This is an
advanced option which should only be set if you understand
and accept the risk of disabling host verification.
--obey-robots
Fetches and obeys the robots.txt (if available) of the web pages
we make requests towards.
Defaults to false.
--http-proxy The HTTP proxy to use for all HTTP requests.
A username:password can be included for basic authentication.
Defaults to none.
--proxy-bearer-token
The <token> to send for bearer authentication with the proxy
Proxy-Authorization: Bearer <token>
--http-max-concurrent
The maximum number of concurrent HTTP requests.
Defaults to 10.
--http-max-host-open
The maximum number of open connection to a given host:port.
Defaults to 4.
--http-connect-timeout
The time, in milliseconds, for establishing an HTTP connection
before timing out. 0 means it never times out.
Defaults to 0.
--http-timeout
The maximum time, in milliseconds, the transfer is allowed
to complete. 0 means it never times out.
Defaults to 10000.
--http-max-response-size
Limits the acceptable response size for any request
(e.g. XHR, fetch, script loading, ...).
Defaults to no limit.
--log-level The log level: debug, info, warn, error or fatal.
Defaults towarn.
--log-format The log format: pretty or logfmt.
Defaults to logfmt.
--log-filter-scopes
Filter out too verbose logs per scope:
http, unknown_prop, event, ...
--user-agent-suffix
Suffix to append to the Lightpanda/X.Y User-Agent
--web-bot-auth-key-file
Path to the Ed25519 private key PEM file.
--web-bot-auth-keyid
The JWK thumbprint of your public key.
--web-bot-auth-domain
Your domain e.g. yourdomain.comSee also how to configure proxy.
CDP server
To control Lightpanda with Chrome Devtool Protocol (CDP) clients like Playwright or Puppeteer , you need to start the browser as a CDP server.
./lightpanda serve --obey-robots --host 127.0.0.1 --port 9222INFO app : server running . . . . . . . . . . . . . . . . . [+0ms]
address = 127.0.0.1:9222serve command options
--host Host of the CDP server
Defaults to "127.0.0.1"
--port Port of the CDP server
Defaults to 9222
--advertise-host
The host to advertise, e.g. in the /json/version response.
Useful, for example, when --host is 0.0.0.0.
Defaults to --host value
--cdp-max-connections
Maximum number of simultaneous CDP connections.
Defaults to 16.
--cdp-max-pending-connections
Maximum pending connections in the accept queue.
Defaults to 128.
--insecure-disable-tls-host-verification
Disables host verification on all HTTP requests. This is an
advanced option which should only be set if you understand
and accept the risk of disabling host verification.
--obey-robots
Fetches and obeys the robots.txt (if available) of the web pages
we make requests towards.
Defaults to false.
--http-proxy The HTTP proxy to use for all HTTP requests.
A username:password can be included for basic authentication.
Defaults to none.
--proxy-bearer-token
The <token> to send for bearer authentication with the proxy
Proxy-Authorization: Bearer <token>
--http-max-concurrent
The maximum number of concurrent HTTP requests.
Defaults to 10.
--http-max-host-open
The maximum number of open connection to a given host:port.
Defaults to 4.
--http-connect-timeout
The time, in milliseconds, for establishing an HTTP connection
before timing out. 0 means it never times out.
Defaults to 0.
--http-timeout
The maximum time, in milliseconds, the transfer is allowed
to complete. 0 means it never times out.
Defaults to 10000.
--http-max-response-size
Limits the acceptable response size for any request
(e.g. XHR, fetch, script loading, ...).
Defaults to no limit.
--log-level The log level: debug, info, warn, error or fatal.
Defaults towarn.
--log-format The log format: pretty or logfmt.
Defaults to logfmt.
--log-filter-scopes
Filter out too verbose logs per scope:
http, unknown_prop, event, ...
--user-agent-suffix
Suffix to append to the Lightpanda/X.Y User-Agent
--web-bot-auth-key-file
Path to the Ed25519 private key PEM file.
--web-bot-auth-keyid
The JWK thumbprint of your public key.
--web-bot-auth-domain
Your domain e.g. yourdomain.comSee also how to configure proxy.
Connect with Puppeteer
Once the CDP server started, you can run a Puppeteer
script by configuring the browserWSEndpoint.
'use strict'
import puppeteer from 'puppeteer-core'
// use browserWSEndpoint to pass the Lightpanda's CDP server address.
const browser = await puppeteer.connect({
browserWSEndpoint: "ws://127.0.0.1:9222",
})
// The rest of your script remains the same.
const context = await browser.createBrowserContext()
const page = await context.newPage()
// Dump all the links from the page.
await page.goto('https://wikipedia.com/')
const links = await page.evaluate(() => {
return Array.from(document.querySelectorAll('a')).map(row => {
return row.getAttribute('href')
})
})
console.log(links)
await page.close()
await context.close()
await browser.disconnect()Connect with Playwright
Try Lightpanda with Playwright by using
chromium.connectOverCDP to connect.
import { chromium } from 'playwright-core';
// use connectOverCDP to pass the Lightpanda's CDP server address.
const browser = await chromium.connectOverCDP('ws://127.0.0.1:9222');
// The rest of your script remains the same.
const context = await browser.newContext({});
const page = await context.newPage();
await page.goto('https://wikipedia.com/');
const title = await page.locator('h1').textContent();
console.log(title);
await page.close();
await context.close();
await browser.close();Connect with Chromedp
Use Lightpanda with Chromedp , a Golang client for CDP servers.
package main
import (
"context"
"flag"
"log"
"github.com/chromedp/chromedp"
)
func main() {
ctx, cancel = chromedp.NewRemoteAllocator(ctx,
"ws://127.0.0.1:9222", chromedp.NoModifyURL,
)
defer cancel()
ctx, cancel := chromedp.NewContext(allocatorContext)
defer cancel()
var title string
if err := chromedp.Run(ctx,
chromedp.Navigate("https://wikipedia.com/"),
chromedp.Title(&title),
); err != nil {
log.Fatalf("Failed getting page's title: %v", err)
}
log.Println("Got title of:", title)
}MCP server
Starts an MCP (Model Context Protocol) server over stdio
./lightpanda mcpTools
| Name | Description |
|---|---|
| goto | Navigate to a specified URL and load the page in memory so it can be reused later for info extraction |
| markdown | Get the page content in markdown format. If a url is provided, it navigates to that url first. |
| links | Extract all links in the opened page. If a url is provided, it navigates to that url first. |
| evaluate | Evaluate JavaScript in the current page context. If a url is provided, it navigates to that url first. |
| semantic_tree | Get the page content as a simplified semantic DOM tree for AI reasoning. If a url is provided, it navigates to that url first. |
| interactiveElements | Extract interactive elements from the opened page. If a url is provided, it navigates to that url first. |
| structuredData | Extract structured data (like JSON-LD, OpenGraph, etc) from the opened page. If a url is provided, it navigates to that url first. |
| detectForms | Detect all forms on the page and return their structure including fields, types, and required status. If a url is provided, it navigates to that url first. |
| click | Click on an interactive element. Returns the current page URL and title after the click. |
| fill | Fill text into an input element. Returns the filled value and current page URL and title. |
| scroll | Scroll the page or a specific element. Returns the scroll position and current page URL and title. |
| waitForSelector | Wait for an element matching a CSS selector to appear in the page. Returns the backend node ID of the matched element. |
Options
--insecure-disable-tls-host-verification
Disables host verification on all HTTP requests. This is an
advanced option which should only be set if you understand
and accept the risk of disabling host verification.
--obey-robots
Fetches and obeys the robots.txt (if available) of the web pages
we make requests towards.
Defaults to false.
--http-proxy The HTTP proxy to use for all HTTP requests.
A username:password can be included for basic authentication.
Defaults to none.
--proxy-bearer-token
The <token> to send for bearer authentication with the proxy
Proxy-Authorization: Bearer <token>
--http-max-concurrent
The maximum number of concurrent HTTP requests.
Defaults to 10.
--http-max-host-open
The maximum number of open connection to a given host:port.
Defaults to 4.
--http-connect-timeout
The time, in milliseconds, for establishing an HTTP connection
before timing out. 0 means it never times out.
Defaults to 0.
--http-timeout
The maximum time, in milliseconds, the transfer is allowed
to complete. 0 means it never times out.
Defaults to 10000.
--http-max-response-size
Limits the acceptable response size for any request
(e.g. XHR, fetch, script loading, ...).
Defaults to no limit.
--log-level The log level: debug, info, warn, error or fatal.
Defaults towarn.
--log-format The log format: pretty or logfmt.
Defaults to logfmt.
--log-filter-scopes
Filter out too verbose logs per scope:
http, unknown_prop, event, ...
--user-agent-suffix
Suffix to append to the Lightpanda/X.Y User-Agent
--web-bot-auth-key-file
Path to the Ed25519 private key PEM file.
--web-bot-auth-keyid
The JWK thumbprint of your public key.
--web-bot-auth-domain
Your domain e.g. yourdomain.comClaude Desktop / Cursor / Windsurf
Add to your MCP host configuration:
- Claude Desktop: Settings > Developer > Edit Config
- Cursor:
.cursor/mcp.jsonin your project - Windsurf: Cascade MCP settings
{
"mcpServers": {
"lightpanda": {
"command": "/path/to/lightpanda",
"args": ["mcp"]
}
}
}