Scrape the modern web
Lightpanda is the open-source browser made for headless usage.
Fast scraping and web automation with minimal memory footprint.
Execution time ~ 7 times faster
Memory peak ~ 9 times less memory used
Featured in Lightpanda
Ultra-low memory footprint
Blazingly Fast
Instant startup
Keep your daily tools
Playwright, Puppeteer
Preview demo
On this mock e-commerce webpage, important information such as product name, price, features, and reviews are fetched through XHR requests.
We’ve put it through tests locally using 3 different tools:
- cURL, not able to execute Javascript and retrieve the data
- Chrome headless, accurate data but slow and fat (180MB RAM)
- Lightpanda, same output than Chrome but 60x faster while using 12x less memory
Javascript execution
is mandatory for the modern web
Back in the good old times, grabbing a webpage was as easy as making an HTTP request, cURL-like. It’s not possible anymore, because Javascript is everywhere, like it or not.
- Ajax, Single Page App, Infinite loading, “click to display”, instant search, etc.
- JS web frameworks: React, Vue, Angular & others
Chrome
is not the right tool
So if we need Javascript, why not use a real web browser. Let’s take a huge desktop application, hack it, and run it on the server, right? Hundreds of instance of Chrome if you use it at scale. Are you sure it’s such a good idea?
- Heavy on RAM and CPU, expensive to run
- Hard to package, deploy and maintain at scale
- Bloated, lots of features are not useful in headless usage
Lightpanda
is built for performance
If we want both Javascript and performance, for a real headless browser, we need to start from scratch. Not yet another iteration of Chromium, really from a blank page. Crazy right? But that’s we did: enters Lightpanda.
- Not based on Chromium, Blink or WebKit
- Low-level system programming language (Zig) with optimisations in mind
- Opinionated, no rendering
Timeline
Q2 2022
— Beginning of the project2022-2023
— Development phaseFeb 2024
— Private Alpha releaseQ2 2024
— OpenSource and public betaQ4 2024
— Cloud versionFrequently Asked Questions
-
Can you explain the benchmarks? Is it a fair comparison?
We all love benchmarks but we also know how difficult it can be to have a fair comparison. That is why it was very important for us to be transparent about our protocol for the benchmark.
What are we testing?
A mock e-commerce web page making an XHR call with a JSON list of products and reviews, updating the DOM with the fetched data. The code is available on the Github repository .
How are we testing it?
By launching both binaries (Google Chrome v122 and Lightpanda) and dumping the the result web page (ie. with the DOM modifications).
What metrics are you looking for?
Execution time and peaked memory.
What tools are you using?
We use Hyperfine to measure the duration of the execution, which allows us to launch the tests multiple times and therefore reduce the impact of warmup and remove the best/worse iterations. For peaked memory we used GNU Time with-v
option.
It is a pretty basic benchmark and we intend to improve it as Lightpanda grows, adding new metrics and protocols (i.e: testing parallel executions in server mode). -
Is it open source? How can I get it?
Yes - Lightpanda is open-source under the AGPL-v3 license.
The source code is available on our Github repository .
For this project we have also developed our own Zig to Javascript runtime available under the Apache2 license. -
What libraries are you using?
For Javascript execution we use the v8 engine (Chromium, Node) for state of the art performance and compatibility.
We plan to add a lighter non-JIT Javascript engine (probably Fabrice Bellard’s QuickJS ) as an alternative, to reduce drastically binary size and memory usage. It will also allow embedding scenarios (as a WASM module, a lib and so on).
For HTML parsing and DOM tree manipulation we use libhubbub and libdom from the Netsurf project.
Our I/O event loop is based on the Tigerbeetle one. -
Who is behind Lightpanda?
We are a young company based in Paris, co-founded by Francis Bouvier and Pierre Tachoire. We have been working on this project for the past 2 years.
Francis is a software developer and entrepreneur, former CTO and co-founder of an e-commerce startup (BlueBoard, sold in 2020 to ChannelAdvisor, NYSE: ECOM).
Pierre is a software developer, former software engineer at BlueBoard and ChannelAdvisor. -
Are you going to have a commercial offer?
Yes - our plan is to have cloud-based and on-premise versions of Lightpanda with support, SLAs and additional tools.