| Hi HN, I’d like to share an open-source project we’ve been working on for a while: Browser4. The motivation came from a recurring frustration: most browser automation tools (Playwright, Selenium, Puppeteer) are excellent for human-written scripts, but start to show friction when used as a core execution layer for AI agents or at very high concurrency. So instead of building “another wrapper around Playwright”, we experimented with a different direction:
designing a browser engine where AI agents are first-class citizens. ### What Browser4 is Browser4 is a browser automation engine built on native Chrome DevTools Protocol (CDP), with a focus on: * Coroutine-safe concurrency (designed to run many browser sessions in parallel) * Agent-oriented APIs (navigation, interaction, extraction as composable actions) * Hybrid extraction: ML agent driven extraction + LLM extraction + structured selectors + an SQL-like DOM query language (X-SQL) * Low-level control without Playwright-style abstraction overhead It’s written in Kotlin/JVM, mainly because we needed predictable concurrency behavior and long-running stability under load. The project is fully open-source (Apache 2.0). ### What it’s not * It’s not a drop-in Playwright replacement. * It’s not a no-code RPA tool. * It’s not “LLM magic” — LLMs sit outside the browser engine. Browser4 intentionally stays close to the browser execution layer and leaves planning/reasoning to external agent loops. ### Current use cases we’re testing * Large-scale web data extraction * Agentic workflows (search → navigate → extract → summarize) * Price / content monitoring with frequent revisits * High-concurrency crawling where browser startup and context switching are bottlenecks On a single machine, we can sustain very high daily page visits, though we’re still validating benchmarks across different workloads. ### Open questions (where I’d love feedback) * For agentic systems, does it make sense to bypass Playwright entirely and work closer to CDP? * Where do you see the biggest pain points when combining LLMs with browser automation today? * Is JVM a reasonable choice here, or is Python still the better tradeoff despite concurrency limits * What abstractions would you want in a browser engine built for AI agents? ### Links * GitHub: https://github.com/platonai/browser4 * Website (light overview): https://browser4.io Happy to answer technical questions or hear criticism — especially from people running browser automation or agent systems in production. Thanks for reading. |
Sequential execution with realistic timing delays is actually necessary for our use case. But I can see how other agent applications would benefit from true concurrency.
Are you handling session isolation between concurrent agents? That seems like it would be critical for avoiding state pollution.