| Thank you. It will be great if you can post it for me: Title: Show HN: A Desktop App That Lets AI Control Your Browser For You link: https://github.com/upwindchange/Autai Hey everyone, I've been working on *Autai* — an open-source desktop app (Electron) that uses AI agents to automate your browser. You just type what you want, and the AI opens a real browser and does it for you. Beyond that, this software can be used as a normal LLM chat interface, similar to cherry-studio. *Links:* GitHub: [https://github.com/upwindchange/autai](https://github.com/upwindchange/autai) Downloads: [https://github.com/upwindchange/autai/releases](https://github.com/upwindchange/autai/releases) *Why this project:*
When I saw the Browser-Use project (the python one), I had a "Wow" moment and tell myself, this would be the future of AI helping people doing everyday task. But when I tried it, it feels very geek to me: you have to use chrome browser via CDP debug mode and the python code runs in a terminal. I was thinking if it is possible to build something that can provide the user an easy to use version of "browser-use" with a nice interface and a nice human in the loop interaction? I went through a lot of exploration, e.g. build upon cheery studio or lm studio, integrate the python runtime and the whole browser-use project into it. At the very end, every shortcut takes a tradeoff and I decided to write my own from scratch. So here I am after close to one year of engineering and trials: the first POC release. I wrote the front-end using assistant-ui, back-end using ai sdk. A ton of other packages are used, you can have a look at the package.json file in the project. Relationship to Browser-Use: You can treat this project as a Typescript rewrite with its own agent, own front-end, etc. More precisely speaking, the CDP tool of browser-use project is rewritten into the Typescript CDP tool in this project. I wrote everything else. This is THE project in which you won't call an ELECTRON app bloated: because I am using it as a browser, making full usage of its blink + node + CDP debugger engines. *What it can do:* - *Browser Automation* — "Add these items to my Target cart", "Book a flight from NYC to SF on Friday", "Fill out this form" — the AI plans the steps and executes them in a real browser session
- *Research Mode* — Ask a question and it searches the web, reads multiple sources, and gives you a synthesized answer. No more 20-tab skimming
- *Multi-session* — Run multiple browser tasks in parallel
- *100+ AI providers, 4,000+ models* — Works with OpenAI, Anthropic, Google, DeepSeek, xAI, Ollama (local), and many more. Bring your own API key *You stay in control:* The AI pauses for CAPTCHAs, logins, and payments and hands control back to you. There's a split-view so you can watch everything the AI does in real time. *Other nice touches:* - Auto-tagged conversations with search and filtering
- Syntax highlighting, math rendering, Mermaid diagrams in AI responses
- Image and file attachments
- Dark/light mode
- English and Chinese UI *Project status:* Autai is in *active alpha development* and evolving fast. I'm heads-down building right now, so issues and feature requests are closed for the alpha phase — they'll open up once it reaches beta. That said, feedback and thoughts are always welcome here in the comments. MIT licensed. Happy to answer questions about how it works or what's coming next. |