Hacker News new | ask | show | jobs
by lsb 509 days ago
I’ve been running half-a-billion parameter models comfortably in a web browser, especially with WebGPU, and you can definitely run billion parameter LLMs in the browser. It becomes a heavyweight browser app, but if the main costs are running ML models you can pretty easily serve static files from a directory and let clients’ browsers do the heavy lifting. Feel free to reach out if you have questions, happy to help, I’ve been working on language web apps as well