Hacker News new | ask | show | jobs
by brucethemoose2 878 days ago
Running "smart" LLMs locally takes a lot of RAM, a lot of compute, and a lot of disk space.

It produces a considerable amount of heat unless it's run on an NPU, which basically doesn't happen on desktops at the moment.

Hot loading/unloading it can be slow even on an SSD.

Users often multitask with chrome in the background, and I think many would be very displeased to find Chrome bogging down their computer for reasons they may not be aware of.

Theoretically Google could run a very small (less than 2B?) LLM with very fast quantization, and maybe even work out how to use desktop NPUs, but that would be one heck of an engineering feat to deploy on the scale of Chrome.

1 comments

Honestly that sounds extremely feasible, especially for a feature that isn't on by default. The one the parent comment references in Arc isn't on by default. Also chrome eating up system resources is already a meme and they've been working on using less by sleeping tabs.