| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by EagnaIonat 3 hours ago

With a dedicated GPU, the lag is in transferring data to the GPU. You don't have that lag in ARM.

But it really depends on what it is you want to do. An MLX optimised recent model will run fine and at decent speeds. Granite4.1 (a few months old) for example takes up 2GB of memory, insanely fast and results are good vs much bigger models like gpt-oss-120b (a year old). It even runs on an M1 mac with good speeds.

The models are only getting better.