Hacker News new | ask | show | jobs
by dcchambers 478 days ago
Historically no, Ollama and the like have only used the CPU+GPU.

That said, there are efforts being made to use the NPU. See: https://github.com/Anemll/Anemll - you can now run small models directly on your Apple Silicon Mac's NPU.

It doesn't give better performance but it's massively more power efficient than using the GPU.