Hacker News new | ask | show | jobs
by Labo333 1152 days ago
Can you explain why they have a "significantly lower energy usage"? Thanks!
1 comments

Yes. Llama.cpp uses the CPU to do inference. MPS is the GPU for the macbook, so it has highly performant cores which can be used to do the computation. When you get inference done on the GPU, there's no (less?) energy wasted on general compute type work. :)