| HN Mirror

Yes. Llama.cpp uses the CPU to do inference. MPS is the GPU for the macbook, so it has highly performant cores which can be used to do the computation. When you get inference done on the GPU, there's no (less?) energy wasted on general compute type work. :)