| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vessenes 1152 days ago
	Yes. Llama.cpp uses the CPU to do inference. MPS is the GPU for the macbook, so it has highly performant cores which can be used to do the computation. When you get inference done on the GPU, there's no (less?) energy wasted on general compute type work. :)