| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by int_19h 1153 days ago
	The GPU isn't actually used by llama.cpp. What makes it that much faster is that the workload, either on CPU or on GPU, is very memory-intensive, so it benefits greatly from fast RAM. And Apple is using DDR5 running at very high clock speeds for this shared memory stuff. It's still noticeably slower than GPU, though.