| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by atwrk 107 days ago
	Local LLM inference is all about memory bandwidth, and an M4 pro only has about the same as a Strix Halo or DGX Spark. That's why the older ultras are popular with the local LLM crowd.