| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bick_nyers 543 days ago
	It will be slower for a 70b model since Deepseek is an MoE that only activates 37b at a time. That's what makes CPU inference remotely feasible here.