| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pama 490 days ago
	I am sure DeepSeek did optimize the inference cost of R1. They did not yet release an efficient MoE downscaling of it, ie an R1-mini.