| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rajveerb 13 days ago
	As someone who has spent quite a lot of time on inference, I would a add a small note: Deployment looks very different for MoE than dense style models so I would say that it is more nuanced than "inference memory reqs remain the same". Memory can be very different for MoE style models.