| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by chaoyu 1097 days ago
	Smaller models are likely more efficient to run inference and doesn't necessarily need the latest GPU. Larger language model trend to have better performance over more different type of tasks. But for a specific enterprise use case, either distilling a large model or use large model to help with training a smaller model can be quite helpful in getting things to production - where you may need cost-efficiency and lower latency.