| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gwern 5 hours ago
	Ensembling is not compute or parameter-efficient, so compression per se is a terrible application. (This is related to why people train ever larger LLMs like 1 10t-parameter LLM, rather than 100 GPT-3-scale LLMs.)

1 comments

Yeah.