Y
Hacker News
new
|
ask
|
show
|
jobs
by
gwern
5 hours ago
Ensembling is not compute or parameter-efficient, so compression per se is a terrible application. (This is related to why people train ever larger LLMs like 1 10t-parameter LLM, rather than 100 GPT-3-scale LLMs.)
1 comments
SubiculumCode
3 hours ago
Yeah.
link