Hacker News new | ask | show | jobs
by typest 1182 days ago
This is not really true. The Chinchilla paper showed that a 4% difference in loss between Chinchilla and Gopher led Chinchilla to blow Gopher out of the water at most tasks, including 30x performance in physics.

Empirically, LLMs have shown to have emergent abilities appear at different loss levels. So, a 10% difference could really matter.

1 comments

That ten percent is not loss it is parameter count.