Y
Hacker News
new
|
ask
|
show
|
jobs
by
merizian
543 days ago
Because of mup [0] and scaling laws, you can test ideas empirically on smaller models, with some confidence they will transfer to the larger model.
[0]
https://arxiv.org/abs/2203.03466