|
|
|
|
|
by newfocogi
480 days ago
|
|
I think this is the correct take. There are other axes to scale on AND I expect we'll see smaller and smaller models approach this level of pre-trained performance. But I believe massive pre-training gains have hit clearly diminished returns (until I see evidence otherwise). |
|