|
|
|
|
|
by slowtrek
458 days ago
|
|
We got better and better models when we threw more and more compute? I gotta work on my snarkiness. Seriously, that's pretty good empirical evidence. The smaller models we get are all some kind of distillation or student model of a larger model, so they can never claim they are not the result of large compute. |
|