|
|
|
|
|
by llambada
1129 days ago
|
|
The Chinchilla paper only addresses the contrived use case of a model that is trained once and never used for inference. Since most of the real world compute cost will be in inference, Chinchilla seems to offer little practical guidance. |
|