|
|
|
|
|
by YetAnotherNick
1025 days ago
|
|
They should focus more on finetuning I think. Finetuning is almost always better than pretraining, even if the pretraining dataset is very different than finetuning dataset. If I could train 30b model for $10 for few tens of million of tokens(basically proportional to current rate), I will definitely use it. |
|