|
|
|
|
|
by vineyardmike
693 days ago
|
|
> this also means that because we've exhausted the human generated content by now as means of training LLMs, new models will start getting trained with mostly the output of other LLMs There is also a rapidly growing industry of people whose job it is to write content to train LMs against. I totally expect this to be a growing source of training data at the frontier instead of more generic crap from the internet. Smaller models will probably stay trained on bigger models, however. |
|