|
|
|
|
|
by mrtranscendence
1093 days ago
|
|
There are such datasets, and AI companies absolutely pay to have data curated. But I suspect it would be just unimaginably expensive to create a dataset from scratch with enough tokens to feed a model with hundreds of billions of parameters, all the while paying every participant fairly. |
|
I wonder what would an LLM trained on Google code and internal documents look like?