|
|
|
|
|
by svc0
974 days ago
|
|
They are most certainly being fed LLM content. However, I think this "model collapse" narrative is over-subscribed. Here are some things to keep in mind: (1) Real content is not generated via a synthetic loop: Humans use generative AI in complex ways, intermixing human-generated and AI-generated content. Imagine a person who writes the first draft of an essay, then uses ChatGPT to rewrite parts of it. These are certainly many human additions, modifications, and stylistic flourishes. (2) The most dramatic effects of model collapse were seen when training multiple generations of AI agents on content generated by the previous agent. This is a very academic scenario. (3) There is already a lot of junk consumed by these models. RLHF is aimed at eliminating these junk responses. I am not aware of any research that explores how the full training cycle is affected when RLHF is employed. Also, there is a lot of training material out there that was not used by the original GPT-3 model. The primary limitation is hardware. |
|
Edit: well look at that. I'm not saying this was generated, but it might as well could be. These "learn from these repos" posts are everywhere now.
https://dev.to/triggerdotdev/17-javascript-repositories-to-b...