Not long before (or it's already happening) LLMs start training on stuff they wrote previously and it becomes the largest echo chamber the internet has ever seen.
Which could be damaging, or it could create interesting results if it's more like an evolutionary algorithm than entropy. That is, if it can iterate and improve on itself, instead of just take in all information and treat it equally, we'll get something interesting.
I’m pretty sure this is already part of the training loop even if it isn’t coming from the internet. It is definitely used for fine tuning and distillation. As for how LLM producers avoid model collapse, they curate and filter.