Too expensive maybe, or just not effective anymore as they used up any available training data. New data is generated slowly, and is massively poisoned with AI generated data, so it might be useless.
That's a lie people repeat because they want it to be true.
People evaluate dataset quality over time. There's no evidence that datasets from 2022 onwards perform any worse than ones from before 2022. There is some weak evidence of an opposite effect, causes unknown.
It's easy to make "model collapse" happen in lab conditions - but in real world circumstances, it fails to materialize.
If OpenAI really are hitting the wall on being able to scale up overall then the AI bubble will burst sooner than many are expecting.