This is the model conflating correlation with causation. Perhaps with more data spurious correlations would disappear, but the 'right' way is to make the models learn causal, world models.
Well, and I think the future of LLMs is not just in the pure LLM, but the agentic ones. LLMs with deterministic tools to ferret out specifics. We're only starting here but the results will be far better than what we do today.
Agentic LLM by itself provides value, to be sure, but they could also be part of learning a causal model. That's how humans do it; by interacting with the world.