This isn't accurate, these models are trained and tuned to be correct, it's not a random occurrence.
Or are we speaking of higher probability for correctness?
https://arxiv.org/abs/2210.13382
Or are we speaking of higher probability for correctness?