Hacker News new | ask | show | jobs
by orbital-decay 71 days ago
>That the model already decided on a solution, upon first seeing the problem, and the reasoning is post hoc rationalization?

Yes it plans ahead, but with significant uncertainty until it actually outputs these tokens and converges on a definite trajectory, so it's not a useless filler - the closer it is to a given point, the more certain it is about it, kind of similar to what happens explicitly in diffusion models. And it's not all that happens, it's just one of many competing phenomena.