Hacker News new | ask | show | jobs
by Imnimo 1209 days ago
While it's possible that it could repeat (in a loose sense - the contents of the context window has shifted, so the exact calculations in the next forward pass will necessarily be different strictly speaking) the calculations that arrived at "banana", there's nothing enforcing this. It's coming at it fresh, and even if it had output "a" on the basis that "banana" was a likely continuation, it could just as well decide on "plantain" given the new context. That's what I mean when I say it can't plan two steps ahead - any planning it does is lost by the time it gets to that second token.

Further, the amount of actual planning (or thinking) it can do in a single forward pass is quite limited compared to what can be done over the course of a long output - that's why tricks like "let's think step-by-step" are so powerful. If it could plan out the entire response in one forward pass, it could equally output the answer directly. But the depth of the network limits multi-step reasoning. To have a persistent long-term plan of a " a sophisticated manipulator" (as the article calls it) seems clearly impossible.