|
|
|
|
|
by gsjbjt
1181 days ago
|
|
The point is that LLMs can’t backtrack after deciding on a token. So the probability at least one token along a long generation will lead you down the wrong path does indeed increase as the sequence gets longer (especially since we typically sample from these things), whereas humans can plan their outputs in advance, revise/refine, etc. |
|
But more to the point, in the deck provided, Lecun's point is _not_ about backtracking per se. The highlighted / red text on the preceding slide is:
> LLMs have no knowledge of the underlying reality > They have no common sense & they can't plan their answer
Now, we generally generate from LLMs by sampling uniformly forward, but it isn't hard to use essentially the same structure to generate tokens conditioned on both preceding and following sequences. If you ran generation for tokens 1...n, and then ran m iterations of re-sampling internal token i based on (1..i-1, i+1..n), it would sometimes "fix" issues created initial generation pass. It would sometimes introduce new issues, which were fine upon original generation. Process-wise, it would look a lot like MCMC at generation-time.
The ability to "backtrack" does _not_ on its own add knowledge of reality, common sense, or "planning".
When a human edits, they're reconciling their knowledge of the world and their intended impact on their expected audience, neither of which the LLM has.