Hacker News new | ask | show | jobs
by abeppu 1181 days ago
Humans can backtrack, but the probability of an "correct" output is still (1-epsilon)^n. Not only can any token introduce an error, but the human author will not perfectly catch errors they have previously introduced. The epsilon ought to be lower for humans, but it's not zero.

But more to the point, in the deck provided, Lecun's point is _not_ about backtracking per se. The highlighted / red text on the preceding slide is:

> LLMs have no knowledge of the underlying reality > They have no common sense & they can't plan their answer

Now, we generally generate from LLMs by sampling uniformly forward, but it isn't hard to use essentially the same structure to generate tokens conditioned on both preceding and following sequences. If you ran generation for tokens 1...n, and then ran m iterations of re-sampling internal token i based on (1..i-1, i+1..n), it would sometimes "fix" issues created initial generation pass. It would sometimes introduce new issues, which were fine upon original generation. Process-wise, it would look a lot like MCMC at generation-time.

The ability to "backtrack" does _not_ on its own add knowledge of reality, common sense, or "planning".

When a human edits, they're reconciling their knowledge of the world and their intended impact on their expected audience, neither of which the LLM has.

1 comments

>> LLMs have no knowledge of the underlying reality > They have no common sense & they can't plan their answer

If his arguments are entirely based on this, then it's not fully correct:

- GPT style language models try to build a model of the world: https://arxiv.org/abs/2210.13382

- GPT style language models end up internally implementing a mini "neural network training algorithm" (gradient descent fine-tuning for given examples): https://arxiv.org/abs/2212.10559