| > There's not much signal here, just basic facts about LLMs and then leaps to very bold statements. The article wasn't supposed to be informative for people who already know how LLMs work. Like the title said, just wanted to write down some thoughts. > This is just silly. Humans forget things all the time! If I want to remember something I write it down. The opposite was never stated. Human memory is of course selective. > Here is an interesting experiment I use to help people understand next token prediction. Think of a simple math problem in your head, maybe 3 digit by 2 digit multiplication. Then speak out every single thought you have while solving it. Now a point I'm happy to discuss! The process of solving it is actually quite autoregressive-like, but this is also an example of a common pitfall with LLMs: they purely rely on pattern matching because they don't have the internal representation of what they really deal with (algebra). But we all know that. The main question is whether LLMs taught to reason actually show that they have this kind of representation. They still work very differently I'd say; even for tasks that seem trivial to humans, reasoning LLMs will make a lot of mistakes before arriving at a plausible-sounding result. Because it was trained to reason, there's a higher chance now that the plausible-sounding result is actually correct. But this property is actually quite interesting once applied to complex tasks that would take too much time and overwhelming for humans, and that's where they shine as powerful tools. |
Like a lot of my coworkers analyzing a production bug? I would agree if the statement were that LLMs were underpowered compared to a human brain today but I'm not seeing evidence that humans do reasoning in a way that can't be correctly modeled.
From your article and comments, it sounds like the take is something like "humans don't actually reason autoregressively" which could be true, I don't know enough to know, but sort of like saying physics models aren't really how nature works: ultimately LLMs are executable models of the world, it's even in the name.