| There are limitations with LLMs but nobody is being clear about it. The overall state of LLMs can be distilled into 3 points: 1. LLMs Can produce output that is equal in intelligence and creativity to humans. It can even produce output that is objectively better than humans. This EVEN applies to novel responses that are completely absent from the training set. This is the main reason why there's so much hype around LLMs right now. 2. The main problem is that LLMs can't produce good output consistently. Sometimes the output is better, sometimes it's the same, sometimes it's the worse. LLMs sometimes "hallucinate", they are sometimes inconsistent, they have an obvious memory problems. But none of these problems completely preclude the LLM from being able to produce output that is objectively better or the same as human level reasoning... it's just not doing this consistently. 3. Nobody fully understands the internal state of LLMs. We have limited understanding of what's going on here. We can understand inputs and outputs but the internal thought process is not completely understood. Thus we can only make limited statements about how an LLM thinks. Nobody can make a statement that LLMs obviously have zero understanding of the world, nobody can make a statement that LLMs are just stochastic parrots because we don't really get whats going on internally. We only have output from LLMs that are remarkably novel and intelligent and output from LLMs that are incredibly stupid and inconsistent. The data does not point towards a definitive conclusion, it only points towards possibilities. There's actually a cargo cult around downplaying AI. There are people who say clearly the AI is a stochastic parrot and they point to the intention of the algorithm itself behind the LLM. Yes the algorithm at the lowest level can be thought of as a next text predictor. But this is just a low level explanation. It's like saying a computer system is simply a turing machine executing simplistic instructions from a tape roll when such instructions can form things like games and 3D simulations of entire open worlds. The high level characteristics of this AI is something we currently cannot understand. Yes we built a text predictor, but something else that was not expected came out as an emergent property and this emergent property is something we still cannot make a definitive statement about. What does the future hold? What follows is my personal opinion on this matter: I believe we will never be able to make a definitive statement about LLMs or even AGI. We will never be able to fully understand these things and instead AGI will come about from a series of trials, errors and accidents. What we build will largely come about as an art and as unexpected emergent properties of trying different things. I believe this for two reasons. The first reason is philosophical. There's this sort of blurry concept that I believe that a complex intelligence cannot fully comprehend something that is equal in complexity to itself. We can only partially understand complexity equal to ourselves by symbolically abstracting parts away but not everything can be abstracted like this. Sometimes true understanding involves comprehension of the entire complex crystal without abstracting any part of it away. I believe that the concept of "intelligence" is such a crystal, but that's just a guess. The second reason is scientific. We've had physical creations of complex intelligence right in front of ours eyes that we can touch, manipulate and influence for decades. The human brain and other animal brains have been studied extensively and our understanding has been consistently far away from any form of true understanding. Given the evidence of the failure to understand the human brain even when it's right in front of us, I'd say we're unlikely to ever completely understand LLMs as well. |
That's a bad analogy, none of those things are emergent behavior.
We can debate whether what an llm does is "emergent" - it's basically a definition thing though and isn't very interesting.
In reality, what's most surprising is that so much of what we say is explainable as next token prediction. It's not the other way around - we're showing how predictable we are, rather than how smart the AI is. But it's clear to me that it's in the outlying cases where the differences are. AI doesn't extrapolate outside it's training data, and even if it gets (100-\alpha)% of it's output right, there is always some alpha that's not in the training data and differentiates pattern matching or fancy key-value lookup (which is how we know AI works) from whatever intelligence is.