|
|
|
|
|
by bondarchuk
563 days ago
|
|
Maybe that would be a suitable working definition of general intelligence, and props to you for even giving a definition at all (in contrast to TFA). However your definition seems almost tailor-made to exclude present and near-future AI (and, I suspect, motivated thereby) . Current AI works by being trained on large amounts of existing data. If current AI would be real intelligence, we would be sad, therefore real intelligence is the opposite of intelligence trained on large amounts of data. Having said that, one can also make the case that LLMs start from a limited set of capabilities and, via exploration, acquire a rich competence. Only these are linguistic abilities and the exploration is exploration of a linguistic environment. Maybe the real intelligence is the friends we made along the way i.e. the general class of algorithms roughly called "backpropagation and gradient descent on a very high-dimensional neural network". |
|
I think you can get to the core of it by considering the evolutionary benefit of intelligence - what beneficial behavioral capability has been optimized - which comes down to being able to utilize past experience to predict/plan future outcomes, rather than being locked into reactive behavior patterns like simpler animals.
LLMs, trained to predict based on past "experience", might (perhaps charitably) be considered to exhibit some intelligence, but where they notably fail is in situations where better prediction (utilization of prior experience) requires a process more similar to search with backtracking than a linear application of rules derived from the training data - i.e. in the areas of reasoning and planning.
You can try to put lipstick on the pig by adding RL-based post-training or wrapping the LLM in an agentic loop, trying to extract more value out of the training data and gain some semblance of reasoning, but at the end of the day it's still a pig - at heart just an expert system not a cognitive architecture.
Another obvious limitation of LLMs is that they are just a repository of canned knowledge/rules, with no ability to learn from "runtime" experience, and therefore lacking the ability to learn to handle novel problems by experimentation and adaptation to failure.
The limited intelligence of LLMs is firmly baked into their architecture - the transformer, being just as pass-thru model, as well as the way they are trained by SGD rather than an algorithm capable of continuous incremental learning.