|
|
|
|
|
by versteegen
1227 days ago
|
|
Why think only about "incremental" improvement? People aren't just making slight tweaks, new papers are published at a remarkable rate where people try significantly different architectures, training methods, etc, and that steady progress leads to ever more impressive results. How can you assume this direction of research will lead nowhere? OK, ignore everyone who doesn't understand the technology. Of those of who do, I'm utterly amazed how pessimistic many are that this "isn't capable" of leading to AGI. Probably not Transformers specially, but LLMs show that intelligence is remarkably easy. You don't even need to put anything in the neural architecture designed to perform reasoning tasks, but they can be learnt regardless, because Transformers are flexible enough to learn to emulate computation (Turing machines) with bounded space and time, going beyond the famous result that 2-layer MLPs are universal function approximators. |
|
LLMs show that language is remarkably easy. Ever since GPT-3 was released, I've been convinced that language comprehension isn't nearly as big a component of general intelligence as people are making it out to be. This makes some intuitive sense: I recall a writer for a tabloid expressing that they simply turn off their brain and start spinning up paragraphs.
But so far, I haven't seen any of these models perform logical reasoning, beyond basic memorization and reasoning by analogy. They can tell you all day what their "reasoning process" is, but the actual content of any step is simply something that looks like it would fit in that step. Where do you derive this confidence that advanced logical reasoning is a natural capability of transformer models? (Being capable of emulating finite Turing machines is hardly impressive: any sufficiently large finite circuit can do that.)