|
Sure, but for our cortex not to be able to learn language without becoming specialized for it, while a transformer can, implies that a transformer is a more powerful prediction engine than our cortex, which seems unlikely. If you look at the cortex as a whole, what seems most striking is how uniform it is - same basic architecture of six layers of neurons with a specific pattern of inter-layer connectivity. It seems nature has come up with a universal prediction architecture. Maybe Wernicke's areas is fine-tuned for language, but to characterize it as a specialized "language organ" seems a bit of a stretch. Let's note too that language is only a million years old, while the cortex itself is 100's of millions, yet has this mostly uniform architecture that evidentially works just as well for vision as hearing, etc, etc. So, sure, the ability of the ridiculously simple transformer architecture to learn language (and many other prediction tasks you throw at it), doesn't PROVE that the brain didn't have to evolve a highly specialized way to do it, but it certainly seems highly suggestive of it. Since we now have an existence proof that a very simple architecture, not specialized for language, can learn language, it seems the onus is now on Chompsky to put some meat on the bones of his claim (without evidence) that the general cortical architecture is incapable of this without a high degree of specialization. |
On the contrary, it's extremely likely that we can engineer something better than random evolution. And we have: calculators are better at adding and subtracting that humans are, bikes are faster than running, etc.
> If you look at the cortex as a whole, what seems most striking is how uniform it is - same basic architecture of six layers of neurons with a specific pattern of inter-layer connectivity. It seems nature has come up with a universal prediction architecture. Maybe Wernicke's areas is fine-tuned for language, but to characterize it as a specialized "language organ" seems a bit of a stretch. Let's note too that language is only a million years old, while the cortex itself is 100's of millions, yet has this mostly uniform architecture that evidentially works just as well for vision as hearing, etc, etc.
Language is ~100,000 years old, not a million. The brain is in fact not uniform: If you damage Broca's area you lose language capability. And to say that all cognitive function is the same general algorithm ignores the obvious fact that the brain performs different functions and doesn't answer the question of how and why this is so. There are lots of cognitive behaviour that is not understood, if you want to explain those behaviors you implicitly have to distinguish them from one another.
> So, sure, the ability of the ridiculously simple transformer architecture to learn language (and many other prediction tasks you throw at it), doesn't PROVE that the brain didn't have to evolve a highly specialized way to do it, but it certainly seems highly suggestive of it.
It doesn't. Moro showed in a series of experiments that humans have difficulty learning non-hierarchical languages and use different parts of the brain to do so, which is highly suggestive that language is specialized.
> Since we now have an existence proof that a very simple architecture, not specialized for language, can learn language, it seems the onus is now on Chompsky to put some meat on the bones of his claim (without evidence) that the general cortical architecture is incapable of this without a high degree of specialization.
He has provided evidence and arguments, some of which I have pointed to above. Maybe you should actually read or listen to him.