| > The human-designed architecture of an LLM makes no such prediction; but after training, the overall system including the learned weights absolutely does, or else it couldn't generate valid language. It makes a prediction about whatever language(s) are in the training data, but it doesn’t make any (substantial) predictions about general constraints on human languages. It really seems that you’re missing the absolutely fundamental goal of Chomsky’s research program here. Remember that whole “universal grammar” thingy? > -I'm an expert in the structure of the Japanese language, but I'm unable to hold a basic conversation. Would you not feel some doubt? I expect anyone learning Japanese as a second language will get a chuckle out of this one. It’s in fact a common scenario. You can learn a lot about the grammar of a language, but conversation requires the ability to use that knowledge immediately and fluidly in a wide variety of situations. It is like the difference between “knowing how to solve a differential equation” and being able to answer 50 questions within an hour in a physics exam. > I see your statement that Chomsky isn't attempting to model the "many non-linguistic cognitive systems", but those don't seem to cause the LLM any trouble. Of course they don’t, because researchers creating LLMs are (in the vast majority of cases) not attempting to model any particular cognitive system; they have engineering goals, not scientific ones. You seem to be stuck in the view that Chomsky is somehow trying and completely failing to do the thing that LLMs do successfully. This certainly makes for a good straw man (if Chomsky had the same goals, then yeah, he never got anywhere), but it’s a misunderstanding of his research program. > "he is deliberately choosing not to produce any result evaluable by a person who hasn't spent years studying his theories" You could say this of many perfectly respectable fields. Andrew Wiles has not produced any result evaluable by me or by almost anyone else. It would certainly take me a lot more than “a few years” of study to evaluate his work. I’m afraid there are no intellectual shortcuts. If you want to evaluate Chomsky’s work, you will have to at least read it, and maybe even think about it a bit too! It seems a bit churlish to whine about that. All you are being deprived of by opting out of this time investment is the opportunity to make informed criticisms of his work on the internet. (The good news is that generative linguistics is actually pretty accessible, and one year of part time study would probably be enough to get the lay of the land.) |
Fermat wrote the theorem in the margin long before Wiles was born. There is no question that many people tried and failed to prove it. There is no question that Wiles succeeded, because the skill required to verify a proof is much less than the skill required to generate it. I haven't done so myself; but lots of other people have, and there is no dispute by any skilled person that his proof is correct. So I believe that Wiles has accomplished something significant.
I don't think Chomsky has any similar accomplishment. I roughly understand the grandiose final goal; I just see no evidence that he has made any progress towards it. Everything that I'd see as an interesting intermediate goal is dismissed as out of scope, especially when others achieve it. On the rare occasion that Chomsky has made externally intelligible predictions on the range of human language, they've been falsified anthropologically. I assume you followed the dispute on Pirahã, which I believe clarified that features like recursion were in fact optional, rendering the theory safely non-falsifiable again.
So what's his progress? Everything that I see turns inward, valuable only within the framework that he himself constructed. Anyone can build such a framework, so that's not an accomplishment. Convincing others to spend years of their lives on that framework is a sort of an achievement, but it's not a scientific one--homeopathy has many practitioners.
> I expect anyone learning Japanese as a second language will get a chuckle out of this one. It’s in fact a common scenario.
I think this view is just as wrong applied to a human as to a model. A beginning language student probably knows a lot more grammar rules than a native speaker, but their inability to converse doesn't come from their inability to quickly apply them. It comes from the fact that those rules capture only a small amount of the structure of natural language. You seem to acknowledge this yourself--if nothing Chomsky is working on would help a machine generate language, then it wouldn't help a human either. This also explains my teachers' usual advice to stop studying and converse as best I could, watch movies, etc.
Humans clearly learn language in a more structured way than LLMs do (since they don't need trillions of tokens), but they learn primarily from exposure, with partial structure but many exceptions. I don't think that's surprising, since most other things "designed" in an evolutionary manner have that same messy form. LLMs have succeeded spectacularly in modeling that, taking the usual definition in ML or other math for "modeling".
It's thus strange to me to see them dismissed as a source of insight into natural language. I guess most experts in LLMs are busy becoming billionaires right now; but if anything resembling Chomsky's universal grammar ever does get found to exist, then I'd guess it will be extracted computationally from models trained on corpora of different languages and not any human insight, in the same way that the Big Five personality traits fall out of a PCA.