| You may wish to read the paper above. But if you want a quick proof: 1. A thought is a representation of a situation 2. A representation generates entailments of that situation 3. Language is many-to-one translation from these representations to symbols 4. Understanding language is reversing these symbols into thoughts (ie., reprs) So, 5. If agent A understands sentence X then A forms the relevant representation of X. 6. If agent has a representation it can state entailments of S (eg., counter-facutals). Now, split X into Xc = "canonical descriptions of S" and trivial permutations Xp. (st. distribution of Xc,Xp is low, but the tokens of Xp are common) Form entailments of X, say Y -- sentences that are cannonically implied by the truth of X. 7. If the LLM understood that X entails Y, it would be via constructing the repr S -- which entails S regardless of which sentence in X was used. 8. Train an LLM on Xc and it's accuracy on judging Y entailed by Xp is random. 9. Since using Xp sentences cause it to fail, it does not predict Y via S. QED. And we can say, 1. Appearing to judge Y entailed-by X is possible via simple sampling of (X, Y) in historical cases.
2. LLMs are just such a sampling. so, 3. +Inference to the best explanation: 4. LLMs sample historical cases rather than form representations. Incidentally, "sampling of historical cases" is already something we knew -- so this entire argument is basically unnecessary. And only necessary because PhDs have been turned into start-up hype men. |
Why? This is obviously wrong in general case. For that to be true Xp and Xc has to have no statistical relationship whatsoever, which statistically is virtually impossible.