| They are missing the forest for the trees. The "emergent phenomena" can be trivially explained by the input they are giving it. They are not using a dataset that contains an equal amount of "correct" and "incorrect" responses. They are using datasets of human communication, which are obviously filtering for "correct" data. We get things wrong occasionally, but that is quite rare relative to what we get right. We can't even structure a sentence without getting something correct! If you feed a dog good food, is it really a surprise that dog is healthy? You never fed it poison! The language model is only returning semantic relationships. The "emergent phenomena" is that most semantic relationships in human communication just happen to also be logical relationships. But the language model doesn't know that. In no way does it interact with logic. It only interacts with semantics. If anyone actually bothered to train an instance of GPT or whatever on poisoned data, (i,e nonsensical stories) then you would see that emergent phenomena disappear. But no one is writing the nonsensical stories in the first place, so such a dataset does not exist. |