|
Or maybe they hallucinate so convincingly because they do understand, but they don't understand much? What is this distinction you make "output that mimics the output of someone who understands, but does not understand itself." ? Imagine you learning a foreign language, the Common European Framework of Reference for Languages (CEFR) grades people at their skill from A1 (beginner) through A2, B1, B2, C1, to C2 (fluent). At the start you are repeating phrases you don't understand imitating someone who does, but you cannot change the phrases at all because you don't know more words and cannot change the grammar because you don't understand it. Call this a chatbot with hard coded strings it can print. After a while, you can fit some other words in the basic sentences. Call this a chatbot which has template strings and lists of nouns it can substitute in, Eliza style "tell me about {X}" where X is a fixed list of {your mother, your childhood, yourself}. After a bit longer you can make basic use of grammar. If you get to fluent you can make arbitrary sentences with arbitrary words and see new words you have never seen before and guess how they will conjugate, whether they are polite or slang from the context, what they might mean from other languages, and use them probably correctly. ChatGPT can make new sentences in English, new words, it can make plausible explanations of words it has never seen before - make up a word like "carstool" and it can say something like that word does not exist but if it did it could be a compound word of 'car' and 'stool' like 'carseat', a car with a stool for a seat. Ask it to make up new compound words and it can say English does not have compound words made of four words, but if it had some examples might be trainticketprintingmachine (a machine for printing train tickets). Something that a complaete beginner in a foreign language could never do until they gained some understanding. Something that an Eliza chatbot could never do. |
ChatGPT is a language model and therefore generates text exactly from start to end, linearly, with each successive token being picked from a pool of probabilities.
It does not form a mental model or understanding of what you feed into it. It is a mathematical model that outputs token probabilities, and then some form of sampling picks the next token (I forget exactly how).
It re-uses the communication of understanding in its training data but never forms new understanding. It can fabricate new words and such because tokens don't represent entire words but rather bits and pieces of them. It sees the past however many tokens for each new token that it outputs so it can mimic nearly every instance of a real human reflecting on what they have already said.
> Something that a complaete beginner in a foreign language could never do until they gained some understanding. Something that an Eliza chatbot could never do.
Because they aren't language models trained on terabytes/petabytes of data. They haven't memorized every pattern on the open Internet and integrated it into a coherent mathematical model.
ChatGPT is extremely impressive as a language model but it does not understand in the same way a human or an AGI could.