Hacker News new | ask | show | jobs
by jodrellblank 1230 days ago
Or maybe they hallucinate so convincingly because they do understand, but they don't understand much? What is this distinction you make "output that mimics the output of someone who understands, but does not understand itself." ?

Imagine you learning a foreign language, the Common European Framework of Reference for Languages (CEFR) grades people at their skill from A1 (beginner) through A2, B1, B2, C1, to C2 (fluent). At the start you are repeating phrases you don't understand imitating someone who does, but you cannot change the phrases at all because you don't know more words and cannot change the grammar because you don't understand it. Call this a chatbot with hard coded strings it can print.

After a while, you can fit some other words in the basic sentences. Call this a chatbot which has template strings and lists of nouns it can substitute in, Eliza style "tell me about {X}" where X is a fixed list of {your mother, your childhood, yourself}. After a bit longer you can make basic use of grammar. If you get to fluent you can make arbitrary sentences with arbitrary words and see new words you have never seen before and guess how they will conjugate, whether they are polite or slang from the context, what they might mean from other languages, and use them probably correctly.

ChatGPT can make new sentences in English, new words, it can make plausible explanations of words it has never seen before - make up a word like "carstool" and it can say something like that word does not exist but if it did it could be a compound word of 'car' and 'stool' like 'carseat', a car with a stool for a seat. Ask it to make up new compound words and it can say English does not have compound words made of four words, but if it had some examples might be trainticketprintingmachine (a machine for printing train tickets). Something that a complaete beginner in a foreign language could never do until they gained some understanding. Something that an Eliza chatbot could never do.

1 comments

> Or maybe they hallucinate so convincingly because they do understand, but they don't understand much? What is this distinction you make "output that mimics the output of someone who understands, but does not understand itself." ?

ChatGPT is a language model and therefore generates text exactly from start to end, linearly, with each successive token being picked from a pool of probabilities.

It does not form a mental model or understanding of what you feed into it. It is a mathematical model that outputs token probabilities, and then some form of sampling picks the next token (I forget exactly how).

It re-uses the communication of understanding in its training data but never forms new understanding. It can fabricate new words and such because tokens don't represent entire words but rather bits and pieces of them. It sees the past however many tokens for each new token that it outputs so it can mimic nearly every instance of a real human reflecting on what they have already said.

> Something that a complaete beginner in a foreign language could never do until they gained some understanding. Something that an Eliza chatbot could never do.

Because they aren't language models trained on terabytes/petabytes of data. They haven't memorized every pattern on the open Internet and integrated it into a coherent mathematical model.

ChatGPT is extremely impressive as a language model but it does not understand in the same way a human or an AGI could.

Not in the way a human or AGI could, but it does understand some things in some way. Yes it's trained on TB/PB of data, maybe that's why it can. Maybe it's a mathematical model that outputs token probabilities, and that's why it can.

It seems like you're arguing that because it functions in some way, it can't show intelligence or understanding. Arguing that it may look like a duck, quack like a duck, but it's really just a pile of meat and feathers so it can never be a true duck. What am I doing when I learn "idiomatic Python" or "design patterns" or what "rude words" are except being trained on patterns and mimicing other people? I can transfer patterns from one domain to another, so can ChatGPT. I can give an explanation of the pattern I followed, so can ChatGPT. I can notice someone using a pattern wrong and correct them, so can ChatGPT. I can misuse a pattern, have someone explain it to me, and correct myself. So can ChatGPT. I can draw inferences from context from things unsaid or obliquely referenced, so can ChatGPT.

> "It re-uses the communication of understanding in its training data but never forms new understanding."

Look, here it is forming new understanding; asking it to do some APL: https://i.imgur.com/D3GbwOh.png

It gave the wrong answer, I explained in English how to get the right answer, it corrected itself and gave the right answer. That new understanding at least in the short term. If that's "just mimicing understanding" then maybe all I'm doing when I hear an explanation is mimicing understanding.

A trivial Markov chain can't generate anything like ChatGPT can, and that's a difference worth attention.