| They neither understand nor reason. They don’t know what they’re going to say, they only know what has just been said. Language models don’t output a response, they output a single token. We’ll use token==word shorthand: When you ask “What is the capital of France?” it actually only outputs: “The” That’s it. Truly, that IS the final output. It is literally a one-way algorithm that outputs a single word. It has no knowledge, memory, and it’s doesn’t know what’s next. As far as the algorithm is concerned it’s done! It outputs ONE token for any given input. Now, if you start over and put in “What is the capital of France? The” it’ll output “ “. That’s it. Between your two inputs were a million others, none of them have a plan for the conversation, it’s just one token out for whatever input. But if you start over yet again and put in “What is the capital of France? The “ it’ll output “capital”. That’s it. You see where this is going? Then someone uttered the words that have built and destroyed empires: “what if I automate this?” And so it was that the output was piped directly back into the input, probably using AutoHotKey. But oh no, it just kept adding one word at a time until it ran of memory. The technology got stuck there for a while, until someone thought “how about we train it so that <DONE> is an increasingly likely output the longer the loop goes on? Then, when it eventually says <DONE>, we’ll stop pumping it back into the input and send it to the user.” Booya, a trillion dollars for everyone but them. It’s truly so remarkable that it gets me stuck in an infinite philosophical loop in my own head, but seeing how it works the idea of ‘think’, ‘reason’, ‘understand’ or any of those words becomes silly. It’s amazing for entirely different reasons. |