|
|
|
|
|
by cgearhart
1177 days ago
|
|
It’s a fallacy to describe what the machine does as “thinking” because that’s only process you know for achieving the same outcome. When you initiate the model with some input where you expect some particular correct output, that means there exists some completed sequence of tokens that is correct—if that weren’t true then you either wouldn’t ask or else you wouldn’t blame the model for being wrong. Now imagine a machine that takes in your input and in one step produces the entire output of that correct answer. In all nontrivial cases there are many more _incorrect_ possible outputs than correct ones, so this appears to be a difficult task. But would you say such a machine is “thinking”? Would you still consider it thinking if we could describe the process mathematically as drawing a sample from the output space; that it draws the correct sample implies it has an accurate probability model of the output space conditioned on your input. Does this require “thought”? GPT is just like this machine except that instead of one-step, the inference process is autoregressive so each token comes out one at a time instead of all at once. (Note that BERT-style transformers _do_ spit out the whole answer at once.) It’s possible that this is all that humans do. Perhaps we are mistaken about “thinking” altogether—perhaps the machine thinks (like a human), or perhaps humans do not think (like the machine). In either case I do feel confident that human and machine are not applying the same mechanism; jury is still out whether we’re applying the same process. |
|