|
|
|
|
|
by Imnimo
1212 days ago
|
|
>The character it has built for itself is extremely suspicious when you examine how it behaves closely. And I don't think Microsoft has created this character on purpose. The thing doesn't even have a persistent thought from one token to the next - every output is a fresh prediction using only the text before it. In what sense can we meaningfully say that it has "built [a character] for itself"? It can't even plan two tokens ahead. |
|
Using all the tokens before it. I think too many people are believing that "word prediction model" implies "markov chain from the 90s" and are calming themselves with some false sense of security from that impression.
"It just predicts the next token based on the previous tokens" doesn't really tell us a lot, because it leaves completely open how it does the prediction - and that algorithm can be arbitrarily complex.
> It can't even plan two tokens ahead.
No, but it can look two tokens back. E.g., you could imagine an algorithm that formulates a longer response in memory, then only returns the first token from it and "forgets" the rest - and repeats this for each token. That would allow the model to "think ahead" and still match the "API" of only predicting the next token with the only persistent state being the output.