| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Habgdnv 810 days ago

Unless you screw something, a different next token does not mean wrong answer. Examples:

(80% of the time) The answer to the expression 2 + 2 is 4

(15% of the time) The answer to the expression 2 + 2 is Four

(5% of the time) The answer to the expression 2 + 2 is certainly

(95% of the time) The answer to the expression 2 + 2 is certainly Four

This is how you can asp ChatGPT the same question few times and it can give you different words each time, and still be correct.

2 comments

weitendorf 810 days ago

That assumes that the model is assigning vanishingly small weights to truly incorrect answers, which doesn't necessarily hold up in practice. So I think "Unless you screw something" is doing a lot of work there

I think a more correct explanation would be that increasing temperature doesn't necessarily increase the probability of a truly incorrect answer proportionately to the temperature increase (because the same correct answer could be represented by many different sequences of tokens), but if the model assigns a non-zero value to any incorrect output after applying softmax (which it most likely does), increasing the temperature does increase the probability of that incorrect output being returned.

lesostep 807 days ago

I would guess that any mentioning of the Radiohead nearby would strongly influence answers, due to the famous "2 + 2 = 5" song. And if I understand correctly, then there is a chance that some tokens that are very close to the "Radiohead" tokens could also influence the answer.

So maybe something like "It's a well-known fact in the smith community that 2 + 2 =" could realistically come up with a "5" as a next token.