|
|
|
|
|
by gwern
775 days ago
|
|
> Indeed this is unsurprising given how LLMs work. I mean if you ask a human to generate a random number, and then reset the universe and all state of the human and ask again, you will get the same number. It actually is surprising, and you should be surprised rather than post hoc justifying it, because the logits should reflect the true random probability and be calibrated in order to minimize the prediction loss. Putting ~100% weights on 'heads' is a terrible prediction! And the LLM logits are in fact calibrated... before they go through RLHF and RLHF-derived dataset training. (Note that all of the models OP lists are either non-base tuned models like ChatGPT, or trained on data from such models, like Phi.) This was observed qualitatively when the 3.5 models were first released to the Playground, documented by the GPT-4 paper, and the 'flattened logits' phenomenon has been found many times since, not just by OP, and mostly by people totally ignorant of this phenomenon (despite being quite well known). This is just one of those things, like BPE-related errors, that we're doomed to point out again and again in the Eternal September of LLMs. |
|
For a weighted coin, isn't this the optimal strategy in the absence of other information? `p > p^2 + ( 1 − p )^2`.