Hacker News new | ask | show | jobs
by scrollop 840 days ago
Temperature 1 - It answered 1 sister:

https://i.imgur.com/7gI1Vc9.png

Temperature 0 - it answered 0 sisters:

https://i.imgur.com/iPD8Wfp.png

1 comments

By virtue of increasing randomness, we got the correct answer once ... a monkey at a typewriter will also spit out the correct answer occasionally. Temperature 0 is the correct evaluation.
So your theory would have it that if you repeated the question at temp 1 it would give the wrong answer more often than the correct answer?
There's no theory.

Just in real life usage, it is extremely uncommon to stochastically query the model and use the most common answer. Using it with temperature 0 is the "best" answer as it uses the most likely tokens in each completion.

> Temperature 0 is the correct evaluation.

In theory maybe, but I don't think it is in practice. It feels like each model has its own quasi-optimal temperature and other settings at which it performs vastly better. Sort of like a particle filter that must do random sampling to find the optimal solution.

Here's a quick analysis of the model vs it's peers:

https://www.youtube.com/watch?v=ReO2CWBpUYk