| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Der_Einzige 162 days ago
	Min_p author here: I’m convinced that the whole field critically misunderstands temperature (I.e temperatures limited to 2 is very harmful for diverse generation). Articles like this are excellent and very cool. Hacking your LLM inference engine to enable cool sampling tricks is the definition of AI research/engineering. We need more of this and less prompt grifting.

2 comments

wolttam 162 days ago

Okay, something just tweaked in my brain. Do higher temperatures essentially unlock additional paths for a model to go down when solving a particular problem? Therefore, for some particularly tricky problems, you could perform many evaluations at a high temperature in hopes that the model happens to take the correct approach in one of those evaluations.

Edit: What seems to break is how high temperature /continuously/ acts to make the model's output less stable. It seems like it could be useful to use a high temperature until it's evident the model has started a new approach, and then start sampling at a lower temperature from there.

link

wongarsu 162 days ago

Decaying temperature might be a good approach. Generate the first token at a high temperature (like 20), then for each next token multiply temperature by 0.9 (or some other scaling factor) until you reach your steady-state target temperature

link

GRiMe2D 162 days ago

I think yes. Recently I was experimenting with NEAT and HyperNEAT solutions and found this site. At the bottom it explains how novelty yields far more optimal solutions. I would assume that reasonably high temperature may also result more interesting solutions from LLM

https://blog.lunatech.com/posts/2024-02-29-the-neat-algorith...

link

bjourne 162 days ago

Correct me if I'm wrong, but the problem is that it is almost impossible to evaluate sampling methods. You can't just look at perplexity and conclude that A is better than B. So you need large-scale expensive human evaluations. Even if you have those it is difficult to extrapolate results since what sampling method works best depends on the dataset(s).

link

programjames 162 days ago

I think you can try maximizing the free energy E[reward] + temperature*entropy?

link

bjourne 162 days ago

How do you know that generates high quality text?

link

programjames 158 days ago

It generalizes better, so it ought to produce higher quality text.

link