| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jjmarr 247 days ago

Google's 2 temperature at 1 top_p is still producing output that makes sense, so it doesn't work for me. I want to turn the knob to 5 or 10.

I'd guess SOTA models don't allow temperatures high enough because the results would scare people and could be offensive.

I am usually 0.05 temperature less than the point at which the model spouts an incoherent mess of Chinese characters, zalgo, and spam email obfuscation.

Also, I really hate top_p. The best writing is when a single token is so unexpected, it changes the entire sentence. top_p artificially caps that level of surprise, which is great for a deterministic business process but bad for creative writing.

top_p feels like Noam Chomsky's strategy to "strictly limit the spectrum of acceptable opinion, but allow very lively debate within that spectrum".

1 comments

int_19h 247 days ago

Google's models are just generally more resilient to high temps and high top_p than some others. OTOH you really don't want to run Qwen3 with top_p=1.0...

link