| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by siliconc0w 505 days ago
	It seems like it would be easy to upgrade existing benchmarks to include uncertainty as a dimension. Then if a model is less certain it could maybe spend more time reasoning or route to a bigger model.

1 comments

mrciffa 505 days ago

Exactly! Uncertainty is critical to correctly evaluate LLM performance and we don't need reasoning models to spend thousands of tokens on simple questions

link