| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by vark90 606 days ago

The idea behind semantic entropy (estimating entropy of distribution over semantic units, instead of individual sequences in the output space) is great, but it's somewhat naive in the sense that it considers these semantic units to be well-defined partitions of output space. There is further generalization of this approach [1] which performs soft clustering of sampled outputs based on a similar notion of semantic equivalence between them.

But even with this in mind, there are caveats. We have recently published [2] a comprehensive benchmark of SOTA approaches to estimating uncertainty of LLMs, and have reported that while in many cases these semantic-aware methods do perform very well, in other tasks simple baselines, like average entropy of token distributions, performs on par or better than complex techniques.

We have also developed an open-source python library [3] (which is still in early development) that offers implementations of all modern UE techniques applicable to LLMs, and allows easy benchmarking of uncertainty estimation methods as well as estimating output uncertainty for deployed models in production.

[1] https://arxiv.org/abs/2307.01379

[2] https://arxiv.org/abs/2406.15627

[3] https://github.com/IINemo/lm-polygraph