|
|
|
|
|
by philipodonnell
80 days ago
|
|
Is the difficulty that in high entropy situations, you can’t really tell whether it’s because the model is uncertain, or because of the options are so semantically similar that it doesn’t matter which one you choose? Like pure synonyms. |
|
You can't look at a single token distribution, though. There are many legitimate high-confidence, high-accuracy cases in which many tokens could come next. For example, the first token of a paragraph. You need to look at pools of entropies over segments of the output or the whole output sequence.
Although there's a correlation between uncertainty and hallucinations or inaccuracies, there's no guarantee. This is a challenging area that we're monitoring the latest literature for and contributing where we can.