Hacker News new | ask | show | jobs
by hakuseki 1136 days ago
> It seems to me all we need here is a measure of confidence for the result averaged over the entire answer. Low confidence is a guess/hallucination.

Even if the model knows the exact answer to the question, there may be many distinct ways of phrasing the answer. This would also lead to low confidence in any particular phrasing.

2 comments

That should be okay though, 10 good answers will still report the score of the best one chosen. I think the GPTs are using beam search which is projecting out a "beam" (looks more like a tree to me) of probable answers each of which has a score of accumulated token probabilities, and then just picking the highest.

https://towardsdatascience.com/foundations-of-nlp-explained-...

In this case, it doesn't matter how wide the beam is or how many possible answers there are, the score is still the accumulated token possibilities of the best branch.

However, others have noted in the thread that RLHF might hurt this approach severely by scoring polite responses high regardless of false answers (for example). Then you have to access the model pre-RLHF to get any idea of its true likelihood.

Ah, interesting, that does begin to explain how this might be more difficult than it initially appears. Could there some way to define.. proximity of different possible responses, and sum the confidence for all the nearby possibilities?