|
|
|
|
|
by hakuseki
1136 days ago
|
|
> It seems to me all we need here is a measure of confidence for the result averaged over the entire answer. Low confidence is a guess/hallucination. Even if the model knows the exact answer to the question, there may be many distinct ways of phrasing the answer. This would also lead to low confidence in any particular phrasing. |
|
https://towardsdatascience.com/foundations-of-nlp-explained-...
In this case, it doesn't matter how wide the beam is or how many possible answers there are, the score is still the accumulated token possibilities of the best branch.
However, others have noted in the thread that RLHF might hurt this approach severely by scoring polite responses high regardless of false answers (for example). Then you have to access the model pre-RLHF to get any idea of its true likelihood.