Hacker News new | ask | show | jobs
by ffsm8 25 days ago
Isn't that precisely the reason why we introduced the term hallucination? Because llms have historically always made up bullshit of they cannot answer directly... If they now nailed this to maybe the model not respond instead of responding incorrectly, then a lot of previously unusable usecases would become feasible.

So I feel like that's exactly the right metric and the way to track it wrt hallucinations.

2 comments

I had a buddy in high school that was notorious for doing the same thing. (He's now a senior director at a Big 4 consultancy. :) )
Do you mind expanding a little more?
They had a buddy who used to lie a lot when they were younger… now they get paid for it
The point is that it's not a useful metric on its own. For example, redirecting from /dev/null also achieves a zero hallucination rate.

We want the hallucination rate to decrease while the overall answer rate of queries remains sufficiently high. For more specifics, look into ROC and AUC.