Hacker News new | ask | show | jobs
by mattyyeung 777 days ago
Two possibilities:

(1) if the <title> contents (unique reference string) doesn't match, then it's trivially detected. Typically the query is re-run (non-determinism comes in handy sometimes) or if problems persist we show an error message to the doctor

(2) if a valid <title> is hallucinated, then the wrong quote is indeed displayed on the blue background. It's still a verbatim quote, but it is up to the user to handle this.

In testing when we have maliciously shown the wrong quote, users seem to be easily able to identify. It seems "Irrelevant" is easier than "wrong" to detect.

1 comments

Galactica training paper from FAIR investigated citation hallucination quite thoroughly, if you havent seen it, probably worth a look. Trained in hashes of citations were much more reliable than a natural language representation.