| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fryz 462 days ago

Yeah great question

We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)

We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.