|
|
|
|
|
by fryz
414 days ago
|
|
Yeah great question We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.) We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context. |
|