| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by _jonas 409 days ago
	This is why I built a startup for automated real-time trustworthiness scoring of LLM responses: https://help.cleanlab.ai/tlm/ Tools to mitigate unchecked hallucination are critical for high-stakes AI applications across finance, insurance, medicine, and law. At many enterprises I work with, even straightforward AI for customer support is too unreliable without a trust layer for detecting and remediating hallucinations.

1 comments

insane_dreamer 409 days ago

Who is watching the watchers?

How do we know the TLM is any more accurate than the LLM (especially if it's not trained on any local data)? If determining veracity were that simple, LLMs would just incorporate a fact-checking stage.

link

_jonas 407 days ago

You might be thinking of LLM as-a-judge, where one simply asks another LLM to fact-check the response. Indeed that is very unreliable due to LLM hallucinations, the problem we are trying to mitigate in the first place.

TLM is instead an uncertainty estimation technique applied to LLMs, not another LLM model.

link