| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kundan_s__r 152 days ago

Whether or not Hallucination “happens often” depends heavily on the task domain and how you define correctness. In a simple conversational question about general knowledge, an LLM might be right more often than not — but in complex domains like cloud config, compliance, law, or system design, even a single confidently wrong answer can be catastrophic.

The real risk isn’t frequency averaged across all use cases — it’s impact when it does occur. That’s why confidence alone isn’t a good proxy: models inherently generate fluent text whether they know the right answer or not.

A better way to think about it is: Does this output satisfy the contract you intended for your use case? If not, it’s unfit for production regardless of overall accuracy rates.