Hacker News new | ask | show | jobs
by stevoh 551 days ago
Most of the leading models are currently pretty bad at estimating their response accuracy. Difficult questions have confident wrong answers or they bounce around how confident they are (if you express skepticism) to the point where the response is useless. I know it is one of the limitations of how LLMs work but I feel like it is one of the blinders right now with respect to how dangerous the models can be socially and in work settings.

At minimum, the companies should do better to train users on how reliant they should be on their models or how to prompt in a way that provide a rigorous response. Unfortunately doing that reduces the hype around LLMs so they have little incentive to work on that problem.