Hacker News new | ask | show | jobs
by emp17344 147 days ago
Well, hallucinations have been identified as an issue since the inception of LLMs, so this doesn’t appear true.
2 comments

Hallucinations are more or less a solved problem for me ever since I made a simple harness to have Codex/Claude check its work by using static typechecking.
But there aren’t very many domains where this type of verification is even possible.
Then you apply LLMs in domains where things can be checked

Indeed I expect to see a huge push into formally verified software just because sound mathematical proofs provide an excellent verifier to put into a LLM hardness. Just see how Aristotle has been successful at math, and it could be applied to coding too

Maybe Lean will become the new Python

https://harmonic.fun/news#blog-post-verina-bench-sota

  "LLMs reliably fail at abstraction."
  "This limitation will go away soon."
  "Hallucinations haven't."
  "I found a workaround for that."
  "That doesn't work for most things."
  "Then don't use LLMs for most things."

    "Autocomplete is great!"
    "It doesn't work in bash"
    "Then don't use it in bash."
I don't see what's wrong with this argument, and I certainly don't see it as a proof that the particular technology is actually useless, as you seem to be suggesting.
Um, yes? Except ‘most things’ are not much at all by volume.
I mean, Hallucinations are 95% better now than the first time I heard the term and experienced them in this context. To claim otherwise is simply shifting goalposts. No one is saying it's perfect or will be perfect, just that there has been steady progression and likely will continue to be for the foreseeable future.