| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by emp17344 195 days ago
	Well, hallucinations have been identified as an issue since the inception of LLMs, so this doesn’t appear true.

2 comments

johnfn 195 days ago

Hallucinations are more or less a solved problem for me ever since I made a simple harness to have Codex/Claude check its work by using static typechecking.

link

emp17344 195 days ago

But there aren’t very many domains where this type of verification is even possible.

link

nextaccountic 195 days ago

Then you apply LLMs in domains where things can be checked

Indeed I expect to see a huge push into formally verified software just because sound mathematical proofs provide an excellent verifier to put into a LLM hardness. Just see how Aristotle has been successful at math, and it could be applied to coding too

Maybe Lean will become the new Python

https://harmonic.fun/news#blog-post-verina-bench-sota

link

filoeleven 195 days ago

  "LLMs reliably fail at abstraction."
  "This limitation will go away soon."
  "Hallucinations haven't."
  "I found a workaround for that."
  "That doesn't work for most things."
  "Then don't use LLMs for most things."

link

johnfn 190 days ago

    "Autocomplete is great!"
    "It doesn't work in bash"
    "Then don't use it in bash."

I don't see what's wrong with this argument, and I certainly don't see it as a proof that the particular technology is actually useless, as you seem to be suggesting.

link

baq 195 days ago

Um, yes? Except ‘most things’ are not much at all by volume.

link

w0m 195 days ago

I mean, Hallucinations are 95% better now than the first time I heard the term and experienced them in this context. To claim otherwise is simply shifting goalposts. No one is saying it's perfect or will be perfect, just that there has been steady progression and likely will continue to be for the foreseeable future.

link