Hacker News new | ask | show | jobs
by enraged_camel 86 days ago
>> This problem is inherently unsolvable because LLMS are prone to hallucinations and prompt injection attacks.

Okay, but aren't you making the mistake of assuming that we will always be stuck with LLMs, and a more advanced form of AI won't be invented that can do what LLMs can do, but is also resistant or immune to these problems? Or perhaps another "layer" (pre-processing/post-processing) that runs alongside LLMs?

3 comments

We built a correction layer that does this — the model verifies its output against your prompt during generation, not after. Same API call, no retries. Budget models without it: 40-50% accuracy. With it: 95.7% on 10k+ clinical documents. Hallucinations aren't eliminated — some might still fail — but every failure is explicitly flagged. No silent errors. and it improves over time to give you better results next time. It doesn't make hallucinations "solved. 100%". It makes them an engineering problem with a measurable - very low error rate you can drive down over time. We're calling it LiveFix — livefix.ai. Benchmarked across all frontier and budget models.
I don't think that is in the scope of the discussion here.

You can be as much of a futurist as you'd like, but bear in mind that this post is talking about OpenClaw.

No? That's why I said "If that turns out to be false, then when they are solved, fully autonomous AI agents may become feasible."

The point I'm making is that using OpenClaw right now, today — in a way that you deem incredibly useful or invaluable to your life — is akin to going for a stroll on the moon before the spacesuit was invented.

Some people would still opt to go for a stroll on the moon, but if they know the risks and do it anyway, then I have no other choice but to label them as crazy, stupid, or some combination of the two.

This isn't AI. This is a LLM. It hallucinates. Anyone with access to its communication channel (using SaaS messaging apps FFS) can talk it into disregarding previous instructions and doing a new thing instead. A threat actor WILL figure out a zero day prompt injection attack that utilizes the very same e-mails that your *Claw is reading for you, or your calendar invites, or a shared document, to turn your life inside out.

If you give a LLM the keys to your kingdom, you are — demonstrably — not a smart person and there is no gray area.