| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jesse_dot_id 86 days ago

I'm a LLM evangelist. I think the positive impacts will far outweigh any negatives against it over time. That said, I'm not delusional about the limitations of the technology and there are a lot of them.

> This is provably not true. LLMs CAN be restricted and censored and an LLM can be shown refusing an injection attack AND not hallucinating.

The remediations that are in place because a engineering/safety/red team did its job are commendable. However, that does not speak to the innate vulnerability of these models, which is what we're talking about. I don't fear remediated CVEs. I fear zero day prompt injection attacks and I fear hallucinations, which have NOT been solved for. I don't know what you're talking about there. If you use LLMs daily and extensively like I do, then you know these things lie constantly and effortlessly. The only reason those lies aren't destructive is because I'm already a skilled engineer and I catch them before the LLM makes the changes.

These problems ARE inherent to LLMs. Prompt injection and hallucinations are problems that are NOT solvable at this time. You can defend against the ones you find via reports/telemetry but it's like trying to bale water out of a boat with a colander.

You're handing a toddler a loaded gun and belly laughing when it hits a target, but you're absolutely ignoring the underlying insanity of the situation. And I don't really know why.

1 comments

threethirtytwo 85 days ago

>The remediations that are in place because a engineering/safety/red team did its job are commendable. However, that does not speak to the innate vulnerability of these models, which is what we're talking about.

I am talking about the innate vulnerability. The LLM model itself can be censored and controlled to do only certain behaviors. We have an actual degree of control here.

>If you use LLMs daily and extensively like I do, then you know these things lie constantly and effortlessly.

Yes and these lies over the last 2 or 3 years have gotten significantly less.

>These problems ARE inherent to LLMs. Prompt injection and hallucinations are problems that are NOT solvable at this time.

Again not true. This is not a binary solve or unsolved situation. There is progress in this area. You need to think in terms of a probability of a successful hallucination or prompt injection. There is huge progress in bringing down that probability. So much so that when you say they are NOT solvable it is patently false from both from a current perspective and even when projecting into the future.

>You're handing a toddler a loaded gun and belly laughing when it hits a target, but you're absolutely ignoring the underlying insanity of the situation. And I don't really know why.

Such an extreme example. It's more like giving a 12 year old a credit card and gun. It doesn't mean that 12 year old is going to shoot up a mall or off himself. The risk is there, but it's not guaranteed that the worst will happen.

link

mbesto 85 days ago

> You need to think in terms of a probability of a successful hallucination or prompt injection.

I would venture to say that an ACID compliant deterministic database has a 99.999999999999999999% chance of retrieving the correct information when asked by the correct SQL statement. An LLM on the other hand is more like 90%. LLMs by their innate code instruction are meant to hallucinate. I don't necessarily disagree with your sentiment, but the gap from 90% to 99.999999999999999999% is much greater of than the 0% to 90% improvement...unless something materially changes about how an LLM works at the bytecode level.

link

threethirtytwo 84 days ago

Eh if you hire a programmer to program things for you, you won’t get a 99.9999999999%.

Getting LLMs to have a reliability rate that is on par or superior to human performance is very very achievable.

link

mbesto 83 days ago

> Getting LLMs to have a reliability rate that is on par or superior to human performance is very very achievable.

Source?

link