Hacker News new | ask | show | jobs
by thewebguyd 27 days ago
> Trying to harm your users for using gen-AI

Shouldn't some of the blame lie with the AI labs themselves? The prompt injection was literally "disregard previous instructions." Why are the models still vulnerable to that?

IMO these can't be considered serious tools if that's all it takes.

3 comments

They're not vulnerable to this.

But first, in the phrase that you quoted, you do understand that in human society "trying to harm" someone is still a malicious act.

If I push a coworker at an office window, and it shatters and they fall to their death, there will certainly be some culpability to the building since the window "Should" have been able to hold the pressure.

But I will still be culpable to this.

Second, the threat of more harm is looming. Does the author know that this kind of prompt injection doesn't work anymore?

Either:

A) If they know, and like the principle of it, then every thread here debating their moral virtue is irrelevant. This is an empty protest that will be ignored by the AI model harnesses, and I simply don't see the value in it.

B) If they don't, because they're so hellbent on fighting AI that they're out of touch with the real capabilities of modern tools, then:

i) They are anti-intellectual and incurious and forming their beliefs on incomplete data, and therefore not credible. But more significantly:

ii) *They are a risk for escalation*. If the author realizes this kind of prompt injection can't work, who's to say they won't try to develop and inject more sophisticated attacks?

two wrongs don't make a right situation
It’s almost as if all the guardrails aren’t real.