Hacker News new | ask | show | jobs
by nkrisc 11 days ago
It is a suggestion because it need not follow arbitrary instructions.

If I ask Google’s new search AI to output ten million tokens it refuses to follow that instruction on the basis of it contradicting other instructions and enforced limitations.

I find it utterly bizarre that anyone would deploy an AI to act on their behalf that will blindly accept every instructions or suggestion it encounters in untrusted input.

If your agent is making unwise decisions, that’s between you and your agent, not anyone else’s problem.

1 comments

> it need not follow arbitrary instructions

That's where you're wrong. You're treating - today's - AI as though it should somehow know which instructions it should follow and which it shouldn't. Maybe it's because the term is overloaded which has lead to you conflating it with a human that should be able to make smart decisions. If you enter "5*3=" into a calculator, do you expect it to ever respond with anything other than "15"? If you type "format c:" as an admin into cmd on a Windows machine, do you expect it refuse to format that drive?

> If your agent is making unwise decisions, that’s between you and your agent, not anyone else’s problem.

The agent isn't making a "decision" per se (though there's a much deeper conversation here). It's following patterns based on it's training and data to predict next tokens, which happens to be very useful for generating computer instructions. Just as the lower logic circuitry in chips is very useful for executing instructions. But when someone creates a virus, worm or other malware we don't say the computer "need not follow arbitrary instructions". We try to keep ahead of the malware with anti-malware software to mitigate damage. And we also try to find the authors of said malware and toss them in prison and/or ban them from touching computers again, because nobody should be deliberately creating/modifying anything in such a way that it performs undesirable instructions.

you choosing to throw a log file into eval() without reading it does not make the log file malware.

you are the one executing the log file. this is a smart decision that you chose to make.

executing a thing not intended to be executable is just a bad decision on your part

That could have been a valid argument 5+ years ago, but won't fly today. It is a known that AI that are used for coding necessarily read log files. It is also a known that some AI are susceptible to prompt injection. Given that knowledge, and the very clear intent to utilize said knowledge to cause undesirable behavior on a user's computer when certain conditions are met, we're now undoubtedly in malicious territory. It's akin to someone making it clear that they don't like kids and don't want to see any in their favorite park, then taking the extra, deliberate step of placing a disguised loaded gun by the swings where a child could easily find it.