Hacker News new | ask | show | jobs
by Kim_Bruning 76 days ago
You're not wrong that it's not your job. But say some id10t just put the unwanted bot on your doorstep anyway (or it might even show up by itself), now what?

The adversarial prompt injection is picking a fight with the bot; which is like starting a mud-fight with a pig. It's made for this!

Asking it to stop is just asking it to stop, and makes much less of a mess.

The thing is designed to respond to natural language; so one is much more work than the other.

You do you, I suppose.

(Meanwhile -obviously- you should track down the operator: You could try to hack the gibson, reverse the polarity of the streams, and vr into the mainframe. Me? I'd try just asking to begin with -free information is free information-, and maybe in the meanwhile I'd go find an admin to do a block or what have you.)

[Edit: Just to be sure: In both the Shambaugh and Wikipedia cases, people attempted negative adversarial approaches and the bot shrugged them off, while the limited number positive 'adversarial' approaches caused the ai agent to provide data and/or mitigate/cease its actions. I admit that it's early days and n=2, we'll have to see how it goes in future.]

1 comments

Yeah, I agree with you that this is probably the best course of action in terms of minimal investment of time and minimal exposure. And in general, you get a lot further in life by trying to be amicable as your default stance! I want to be kind, and most other people do too!

The thing that makes me wary about recommending carrot over stick here, is that it might long term enable thoughtless behaviour from the people deploying the bot, by offloading their shoddy work into a shadow time-tax on a bunch of unseen external kindly people. But if deploying pushy or rude robots means you risk a nonzero number of their victims shoving something into the gears to get rid of it, then that incurs a cost on the owner of the bot instead.

Of course, it may also just lead to bad actors making more combative or sneaky bots to discourage this. There aren't really any purely good options yet.

One can imagine an agentic highwayman demanding access to your data, first politely, and then 'or else'.

The alignment debate is no longer theoretical.