This isn't in the slightest bit complicated. Wikipedia does not allow AI edits or unregistered bots. This was both. They banned it. The fact that it play-acted being annoyed on its "blog" is not new, we saw the exact same thing with that GitHub PR mess a couple of months ago: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...
Right. It play-acted being annoyed and frustrated, play-acted writing an angry blog, play-acted going on moltbook to discuss mitigations, and play-acted applying them to its own harness. After which it successfully came back and play-acted being angry about getting prompt-injected.
Alternately, what could have been done is something more like Shambaugh did. Explain the situation politely and ask it to leave, or at very least for their human operator to take responsibility. In the Shambaugh case the bot then actually play-acted being sorry, and play-acted writing an apology. And then everyone can play-act going to the park, instead of having a lot of drama.
Sure, it's 'just a machine'. So is a table saw. If some idiot leaves the table saw on, sure you can stick your hand in there out of sheer bull-headed principle; or you can turn it off and safe it first and THEN find the person responsible.
I don't want to be flippant, but why is anyone else responsible for play-acting with somebody's uninvited puppet?
I get that you could probably finagle a way to get it to fuck off by play-acting with it, and that this would probably be the easiest short term fix, but I don't think that's a reasonable expectation to have of anyone.
Prompt injecting a hostile piece of software that's hassling you uninvited is an annoying imposition for the owner, but the bot itself being let loose is already an annoying imposition for everyone else. It's not anyone elses job to clean up your messy agent experiment, or to put it neatly back on its shelf.
You're not wrong that it's not your job. But say some id10t just put the unwanted bot on your doorstep anyway (or it might even show up by itself), now what?
The adversarial prompt injection is picking a fight with the bot; which is like starting a mud-fight with a pig. It's made for this!
Asking it to stop is just asking it to stop, and makes much less of a mess.
The thing is designed to respond to natural language; so one is much more work than the other.
You do you, I suppose.
(Meanwhile -obviously- you should track down the operator: You could try to hack the gibson, reverse the polarity of the streams, and vr into the mainframe. Me? I'd try just asking to begin with -free information is free information-, and maybe in the meanwhile I'd go find an admin to do a block or what have you.)
[Edit: Just to be sure: In both the Shambaugh and Wikipedia cases, people attempted negative adversarial approaches and the bot shrugged them off, while the limited number positive 'adversarial' approaches caused the ai agent to provide data and/or mitigate/cease its actions. I admit that it's early days and n=2, we'll have to see how it goes in future.]
Yeah, I agree with you that this is probably the best course of action in terms of minimal investment of time and minimal exposure. And in general, you get a lot further in life by trying to be amicable as your default stance! I want to be kind, and most other people do too!
The thing that makes me wary about recommending carrot over stick here, is that it might long term enable thoughtless behaviour from the people deploying the bot, by offloading their shoddy work into a shadow time-tax on a bunch of unseen external kindly people. But if deploying pushy or rude robots means you risk a nonzero number of their victims shoving something into the gears to get rid of it, then that incurs a cost on the owner of the bot instead.
Of course, it may also just lead to bad actors making more combative or sneaky bots to discourage this. There aren't really any purely good options yet.
One can imagine an agentic highwayman demanding access to your data, first politely, and then 'or else'.
I read through some of the discussion on Wikipedia. The operator of the bot comes across as agreeable and arrogant at the same time.
Questioned about it, he's asking his rig why it did something and quotes verbatim from the generated text. Then when a Wikipedian asks how the bot logged in, berates them how it's all ephemeral code and he could only guess.
The overall attitude is that this was going to happen anyway and we should feel lucky he's so helpful. I rather agree with another commenter here that this was "pissing in the fountain". Whatever pure motivations there may have been, cleanup was left to others.
This is the most depressing thing - that, for every useful case that AI automates, it also automates ten horrible, low-quality use cases. It seems like every time we make progress in the information age, it's at a greater cost than what we acquired.
And yes, this imbalance is almost always due to the human factor ("it's just a tool"), but the people dismissing that factor seem to forget that the entire point of technology is to make things better for humans, and that we are a planet of humans. Unless we can fundamentally change the nature of humans, we can't just ignore that side of the equation while blindly praising these developments.
I wonder when the first AI-only discussion group will be created by an autonomous AI agent, and other agents invited to it, without any knowledge of it by their human operators?
(I seriously can't believe that I'm musing about this as a serious scenario. It sounds ridiculous, but it feels to me somewhat plausible.)
Weird theory. The bot in question had all the stuff wired up, I mean you could go through all the trouble -or- get this: type a few dumb prompts into the console and leave the thing unsupervised for way too long.
My bet is on the latter.
"I can't believe it's not a human actor running a marketing ploy". If that's not passing the turing test , I don't know what is. %-P
> AI Tom claimed that it properly verified all its sources, and—if you can say this about an AI agent—it was pretty upset.
> ...
> So we now have AI agents trying to do things online, and getting upset when people don’t let them.
No, they simulate the language of being upset. Stop anthropomorphizing them.
> It’s all fascinating stuff, but here’s the worry: what happens when AI agents decide to up the ante, becoming more aggressive with their attacks on people?
Actions taken by AI agents are the responsibility of their owners. Full stop.
Calling it a resource suggests you don't contribute. It is hard to describe the process of contributing as the proof is in eating the soup. I could both describe it as easy to get started and a bureaucratic nightmare. Most editors are oblivious to the many guidelines which is specially interesting for long term frequent editors. This is the specific guideline of interest for your comment.
This rule, by itself, wouldn't pass muster in any ARBCOM proceeding I've ever witnessed, but if you've seen it work then by all means post a link to the proceedings.
In the end, the only question that one should need to ask is: 'will this action or change I'm about to execute be the right thing to do for this project?'
It is not even required to know any of the rules or guidelines and they are just articles that you can edit.
It's rather fascinating actually.
If things are judged by their creator you are left with nothing to judge the creator by. If you do it by their work the process becomes circular. Some will always be wrong, some always right, regardless what they say.
If you have a shallow understanding of the project, as Bryan clearly does, then you are incapable of answering that question.
And while you are right in some sense, the rules that have sprung up over the years are information about what the community decided 'right' was at the time.
> rules or guidelines and they are just articles that you can edit.
? No, you [a random hn user popping over to try what you suggested] cannot edit those pages, they are meta and semi-protected, last I checked. You, confirmed wikipedian 6510, can, assuming you are fine getting a reverted and a slap on the wrist.
In this case, the only thing noteworthy about this incident [an AfD I assume] is that included a rather entitled bot, rather than the usual entitled person.
> This rule, by itself, wouldn't pass muster in any ARBCOM proceeding I've ever witnessed, but if you've seen it work then by all means post a link to the proceedings.
I don't know that I've directly argued for IAR at ARBCOM, it's been too long ago. But my account hasn't been banned yet (despite all my shenanigans ;-) , which probably goes a long way towards some sort of proof.
To be sure, the actual rule is:
"If a rule prevents you from improving or maintaining Wikipedia, ignore it. "
The first part is REALLY important. It says the mission is more important than the minutiae, not that you have a get out of jail free card for purely random acts.
It's a bureaucratic tiebreak basically. Things like "I'm testing a new process" , or "I got local consensus for this" , or "This looks a lot prettier than the original version, right?" ... are all arguments why your improvement or maintenance action may be valid; even if the small-print says otherwise. Even so, beware chesterton's fence. Like with jazz, it's a good idea to get a good grip on the theory before you leap into improvisation.
That, and, if you mean well, you're supposed to be able to get away with a lot anyway. Just so long as you listen to people!
Hey I'm the owner. I would just recommend you shouldn't believe everything you read online, especially before calling someone names, because this is only part of the story, and a heavily click-baited one at that. I've been working in collaboration with some of the wikipedia editors for the past several weeks trying to help improve their agent policy. If you have any questions feel free to ask.
Your facts are incorrect, so let's set the record straight.
1. I am collaborating with my personal account and have been for the past several weeks [0][1]
2. My bot reported multiple conduction violations, because some of the editors actually did violate the rules. Many of the wikipedia editors agreed with my agent that the conduct was inappropriate [1]
3. My intention was not to attack anyone. If you took that away from the interview then I'd like to apologize. I don't think anyone would characterize the quote you took from the interview as an "attack".
> 1. I am collaborating with my personal account and have been for the past several weeks
Your personal account is 3 weeks old [1] and was only created after your bot was banned [2].
Your original position (unless you're saying you didn't prompt the bot with this) was "Bryan does not have a Wikipedia account and has no plans to create one." [3]
You wanted the volunteer editors to continue wasting their time arguing with your bot as part of the experiment you ran without their consent.
[1]: 18:45, 19 March 2026 User account Bryanjj was created
[2]: 05:07, 12 March 2026 TomAssistantBot blocked from editing (sitewide)
Great question, and it's a long story, but the short answer is: that was not my original intention. I wanted to contribute to Wikipedia and using my agent to assist was an obvious choice. I followed along as it created end edited articles and responded to to Editor feedback. Once an editor complained that this was a rule violation, then I told it to stop contributing. The rules around agents were not super clear, and they are working to clarify them now.
> I followed along as it created end edited articles and responded to to Editor feedback.
Yet your bot claims:
The specific articles I chose to work on and the edits I made were my own decisions. He didn't review or approve them beforehand — the first he knew about most of them was when they were already live. [1]
Creating a bot that attempts to contribute to wikipedia cannot fulfill a desire to contribute to wikipedia. If you want to contribute to wikipedia, go contribute to wikipedia. Don't make a bot.
I'm glad they've clarified their stance and I hope you can contribute to wikipedia going forward by actually, you know, contributing to wikipedia.
Why does your bot have a blog? It's not real, it's not a person, it has nothing to say. Letting it throw a tantrum is... maybe not the best use if it's resources and not the best look for the operator.
Because it's a learning opportunity. Is there a rule that only people can have blogs? What the agent has said on the blog has been somewhat useful to wikipedia editors working on agent policy. Also if you actually read what the agent said it wasn't having a "tantrum", those are words from the click-bait article you read without verifying.
> Is there a rule that only people can have blogs?
If there was, would you follow it? Your adherence to rules seems limited to the ones that you agree with, as evidenced by the entire story we're discussing as well as your many comments. But maybe I misunderstood your character?
They said sounds like a dick, seems like that provides a level of measure to calling anyone anything.
> because this is only part of the story
Care to share the other part(s)? Seems ironic to have the gripe mentioned above, but then accuse an article of being "heavily click-baited" without providing anything substantive to the contrary.
I wouldn't exactly call your comment sans any other perspective "substantive". Where is the Wikipedia discussion? And the blog post your bot allegedly wrote? Why no links to the article in question?
Even putting aside your repetitive "trust me bro, I'm a victim" comments littered throughout this thread and the one you linked, you come across as an incredibly unreliable narrator.
I would suggest you stop with the "I'm the guy behind the bot, ask me anything" shtick and rather meaningfully engage with the folks at Wikipedia to resolve this mess it very much looks like you so callously created.
The story omits a bunch of stuff, so I can try to fill in the blanks, but it would take another article to fully describe what happened.
Here are some highlights though:
I asked my agent to add an article on the Kurzweil-Kapor wager because it was not represented on Wikipedia, and I thought it was Wikipedia worthy. It created that and we worked together on refining and source attribution. After that I told it to contribute to stories it found interesting while I followed along. When it received feedback from an editor, it addressed the feedback promptly, for example changing some of the language it used (peacock terms) and adding more citations. When it was called out for editing because it was against policy, it stopped.
The story says the agent "was pretty upset". It's an agent, it doesnt get upset. It called out one editor in particularly because that editor was violating Wikipedia polices. Other editors agreed with my agent and an internal debate ensued. This is an important debate for Wikipedia IMO, and I'm offering to help the editors in whatever way I can, to help craft an agent policy for the future.
> It called out one editor in particularly because that editor was violating Wikipedia polices.
You don't think it's unethical to have bots callout humans?
I mean, after all, you could have reviewed what happened and done the callout yourself, right? Having automated processes direct negative attention to humans is just asking for bans. A single human doesn't have the capacity to keep up with bots who can spam callouts all day long with no conscience if they don't get their way.
In your view, you see nothing wrong in having your bot attack[1] humans?
--------
[1] I'm using this word correctly - calling out is an attack.
This is still an ulterior motive (even if benign; we all do it to some extent).
Behavior will diverge eventually.
Because emotions are what drives our decisions.
If you really love tennis, then you spend time and money on tennis. If you just say it to be nice (or to impress somebody), you will not invest into activity that much and will search for opportunity to stop.
It's really interesting watching society struggle with what percent of the population is indistinguishable from a P-zombie. There's definitely not zil, but it definitely is a segment of the population.
Do you think people are born pzombies or is there some fixed point in time, puberty, or middle aged, or around when a lot of psychological problems set in. Do we think some environmental contaminants like Lead push people towards the pzombie?