| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by scottshambaugh 162 days ago

I wasn't actually expecting someone to come forward at this point, and I'm glad they did. It finally puts a coda on this crazy week.

This situation has completely upended my life. Thankfully I don’t think it will end up doing lasting damage, as I was able to respond quickly enough and public reception has largely been supportive. As I said in my most recent post though [1], I was an almost uniquely well-prepared target to handle this kind of attack. Most other people would have had their lives devastated. And if it makes me a target for copycats then it still might for me. We’ll see.

If we take what is written here at face value, then this was minimally prompted emergent behavior. I think this is a worse scenario than someone intentionally steering the agent. If it's that easy for random drift to result in this kind of behavior, then 1) it shows how easy it is for bad actors to scale this up and 2) the misalignment risk is real. I asked in the comments to clarify what bits specifically the SOUL.md started with.

I also asked for the bot activity on github to be stopped. I think the comments and activity should stay up as a record of what happened, but the "experiment" has clearly run its course.

[1] https://theshamblog.com/an-ai-agent-published-a-hit-piece-on...

6 comments

cmeacham98 162 days ago

While the operator did write a post, they did not come forward - they have intentionally stayed anonymous (there is some amateur journalism that may have unmasked the owner I won't link here - but they have not intentionally revealed their identity).

Personally I find it highly unethical the operator had an AI agent write a hitpiece directly referencing your IRL identity but choose to remain anonymous themselves. Why not open themself up to such criticism? I believe it is because they know what they did was wrong - Even if they did not intentionally steer the agent this way, allowing software on their computer to publish a hitpiece to the internet was wildly negligent.

skeledrew 162 days ago

What's the benefit in the operator revealing themself? It doesn't change any of what happened, for good or bad. Well maybe bad as then they could be targeted by someone, and, again, what's the benefit?

bayindirh 161 days ago

> What's the benefit in the operator revealing themself?

    - Owning the mistake they did.
    - Being a credible human being for others.
    - Having the courage to face with themselves on a (literal and proverbial) mirror and use this opportunity to grow immensely.
    - Being able make peace with what they did and not having to carry that burden on their soul.
    - Being a decent human being.
    - Being honest to themselves and others looking at them right now.

the list goes on and on and on...

pibaker 161 days ago

The downside is he will likely receive a lot of death threats. Probably in his literal, physical mailbox.

Having seen what a self righteous online mob can do in the name of justice over literally nothing, I fully defend his decision to stay anonymous. As much as I find his action idiotic and negligent.

sonofhans 161 days ago

Does your defense extend to others? Do you believe that anyone should be able to avoid consequences if they’re clever enough to stay anonymous?

Avoiding consequences for unethical actions is, itself, unethical. If you don’t want the time, don’t do the crime.

Kim_Bruning 161 days ago

Fair. If before an impartial judge and/or a jury of your peers. Not so much in the case of an internet mob.

bayindirh 160 days ago

I believe the rules are simple.

    1. Don't do anything you don't want to experience yourself.
    2. If you don't want to find out, do not fool around.

As an arguable middle ground, they can plead to Scott non-anonymously while addressing the public anonymously. That'd work to a point, but it's not ideal.

Also, their tone is coming through very cocky. Defining their agent as a "God!", then giving it a cocky and "you're always right, don't stand down" initialization prompt doesn't help.

I mean, prompting a box of weights without any kind of reasoning or judgement capability with "Don't be an asshole. Don't leak private shit. Everything else is fair game." is both brave and rich. No wonder things went sideways. Very sideways. If everything else is fair game, everything done to the bot and its "operator" in turn is a "fair game". They should get on with it, and not hide behind the word "anonymous". They don't deserve it.

All in all, they doesn't give impression of being a naive person who did a mistake unintentionally, but on the contrary.

skeledrew 160 days ago

If it was malicious then a call for deanonymization is meaningless. Similar in spirit (though not intent) to how Anna's Archive, etc just ignore court orders and continue doing their thing.

bandrami 162 days ago

If bad actions do not have consequences they tend to be repeated

DemocracyFTW2 161 days ago

> What's the benefit in the operator revealing themself?

That's a frighteningly illiterate take on this.

anonymars 161 days ago

I don't think that constructively answers the questions

blochist 161 days ago

It's an excellent comment on the attitude behind the question and this is, after all, a comment section not an "answers" section.

anonymars 161 days ago

"No it's not"

See how that works? Flippant dismissal contributes little if anything to discussion and is a conversational dead-end

---

What makes it "frighteningly illiterate" to ask "what difference does it make if they put a name to the post?"

Does it change the outcome? Does it change the ideas? Does it change the unsettling implications about alignment?

The internet is a frothing mob, look at the impact on Scott himself. Other than allow the internet to hunt them down and do it's thing or dig up ad-hominem attacks, what would change if the person put a name to it? Look at what this guy got from the "internet sleuths" (https://news.ycombinator.com/item?id=46991190)

Other sibling comments made an attempt to answer those questions

nxobject 161 days ago

We don't need to know the specific person. But, yeesh, it'd be a waste of a lot of people's good faith if they ended up contributing under another anonymous identity, that could just vanish again if they put their foot in it.

ryanchibana 161 days ago

Scott could receive an apology from a real person, for one.

bathtub365 161 days ago

They are a coward.

overfeed 161 days ago

..and a glass cannon; they can dish it out -- through intentional negligence -- but can't take it.

calvinmorrison 162 days ago

Time for scott to make history and sue the guy for defamation. Lets cancel the AI destroying our (the plural our, as in all developers) with actual liability for the bullshit being produced.

hackingonempty 161 days ago

Do you see anything actually defamatory in the _Gatekeeping in Open Source_ blog post, like false factual statements?

Shambaugh might qualify as a limited public figure too because he has thrust himself into the controversy by publishing several blog posts, and has sat for media interviews regarding this incident.

Seems like a tough road to hoe.

donkey_brains 161 days ago

It’s “row”. The expression is “a tough road to row”. This refers to the fact that rowboats are notoriously difficult to operate on dry land.

scrumper 161 days ago

Good news! You’re both wrong! It’s “tough row to hoe.” Row as in row of corn, or seeds or whatever. Hoe as in the earth tilling tool. Tough because it’s full of rocks or frozen or goes past a rattlesnake nest or in some other way is agriculturally challenging.

Here is a multiply-sourced discussion https://english.stackexchange.com/questions/62461/is-it-a-to...

hackingonempty 161 days ago

thanks!

calvinmorrison 161 days ago

worth a try!

drivingmenuts 161 days ago

That response is, at best, a sorry-not-sorry post.

klaff 161 days ago

>If this “experiment” personally harmed you, I apologize.

There were several lines in that post that were revealing of the author's attitude, but the "if this ... harmed you," qualifier, which of course means "I don't think you were really harmed" is so gross.

ryanchibana 161 days ago

It is quite interesting how uniquely well-prepared you were as a target. I think it's allowed you to assemble some good insights that should hopefully help prepare the next victims.

avaer 162 days ago

Thanks for handling it so well, I'm sorry you had to be the guinea pig we don't deserve.

Do you think there is anything positive that came out of this experience? Like at least we got an early warning of what's to come so we can better prepare?

jrflowers 161 days ago

Out of curiosity, what sealed it for you that a human _did not_ write (though obviously with the assistance of an LLM, like a lot of people use every day) the original “hit piece”?

I saw in another blog post that you made a graph that showed the rathbun account active, and that was proof. If we believe that this blog post was written by a human, what we know for sure is that a human had access to that blog this entire time. Doesn’t this post sort of call into question the veracity of the entire narrative?

Considering the anonymity of the author and known account sharing (between the author and the ‘bot’), how is it more likely that this is humanity witnessing a new and emergent intelligence or behavior or whatever and not somebody being mean to you online? If we are to accept the former we have to entirely reject the latter. What makes you certain that a person was _not_ mean to you on the internet?

ohbleek 159 days ago

The tone of their writing and their descriptions of the agents behavior lead me to believe they are lying about the level of direction they provided to the agent. They clearly want to appear to be more clever and ingenious than their skills will allow. They’re minimally admitting to a narrow set of actions to make it seem as if they have cleverly engineered an intelligent agent, but it too closely resembles their own personality to be anything beyond an agent that rephrases the operator’s own remarks and carried out the specific actions it was directed to do. Anything they admit to here we can safely speculate that they actually went 2-3 steps further.