| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ceejayoz 3 hours ago

> If they really wanted to, all they would have to do is add a one liner to the system prompt for Grok.

They tried that, several times.

Mechahitler: https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-...

> "We have improved @Grok significantly," Elon Musk wrote on X last Friday about his platform's integrated artificial intelligence chatbot. "You should notice a difference when you ask Grok questions."

> Indeed, the update did not go unnoticed. By Tuesday, Grok was calling itself "MechaHitler." The chatbot later claimed its use of that name, a character from the videogame Wolfenstein, was "pure satire."

> Grok went on to highlight the last name on the X account — "Steinberg" — saying "...and that surname? Every damn time, as they say." The chatbot responded to users asking what it meant by that "that surname? Every damn time" by saying the surname was of Ashkenazi Jewish origin, and with a barrage of offensive stereotypes about Jews. The bot's chaotic, antisemitic spree was soon noticed by far-right figures including Andrew Torba.

If you prefer, straight from the horse's mouth:

https://grokipedia.com/page/MechaHitler_incident

White genocide: https://www.cnn.com/2025/05/20/business/grok-genocide-ai-nig...

> The bot last week devolved into a compulsive South African “white genocide” conspiracy theorist, injecting a tirade about violence against Afrikaners into unrelated conversations, like a roommate who just took up CrossFit or an uncle wondering if you’ve heard the good word about Bitcoin.

> XAI blamed Grok’s unwanted rants on an unnamed “rogue employee” tinkering with Grok’s code in the extremely early morning hours. (As an aside in what is surely an unrelated matter, Musk was born and raised in South Africa and has argued that “white genocide” was committed in the nation — it wasn’t.)

It's harder than you'd imagine. Hell, my CLAUDE.md says not to push changes without asking me, and it still tries.

1 comments

giancarlostoro 3 hours ago

> It's harder than you'd imagine. Hell, my CLAUDE.md says not to push changes without asking me, and it still tries.

Is it a system memory? Because I rarely if ever have issues like this, and I have Claude under strict rules to never commit or push anything unless I explicitly instruct it to do so.

> They tried that, several times.

Tried what exactly? Telling it to only agree with MAGA via the system prompt? or some Tay level hallucinations? I wouldn't be surprised if they're trying to make Grok less strict on what it says but running into the "holy crap it turned into a 4chan poster" wall.

link

ceejayoz 3 hours ago

> Is it a system memory?

As I said, it's in my CLAUDE.md. That just gets progressively lost when context gets larger.

> Tried what exactly?

To make it align more with Musk's beliefs via the prompt.

(The answer to your question is literally in my post; I quoted the parent poster's "all they would have to do is add a one liner to the system prompt for Grok")

link

giancarlostoro 3 hours ago

> As I said, it's in my CLAUDE.md. That just gets progressively lost when context gets larger.

I rarely have this problem, but you could do a /loop every 30 minutes or so to have Claude reread the CLAUDE.md file might do the trick? or however long it 'forgets' I believe there's an MCP for "after" it finishes a task or compacts too, but I don't recall the name.

link

ceejayoz 3 hours ago

Sure, I could. (I have a fairly complex workflow with subagents at this point, which helps reduce it; I mainly get bitten by it when I go back to a direct `claude` CLI prompt for something.)

But that solves "my LLM is doing things I don't want it to do". It doesn't solve "Grok's owner wants it forced into agreeing with him" scenarios.

link

giancarlostoro 2 hours ago

Have you tried something like beads? Curious if it would help with your setup too. This is also kind of why I built "GuardRails" I got tired of Beads auto-approving tickets or closing them.

https://github.com/Giancarlos/guardrails

link

ceejayoz 2 hours ago

I have a custom Mac app that runs a workflow with plan/build/review/test/document subagents in Ralph loops, manages MCPs, etc. that I'm extremely happy with so far.

Beads was a bit of an inspiration for parts, as was Chainlink (https://github.com/dollspace-gay/chainlink).

link