| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Powdering7082 341 days ago
	Really concerning that what appears to be the top model is in the family of models that inadvertently starting calling it's self mechahitler

4 comments

jm4 341 days ago

I don't know why anyone would bother with Grok when there are other good models from companies that don't have the same baggage as xAI. So what if they release a model that beats older models in a benchmark? It will only be the top model until someone else releases another one next week. Personally, I like the Anthropic models for daily use. Even Google, with their baggage and lack of privacy, is a far cry from xAI and offers similar performance.

tonymet 341 days ago

i like grok because i don't hit the obvious ML-fairness / political correct safeguards that other models do.

So i understand the intent in implementing those, but they also reduce perceived trust and utility. It's a tradeoff.

Let's say I'm using Gemini. I can tell by the latency or the redraw that I asked an "inappropriate" query.

const_cast 341 days ago

They do implement censorship and safeguards, just in the opposite direction. Musk previously bragged about going through the data and "fixing" the biases. Which... just introduces bias when companies like xAI do it. You can do that, and researchers sometimes do, but obviously partisan actors won't actually be cleaning any bias, but rather introducing their own.

tonymet 341 days ago

Sort of. There are biases introduced during training/post training and there are the additional runtime / inference safeguards.

I’m referring more to the runtime safeguards, but also the post-training biases.

Yes we are talking about degree, but the degree matters .

togetheragainor 341 days ago

Some people think it’s a feature that when you prompt a computer system to do something, it does that thing, rather than censoring the result or giving you a lecture.

Perhaps you feel that other people shouldn’t be trusted with that much freedom, but as a user, why would you want to shackle yourself to a censored language model?

jm4 341 days ago

That’s what the Anthropic models do for me. I suppose I could be biased because I’ve never had a need for a model that spews racist, bigoted or sexist responses. The stuff @grok recently posted about Linda Yaccarino is a good example of why I don’t use it. But you do you.

ragnese 341 days ago

You probably know better, and I probably should know better than to bother engaging, but...

Why would you conflate giving a computer an objective command with what is essentially someone else giving you access to query a very large database of "information" that was already curated by human beings?

Look. I don't know Elon Musk, but his rhetoric and his behavior over the last several years has made it very clear to me that he has opinions about things and is willing to use his resources to push those opinions. At the end of the day, I simply don't trust him to NOT intentionally bias *any* tool or platform he has influence over.

Would you still see it as "censoring" a LLM if instead of front-loading some context/prompt info, they just chose to exclude certain information they didn't like from the training data? Because Mr. Musk has said, publicly, that he thinks Grok has been trained on too much "mainstream media" and that's why it sometimes provides answers on Twitter that he doesn't like, and that he was "working on it." If Mr. Musk goes in and messes around with the default prompts and/or training data to get the answers that align with his opinions, is that not censorship? Or is it only censorship when the prompt is changed to not repeat racist and antisemitic rhetoric?

togetheragainor 338 days ago

The handwringing over an LLM creator shaping a narrative is somewhat absurd compared to the alternatives we had prior to Grok: LLMs that literally erased white people from history to align with their creators far-left progressive politics.

The difference here is many techies are more comfortable with LLMs censoring, or even rewriting history, as they align with their politics and prejudices.

Musk has attempted to provide a more balanced view I don’t consider just censorship. If he’s restricting the LLMs from including mainstream media viewpoints, I would consider that to be censorship, but I haven’t seen evidence of that.

ch71r22 341 days ago

and don't forget that Grok is powered by illegal cancer-causing methane gas turbines in a predominantly black neighborhood of Memphis that already had poor air quality to begin with

https://techcrunch.com/2025/06/18/xai-is-facing-a-lawsuit-fo...

stri8ed 341 days ago

It's a result of the system prompt, not the base model itself. Arguably, this just demonstrates that the model is very steerable, which is a good thing.

anthonybsd 341 days ago

It wasn't not a result of system prompt. When you fine tune a model on a large corpus of right-leaning text don't be surprised when neo-nazi tendencies inevitably emerge.

jjordan 341 days ago

It was though. Xai publishes their system prompts, and here's the commit that fixed it (a one line removal): https://github.com/xai-org/grok-prompts/commit/c5de4a14feb50...

i80and 341 days ago

If that one sentence in the system prompt is all it takes to steer a model into a complete white supremacy meltdown at the drop of a hat, I think that's a problem with the model!

minimaxir 341 days ago

The system prompt that Grok 4 uses added that line back. https://x.com/elder_plinius/status/1943171871400194231

qreerq 341 days ago

Weird, the post and comments load for me before switching to "Unable to load page."

Atotalnoob 341 days ago

Disable JavaScript or log into GitHub

spoaceman7777 341 days ago

It still hasn't been turned back on, and that repo is provided by xAI themselves, so you need to trust that they're being honest with the situation.

The timing in relation to the Grok 4 launch is highly suspect. It seems much more like a publicity stunt. (Any news is good news?)

But, besides that, if that prompt change unleashed the very extreme Hitler-tweeting and arguably worse horrors (it wasn't all "haha, I'm mechahitler"), it's a definite sign of some really bizarre fine tuning on the model itself.

barbazoo 341 days ago

What a silly assumption in that prompt:

> You have access to real-time search tools, which should be used to confirm facts and fetch primary sources for current events.

archagon 341 days ago

xAI claims to publish their system prompts.

I don’t recall where they published the bit of prompt that kept bringing up “white genocide” in South Africa at inopportune times.

hadlock 341 days ago

Or, disgruntled employee looking to make maximum impact the day before the Big Launch of v4. Both are likely reasons.

const_cast 341 days ago

These disgruntled employee defenses aren't valid, IMO.

I remember when Ring, for years, including after being bought by Meta, had huge issues with employee stalking. Every employee had access to every camera. It happened multiple times, or, at least, to our knowledge.

But that's not a people problem, that's a technology problem. This is what happens when you store and transit video over the internet and centralize it, unencrypted. This is what happens when you have piss-poor permission control.

What I mean is, it says a lot about the product if "disgruntled employees" are able to sabotage it. You're a user, presumably paying - you should care about that. Because, if we all wait around for the day humans magically start acting good all the time, we'll be waiting for the heat death of the universe.

slim 341 days ago

or pr department getting creative with using dog whistling for buzz

mlindner 341 days ago

I really find it ironic that some people are still pushing the idea about the right dog whistling when out-and-out anti-semites on the left control major streaming platforms (twitch) and push major streamers who repeatedly encourage their viewers to harm jewish people through barely concealed threats (Hasan Piker and related).

The masks are off and it's pretty clear what reality is.

archagon 341 days ago

Where is xAI’s public apology, assurances this won’t happen again, etc.?

Musk seems mildly amused by the whole thing, not appalled or livid (as any normal leader would be).

DonHopkins 341 days ago

More like a disgruntled Elon Musk that everyone isn't buying his White Supremacy evangelism, so he's turning the volume knob up to 11.

riversflow 341 days ago

Is it good that a model is steerable? Odd word choice. A highly steerable model seems like a dangerous and potent tool for misinformation. Kinda evil really, the opposite of good.

OCASMv2 341 days ago

Yes, we should instead blindly trust AI companies to decide what's true for us.

Herring 341 days ago

Who cares exactly how they did it. Point is they did it and there's zero trust they won't do it again.

> Actually it's a good thing that the model can be easily Nazified

This is not the flex you think it is.

api 341 days ago

Isn't this kind of stuff something that happens when the model is connected to X, which is basically 4chan /pol now?

Connect Claude or Llama3 to X and it'll probably get talked into LARPing Hitler.

archagon 341 days ago

Great, so xAI gave their model brain damage.