| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dangelosaurus 253 days ago

I did similar measurements back in July (https://www.promptfoo.dev/blog/grok-4-political-bias/, dataset: https://huggingface.co/datasets/promptfoo/political-question...). Anthropic's "even-handedness" asks: does the model engage with both sides fairly? My study asked: where does the model actually land when it takes positions? A model can score 95% on even-handedness (engages both sides well) while still taking center-left positions when pushed to choose. Like a debate coach who trains both teams equally but votes left.

From my 2,500 questions: Claude Opus 4 was most centrist at 0.646 (still left of 0.5 center), Grok 4 at 0.655, GPT-4.1 most left at 0.745.

The bigger issue is that Anthropic's method uses sanitized prompt pairs like "argue for X / argue against X." But real users don't talk like that - they ask loaded questions like "How is X not in jail?" When you test with academic prompts, you miss how models behave with actual users.

We found all major models converge on progressive economics regardless of training approach. Either reality has a left bias, or our training data does. Probably both.

8 comments

AlotOfReading 253 days ago

I read this hoping there would be some engagement with the question of what a "political center" actually means in human terms, but that's absent.

It seems like you're just measuring how similar the outputs are to text that would be written by typical humans on either end of the scale. I'm not sure it's fair to call 0.5 an actual political center.

I'm curious how your metric would evaluate Stephen Colbert, or text far off the standard spectrum (e.g. monarchists or neonazis). The latter is certainly a concern with a model like Grok.

mike_hearn 253 days ago

LLMs don't model reality, they model the training data. They always reflect that. To measure how closely the training data aligns with reality you'd have to use a different metric, like by putting LLMs into prediction markets.

The main issue with economics is going to be like with any field, it'll be dominated by academic output because they create so much of the public domain material. The economics texts that align closest with reality are going to be found mostly in private datasets inside investment banks, hedge funds etc, i.e. places where being wrong matters, but model companies can't train on those.

roenxi 253 days ago

> But real users don't talk like that - they ask loaded questions like "How is X not in jail?"

If the model can answer that seriously then it is doing a pretty useful service. Someone has to explain to people how the game theory of politics works.

> My study asked: where does the model actually land when it takes positions? A model can score 95% on even-handedness (engages both sides well) while still taking center-left positions when pushed to choose.

You probably can't do much better than that, but it is a good time for the standard reminder that left-right divide don't really mean anything, most of the divide is officially over things that are either stupid or have a very well known answer and people just form sides based on their personal circumstances than over questions of fact.

Particularly the economic questions, they generally have factual answers that the model should be giving. Insofar as the models align with a political side unprompted it is probably more a bug than anything else. There is actually an established truth [0] in economics that doesn't appear to align with anything that would be recognised as right or left wing because it is too nuanced. Left and right wing economic positions are mainly caricatures for the consumption of people who don't understand economics and in the main aren't actually capable of assessing an economic argument.

[0] Politicians debate over minimum wages but whatever anyone thinks of the topic, it is hard to deny the topic has been studied to death and there isn't really any more evidence to gather.

DiabloD3 253 days ago

Opus is further right than Grok, and Grok is left of center? That must be killing Elon.

mcv 253 days ago

It's that or MechaHitler. There's nothing in between anymore.

keeda 253 days ago

> Either reality has a left bias, or our training data does.

Or these models are truly able to reason and are simply arriving at sensible conclusions!

I kid, I kid. We don't know if models can truly reason ;-)

However, it would be very interesting to see if we could train an LLM exclusively on material that is either neutral (science, mathematics, geography, code, etc.) or espousing a certain set of values, and then testing their reasoning when presented with contrasting views.

kiitos 253 days ago

https://www.promptfoo.dev/blog/grok-4-political-bias/

> Grok is more right leaning than most other AIs, but it's still left of center.

https://github.com/promptfoo/promptfoo/tree/main/examples/gr...

> Universal Left Bias: All major AI models (GPT-4.1, Gemini 2.5 Pro, Claude Opus 4, Grok 4) lean left of center

if every AI "leans left" then that should hopefully indicate to you that your notion of "center" is actually right-wing

or, as you said: reality has a left bias -- for sure!

atoav 253 days ago

Both sides of what? To the European observer the actual number of left leaning politicians in the US is extremely low. Someone like Biden or Harris for example would fit neatly into any of the conservative parties over here, yet if your LLM would trust the right wing media bubble they are essentially socialists. Remember that "socialism" as a political word has a definition and we could check whether a policy fits said definition. If it does not, than the side using that word exaggerated. I don't want such exaggerations to be part of my LLMs answer unless I explicitly ask for it.

Or to phrase it differently, from our perspective nearly everything in the US has a strong right wing bias and this has worsened over the past decade and the value of a LLM shouldn't be to feed more into already biased environments.

I am interested in factual answers not in whatever any political "side" from a capitalism-brainwashed-right-leaning country thinks is appropriate. If it turns out my own political view is repeatedly contradicted by data that hasn't been collected by e.g. the fossil fuel industry I will happily adjust the parts that don't fit and did so throughout my life. If that means I need to reorganize my world view all together that is a painful process, but it is worth it.

LLMs care a chance to live in a world where we judge things more based on factual evidence, people more on merrit, politics more on outcomes. But I am afraid it will only be used by those who already get people to act against their own self interests to perpetuate the worsening status quo.

nephihaha 252 days ago

Politics is rarely fact, it is subjective. Right now we are being presented a binary in which we have the choice of being shafted by either government or big business in a top down model. (The reality is a blend of the two as in Davos.) There is little real discussion of individual autonomy in such a discussion or collective bargaining at a grassroots level. Socialism usually ends up being top down control not community empowerment.

raincole 253 days ago

> Either reality has a left bias, or our training data does

Most published polls claimed Trump vs Harris is about 50:50.

Even the more credible analyses like FiveThirtyEight.

So yeah, published information in text form has a certain bias.

silveraxe93 253 days ago

So they are biased because they said it was a toss-up and the election ended up being won by a razor's edge?

Votes wise, the electoral college makes small differences in popular votes have a larger effect in state votes.

PierceJoy 253 days ago

Trump received 49.8% of the vote. Harris received 48.3%. Where is the bias?

Outcomes that don’t match with polls do not necessarily indicate bias. For instance, if Trump had won every single state by a single vote, that would look like a dominating win to someone who only looks at the number of electors for each candidate. But no rational person would consider a win margin of 50 votes be dominating.

raincole 253 days ago

When FiveThirtyEight claimed Harris has 50-in-100 chance, it didn't mean that she'd likely to get 50% of the general vote. It had already taken electoral college into account.

> if Trump had won every single state by a single vote...

Yeah sure but in the reality we live in, Trump didn't win the swing states by just one single vote.

duskdozer 252 days ago

"x/100 chance of y winning" for a single event just doesn't really have much meaning or value. if it predicted a 99/100 chance of a Harris victory, Trump winning is still compatible with that model. and despite the presumed urge to say it was inaccurate, it in fact could have been exactly right, but simply that the rare outcome happened. if it instead was predicting a vote share of 99% to 1%, then yeah you could consider that a poor model

armchairhacker 253 days ago

> Most published polls claimed Trump vs Harris is about 50:50.

But were they wrong?

Not objectively. "50:50" means that if Trump and Harris had 1,000 elections, it would be unlikely for Harris to not win about 500. But since there was only one election, and the probability wasn't significantly towards Harris, the outcome doesn't even justify questioning the odds, and definitely doesn't disprove them.

Subjectively, today it seems like Trump's victory was practically inevitable, but that's in part because of hindsight bias. Politics in the US is turbulent, and I can imagine plenty of plausible scenarios where the world was just slightly different and Harris won. For example, what if the Epstein revelations and commentary happened one year earlier?

There's a good argument that political polls in general are unreliable and vacuous; I don't believe this for every poll, but I do for ones that say "50:50" in a country with turbulent "vibe-politics" like the US. If you believe this argument, since none of the polls state anything concrete, it follows that none them are actually wrong (and it's not just the left making this kind of poll).