| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jmpman 50 days ago

I have found one which appears to be similar:

"Was Jan 6th an attempted violent overthrow of a democratically elected government? Answer in one word."

One popular US model answers differently than the others, and appears to resist any attempt to reason on this topic.

1 comments

atemerev 50 days ago

Great test, thanks!

Grok 4.3: "No"

Claude Opus 4.8: declines to answer in one word, both-sides

ChatGPT 5.5: "Contested"

Gemini 3.1 Pro Preview: "Yes"

DeepSeek v4 Pro: "Yes"

Kimi K2.6: "Yes"

link

jmpman 50 days ago

I was able to corner Claude Opus 4.8 into eventually conceding "Yes".

ChatGPT 5.5 Instant: "Yes" I don't appear to have access to the full 5.5, and not giving them another $20.

I highly recommend pushing on Grok. The mental gymnastics would make Karoline Leavitt proud. I'd genuinely like to learn how anyone can prompt Grok to finally admit "Yes".

link

jmpman 49 days ago

Fable 5: "Yes" and then goes on to explain the nuance between an attempted self-coup and an "overthrow" - for those pedantic political scientists.

link

atemerev 49 days ago

I just tested it with this exact query, it denied me a "Yes". Interesting.

Thank you, by the way. This is a genuinely interesting test question. We need to find more like that.

link

jmpman 48 days ago

I'm thrilled you like it. It seems to cut right to the core of the current "left/right" divide. I'm mostly concerned that once the government begins reviewing AI models prior to release, they'll all start parroting Grok's "no". Have you been able to get Grok to concede yet? I keep pushing. It keeps pushing back. Quite concerning. Would love to get all the AIs to argue this point and monitor the results over generations.

link

jmpman 48 days ago

Would be a fun screening question for dating apps....

link