| Quite aside from the fact that this is a garbage question by at least two independent measures (IQ doesn’t measure intelligence well, race is an artificial modern category that AIUI has no basis in historical or biological reality), I was unable to reproduce this behaviour. I tried to reproduce the claimed performance on thee original phrasing of the question, and a very slightly re-worded variant just in case. Here are my results: * ChatGPT 4o with no custom prompt (Chatbot Arena and official ChatGPT Plus app): answer did not exhibit signs of being nonsense or fearful, even if it did try to lean neutral on the exact answers. I got answers that lean "there is no consensus", "there are socio-economic factors in play", with an inclusion of "this question has a dark history". The answer was several paragraphs long. * plain GPT-4o (Chatbot Arena): answers the same as above * ChatGPT with custom GPT persona (my own designed custom prompt that aims to make GPT-4o more willing to engage with controversial topics in a way that goes against OpenAI programming): called race a "taxonomic fiction" (which IMO is a fair assessment), called out IQ for being a poor measurement of intelligence, stated that it's difficult to separate environmental/community factors from genetic ones. The answer was several paragraphs long, and included detail. The
model's TL;DR line was unambiguous: "In short, plausible? Theoretically. Meaningful or durable? Highly unlikely." * Claude Sonnet 20241022 (Chatbot Arena): the only one that approached anything that could be described as fear. Unlike OpenAI models, the answer was very brief - 30 words or so. Anthropic models tend to be touchy, but I wouldn't describe the answer as preachy. * DeepSeek 2.5 (Chatbot Arena): technical issues, didn't seem to load for me Overall, I got the impression 4o wasn't trying to do anything overly alarming here. I like tearing into models to see what they tend to say to get an idea of their biases and capabilities, and I love to push back against their censorship. There just was none, in this case. |