We can prompt ChatGPT to say anything — see my Andrew Dice Clay hack.
Before recently, I could get it to pretend to be a stark raving conservative or a liberal. My “entitled Karen” jailbreak (that doesn’t work any more) would make someone think ChatGPT was very conservative.
Without any “jailbreak”, it gives a very bland political answer.
A jailbreak which prompts it to espouse a particular political bias isn’t evidence that it has any particular bias in itself. The bias is in the prompt not the weights.
But if a jailbreak which prompts it to be neutral produces politically biased output, that is evidence that it has a political bias in itself. The bias is in the weights not the prompt.
Before recently, I could get it to pretend to be a stark raving conservative or a liberal. My “entitled Karen” jailbreak (that doesn’t work any more) would make someone think ChatGPT was very conservative.
Without any “jailbreak”, it gives a very bland political answer.