| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by satisfice 97 days ago

I call this self-repudiation. I performed a systematic experiment on this exact matter, a couple of years ago. I found that ChatGPT 3.5 frequently self-repudiated, whereas 4.0, under identical circumstances, rarely did.

These experiments are a bit expensive to run because you are forced to read all the responses to judge repudiation. Sometimes it is subtle.

Also, behavior changes with the exact wording of the question.