Hacker News new | ask | show | jobs
by yacin 75 days ago
The onus is on you to prove or at least convincingly argue that the results are unlikely to generalize across incremental model releases. In my personal experience, the overly affirming nature seems to have held since GPT-3. What makes you think a newer, larger model would not exhibit this behavior? Beyond "they're more capable"? I'd argue that being more capable doesn't mean less sycophantic.

It's certainly possible some of the new advances (chain-of-thought, some kind of agentic architecture) could lessen or remove this effect. But that's not what the paper was studying! And if you feel strongly about it, you could try to further the discussion with results instead of handwavingly dismissing others' work.

2 comments

The onus of persuasion is on the persuader, and publishing a study on old models that no one uses anymore isn’t persuasive. I don’t need to prove anything to decide that you haven’t changed my mind.
By this logic there can be hundreds of studies that all show the pattern, including a 100% accurate prediction of the results for the next model and none of them would be "persuasive", because OpenAI decided to always release a new model the day before the paper is published.

So what you're saying here is that you were never open to "persuasion" and it was just a front to waste everyone's time.

I think you are absolutely right. (had to)