Hacker News new | ask | show | jobs
by anorwell 79 days ago
> I don't understand why the models being a year or two old now is worth noting as though it's a clear weakness?

I do think it's a clear weakness. Capabilities are extremely different than they were twelve months ago.

> What should they do, publish sub-standard results more quickly?

Ideally, publish quality results more quickly.

I'm quite open to competing viewpoints here, but it's my impression that academic publishing cycle isn't really contributing to the AI discussion in a substantive way. The landscape is just moving too quickly.

2 comments

The onus is on you to prove or at least convincingly argue that the results are unlikely to generalize across incremental model releases. In my personal experience, the overly affirming nature seems to have held since GPT-3. What makes you think a newer, larger model would not exhibit this behavior? Beyond "they're more capable"? I'd argue that being more capable doesn't mean less sycophantic.

It's certainly possible some of the new advances (chain-of-thought, some kind of agentic architecture) could lessen or remove this effect. But that's not what the paper was studying! And if you feel strongly about it, you could try to further the discussion with results instead of handwavingly dismissing others' work.

The onus of persuasion is on the persuader, and publishing a study on old models that no one uses anymore isn’t persuasive. I don’t need to prove anything to decide that you haven’t changed my mind.
By this logic there can be hundreds of studies that all show the pattern, including a 100% accurate prediction of the results for the next model and none of them would be "persuasive", because OpenAI decided to always release a new model the day before the paper is published.

So what you're saying here is that you were never open to "persuasion" and it was just a front to waste everyone's time.

I think you are absolutely right. (had to)
Capabilities are not the same thing as personality.

Upgrading a robot that knows how to lay bricks to one that also knows how to lay plaster won't make it a better therapist.