Hacker News new | ask | show | jobs
by zjp 82 days ago
Also, nothing has changed! Claude will still yes-and whatever you give it. ChatGPT still has its insufferable personality, where it takes what you said and hands it back to you in different terms as if it's ChatGPT's insight.
4 comments

OTOH, for Claude the study says 39% yessy, same as humans, 2nd lowest yessing of the LLMs; GPT5 above 50% yessy.
No dude, you don’t understand! It’s just so advanced now that you aren’t allowed to levy any criticism whatsoever!
It's almost like it is based on the training data and regimen that is largely the same between versions.
Well yes, but no. There's also open-weight models, and literally all of the listed above are not used anymore, at least by most end users and developers as far as I'm aware.
No study of ai can ever be done or be relevant because ever couple of months they are a new number to the name of the model thus invalidating all work around model behavior
Yes, you are right. Sorry, I missed that out. It's just that all the open-weight models mentioned were... One year old or older. I just forgot that, firstly, such research is rarely done on frontier models because it takes time (you start with Llama 3.3, but look, one month later there's Llama 4), and secondly, there's also a publication delay. I think I'm just too used to the world of software, where everything moves at lightning speed. Sorry : )