Hacker News new | ask | show | jobs
by kolinko 132 days ago
They didn't test Opus at all, only Sonnet.

One of the tasks was "Build an interactive dashboard for exploring data from the World Happiness Report." -- I can't imagine how Opus4.5 could've failed that.

1 comments

Check the link to the study. It has been updated for Opus 4.5.