Hacker News new | ask | show | jobs
by jascha_eng 2 hours ago
This mostly reads as a comparison between Opus 4.7 and 4.1 it would be more interesting if they reran the experiment against a team of humans with 4.7 and see how much the humans still improve the results today.