Hacker News new | ask | show | jobs
by swyx 2 days ago
shared older model numbers here https://www.latent.space/p/ainews-frontiercode-benchmarking

tldr theres been broad progress despite your observed regressions