Hacker News new | ask | show | jobs
by sigmoid10 52 days ago
Any static benchmark older than 12-18 months is basically worthless, because the content will have spread all over the internet and have found its way into the latest model's training set.