Hacker News new | ask | show | jobs
by 100ms 58 days ago
Tiny model overfit on benchmark published 3 years prior to its training. News at 10
2 comments

It wasn't important enough to make the 11 o'clock program.
But GPT-3.5 was benchmaxxing too.
GPT 3.5 Turbo knowledge cutoff was circa 2021. MT-Bench is from 2023. Not suggesting improvements on small models aren't possible (or forthcoming, the 1.85 bit etc models look exciting), but this almost certainly isn't that.