Hacker News new | ask | show | jobs
by attemptone 380 days ago
>I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time.

Wait, what kind of metric are you talking about? When I did my masters in 2023 SOTA models where trying to push the boundaries by minuscule amounts. And sometimes blatantly changing the way they measure "success" to beat the previous SOTA

1 comments

Almost every single major benchmark, and yes progress is incremental but it adds up, this has always been the case
We were talking about linear improvements and I have yet to see it
check the benchmarks or make one of your own
I checked the BlEU-Score and Perplexity of popular models and both have stagnated around 2021. As a disclaimer this was a cursory check and I didn't dive into the details of how individuals scores were evaluated.
on what benchmarks? pretty much every major one is linear improvement