| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by attemptone 380 days ago
	>I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time. Wait, what kind of metric are you talking about? When I did my masters in 2023 SOTA models where trying to push the boundaries by minuscule amounts. And sometimes blatantly changing the way they measure "success" to beat the previous SOTA

1 comments

mountainriver 380 days ago

Almost every single major benchmark, and yes progress is incremental but it adds up, this has always been the case

link

attemptone 380 days ago

We were talking about linear improvements and I have yet to see it

link

mountainriver 379 days ago

check the benchmarks or make one of your own

link

attemptone 379 days ago

I checked the BlEU-Score and Perplexity of popular models and both have stagnated around 2021. As a disclaimer this was a cursory check and I didn't dive into the details of how individuals scores were evaluated.

link

mountainriver 378 days ago

on what benchmarks? pretty much every major one is linear improvement

link