|
|
|
|
|
by jackblemming
1158 days ago
|
|
I work in vision. Go look up the imagenet leaderboard. Look at the results of Alexnet vs the top result today. The trend is a log line. The top contending architectures still include CNNs trained on backprop, they’ve just had a decade of tricks applied to eek out some improvements. The transformer based vision models aren’t much better. Talk to any machine learning expert and they’ll tell you the math and fundamentals haven’t really changed since the 90s, we’ve just gotten better at scaling. Transformers came onto the scene half a decade ago and we could scale them much better than CNNs, but like CNNs of today, we’ve hit the diminishing returns limit. Maybe look at actual data instead of being dismissive to different opinions. |
|
And what did you expect other than a log curve. The maximum is obviously 100%.