|
|
|
|
|
by i_am_proteus
1485 days ago
|
|
While I am inclined to personally agree with your sentiment, I don't think I have better insights than Richard Sutton:
http://incompleteideas.net/IncIdeas/BitterLesson.html "The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin." |
|
EfficientNet is an exemplar of this approach; they made much better small models, and wound up with much higher quality big models as a result of having better architecture overall: https://arxiv.org/pdf/1905.11946.pdf
We're currently seeing some great results with more efficient attention layers, which will make the current 'big' models much more efficient... And unlock a next generation of higher quality big models.