|
|
|
|
|
by turkeygizzard
1140 days ago
|
|
Thank you for articulating this. I remember similar problems and arguments arising after RNNs and CNNs became massively successful. People argued that training larger models would be infeasible for several reasons that all were made moot by Attention Is All You Need. Somebody seems to always figure out a new approach |
|
That said, this doesn't really seem all that comparable. The article points out very fundamental properties of all the diverse current approaches: They are tightly data constrained. You either need to cheap simulation or massive real world data. That's not an arcane technical point.