|
|
|
|
|
by joe_the_user
2420 days ago
|
|
IMO this is not a problem. The people building insanely huge models are expanding the set of tasks that can be done by a computer. Who cares how much memory it takes? But are they? The example in the article describes an incremental improvement in a benchmark in exchange for a massive increasing in training time. Deep learning has achieved success on a number of tasks that previously computers had been unable to do. Since the initial period of success, it is an area of debate whether deep learning has expanded it's basic area of applicability or whether is has incrementally on it's initial achievements. And if it is true that deep learning is stuck on just expanding what it's already doing, it might be the fundamental next advance might come from one person with one machine rather than a massive team with a massive machine. Consider that neural nets as a theory had been around since the 1990s if not the 1960s but the fundamental advantage of DL came when grad students could use GPU in the 2010s, not when massively parallel machines came into existence (quite a bit earlier). Here, the further wrinkle is that moore's law is gradually ending. We won't access to that much more computing power twenty years hence - so making less do more does make sense. |
|
One thing that I can't help wondering, however sci-fi it sounds, is if model simplifications like in this post might lead to models humans can fully understand, which then might lead to new styles of traditional programing - opening up whole new ways of doing things.