|
|
|
|
|
by solidasparagus
2457 days ago
|
|
It really does. You've got to remember that a good SotA paper takes hundreds of training runs, at least. I can't go into detail about budgets, but suffice to say if you think $1M is a university compute budget that lets you be a competitive research team on the cutting edge, you are __severely__ underestimating the amount of compute that leading corporate researchers are using. Orders of magnitude off. On-prem is good for a bit until you're 18 months into your 3 year purchase cycle and you're on K80s while the major research leaders are running V100s and TPUs and you can't even fit the SotA model in your GPUs' memories any more. Longer to train can mean weeks or even months for one experiment - that iteration speed makes it so hard to stay on the cutting edge. And this is before considering things like neural architecture search and internet scale image/video/speech datasets where costs skyrocket. The boundary between corporate research and academia is incredibly porous and a big part of that is the cost of research (compute, but also things like data labelling and staffing ML talent). |
|
You still have yet to provide any concrete sources to back up your claims. We're talking about contributing to research here. If multi-million dollar training jobs are what it takes to be at the cutting edge you should be able to provide ample sources of that claim.