| This is an odd framing. Training has become much more accessible, due to a variety of things (ASICs, offerings from public clouds, innovations on the data science side). Comparing it to Moore's Law doesn't make any sense to me, though. Moore's Law is an observation on the pace of increase of a tightly scoped thing, the number of transistors. The cost of training a model is not a single "thing," it's a cumulative effect of many things, including things as fluid as cloud pricing. Completely possible that I'm missing something obvious, though. |
I assume it's meant as a qualitative comparison rather than a meaningful quantitative one. Sort of a (sub-)cultural touchstone to illustrate a point about which phase of development we're in.
With CPUs, during the phase of consistent year after year exponential growth, there were ripple effects on software. For example, for a while it was cost-prohibitive to run HTTPS for everything, then CPUs got faster and it wasn't anymore. So during that phase, you expected all kinds of things to keep changing.
If deep learning is in a similar phase, then whatever the numbers are, we can expect other things to keep changing as a result.