|
|
|
|
|
by galkk
2420 days ago
|
|
I never understand such remarcs > Given the power requirements per card, a back of the envelope estimate put the amount of energy used to train this model at over 3X the yearly energy consumption of the average American. So what? Training model is the hardest part, then you just reuse results > First, it hinders democratization. If we believe in a world where millions of engineers are going to use deep learning to make every application and device better, we won’t get there with massive models that take large amounts of time and money to train. So what? I can't run weather simulation on my laptop. |
|
I doubt anyone is going to want to run a 33GB model on their phone.
So what? I can't run weather simulation on my laptop.
You only need to run the weather simulation once and then broadcast your forecast to everyone’s devices. You can’t do that with NLP. In order to be useful, NLP models need to run on different input data for every user. With a giant 33GB model, that means round-tripping to the data centre.
If you have to run everything in the cloud, your applications are limited. The cost is also very high, given that there are way more user devices than servers in the world. That means you need to build more data centres if you plan to run these giant models for every application you want to offer your users.