|
|
|
|
|
by gchamonlive
2171 days ago
|
|
I don't really know if those hardware breakthroughs that the article refers to already reflects in Cloud GPU performance, but software reflects nonetheless. So even though pricing has fluctuated marginally since 2018, it is just plain faster to train a neural network today because of software advances, from what I understood. |
|
Here's some figures from an actual benchmark [1] w.r.t. training costs:
1. [Mar 2020] $7.43 (AlibabaCloud, 8xV100, TF v2.1)
2. [Sep 2018] $12.60 (Google, 8 TPU cores, TF v1.11)
3. [Mar 2020] $14.42 (AlibabaCloud, 128xV100, TF v2.1)
--
Training time didn't go down exponentially either [1]:
1. [Mar 2020] 0:02:38 (AlibabaCloud, 128 x V100, TF v2.1)
2. [May 2019] 0:02:43 (Huawei Cloud, 128 x V100, TF v1.13)
3. [Dec 2018] 0:09:22 (Huawei Cloud, 128 x V100, MXNet)
So again, I have to ask where exactly do these magical improvement occur (regarding training - inference is another matter entirely, I understand that)? I've yet to find a source that supports 4x to 10x cost reductions.
[1] https://dawn.cs.stanford.edu/benchmark/index.html