|
|
|
|
|
by argonaut
3751 days ago
|
|
The difference is between training taking a week, and training taking 10 weeks. It takes a week to train a standard AlexNet model on 1 GPU on ImageNet (and this is pretty far from state of the art). It takes 4 GPUs 2 weeks to train a marginally-below state of the art image classifier on ImageNet (http://torch.ch/blog/2016/02/04/resnets.html) - the 101 layer deep residual network. This would be 20 weeks on an ensemble of CPUs. (State of the art is 152 layers; I don't have the numbers but I'd guess-timate 3-4 weeks to train on 4 GPUs). |
|