|
|
|
|
|
by piiswrong
3501 days ago
|
|
Amazon probably used P2 because they want to advertise it.
We can get almost linear speedup on 10 8xM40 machines using MXNet. Batch size is linearly increased with # of machines but empirically it doesn't hurt convergence, at least on imagenet. I mean who cares about AlexNet any more? It's 2016 already. It trains in under 2h on a single machine. Distributing it doesn't make much sense |
|
Amazon is at its best when it's customer obsessed and at its worst when it puts politics first.
All IMO of course.