|
|
|
|
|
by oneshot908
3501 days ago
|
|
Using 3 year-old GPUs on a much deeper network than the other guys(tm) to demonstrate awesome scaling efficiency == Intel-level FUD. Note also the absence of overall batch size. Wonder what would happen to that scaling efficiency if those GPUs were P40s? See also the absence of equivalent AlexNet numbers to further obscure attempts at comparing this to the other guys(tm). Can't wait for Intel's response to this. |
|
If they can achieve 109x speed up with 128 GPUs using synchronous data parallelism with a batch size tuned for optimal single GPU convergence time, then this is very impressive (but quite unlikely).
However I don't think that publishing training benchmarks on Inception v3 (vs say AlexNet) is a fraud. Inception v3 is close to the state of the art and very good at using few parameters & inference FLOPS for a good test accuracy.
Inception v3 has been publicly available for quite a long time in a variety of DL toolkits along with pre-trained weights.