| Just tried it. On integrated graphics (Intel HD 620) (batch size 1000): - was able to train a simple dense network, but no speed up over just doing CPU training on the same processor (i3-7100U) - ResNet style architecture failed on the same HD 620 with "LLVM ERROR: SPIRV internal error: Invalid magic number" On a machine with NVidia GPU (batch size 1000): - unlike Intel GPU, ResNet trained without any errors (so it might have been Intel driver issue) - using DirectML came out about 3 times faster, than CPU of the machine (i7-8700K) - using DirectML came out about 12 times slower, than using regular tensorflow-gpu with CUDA So far mixed feelings, but I am excited to see how it runs on AMD GPUs, and on Windows on ARM64 (e.g. Surface X). P.S. I run
https://github.com/losttech/Gradient-Samples/tree/master/Fas...
and
https://github.com/losttech/Gradient-Samples/tree/master/Res... had to add "batch_size: 1000" to the fit call to see speedups over CPU. |