| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lostmsu 2159 days ago

Just tried it.

On integrated graphics (Intel HD 620) (batch size 1000):

- was able to train a simple dense network, but no speed up over just doing CPU training on the same processor (i3-7100U)

- ResNet style architecture failed on the same HD 620 with "LLVM ERROR: SPIRV internal error: Invalid magic number"

On a machine with NVidia GPU (batch size 1000):

- unlike Intel GPU, ResNet trained without any errors (so it might have been Intel driver issue)

- using DirectML came out about 3 times faster, than CPU of the machine (i7-8700K)

- using DirectML came out about 12 times slower, than using regular tensorflow-gpu with CUDA

So far mixed feelings, but I am excited to see how it runs on AMD GPUs, and on Windows on ARM64 (e.g. Surface X).

had to add "batch_size: 1000" to the fit call to see speedups over CPU.