Hacker News new | ask | show | jobs
by cma 3490 days ago
How do FPGAs compare with GPUs for the inference stage of Deep Learning algorithms? Can they accelerate it a lot?
1 comments

No, but they do use less power:

To the best of our knowledge, state-of-the-art performance for forward propagation of CNNs on FPGAs was achieved by a team at Microsoft. Ovtcharov et al. have reported a throughput of 134 images/second on the ImageNet 1K dataset [28], which amounts to roughly 3x the throughput of the next closest competitor, while operating at 25 W on a Stratix V D5 [30]. This performance is projected to increase by using top-of-the-line FPGAs, with an estimated through- put of roughly 233 images/second while consuming roughly the same power on an Arria 10 GX1150. This is com- pared to high-performing GPU implementations (Caffe + cuDNN), which achieve 500-824 images/second, while con- suming 235 W. Interestingly, this was achieved using Micros oft- designed FPGA boards and servers, an experimental project which integrates FPGAs into datacenter applications.

https://arxiv.org/pdf/1602.04283v1.pdf

That's hard to compare. Typically FPGAs are doing fixed-point math, so they can do more operations with less power. GPUs have traditionally done floating point. However, with the new Pascal architecture, certain cards (P4/P40) support 8-bit integer dot products, which give a massive boost in performance/W. It's still fairly high at 250W, but that's for an entire card with 24GB of memory. You'd have to compare that to an FPGA with that much memory on a PCIe card if you're doing apples to apples. Something like this is appropriate for comparison: http://www.nallatech.com/store/fpga-accelerated-computing/pc...