|
|
|
|
|
by CthulhuOvermind
3755 days ago
|
|
I feel I can talk about this! My master's thesis was computation of a neural network using FPGA + CPU. The original SNN code was in C++, my thesis implemented in in OpenCL. This was using the Altera OpenCL to FPGA implementation Essencially taking the inner-most loop (that computed if a neuron would spike or not) and implementing it as a kernel in OpenCL. Step 1 was showing increase from single-thread C++ to OpenCL kernel. Increase was 6-10x using a i7-2600k and running on all logical cores.
Step 2 was implementing in FPGA. This means pre-shipping data to the FPGA while CPU calculated other things, and beginning computation on the FPGA, and receiving responses back on CPU. Performance was 75x compared to single-thread C++ code. Important notes that I didn't expect:
Bottleneck was memory transfer bandwidth across PCI-E.
Power consumption was less on FPGA compared to CPU.
Development time was significantly lessened. Altering the design is simple when going from OpenCL > FPGA, compared to Verilog > FPGA |
|