Hacker News new | ask | show | jobs
by astrodust 3538 days ago
I'm not as hyped about FPGA-in-CPU so much as I am of having Intel release a specification for their FPGAs that will allow development of third-party tools to program them.

Right now the various vendors seem to insist on their own proprietary everything which makes it hard to streamline your development toolchain. Many of the tools I've used are inseparably linked to a Windows-only GUI application.

2 comments

We're starting a project at Stanford to solve just this problem! The Agile Hardware Center: https://aha.stanford.edu/

The plan is to have a completely open source toolchain from the HDL to P&R.

Your lab has some interesting publications. Doing good work. Appreciate you taking time to try to solve the FPGA problem. Here's a few related works in case you want to build on or integrate with them:

https://github.com/haojunliu/OpenFPGA

https://www.synflow.com/#home-Synflow

http://legup.eecg.utoronto.ca/

The Leon3 GPL SoC might also make a good test case for you since they're designed for easy customization. Lots of academics use it for experiments.

http://www.gaisler.com/index.php/products/processors/leon3

I'm not too familiar with FPGAs, but isn't the tradeoff that since they are flexible they are usually much slower than CPUs/GPUs and it is usually used to prototype an ASIC? How is FPGA-in-CPU going to be a good thing?
They're slower in terms of clock speed, but they're not slower in terms of results.

You can do things in an FPGA that a CPU can't even touch, it can be configured to do massively parallel computations for example.

If Bitcoin is any example, GPU is faster than CPU, FPGA is faster than GPU, and ASIC is faster than FPGA. Each one is at least an order of magnitude faster than the other.

A GPU can do thousands of calculations in parallel, but an FPGA can do even more if you have enough gates to support it.

I haven't looked too closely at the SHA256 implementations for Bitcoin, but it is possible to not only do a lot of calculations in parallel, but also have them pipelined so each round of your logic is actually executing simultaneously on different data rather than sequentially.

One of the biggest caveats of FPGAs that I don't see mentioned often is that they're slow to program. This implies some unusual restrictions about where they can be used. I.e. data centers will benefit, general purpose computing not so much.
Well, normally what you lose in terms of clockspeed when using a FPGA you make up in being able establish hardware dataflows for increased parallelism. But I don't have a sense for whether deep learning problems are amenable to that sort of thing.