Hacker News new | ask | show | jobs
by vzidex 1587 days ago
You've struck on the fundamental problem that the FPGA industry has been trying to solve for 30+ years - how to get an FPGA into the hands of every developer, like how GPUs have propagated to be essential tools.

Nobody has come up with a good answer yet. Developing for an FPGA still requires domain-specific knowledge, and because place & route (the "compile" for an FPGA) is a couple of intertwined NP-hard problems development cycles are necessarily long. Small designs might take an hour to compile, the largest designs deployed these days ~24H.

All this to say is that while they are neat, nobody has found the magic bullet use case that will make everyone want one enough to put up with the pain of developing for them (a la machine learning for GPUs). Simultaneously, nobody has found the magic bullet to make developing for them any easier, whether by reducing the knowledge required or improving the tooling.

Effort has been made in places like High-Level Synthesis (HLS, compiling C/C++ code down to an FPGA), open-source tooling, and (everyone's favorite) simulation, but they all still kinda suck compared to developing software, or even the ecosystem that exists around GPUs these days. You'll often hear FPGA people saying stuff like "just simulate your design during development, compiling to hardware is just a last step to check everything works" - but simulation still takes a long time (large designs can take hours) and tracking down a bug in waveforms is akin to Neo learning to see the Matrix.

3 comments

If the FPGA industry thinks it has been trying to do this for decades, then it has been going about it seriously wrong! Keeping your systems as black boxes, with unit prices and development prices that make them prohibitive for anything but high margin device, effectively guarantees they'll never become popular consumer commodities.

With how open development works, the straightforward minimal investment is to publicly document some devices' bitstream formats and bootstrap the ecosystem by releasing some reliable Libre place and route software. The software doesn't even have to contain all of the trade secret heuristics, it just has to work with (./configure && make && make install) and be functionally adequate enough that individual developers can scratch their own itches.

Why not ship integrated FPGA in CPUs?

Being able to offload a repeated, complex MIMD computation to an FPGA treated like an instruction could be a huge win for scientific computing and any large, steady workload that is expensive enough for companies to invest in optimizing for the FPGA. If this became commonplace and relatively inexpensive then large corporations would likely fund improvements into compilers to make the developer experience simpler and faster.

There are such CPUs, and the uptake has been minimal, because as proven by GPGPUs not every developer is capable of actually use them.

Your example could be as easily done in a GPGPU.

I just wanted to note Intel tried that and it didn't work. See pjmlp reply.

I still think the idea is sound, the way to go about it needs a lot of rethinking.

You don't seem bullish on the prospects of using Vitis [0] to deploy a machine learning model to a Xilinx FPGA?

[0] https://www.xilinx.com/products/design-tools/vitis/vitis-pla...

Disclaimer: I work in this space (not at Xilinx), comments are strictly my own opinions and do not reflect any positions of my employer, etc.

Broadly speaking, FPGA-based ML model accelerators are in an interesting space right now, where they aren't particularly compelling from a performance (or perf / Watt, perf / $, etc.) perspective. If you just need performance, then a GPU or ASIC-based accelerator will serve you better - the GPU will be easier to program, and ASIC-based accelerators from the various startups are performing pretty well. Where an FPGA accelerator makes a lot of sense is if you otherwise need an FPGA anyways, or the other benefits of FPGAs (e.g. lots of easily-controlled IO) - but then you're just back to square 1 of "there's some cases where an FPGA makes sense and many where it doesn't". Besides that, a few niche cases where a mid-range FPGA might beat a mid-range GPU on perf / Watt or whatever metric is important for you.

Again, opinions are my own and all that. As someone in the space, I am very much hoping that someone - whether an ASIC startup or Xilinx / Intel come up with a "better" (performant, cheaper, easier to use, etc.) solution than GPUs for ML applications. If the winner ends up being FPGAs, that would be really really cool! Just at the moment it's not too compelling, and I'm trying to be realistic.

All that said, FPGAs and their related supports (software, boards, etc.) are an $Xb / Y market - nothing to shake a stick at, and there are many cases where an FPGA makes sense. Just doesn't currently make sense for every dev to buy an FPGA card to drop in their desktop to play with.

>come up with a "better" (performant, cheaper, easier to use, etc.) solution than GPUs for ML applications

you probably are aware but Xilinx themselves is attempting this with their versal aie boards which (in spirit) similar to GPUs, in that they group together a programmable fabric of programmable SIMD type compute cores.

https://www.xilinx.com/support/documentation/architecture-ma...

i have not played with one but i've been told (by a xilinx person, so grain of salt) the flow from high-level representation to that arch is more open

https://github.com/Xilinx/mlir-aie

Fascinating, thank you! Admittedly I don't keep the closest tabs on what Xilinx is doing.