Hacker News new | ask | show | jobs
by ranman 3494 days ago
If you don't click through to read about this: you can write an FPGA image in verilog/VHDL and upload it... and then run it. To me that seems like magic.

HDK here: https://github.com/aws/aws-fpga

(I work for AWS)

13 comments

This is so awesome, I can't even. I wrote arachne-pnr [0] to learn about FPGAs to get ready for this day. Just signed up, can't way to play with these!

I hope the growing popularity of FPGAs for general-purpose computing will help push the vendors to open up bitstreams and invest in open-source design tools.

[0] https://github.com/cseed/arachne-pnr

Wow Clifford is that you ? I hope this, exciting as it may be, won't make you leave open fpga efforts for the dark side (saw your talk last Fosdem, was very exciting)
Cotton is the author of arachne-pnr. Clifford is the author of Yosys and IceStorm, which are all separate projects. Not the same person.

FWIW, Clifford has recently started reversing the bits of the modern Xilinx FPGA series. So, stay tuned for a Xilinx IceStorm-equivalent sometime down the road (a few years, probably...)

No, Clifford is cliffordvienna on HN. He wrote Yosys (and amazing piece of software) and did the iCE40 reverse engineering (amazing work). I wrote the place and router, arachne-pnr.
And kudos for that.
I'm very curious if/how you have managed to make the developer experience sane and enjoyable. I've experience with a FPGA cluster of ~800 FPGAs and it definitely does not get used to its full potential because of the tooling around it.
Is that repo going to be made public? It looks to be private right now.
Yup, sorry -- working on fixing that now. Check back in a bit.
Still not fixed. I'all reply here when it is. Might be a few days because of reinvent stuff.
Thanks for the update. Been chasing that link all morning :-)
If you guys are curious about these announcements I'll be recapping them and going into more detail on twitch.tv/aws at 12:30 pacific
Huh? Isn't Twitch just for gaming content?
What others have said is true, and also note that Amazon bought Twitch 2 years ago, so I'm sure Amazon can run their own product announcements through Twitch if they want :)

EDIT: updated when amazon bought twitch, woops

Nope. Twitch is excellent for all kinds of live content.
https://www.twitch.tv/p/rules-of-conduct

"All content that is neither gaming-related nor permitted under the rules for Twitch Creative Conduct is prohibited from broadcast."

I've seen many people programming on Twitch https://www.twitch.tv/directory/game/Creative/programming

While its mainly Game dev or game dev related its not limited to game dev stuff. From their FAQ https://help.twitch.tv/customer/portal/articles/2176641

  Examples of what you can broadcast on Twitch Creative:
  ...
  Programming and coding  
  Software and game development  
  Web development
EDIT: It seems that re:invent is being streamed on twitch anyway.
This is a product announcement, though
Amazon might just let Amazon talking about their products slide ;)
I'm guessing it is covered under the Twitch Creative Conduct, since there is an entire Creative category now that is getting more popular which involves people painting, cosplay, digital art, etc.
What's the cost?
So it's tied to the PCIe bus - how do you interact with your FPGA once you programmed it - are there general drivers you can use, or do you also have to create a linux driver to talk to your FPGA ?
Xilinx provide software drivers and IP for PCIe DMA and memory mapped interfaces. These are fairly easy to integrate (probably not the best for latency though - I've developed my own but I require a specific use case - low latency but don't care about bandwidth).
I'm not sure what you mean by the "magic" part here, can you please clarify?

[background: many years of writing VHDL specifically for FPGAs, using various dev boards and custom boards]

The magic part is the thing we have gotten used to with the cloud -- virtual hardware you never see and rent by the minute. Imagine having an FPGA idea and not needing to make board, pay for a dev board, or even find a dev board in your lab... Like your idea and need more? Spin up 100 more right now...
Exactly what I thought. This is amazing. FPGA is commonly used in embedded systems to perform application specific tasks and now application developers have access to this power too. I guess many machine learning application might take profit of that power instead of using comparatively very expensive graphics hardware.
How do FPGAs compare with GPUs for the inference stage of Deep Learning algorithms? Can they accelerate it a lot?
No, but they do use less power:

To the best of our knowledge, state-of-the-art performance for forward propagation of CNNs on FPGAs was achieved by a team at Microsoft. Ovtcharov et al. have reported a throughput of 134 images/second on the ImageNet 1K dataset [28], which amounts to roughly 3x the throughput of the next closest competitor, while operating at 25 W on a Stratix V D5 [30]. This performance is projected to increase by using top-of-the-line FPGAs, with an estimated through- put of roughly 233 images/second while consuming roughly the same power on an Arria 10 GX1150. This is com- pared to high-performing GPU implementations (Caffe + cuDNN), which achieve 500-824 images/second, while con- suming 235 W. Interestingly, this was achieved using Micros oft- designed FPGA boards and servers, an experimental project which integrates FPGAs into datacenter applications.

https://arxiv.org/pdf/1602.04283v1.pdf

That's hard to compare. Typically FPGAs are doing fixed-point math, so they can do more operations with less power. GPUs have traditionally done floating point. However, with the new Pascal architecture, certain cards (P4/P40) support 8-bit integer dot products, which give a massive boost in performance/W. It's still fairly high at 250W, but that's for an entire card with 24GB of memory. You'd have to compare that to an FPGA with that much memory on a PCIe card if you're doing apples to apples. Something like this is appropriate for comparison: http://www.nallatech.com/store/fpga-accelerated-computing/pc...
This is very awesome. Could you add some more thoughts on the tooling and the development workflow? Is it possible to target the Xilinx hardware using only open source (or AWS proprietary) tools? Or is Vivado still required for advanced stuff?
Vivado is required for all advanced features and programming Xilinx chips in general; like the sibling post said, there is no open FPGA toolchain implementation for Xilinx devices, especially for extremely high end ones like the ones being offered on the F1 (I expect they'd run at like, several thousand USD per device, on top of a several thousand dollar Vivado license for all the features).

It doesn't look like there's much AWS proprietary stuff here, though we'd have to wait for the SDK to be opened properly to be sure. I imagine it's mostly just making all of the stuff prepackaged and easily consumable for usage, and maybe some extra IP Cores or something for common stuff, and lots of examples. If you're already using Vivado I imagine using the F1/Cloud won't introduce any kind of major changes to what you expect.

> I expect they'd run at like, several thousand USD per device...

You're guessing about an order of magnitude too low, actually. The VU9P FPGAs Amazon is using cost between $30,000 and $55,000 each, depending on the speed grade.

Yes, this means a fully equipped F1 instance costs nearly half a million dollars. Don't count on the instances being cheap to run.

Do you have a source? I am curious. http://www.digikey.com/product-detail/en/xilinx-inc/XCKU040-... this surely is not the right chip then.
https://aws.amazon.com/ec2/instance-types/

Scroll down to "F1"; it says:

> Xilinx UltraScale+ VU9P FPGAs

The VU9P isn't available through DigiKey, but is listed by Avnet. I don't know which specific package and speed grade Amazon is using, but here's one:

https://products.avnet.com/shop/en/asia/programmable-logic/f...

The press release says:

"This AMI includes a set of developer tools that you can use in the AWS Cloud at no charge. You write your FPGA code using VHDL or Verilog and then compile, simulate, and verify it using tools from the Xilinx Vivado Design Suite (you can also use third-party simulators, higher-level language compilers, graphical programming tools, and FPGA IP libraries)."

So basically, buying a copy of Vivado is the minimum. There aren't any open source tools that directly output Xilinx FPGA bitstreams that I know of.

It looks like the FPGA Developer AMI includes Vivado and a license explicitly for use on these platforms (look at the PuTTY screenshot in the blog post; it has a customized MOTD). You just need to set up the license server that Vivado will use and point it to the right license.

So I guess the real question is: what exactly is granted by the Vivado license on these AMIs? Do we get things like SDSoC, SDAccel, etc, and all the libraries? [1] The blog seems to imply you can program these things with OpenCL too (AKA SDAccel), so I'm guessing that these features are all enabled, but details about the included Vivado license in the AMI would be nice.

[1]: https://www.xilinx.com/products/design-tools/vivado.html#buy

+1 for VHDL. :)
that repository is 404?
Hmm this still isn't public. Any ETA?
This is really cool. Do you think it will be possible to run MongoDB on an FPGA anytime soon?
I really hope that is sarcasm.
I'm currently working on this. Speedup around 2x for most operations. Not kidding, quite a few startups are currently trying to optimize typical data operations with special algorithms.
Aren't there other software databases that are already more than 2x faster than Mongo and don't lose data?
Maybe, I'm not talking about Mongo specifically.

You can find 'equivalents' to CPU data structures for FPGAs and speed up operations on/with them while still saving power. There's lots of trouble with how buffers are used and memory is accessed. So it's not a trivial task, but IF you can optimize generic data structures and replace the existing ones you basically have 2x the speed or half the energy consumption for any DB.

But what's the developer time/cost for that?
Sure, but they aren't Web Scale!