Hacker News new | ask | show | jobs
by omeze 2435 days ago
This is a pretty nice tutorial! My courses in FPGA design in school taught me a ton about 1) concurrency and 2) good state machine design. In modern backend web development these topics receive so little attention (from interviewing all the way to writing technical specs, I've rarely encountered these topics brought up explicitly) but are important. I was a bit hesitant for this guide to suggest using C++ since I tend to dislike mixing traditional languages with hardware languages but I realized it was just for testbenches, which is very reasonable (and even VHDL exposes things like `for` loops that are really only useful for testing and meaningless otherwise - sans some special cases[1]).

[1] You can abuse some imperative paradigms to implement things like Conway's Game of Life as a systolic array - https://en.wikipedia.org/wiki/Systolic_array

3 comments

I disagree about for loops, you actually end up using these quite a lot in vhdl/verilog (with understanding about what logic you are going to end up with), if you want to do the same operation on multiple things:

  input [NUM_OF_MULTIPLIERS*32-1:0] a_in,
  input [NUM_OF_MULTIPLIERS*32-1:0] b_in,

  output [NUM_OF_MULTIPLIERS*64-1:0] mult_out

  reg [31:0] tmp_a, tmp_b;
  reg [63:0] tmp_mult;

  always @(*) begin
    mult_out = {(NUM_OF_MULTIPLIERS*64){1'b0}};
    for (i=0; i<NUM_OF_MULTIPLIERS; i+=1) begin
      tmp_a = a_in>>(i*32);
      tmp_b = b_in>>(i*32);
      tmp_mult = tmp_a*tmp_b;
      mult_out |= tmp_mult<<(i*64);
    end  
  end
Would give you NUM_OF_MULTIPLIERS multipliers. If you wrote each multiply out, it would be more code and also wouldn't allow you to parametrize the code.
The key is that for loops are essentially pre-processor macros (like C) so they must have a fixed number of iterations known at compile time. So yes, you have a for loop, but it's very different to what you expect from a for loop in software.
Yes, the key is that loops are always unrolled so the number of iterations (number of copies of the hardware) is fixed. But whether the output of each iteration is used or not can be entire dynamic, potentially resulting in something very similar to a loop in software.
I agree about using C++ for actual ip block implementation. My experience has been pretty mixed. Mostly because the tools (Intel HLS in my case) don't always give you a great idea of what constructs cause you to generate inefficient hdl code.

For example, passing a variable by reference in one context cost me an extra 10% logic blocks, and in another lowered it by 10%. It became a bit of a shotgun approach to optimising

One does not pass a variable in an HDL design ;-). Trying to pluck software principles onto FPGAs is wasting so much performance. Get one with the underlying hardware and map your problem onto them, not an intermediate SW-like representation. Like some other comment mentioned, get one with the clock and your design will fly.
Simply not true. If you realize how the tools use loops you can create useful hardware just as easily as any other construct in a given HDL language.