| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gsmecher 1272 days ago

It is wonderful to see open-source tooling continue to invade the EDA space. Two pieces of context for non-practitioners:

First: the "slow" part of an HDL toolflow is synthesis/place-and-route, not elaboration or translation from an HLS into Verilog (which is discussed here). It's not even close: synthesis/P&R times for complex designs are measured in hours. Even complex HLS elaboration/translation (e.g. where scheduling algorithms are involved), it's a much simpler, smaller computational problem than synth/P&R.

In other words: this work does not aim to make FPGA development look like software development. As the authors point out, it's most relevant for simulation toolflows, where synth/P&R are not involved anyways.

Second: there are several reasons synth/P&R flows do not fully make effective use of multiple threads. These reasons are just as applicable here. One critical factor is repeatability: to whatever extent work can be parallelized, the results must stay identical run-to-run - otherwise the design outputs change even when the input HDL doesn't. EDA vendors are very careful to prevent (or limit) this variation because their biggest clients insist on it.

It's not clear whether this paper (discussing elaboration) considers repeatability or not. To pick an example: if an HDL translator needs to generate a new net or instance name in one thread, it cannot race with another naming allocation on another thread except on a completely different piece of code - otherwise, net or instance names will not be repeatable run-to-run. (Even the order of generated code needs to stay consistent run-to-run - even if a permuted order is just two blocks of text changing position with no semantic difference.) Serializing around every possible race is obviously a huge buzzkill for parallel performance, unless the problem can be consistently partitioned so these races don't occur.

Normally, the synth/P&R flow gets all the attention - I view the authors' contribution primarily as a way to accelerate simulation and chip away at Verilog's rather ratty position as a machine-generated IR. Both of these are excellent places to be focusing and I'm delighted to see papers like this.