Hacker News new | ask | show | jobs
by tremon 1493 days ago
No clocks to tick - no energy spent

If only it were that simple. Logic gates take time to settle, and each input gate switch or transient will have a ripple effect on all its downstream gates, which can be many in a complex circuit. Synchronous logic elements such as latches will block the spurious transients from propagating beyond the next clock barrier, but if you lack those, you also lose the protection against propagating logic transients. And every transient draws a little bit of power.

Imagine the ripple effects of a 64-bit 2-operand multiplier (simple ripple-carry, as it's the easiest to reason about). Since the inputs are probably not gated either, each of the 4096 adder tree inputs may arrive at a different time, and each input has an average of 96 downstream gates (64/2 adder tree height, 128/2 carry propagation length). The carry propagation is done through and-gates which have an attenuating effect on the propagation length (each input bit flip only has 50% percent change of propagating the change), but the xor-gates for the adder propagate every transient. On average, you still get 64 transients per adder input transient, and 2048 (64 and-gates * 50%) transients for every operand bit flip. That's a lot to account for in your worst-case power envelope.

Yes, asynchronous designs are more flexible to work with. But they are less predictable in operation, not just in propagation delay but also in power usage. And you still need some form of inter-module communication, and that communication needs to account for differences in signal path length -- which is much easier to do if you can refer to a global clock.

I'm sure there have been successful asynchronous designs for specific applications (e.g. analog feedback control loops), and I haven't kept up with the last ten years of IC development which is a lifetime, but most asynchronous logic designs weren't necessarily faster than their synchronous implementations last time I checked.

1 comments

Contemporary intermodule designs are pipelined and message-oriented exactly because it is hard to predict difference in signal path length for long paths. I am talking about high speed buses from ARM, I think I read about them in 2016 or so.

The same can be done with asynchronous designs, in more relaxed way.

You said that asynchronous designs are less predictable in their use of power. Can you elaborate on that?

> The same can be done with asynchronous designs, in more relaxed way.

Sure, just ask these guys:

https://chronostech.com/technology

Chronos Link: A QDI Interconnect for Modern SoCs https://ieeexplore.ieee.org/document/9179196

It's compatible with TileLink, which is SiFive's Fabric. https://bar.eecs.berkeley.edu/projects/tilelink.html