Hacker News new | ask | show | jobs
by bee_rider 1386 days ago
Anyway, since there aren't any dependencies between a, b, c, and d, I would expect the two divisions to end up basically in parallel in the pipeline. So the critical path is a division and a multiplication either way. Of course that is just a guess.
1 comments

That assumes you can do multiple divisions in parallel. Back in the good old days, a single division unit was the norm, and it still is on most microcontrollers (assuming they even have hardware floating-point division[1]).

Anyone have any references on how the current state of affairs on modern AMD/Intels?

[1]: ARM Cortex-M4 for example can have a hardware FPU, but where division and sqrt are optional, see https://developer.arm.com/documentation/102832/latest/

https://en.wikichip.org/wiki/intel/microarchitectures/sunny_...

Looks like one FP divider on modern intel. Though you can pack multiple divisions into an instruction.

For AMD I can find throughput numbers but not how many there are, in a brief search. I'd guess two??

Interesting! Looks like my guess was off -- mea culpa.
It appears that Agner Fog's website is down at the moment, so we must conclude that the universe does not want to share this knowledge.