Hacker News new | ask | show | jobs
by mitthrowaway2 534 days ago
I wonder to what extent the dedicated hardware is essentially implementing the same steps but at the transistor level.
1 comments

The big cores do. They essentially pump division through something like an FMA (fused multiply-add) unit, possibly the same unit that is used for multiplication and addition. That's for the Newton-Raphson steps, or Goldschmidt steps.

In hardware it's much easier to do a LUT-based approximation for the initial estimate rather than the subtraction trick, though.

It's common for CPUs to give 6-8 accurate bits in the approximation. x86 gives 13 accurate bits. Back in 1975, the Cray 1 gave 30 (!) accurate bits in the first approximation, and it didn't even have a division instruction (everything about that machine was big and fast).