Hacker News new | ask | show | jobs
by adgjlsfhk1 828 days ago
no it's not. cordic has awful convergence of 1 bit per iteration. pretty much everyone uses power series.
2 comments

Actually pretty much everyone implements double precision sin/cos using the same (IIRC) pair of 6th order polynomials. The same SunPro code exists unchnaged in essentially every C library everywehre. It's just a fitted curve, no fancy series definition beyond what appears in the output coefficients. One for the "mostly linear" segment where the line crosses the origin and another for the "mostly parabolic" peak of the curve.
That is 64 iterations for a double, that is nothing!
53, but that's still a lot more than the 5th degree polynomial that you need.
yeah, but 52 adds can be a lot cheaper than a few multiplies, if you're making them out of shift registers and logic gates (or LUT). in a CPU or GPU, who cares, moving around the data is 100x more expensive than the ALU operation.
> in a CPU or GPU, who cares, moving around the data is 100x more expensive than the ALU operation

Moving data is indeed expensive, but there’s another reason to not care. Modern CPUs take same time to add or multiply floats.

For example, the computer I’m using, with AMD Zen3 CPU cores, takes 3 cycles to add or multiply numbers, which applies to both 32- and 64-bit flavors of floats. See addps, mulps, addpd, mulpd SSE instructions in that table: https://www.uops.info/table.html

> moving around the data is 100x more expensive than the ALU operation.

This is exactly the problem with CORDIC. 52 dependent adds requires moving data from a register to the ALU and back 52 times.

It's the problem with CORDIC in that context, yes!