Hacker News new | ask | show | jobs
by eru 652 days ago
It's also from an era when floats were rather expensive.
1 comments

I still picture them as expensive. Things like trig functions are still very expensive.
I learned relatively recently that trig functions on the GPU are free if you don’t use too many of them; there’s a separate hardware pipe so they can execute in parallel with floats adds and muls. There’s still extra latency, but it’ll hide if there’s enough other stuff in the vicinity.
The CUDA documenation tells me that there are more performant but less precise trigonometric functions: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index....

Do you know if that hardware pipeline works only for these intrinsic variants?

Yep, these intrinsics are what I was referring to, and yes the software versions won’t use the hardware trig unit, they’ll be written using an approximating spline and/or Newton’s method, I would assume, probably mostly using adds and multiplies. Note the loss of precision with these fast-math intrinsics isn’t very much, it’s usually like 1 or 2 bits at most.
I couldn't find much information on those. I assume that they don't include range reduction?
I’m not totally sure but I think fast math usually comes with loss of support for denormals, which is a bit of range reduction. Note that even if they had denormals, the absolute error listed in the chart is much bigger than the biggest denorm. So you don’t lose range out at the large ends, but you might for very small numbers. Shouldn’t be a problem for sin/cos since the result is never large, but maybe it could be an issue for other ops.
Yes, but here it's about avoiding multiplication (and division).

I suspect on a modern processor the branches (ie "if"s) in Bresenham's algorithm are gonna be more expensive than the multiplications and divisions in the naive algorithm.

Bresenham is easy to implement branchlessly using conditional moves or arithmetic. It also produces a regular pattern of branches that is favorable for modern branch predictors that can do pattern prediction.