|
|
|
|
|
by scottlamb
97 days ago
|
|
Isn't the faster approach SIMD [edit: or GPU]? A 1.05x to 1.90x speedup is great. A 16x speedup is better! They could be orthogonal improvements, but if I were prioritizing, I'd go for SIMD first. I searched for asin on Intel's intrinsics guide. They have a AVX-512 instrinsic `_mm512_asin_ps` but it says "sequence" rather than single-instruction. Presumably the actual sequence they use is in some header file somewhere, but I don't know off-hand where to look, so I don't know how it compares to a SIMDified version of `fast_asin_cg`. https://www.intel.com/content/www/us/en/docs/intrinsics-guid... |
|