|
|
|
|
|
by rrhjm53270
618 days ago
|
|
Thank you for sharing such an interesting work. A little comment: adding some more aggressive optimization optimization options to simd C++ code to see the performance difference. On my side with a AMD Ryzen 9 7900X3D CPU, I have - 0.0592569 ms for `-O3 -march=native` option, and
- 1.7741e-05 ms for `-funsafe-math-optimizations -Ofast -flto=auto -pipe -march=native` |
|
Pretty sure your 17.7 nanoseconds result had the whole function optimized away. Workarounds are tricky and compiler-specific.