Hacker News new | ask | show | jobs
by nkurz 3707 days ago
Yes, that's a great link, and I agree that if you can get the performance you want with Intrinsics they are usually a better choice. But if you need compiler-portable high performance, I find that it can be really hard to get good performance on GCC, ICC, and Clang simultaneously with intrinsics.

Another approach that's not quite there yet but is becoming more possible is to use https://www.cilkplus.org to annotate your C code to force automatic vectorization. It's native to ICC, built-in to GCC 5.0+, and available as an extension to Clang: https://news.ycombinator.com/item?id=11550250