Hacker News new | ask | show | jobs
by __s 1192 days ago
If you look at the output of your compiler many unnecessary loads/stores. Vectorized code in particular still comes out lacking even with intrinsics

In fact, you can benchmark openssl's assembly vs openssl's C: https://github.com/openssl/openssl/blob/master/crypto/aes/ae...

Granted, they aren't using intrinsics in that code, but a sufficiently smart compiler shouldn't need intrinsics