|
|
|
|
|
by __s
1192 days ago
|
|
If you look at the output of your compiler many unnecessary loads/stores. Vectorized code in particular still comes out lacking even with intrinsics In fact, you can benchmark openssl's assembly vs openssl's C: https://github.com/openssl/openssl/blob/master/crypto/aes/ae... Granted, they aren't using intrinsics in that code, but a sufficiently smart compiler shouldn't need intrinsics |
|