Hacker News new | ask | show | jobs
by dist-epoch 21 days ago
Because you are working in the cache.

Also, you should use SIMD.

1 comments

> Also, you should use SIMD. ironically no clang is better at auto vectorizing
Better than what? And do you use `-mavx2` or do you let it target baseline x86_64 and miss out on 8-float vectors? How do you make sure its autovectorisation is successful?