| Sure; but it’s hard to do and very few programs get optimised to this point. Before reaching for vector instructions, I’ll: - Benchmark, and verify that the code is hot. - Rewrite from Python, Ruby, JS into a systems language (if necessary). Honorary mention for C# / Go / Java, which are often fast enough. - Change to better data structures. Bad data structure choices are still so common. - Reduce heap allocations. They’re more expensive than you think, especially when you take into account the effect on the cpu cache Do those things well, and you can often get 3 or more orders of magnitude improved performance. At that point, is it worth reaching for SIMD intrinsics? Maybe. But I just haven’t written many programs where fast code written in a fast language (c, rust, etc) still wasn’t fast enough. I think it would be different if languages like rust had a high level wrapper around simd that gave you similar performance to hand written simd. But right now, simd is horrible to use and debug. And you usually need to write it per-architecture. Even Intel and amd need different code paths because Intel has dumped avx2. Outside generic tools like Unicode validation, json parsing and video decoding, I doubt modern simd gets much use. Llvm does what it can but …. |