|
|
|
|
|
by exDM69
815 days ago
|
|
Compared to what? Scalar loopy C code sure. The auto vectorization is not great. But give LLVM some SIMD code as input, and it will be able to optimize it, and it does a great job with register allocation, spill code, instruction scheduling etc. Instruction selection isn't as great and you still need to use intrinsics for specialized instructions. And you get all of this for all CPU architectures and will deal with future microarchitecture changes for free. E.g. more execution ports added by Intel will get used with no code changes on your side. With infinite time you can still do better by hand, but it gets expensive fast, especially if you have several CPU architectures to deal with. |
|