|
|
|
|
|
by exDM69
2725 days ago
|
|
> I wonder if GCC has improved since then. Yes, it has. I've written a lot of SIMD code and spent a good amount of time reading the compiler assembly output and there has been huge improvement over the last decade. GCC register allocation wasn't great, then it got better with x86 SSE but still sucked at ARM NEON, and now it seems to be decent with both. Clang was better at SIMD code before GCC was. It was equally good with SSE and NEON. In my experience, compilers are much better than humans at instruction scheduling. Especially when using portable vector extensions, you don't have to write the same code twice and then tweak the scheduling for every architecture separately. |
|
It'd be more accurate to say they're much better than humans when the heuristics or whatever they use works. Sometimes the compiler messes up badly.
The workflow is often to compile and then examine disassembly to see whether the compiler managed to generate something sensible or not.
Other issue is that compiler pattern matching is sometimes not working and generating correct SIMD instruction. Even when data is SIMD width aligned. For example, recently I saw ICC not generating a horizontal add in the most basic scenario imaginable. * shrug *.