|
|
|
|
|
by zamadatix
1490 days ago
|
|
I'm not sure I'd say many compilers are even that great with SIMD these days and that is easier than what the itanium was asking of compilers. There are real gains to be had by using SIMD but it tends to be massively parallel data processing workloads with specially written SIMD code or even hand tuned assembly (image/video processing, neural networks) not just feeding in a source file and compiling with the SIMD flag to then realize meaningful gains. |
|
SIMD is harder because you have to have a uniform operation across a set of data.
Imagine a for loop that looks like this
For SIMD, this is a complicated mess for the compiler to unravel. What the compiler would LIKE to do is turn this into 3 for loops and use the SIMD instructions to perform those operations in parallel.The itanium optimization, however, is a lot easier. The compiler can see that none of p, d, or q depend on the results of the previous stage (that is q[i] doesn't depend on p[i]). As a result, the entire thing can be packed into a single operation.
Now, of course, modern OOO processors can do the same optimization so maybe it's not a huge win? Still, would have been something worth exploring more (IMO) but the market forces killed it. Moving that sort of optimization out of the processor hardware and into the compiler software seems like it could lead to some nice power/performance benefits.