|
|
|
|
|
by tomxor
1486 days ago
|
|
I think in summary what you are alluding to is instruction decoding and scheduling as the sibling comment points out, which is indeed a large cost in both speed and power. > SIMD vs no SIMD has a neglectable overhead compared to entering a lower power state sooner. Yes on the same CPU as I suggested, real world difference may be unmeasurable. However note that this particular case is interesting because it's not comparing fewer serial multiplies to more SIMD multiplies, it's comparing SIMD multiplies to no multiplies but with a serial constraint due to variable dependence... i.e it's SIMD vs no multiply without any other difference in number or type of ALU ops... which again could make no difference on a big x86 in practice, but it would be interesting to know. All of this changes if you are coding for a lower power device and have choice of hardware. |
|