|
|
|
|
|
by ascar
1483 days ago
|
|
oh you're right, my bad. Sloppy on my part! I somehow got tripped up by the 4xSIMD. I was assuming you meant it's using 4x 64bit SIMD there which it doesn't. mulpd and addpd are 2x 64bit, also visible by the xmm instead of ymm registers. I got sloppy on the difference between all instructions including the loop logic vs just the instructions necessary to do the main computation. Obviously the first is the correct measure and I was sloppy. |
|