|
|
|
|
|
by sampo
4635 days ago
|
|
> assuming vectorization doesn't come into play. Now that 256 bit AVX registers that process 4 numbers in one go, even when one uses 64bit floats (and 8 with 32bit floats), vectorization more and more comes into play. Using 64bit floats with 128bit SSE registers, it was kinda possible to ignore the vectorization, as it was less than 2x speedup. But no more. |
|