|
|
|
|
|
by vardump
3979 days ago
|
|
Scalar output would be way less than that number, 25 GFLOPS. At most 2x clock frequency. It's likely their benchmark just doesn't support AVX2 (and FMA [1]). You get about 25 GFLOPS if you use SSE only. [1]: https://en.wikipedia.org/wiki/FMA_instruction_set |
|