Hacker News new | ask | show | jobs
by janwas 1481 days ago
Yes, we can sort 64-bit ints. The speedup on AVX2 is roughly 2/3 of the 10x we see on AVX-512. Longsort appears to be an autovectorized sorting network. That's only going to be competitive or even viable for relatively small arrays (thousands). See comments above on djbsort.

Why not use whichever AVX the CPU has? Not a problem when using runtime dispatch :)

1 comments

What about performance-per-watt?
Main memory accesses dominate energy consumption, so the lower your total memory bandwidth the less energy an algorithm will take.

https://www.researchgate.net/figure/Data-movement-is-overtak...

The chart above shows a 1000x (3 orders of magnitude base 10) increase in energy consumption relative to a register move (it really should be called copy).

The bigger the vectors, the better the performance per watt.