|
|
|
|
|
by holy_city
2586 days ago
|
|
This, 1000%. Especially floating point arithmetic. Compilers aren't that great at vectorization (or more accurately, people aren't that great at writing algorithms that can be vectorized), and scalar operations on x86_64 have the same latency as their vectorized counterparts. When you account for how smart the pipelining/out-of-order engines on x86_64 CPUs are, even with additional/redundant arithmetic you can achieve >4x throughput for the same algorithm. Audio is one of the big areas where we can see huge gains, and I think philosophies about optimization are changing. That said, there are myths out there like "recursive filters can't be vectorized" that need to be dispelled. |
|