|
|
|
|
|
by phire
2006 days ago
|
|
No. For many algorithms, AVX isn't a 2x speedup over SSE. Especially when lanes are conditionally masked. Often you are happy to get a 1.25x speed up with AVX. Sometimes it actually goes slower. If you were to emulate that code with a 1.25x speedup with AVX on the M1, you would end up with all the disadvantages of going to 8-wide, but with none of the speedup. That 1.25x speedup is halved and the emulated AVX code actually runs at about 0.625x the speed of the emulated SSE code path. |
|