|
|
|
|
|
by lifthrasiir
406 days ago
|
|
Personally I prefer fixed-size SIMD mainly because it enables more usages than usual vector instructions while vector instructions can be rather trivially lowered to fixed-size SIMD instructions. I'd call them as "opportunistic" usages, because those are perfectly fine without SIMD or vector but only get vectorized due to the relatively small size of SIMD registers. Those usages are significant enough that I see them as useful even with the presence of vector instructions. |
|
New x86 processor don't executes 128-bit SIMD, the vecto ALUs are all wider now and 128 and 256-bit instructions have the same throughput and latency.
Also, do you have an example for such "opportunistic" usages?
I suppose mainly things the SLP vectorizer can usually do already (in compiled languages, I'm not sure how good the JIT is these days).
I worry that we now may end up in a world, where "hand optimized SIMD" in WASM ends up slower than autovectorization, because you can't use the wider SIMD instructions and leave 2x (zen4) to 4x (zen5) of the performance on the table.