|
|
|
|
|
by johnt15
3731 days ago
|
|
The property of AVX and AVX2 you mentioned actually helps having single code path. If the SIMD wrapper allows parameterization on vector width (most do that), you can simply increase vector width when compiling for AVX and that's it. |
|
Think about shuffling instructions (pshufb), lookup vector for the instruction are different in AVX2 and SSE. Even if an AVX2 vector could be created by cloning SSE vector twice, this must be a programmer decision.
Another example is algorithm using video-encoding instruction mpsadbw to locate substrings (http://0x80.pl/articles/sse4_substring_locate.html#introduct...). AVX2 instruction vmpsadw operates on 128-bit lanes and the algorithm have to be rewritten in some parts to align with this limitation.