|
|
|
|
|
by janwas
572 days ago
|
|
I hope people aren't writing directly to AVX2. When using a wrapper such as Highway, you get exactly this kind of update after a recompile, or even just running your code on a CPU that supports newer instructions. The cost is that the binary carries around both AVX2 and AVX-512 codepaths, but that is not an issue IMO. |
|
An issue with the abstractions that does not go away is that the optimal code architecture -- well above the level of the SIMD wrappers -- is dependent on the capabilities of the silicon. The wrappers can't solve for that. And if you optimize the code architecture for the silicon architecture, it quickly approximates writing architecture-specific intrinsics with an additional layer of indirection, which significantly reduces any notional benefit from the abstractions.
The wrappers can't abstract enough, and higher level abstractions (written with architecture aware intrinsics) are often too use case specific to reuse widely.