|
|
|
|
|
by an1sotropy
1288 days ago
|
|
Not really, unfortunately, and it’s a pre-existing framework for teaching a class, so simplicity of compilation is extra important. Also if I try to isolate the SIMD bits in C++ I’ll lose the opportunity to have them be inlined which will defeat the optimization purpose. For those that are new to this, can you give an example of a kind of computation or algorithm which is well-served by your project, but not possible with vector extensions like https://clang.llvm.org/docs/LanguageExtensions.html#vectors-... ? |
|
Agreed. Usually the interface would be something like RunEntireAlgorithm(), not DotProduct().
> For those that are new to this, can you give an example of a kind of computation or algorithm which is well-served by your project but not possible with vector extensions
Sure. Vector extensions are OKish for simple math but JPEG XL includes nontrivial cross-lane operations such as transpose and boundary handling for convolution. __builtin_shufflevector requires a known vector length, and can be pessimized (fusing two into one general all-to-all permute which is more expensive than two simple shuffles).
Also, vqsort (https://github.com/google/highway/tree/master/hwy/contrib/so...) almost entirely consists of operations not supported by the extensions, and actually works out of the box on variable-length RISC-V and SVE, which compiler extensions cannot.