Hacker News new | ask | show | jobs
by vblanco 526 days ago
Interesting library, but i see it falls back into what happens to almost all SIMD libraries, which is that they hardcode the vector target completely and you cant mix/match feature levels within a build. The documentation recommends writing your kernels into DLLs and dynamic-loading them which is a huge mess https://jfalcou.github.io/eve/multiarch.html

Meanwhile xsimd (https://github.com/xtensor-stack/xsimd) has the feature level as a template parameter on its vector objects, which lets you branch at runtime between simd levels as you wish. I find its a far better way of doing things if you actually want to ship the simd code to users.

5 comments

100% agreed. This is the main reason ISPC is my go-to tool for explicit vectorization.
+1, dynamic dispatch is important. Our Highway library has extensive support for this.

Detailed intro by kfjahnke here: https://github.com/kfjahnke/zimt/blob/multi_isa/examples/mul...

Thanks, that's an important caveat!

> Meanwhile xsimd (https://github.com/xtensor-stack/xsimd) has the feature level as a template parameter on its vector objects

That's pretty cool because you can write function templates and instantiate different versions that you can select at runtime.

Yeah thts the fun of it, you create your kernel/function so that the simd level is a template parameter, and then you can use simple branching like:

if(supports<avx512>){ myAlgo<avx512>(); } else{ myAlgo<avx>(); }

Ive also used it for benchmarking to see if my code scales to different simd widths well and its a huge help

FYI: You don't want to do this. `supports<avx512>` is an expensive check. You really want to put this check in a static.
I guess this was just pseudo-code. Of course you don't want to do a runtime feature check over and over again.
Our answer to this - is dynamic dispatch. If you want to have multiple version of the same kernel compiled - compile multiple dlls.

The big problem here is: ODR violations. We really didn't want to do the xsimd thing of forcing the user to pass an arch everywhere.

Also that kinda defeats the purpose of "simd portability" - any code with avx2 can't work for an arm platform.

eve just works everywhere.

Example: https://godbolt.org/z/bEGd7Tnb3

It is possible to avoid ODR violations :) We put the per-target code into unique namespaces, and export a function pointer to them.
You can do many thing with macros and inline namespaces but I believe they run into problems when modules come into play. Can you compile the same code twice, with different flags with modules?
We use pragma target instead of compiler flags :)
I don't think we understand each other.

We want to take one function and compile it twice:

``` namespace MEGA_MACRO {

void foo(std::span<int> s) { super_awesome_platform_specific_thing(s); }

} // namespace MEGA_MACRO ```

Whatever you do - the code above has to be written once but compiled twice. In one file/in many files - doesn't matter.

My point is - I don't think you can compile that code twice if you support modules.

I think I do understand, this is exactly what we do. (MEGA_MACRO == HWY_NAMESPACE)

Then we have a table of function pointers to &AVX2::foo, &AVX3::foo etc. As long as the module exports one single thing, which either calls into or exports this table, I do not see how it is incompatible with building your project using modules enabled?

(The way we compile the code twice is to re-include our source file, taking care that only the SIMD parts are actually seen by the compiler, and stuff like the module exports would only be compiled once.)

Since you seem knowledgeable about this, what does this do differently from other SIMD libraries like xsimd / highway? Is it the addition of algorithms similar to the STD library that are explicitly SIMD optimized?
The algorithms I tried to make as good as I knew how. Maybe 95% there. Nice tail handling. A lot of things supported. I like or interface over other alternatives, but I'm biased here. Really massive math library.