| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vblanco 526 days ago
	Interesting library, but i see it falls back into what happens to almost all SIMD libraries, which is that they hardcode the vector target completely and you cant mix/match feature levels within a build. The documentation recommends writing your kernels into DLLs and dynamic-loading them which is a huge mess https://jfalcou.github.io/eve/multiarch.html Meanwhile xsimd (https://github.com/xtensor-stack/xsimd) has the feature level as a template parameter on its vector objects, which lets you branch at runtime between simd levels as you wish. I find its a far better way of doing things if you actually want to ship the simd code to users.

5 comments

kookamamie 526 days ago

100% agreed. This is the main reason ISPC is my go-to tool for explicit vectorization.

link

janwas 526 days ago

+1, dynamic dispatch is important. Our Highway library has extensive support for this.

Detailed intro by kfjahnke here: https://github.com/kfjahnke/zimt/blob/multi_isa/examples/mul...

link

spacechild1 526 days ago

Thanks, that's an important caveat!

> Meanwhile xsimd (https://github.com/xtensor-stack/xsimd) has the feature level as a template parameter on its vector objects

That's pretty cool because you can write function templates and instantiate different versions that you can select at runtime.

link

vblanco 526 days ago

Yeah thts the fun of it, you create your kernel/function so that the simd level is a template parameter, and then you can use simple branching like:

if(supports<avx512>){ myAlgo<avx512>(); } else{ myAlgo<avx>(); }

Ive also used it for benchmarking to see if my code scales to different simd widths well and its a huge help

link

dyaroshev 526 days ago

FYI: You don't want to do this. `supports<avx512>` is an expensive check. You really want to put this check in a static.

link

spacechild1 525 days ago

I guess this was just pseudo-code. Of course you don't want to do a runtime feature check over and over again.

link

dyaroshev 526 days ago

Our answer to this - is dynamic dispatch. If you want to have multiple version of the same kernel compiled - compile multiple dlls.

The big problem here is: ODR violations. We really didn't want to do the xsimd thing of forcing the user to pass an arch everywhere.

Also that kinda defeats the purpose of "simd portability" - any code with avx2 can't work for an arm platform.

eve just works everywhere.

Example: https://godbolt.org/z/bEGd7Tnb3

link

janwas 526 days ago

It is possible to avoid ODR violations :) We put the per-target code into unique namespaces, and export a function pointer to them.

link

dyaroshev 526 days ago

You can do many thing with macros and inline namespaces but I believe they run into problems when modules come into play. Can you compile the same code twice, with different flags with modules?

link

janwas 525 days ago

We use pragma target instead of compiler flags :)

link

dyaroshev 525 days ago

I don't think we understand each other.

We want to take one function and compile it twice:

``` namespace MEGA_MACRO {

void foo(std::span<int> s) { super_awesome_platform_specific_thing(s); }

} // namespace MEGA_MACRO ```

Whatever you do - the code above has to be written once but compiled twice. In one file/in many files - doesn't matter.

My point is - I don't think you can compile that code twice if you support modules.

link

janwas 525 days ago

I think I do understand, this is exactly what we do. (MEGA_MACRO == HWY_NAMESPACE)

Then we have a table of function pointers to &AVX2::foo, &AVX3::foo etc. As long as the module exports one single thing, which either calls into or exports this table, I do not see how it is incompatible with building your project using modules enabled?

(The way we compile the code twice is to re-include our source file, taking care that only the SIMD parts are actually seen by the compiler, and stuff like the module exports would only be compiled once.)

link

vlovich123 526 days ago

Since you seem knowledgeable about this, what does this do differently from other SIMD libraries like xsimd / highway? Is it the addition of algorithms similar to the STD library that are explicitly SIMD optimized?

link

dyaroshev 526 days ago

The algorithms I tried to make as good as I knew how. Maybe 95% there. Nice tail handling. A lot of things supported. I like or interface over other alternatives, but I'm biased here. Really massive math library.

link