Hacker News new | ask | show | jobs
by celeritascelery 1883 days ago
I find it interesting that with the state of rust SIMD many implementations just opted for reimplementing the code multiple times for each intrinsic (SSE, Neon, AVX, etc). One would think that one of the generic libraries (faster, simdeez, packed_simd) would start to see widespread use, but they all seem to have issues that have prevented adoption.
3 comments

I work on a SIMD optimized font library [0] and have stumbled into the same situation of hand writing SIMD intrinsics. Some things are just kinda hard to make sure they get optimized correctly, and there is enough difference between the platforms where that matters when fiddling with bits. I also kinda have fun writing SIMD code like this too.

[0]: https://github.com/mooman219/fontdue/blob/master/src/platfor...

The various SIMD ISAs are different enough from eachother that you really want a custom implementation per ISA for perf reasons. Particularly if you're doing something off the beaten path like this rather than cranking through some vector math.
I don't know. Looking in their code, they have one for SSE and another for AVX2. The algorithm is identical expect for one is 128 and the other is 256 bits. It seems like something that would be easy to abstract over so you only had a single implementation (at least for x86 SIMD). But it is obviously a hard problem or it would be solved already.
Even for vector math, there are subtle differences, such as the numerical precision of the hardware reciprocal, leading to different implementations for x/y.

The Eigen library did recently combine most SSE and AVX code paths, however: https://eigen.tuxfamily.org/index.php?title=3.4

Can you comment on the Vector API that Java is set to include? Do you think it's entirely misguided, or more of a best effort given what one can reasonably expect from Java?

http://openjdk.java.net/jeps/8261663

(I'm completely ignorant on the subject)

packed_simd is now packed_simd2, the original author is MIA and no one else has crate or repo permissions

I really think the general push right now is towards stdsimd being the future for portability.