Hacker News new | ask | show | jobs
by exDM69 3435 days ago
These are all great options for massive parallelism, but that's not what I'm after.

I want explicit SIMD with 2/4/8/16 wide vectors, primarily to be used with 3d graphics and physics calculations.

2 comments

I use SIMDPP [0], which allows you to explicitly write SIMD instructions in a portable way. See the documentation [1] for the available commands. Specifically, I write code to be used on both x86 and ARM systems.

> libsimdpp is a portable header-only zero-overhead C++ wrapper around single-instruction multiple-data (SIMD) intrinsics found in many compilers. The library presents a single interface over several instruction sets in such a way that the same source code may be compiled for different instruction sets. The resulting object files then may be hooked into internal dynamic dispatch mechanism.

> The library resolves differences between instruction sets by implementing the missing functionality as a combination of several intrinsics. Moreover, the library supplies a lot of additional, commonly used functionality, such as various variants of matrix transpositions, interleaving loads/stores, optimized compile-time shuffling instructions, etc. Each of these are implemented in the most efficient manner for the target instruction set. Finally, it's possible to fall back to native intrinsics when necessary, without compromising maintanability.

[0] https://github.com/p12tic/libsimdpp

[1] http://p12tic.github.io/libsimdpp/v2.0%7Erc2/libsimdpp/

Halide is for explicit SIMD, and a couple of the others provide good support for it as well. These tools are made by graphics and physics optimization people. Look at the examples.