Hacker News new | ask | show | jobs
by electricslpnsld 3067 days ago
A better comparison might be a C++ library like Eigen that is intended for use on these small systems, that should optimize out function calls in a similar way, and that uses intrinsics.
1 comments

If your matrices are of arbitrary size, Blaze [0] would be my pick. It has the best performance on their benchmarks, Baptiste Wicht's benchmarks, and my own.

Though if the matrices are guaranteed to be small (like in this example), libxsmm [1] is specialized and highly optimized for this use case and beats its competitors above.

And yes, it's absolutely essential to make sure you didn't just move computations to compile-time.

[0]: https://bitbucket.org/blaze-lib/blaze

[1]: https://github.com/hfp/libxsmm