|
|
|
|
|
by stochastic_monk
2957 days ago
|
|
Eigen, Armadillo, Blaze, and ETL all have their own replacement implementations for BLAS but can be linked against any version. By the way, MKL supports AVX512, while OpenBLAS does not as of yet. Benchmarks show a factor of 4 between the two for gemm. |
|
For avx512 (and maybe other x86_64, which is now dynamically dispatched) large BLAS, use BLIS. BLIS also provides a non-BLAS interface. For small matrix multiplication, use libxsmm, of course.
Remember that the world isn't all amd64/x86_64, in which case BLIS is infinitely faster than MKL, and it's probably faster even on Bulldozer/Zen. (I haven't compared on Bulldozer recently, and don't have Zen.)