|
|
|
|
|
by celrod
2175 days ago
|
|
That's what I did when calling OpenBLAS and MKL, but I confess I don't know the internal details of a non-inlined `matmul` call in gfortran when you don't use `-fexternal-blas`. Just writing three loops and letting the compiler optimize it was much faster for `A * B'`, so it must be a pretty naive implementation getting called. |
|