|
|
|
|
|
by giaf
1679 days ago
|
|
Thanks for mentioning Octavian, I didn't know about this interesting project.
Are you referring to single- or multi-threaded applications? In the context of embedded optimal control applications (i.e. the original framework motivating the Prometeo development), applications are typically single-threaded, and in this case for matrices of size 100x100 MKL is _very_ close to peak performance already, there is no way something can be 2x faster without breaking the laws of physics.
[Trust that I know what I'm saying here, as the main BLASFEO developer, I check MKL performance often enough ;) ]
Just for reference, MKL has special flags MKL_DIRECT_CALL and MKL_DIRECT_CALL_SEQ which enable extra optimizations improving performance for small matrices (e.g. turn off most input arguments checks), these should definitely be used in a fair comparison. On top of that, linear algebra is much more than matrix-matrix multiplication, and e.g. in embedded optimal control the performance of factorization routines plays a key role. |
|