|
|
|
|
|
by cdavid
5274 days ago
|
|
NumPy dev here. Note that numpy may not use BLAS (this was done though as to avoid any hard dependencies on 3rd party libraries). I am not sure what you mean by putting the FLOP count below what's required. BLAS will still need O(N^3) operations for a NxN matrix multiplications, whether they are optimized or not. The biggest difference between libraries is usually in clever data organization/passing to use the cpu cache as efficiently as possible (memory throughput is usually the bottleneck until your data don't fit in RAM). You can easily gain one order of magnitude using MKL compared to a naive implementation in C (that you should never, ever do, BTW). |
|
Why wouldn't they use an algorithm that is better than O(N^3)?