|
|
|
|
|
by stephencanon
1577 days ago
|
|
That said, it _is_ optimizable, it's just that: - the gains are the sort of typical 2-8x speed improvements from vectorization, not the multiple-orders-of-magnitude gains that you can get on dense GEMM. - the absolute number of flops performed is O(n^2) rather than O(n^3) for GEMM, so even if you could make tridiagonal operations infinitely fast, that optimization effort would be better spent on even small speedups to the O(n^3) work that probably comprises other parts of your algorithm. |
|