|
|
|
|
|
by fdej
2948 days ago
|
|
This trick does work. If the matrices are in row-major order, you transpose B in memory and then compute A * (B^T)^T. This multiplication reads both matrices in row order. However, while this does improve performance over the naive algorithm, it's still not as good as a tiling algorithm. |
|