|
|
|
|
|
by silentvoice
4309 days ago
|
|
Matrix multiplication is one of the most abused computational kernels when showing off cache locality and vectorization optimizing compilers. Unfortunately very few scientific codes consist of massive matrix-matrix multiplies, and even more unfortunately quite a few of them require many vector additions and dot products - operations which are memory bound and confound the performance of scientific codes which make even the cleverest use of BLAS. Your CPU may be able to churn out a bajillion gigaflops on a matrix-matrix multiply, but once you get to the vector adds and dot products you just can't feed that FLOPS hungry beast fast enough to keep up the gains. |
|