| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by eindiran 2600 days ago

On the CPU, matrix multiplication follows the same procedure you'd use to multiply matrices by hand. But GPUs are good at performing the same operation on a bunch of data at the same time. Any operation that is embarrassingly parallel is a good fit for doing on the GPU and often large matrix multiplications are. So the premise is that you do the steps that don't need to be done sequentially in parallel on the GPU.

Most approaches to doing matrix multiplication on the GPU benefit from doing operations in a way that plays off the behavior described above, make good use of caching, and respond to how the data in your matrix actually looks (e.g. is it sparse, etc).

To learn how you'd do matrix multiplication on the GPU, you might want to look up how its done via CUDA since many applications that make use of the GPU do so via CUDA and it doesn't require specific knowlege of graphics programming. This seems like a good introduction: https://www.shodor.org/media/content/petascale/materials/UPM...