| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by financltravsty 938 days ago

Parallelization.

Each "unit of work" in matrix multiplication is not dependent on any other unit of work. Stuff as many cores as you can into a chip, and then simply feed in all your vectors at the same time.

I.e. basically a beefed up GPU or an "AI" chip.

1 comments

Symmetry 937 days ago

A million element square matrix is a lot of data. To process that in a second is much more bandwidth than a single socket can support, so you'll need many sockets too.

link