Hacker News new | ask | show | jobs
by saagarjha 385 days ago
Matrix multiplies are typically compute bound, but you don't get much option to improve the actual algorithm because Nvidia gives you an accelerator for one and anything else would be slower.