Hacker News new | ask | show | jobs
by rerx 1018 days ago
GPUs will be better optimized for large matrix multiplications than CPUs, by design.

And you need to the inference again and again, not just a single time (like your training).

1 comments

Desktop CPUs have integrated GPUs, so it's more complicated. If I infer on the GPU inside my CPU, how do you count that?
True. Memory bandwidth may be the most limiting factor.