Hacker News new | ask | show | jobs
by cardine 1259 days ago
A lot of computation is offloaded to the CPU, such as gradients and optimizer states. You are right though that quite a bit of computation is still done on the GPU.