| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cardine 1260 days ago
	Offloading is when the computation is done on the CPU instead of the GPU. DeepSpeed is an example of this.

2 comments

borzunov 1260 days ago

In case of offloading, the computations are usually still performed on GPU, but the model is hosted in RAM/SSD instead of the GPU memory (and its chunks are copied to the GPU memory when necessary).

link

cardine 1260 days ago

A lot of computation is offloaded to the CPU, such as gradients and optimizer states. You are right though that quite a bit of computation is still done on the GPU.

link

rolenthedeep 1260 days ago

I remember when GPUs were starting to support arbitrary computation and offloading meant shifting work away from the CPU.

link