Also, moving things to the GPU may be good for throughput but bad for latency depending on the workload, since offloading to GPU has a cost and data exchanges too.
> offloading to GPU has a cost and data exchanges too.
This is bad with dGPUs over the PCIe bus, but not so much with GPUs that share a very fast memory bus with the CPU. In this case, the layout of the data may prove challenging to keep the same for when you use a CPU and a GPU.
This is bad with dGPUs over the PCIe bus, but not so much with GPUs that share a very fast memory bus with the CPU. In this case, the layout of the data may prove challenging to keep the same for when you use a CPU and a GPU.