Hacker News new | ask | show | jobs
by jcranmer 2778 days ago
Data transfer from the host CPU to the GPU card can kill the performance of offloading. You need a hefty data-parallel kernel, with a high-ish work-per-element, to get speedup that's worth the data transfer costs.