| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zcbenz 340 days ago
	In the absence of hardware unified memory, CUDA will automatically copy data between CPU/GPU when there are page faults.

3 comments

fenced_load 340 days ago

There is also NVLink c2c support between Nvidia's CPUs and GPUs that doesn't require any copy, CPUs and GPUs directly access each other's memory over a coherent bus. IIRC, they have 4 CPU + 4 GPU servers already available.

link

benreesman 340 days ago

Yeah NCCL is a whole world and it's not even the only thing involved, but IIRC that's the difference between 8xH100 PCI and 8xH100 SXM2.

link

saagarjha 340 days ago

This seems like it would be slow…

link

freeone3000 340 days ago

Matches my experience. It’s memory stalls all over the place, aggravated (on 12.3 at least) there wasn’t even a prefetcher.

link

nickysielicki 340 days ago

link