Y
Hacker News
new
|
ask
|
show
|
jobs
by
zcbenz
340 days ago
In the absence of hardware unified memory, CUDA will automatically copy data between CPU/GPU when there are page faults.
3 comments
fenced_load
340 days ago
There is also NVLink c2c support between Nvidia's CPUs and GPUs that doesn't require any copy, CPUs and GPUs directly access each other's memory over a coherent bus. IIRC, they have 4 CPU + 4 GPU servers already available.
link
benreesman
340 days ago
Yeah NCCL is a whole world and it's not even the only thing involved, but IIRC that's the difference between 8xH100 PCI and 8xH100 SXM2.
link
saagarjha
340 days ago
This seems like it would be slow…
link
freeone3000
340 days ago
Matches my experience. It’s memory stalls all over the place, aggravated (on 12.3 at least) there wasn’t even a prefetcher.
link
nickysielicki
340 days ago
See also:
https://www.kernel.org/doc/html/v5.0/vm/hmm.html
link