Hacker News new | ask | show | jobs
by fragmede 646 days ago
NVLink is what makes multiGPU work. It lets the GPUs talk to each other across a high bandwidth (600 Gbps), low latency link. Tensorflow and PyTorch both support it, among other things. It's not this weird thing that's a side note, the interconnect between nodes is what makes a supercomputer super. You don't hear about it much because you don't hear about a lot of details of supercomputer stuff in mainstream media.
1 comments

Thank you, but this doesn't really answer OPs or my question. Is NVLink required if you want to run an LLM model which exceeds the memory of a single GPU? What are the benchmark comparisons with and without it?

I've heard that NVLink helps with training, but not so much with inferencing.