Hacker News new | ask | show | jobs
by csdreamer7 2170 days ago
Most Ryzen consumer motherboards have a limit of 128 gigs of RAM and 16-20 direct to the CPU pcie lanes. Is 128 gigs of ram and x8 pcie lanes for dual GPUs, a bottleneck for ML workloads?

I can see the lanes not being an issue for the next gen Titans, that will likely use pcie 4.0, but that is months away.

Asking as someone outside the ML field.

2 comments

In order the bottleneck is: gpu ram, cpu ram, then pci-e lanes.

There is a big delay moving memory from ram to vram to run a task on the gpu, so much so that you'd be better off running the task on the cpu if you can't fit it all in the gpu, or are very clever in how data is buffered, which isn't an option for neural networks. Because of this, the pci-e lane is not saturated except when first sending the data to vram. PCI-E 3.0 x8 runs at 7880MB/s, so if your gpu has 16gb of vram, the difference between x8 and x16 is 1 second, when a task can typically take 8+ hours to complete.

The reduction from 16x to 8x PCIe lanes is usually not a bottleneck for ML. Still, it's always a good idea to benchmark and validate the configuration, especially if you're planning to spend a lot of money on a bunch of identical systems.

As for RAM, only you can know how big your datasets are. But if you're training models on GPUs the bottleneck is almost certainly going to be GPU RAM, not system RAM.