Hacker News new | ask | show | jobs
by elmarhaussmann 2983 days ago
For this benchmark, NVLink and gradient reduction isn't the bottleneck. The performance scales almost perfectly linearly from one GPU to four.